bunnei
6df6caaf5f
Merge pull request #3143 from ReinUsesLisp/indexing-bug
...
gl_device: Deduce indexing bug from device instead of heuristic
2019-11-26 21:53:12 -05:00
ReinUsesLisp
ef4446cb11
gl_shader_decompiler: Fix casts from fp32 to f16
...
Casts from f32 to f16 zeroes the higher half of the target register.
2019-11-25 22:22:33 -03:00
ReinUsesLisp
410d44ce05
gl_device: Deduce indexing bug from device instead of heuristic
...
The heuristic to detect AMD's driver was not working properly since it
also included Intel. Instead of using heuristics to detect it, compare
the GL_VENDOR string.
2019-11-25 16:15:22 -03:00
bunnei
2899c93818
Merge pull request #3158 from ReinUsesLisp/srgb-blit
...
gl_texture_cache: Apply sRGB on blits
2019-11-24 20:47:13 -05:00
bunnei
33a6b45a6c
Merge pull request #3155 from bunnei/fix-asynch-gpu-wait
...
gpu_thread: Don't spin wait if there are no GPU commands.
2019-11-24 20:19:25 -05:00
bunnei
b03242067d
Merge pull request #3098 from ReinUsesLisp/shader-invalidations
...
gl_shader_cache: Miscellaneous changes to shaders
2019-11-24 19:36:30 -05:00
ReinUsesLisp
74fff717aa
gl_texture_cache: Apply sRGB on blits
...
glBlitFramebuffer keeps in mind GL_FRAMEBUFFER_SRGB's state. Enable this
depending on the target surface pixel format.
2019-11-24 18:13:33 -03:00
bunnei
b7031b2b9d
Merge pull request #3105 from ReinUsesLisp/fix-stencil-reg
...
maxwell_3d: Fix stencil_back_func_mask offset
2019-11-24 13:53:23 -05:00
bunnei
e81e0036b4
Merge pull request #3145 from ReinUsesLisp/buffer-cache-init
...
buffer_cache: Remove brace initialized for objects with default constructor
2019-11-24 02:55:02 -05:00
bunnei
9ec84fc592
gpu_thread: Don't spin wait if there are no GPU commands.
2019-11-23 15:17:28 -05:00
bunnei
4ed183ee42
Merge pull request #3141 from ReinUsesLisp/gl-position
...
gl_shader_gen: Apply default value to gl_Position
2019-11-23 13:23:46 -05:00
ReinUsesLisp
dc2e83fa31
gl_device: Reserve base bindings on limited devices
...
SSBOs and other resources are limited per pipeline on Intel and AMD.
Heuristically reserve resources per stage having in mind the reported
OpenGL limits.
2019-11-22 21:28:50 -03:00
ReinUsesLisp
e3d7334be9
gl_state: Skip null texture binds
...
glBindTextureUnit doesn't support null textures. Skip binding these.
2019-11-22 21:28:50 -03:00
ReinUsesLisp
919ac2c4d3
gl_rasterizer: Disable compute shaders on Intel
...
Intel's proprietary driver enters in a corrupt state when compute
shaders are executed. For now, disable these.
2019-11-22 21:28:50 -03:00
ReinUsesLisp
894ad74b87
gl_shader_cache: Hack shared memory size
...
The current shared memory size seems to be smaller than what the game
actually uses. This makes Nvidia's driver consistently blow up; in the
case of FE3H it made it explode on Qt's SwapBuffers while SDL2 worked
just fine. For now keep this hack since it's still progress over the
previous hardcoded shared memory size.
2019-11-22 21:28:49 -03:00
ReinUsesLisp
e35b9597ef
gl_shader_decompiler: Normalize image bindings
2019-11-22 21:28:49 -03:00
ReinUsesLisp
36d9b409fc
gl_shader_decompiler: Normalize cbuf bindings
...
Stage and compute shaders were using a different binding counter.
Normalize these.
2019-11-22 21:28:49 -03:00
ReinUsesLisp
f936b86c7c
gl_rasterizer: Add missing cbuf counter reset on compute
2019-11-22 21:28:49 -03:00
ReinUsesLisp
180417c514
gl_shader_cache: Remove dynamic BaseBinding specialization
2019-11-22 21:28:49 -03:00
ReinUsesLisp
c8a48aacc0
video_core: Unify ProgramType and ShaderStage into ShaderType
2019-11-22 21:28:48 -03:00
ReinUsesLisp
0f23359a44
gl_rasterizer: Bind graphics images to draw commands
...
Images were not being bound to draw invocations because these would
require a cache invalidation.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
287ae2b9e8
gl_shader_cache: Specialize local memory size for compute shaders
...
Local memory size in compute shaders was stubbed with an arbitary size.
This commit specializes local memory size from guest GPU parameters.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
dbeb523879
gl_shader_cache: Specialize shared memory size
...
Shared memory was being declared with an undefined size. Specialize from
guest GPU parameters the compute shader's shared memory size.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
4f5d8e4342
gl_shader_cache: Specialize shader workgroup
...
Drop the usage of ARB_compute_variable_group_size and specialize compute
shaders instead. This permits compute to run on AMD and Intel
proprietary drivers.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
dc9961f341
shader/texture: Handle TLDS texture type mismatches
...
Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets
used by instructions of 1D textures. To handle the discrepancy this
commit uses the the texture type from the binding and modifies the
emitted code IR to build a valid backend expression.
E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples
a 2D texture in the coordinate ivec2(X, 0).
2019-11-22 21:28:47 -03:00
ReinUsesLisp
32c1bc6a67
shader/texture: Deduce texture buffers from locker
...
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
73aaf365e7
buffer_cache: Remove brace initialized for objects with default constructor
2019-11-20 16:00:40 -03:00
Fernando Sahmkow
cc81c0ce64
Texture_Cache: Redo invalid Surfaces handling.
...
This commit aims to redo the full setup of invalid textures and
guarantee correct behavior across backends in the case of finding one by
using black dummy textures that match the target of the expected
texture.
2019-11-20 14:59:35 -04:00
ReinUsesLisp
24f4198cee
shader/other: Reduce DEPBAR log severity
...
While DEPBAR is stubbed it doesn't change anything from our end. Shading
languages handle what this instruction does implicitly. We are not
getting anything out fo this log except noise.
2019-11-19 21:26:40 -03:00
ReinUsesLisp
bc10714dcf
gl_shader_gen: Apply default value to gl_Position
...
Nvidia has sane default output values for varyings, but the other
vendors don't apply these. To properly emulate this we would have to
analyze the shader header. For the time being, apply the same default
Nvidia applies so we get the same behaviour on non-Nvidia drivers.
2019-11-19 20:32:01 -03:00
bunnei
b0819e2ffb
Merge pull request #3086 from ReinUsesLisp/format-lookups
...
texture_cache: Use a flat table instead of switch for texture format lookups
2019-11-19 18:29:17 -05:00
Fernando Sahmkow
c8473f399e
Shader_IR: Address Feedback
2019-11-18 07:34:34 -04:00
bunnei
a8295d2c53
Merge pull request #3047 from ReinUsesLisp/clip-control
...
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
2019-11-15 12:09:19 -05:00
ReinUsesLisp
4681381a34
format_lookup_table: Address feedback
...
format_lookup_table: Drop bitfields
format_lookup_table: Use std::array for definition table
format_lookup_table: Include <limits> instead of <numeric>
2019-11-14 20:57:30 -03:00
ReinUsesLisp
80eacdf89b
texture_cache: Use a table instead of switch for texture formats
...
Use a large flat array to look up texture formats. This allows us to
properly implement formats with different component types. It should
also be faster.
2019-11-14 20:57:10 -03:00
ReinUsesLisp
48a1687f51
texture_cache: Drop abstracted ComponentType
...
Abstracted ComponentType was not being used in a meaningful way.
This commit drops its usage.
There is one place where it was being used to test compatibility between
two cached surfaces, but this one is implied in the pixel format.
Removing the component type test doesn't change the behaviour.
2019-11-14 18:21:42 -03:00
greggameplayer
c6bc13d0aa
correct the implementation of RGBA16UI
2019-11-14 21:37:39 +01:00
Fernando Sahmkow
cd0f5dfc17
Shader_IR: Implement TXD instruction.
2019-11-14 11:15:27 -04:00
Fernando Sahmkow
f3d1b370aa
Shader_IR: Implement FLO instruction.
2019-11-14 11:15:27 -04:00
Fernando Sahmkow
95137a04e1
Shader_Bytecode: Add encodings for FLO, SHF and TXD
2019-11-14 11:15:26 -04:00
Fernando Sahmkow
b6f6733131
Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles
...
shader: Implement FSWZADD and reimplement SHFL
2019-11-14 10:27:27 -04:00
ReinUsesLisp
7990220df7
maxwell_3d: Fix stencil_back_func_mask offset
...
stencil_back_func_mask and stencil_back_mask were misplaced. This commit
addresses that issue.
2019-11-13 16:35:17 -03:00
Rodrigo Locatti
cf770a68a5
Merge pull request #3084 from ReinUsesLisp/cast-warnings
...
video_core: Treat implicit conversions as errors
2019-11-13 02:16:22 -03:00
Rodrigo Locatti
fb9418798d
video_core: Enable sign conversion warnings
...
Enable sign conversion warnings but don't treat them as errors.
2019-11-11 18:00:37 -03:00
bunnei
0fc596de6e
Merge pull request #3082 from ReinUsesLisp/fix-lockers
...
gl_shader_cache: Fix locker constructors
2019-11-09 13:58:36 -05:00
ReinUsesLisp
18c1cb68fd
video_core: Treat implicit conversions as errors
2019-11-08 22:49:39 +00:00
ReinUsesLisp
096f339a2a
video_core: Silence implicit conversion warnings
2019-11-08 22:48:50 +00:00
bunnei
a056d8de16
Merge pull request #3080 from FernandoS27/glsl-fix
...
GLSLDecompiler: Correct Texture Gather Offset.
2019-11-08 15:56:29 -05:00
ReinUsesLisp
bfa973a62b
gl_shader_cache: Fix locker constructors
...
Properly pass engine when a shader is being constructed from memory.
2019-11-07 20:43:31 -03:00
ReinUsesLisp
3ab0514698
gl_shader_cache: Enable extensions only when available
...
Silence GLSL compilation warnings.
2019-11-07 20:08:42 -03:00
ReinUsesLisp
cd66395944
gl_shader_decompiler: Add safe fallbacks when ARB_shader_ballot is not available
2019-11-07 20:08:42 -03:00
ReinUsesLisp
56e237d1f9
shader_ir/warp: Implement FSWZADD
2019-11-07 20:08:41 -03:00
ReinUsesLisp
08b2b1080a
gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics
2019-11-07 20:08:41 -03:00
Fernando Sahmkow
3d7c284e0f
GLSLDecompiler: Correct Texture Gather Offset.
...
This commit corrects the argument ordering in textureGatherOffset.
2019-11-07 11:43:56 -04:00
bunnei
b6ae48966d
Merge pull request #3032 from ReinUsesLisp/simplify-control-flow-brx
...
shader/control_flow: Abstract repeated code chunks in BRX tracking
2019-11-07 01:30:01 -05:00
Morph
0e8a3bf3e5
buffer_cache: Add missing includes ( #3079 )
...
`boost::make_iterator_range` is available when `boost/range/iterator_range.hpp` is included.
Also include `boost/icl/interval_map.hpp` and `boost/icl/interval_set.hpp`.
2019-11-07 06:25:53 +00:00
bunnei
344d15f61e
Merge pull request #3070 from ReinUsesLisp/shader-warnings
...
shader_ir: Reduce severity of warnings
2019-11-07 00:47:24 -05:00
ReinUsesLisp
e9d2fad984
gl_rasterizer: Remove front facing hack
2019-11-07 01:52:18 -03:00
ReinUsesLisp
f1facaeaef
gl_shader_decompiler: Fix typo "y_negate"->"y_direction"
2019-11-07 01:52:18 -03:00
ReinUsesLisp
e2ea0c3e11
gl_shader_manager: Remove unused variable in SetFromRegs
2019-11-07 01:52:18 -03:00
ReinUsesLisp
f019817f8f
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
...
Emulates negative y viewports with ARB_clip_control. This allows us to
more easily emulated pipelines with tessellation and/or geometry shader
stages. It also avoids corrupting games with transform feedbacks and
negative viewports (gl_Position.y was being modified).
2019-11-07 01:52:18 -03:00
Rodrigo Locatti
ff5a0f370c
shader/control_flow: Specify constness on caller lambdas
...
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
Update src/video_core/shader/control_flow.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-11-07 01:44:09 -03:00
ReinUsesLisp
7b069252f8
shader/control_flow: Use callable template instead of std::function
2019-11-07 01:44:08 -03:00
ReinUsesLisp
46c3047283
shader/control_flow: Abstract repeated code chunks in BRX tracking
...
Remove copied and pasted for cycles into a common templated function.
2019-11-07 01:44:08 -03:00
ReinUsesLisp
ae7dfa93be
shader/control_flow: Silence Intellisense cast warnings
2019-11-07 01:44:08 -03:00
ReinUsesLisp
deb1b54eed
shader/control_flow: Remove brace initializer in std containers
...
These containers have a default constructor.
2019-11-07 01:44:08 -03:00
ReinUsesLisp
39c66abd91
shader/decode: Reduce severity of arithmetic rounding warnings
2019-11-07 01:43:38 -03:00
ReinUsesLisp
c4374d0d41
shader/arithmetic: Reduce RRO stub severity
2019-11-07 01:43:38 -03:00
ReinUsesLisp
35d40b74b3
shader/texture: Remove NODEP warnings
...
These warnings don't offer meaningful information while decoding
shaders. Remove them.
2019-11-07 01:43:38 -03:00
bunnei
468576284d
Merge pull request #3057 from ReinUsesLisp/buffer-sub-data
...
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
2019-11-06 10:08:55 -05:00
Rodrigo Locatti
654b77d2ec
Merge pull request #3039 from ReinUsesLisp/cleanup-samplers
...
shader/node: Unpack bindless texture encoding
2019-11-06 04:54:11 +00:00
bunnei
21e07df7b7
Merge pull request #2914 from FernandoS27/fermi-fix
...
Fermi2D: limit blit area to only available area
2019-11-05 20:45:24 -05:00
bunnei
1bdae0fe29
common_func: Use std::array for INSERT_PADDING_* macros.
...
- Zero initialization here is useful for determinism.
2019-11-03 22:22:41 -05:00
ReinUsesLisp
442a1cc021
gl_rasterizer: Re-enable stream buffer memory due to global memory
...
Global memory is still using the stream buffer when it shouldn't. As a
temporary fix re-enable the stream buffer on compute.
2019-11-02 13:19:19 -03:00
ReinUsesLisp
76ca2a5f82
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
...
Nvidia's OpenGL driver maps gl(Named)BufferSubData with some requirements
to a fast. This path has an extra memcpy but updates the buffer without
orphaning or waiting for previous calls. It can be seen as a better
model for "push constants" that can upload a whole UBO instead of 256
bytes.
This path has some requirements established here:
http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf#page=24
Instead of using the stream buffer, this commits moves constant buffers
uploads to calls of glNamedBufferSubData and from my testing it brings a
performance improvement. This is disabled when the vendor is not Nvidia
since it brings performance regressions.
2019-11-02 05:05:34 -03:00
Fernando Sahmkow
23cabc98db
Shader_IR: Fix regression on TLD4
...
Originally on the last commit I thought TLD4 acted the same as TLD4S and
didn't have a mask. It actually does have a component mask. This commit
corrects that.
2019-10-30 21:14:57 -04:00
Rodrigo Locatti
658489ebf7
Merge pull request #3050 from FernandoS27/fix-tld4
...
shader_ir: Fix TLD4 and add bindless variant
2019-10-30 18:37:17 +00:00
Fernando Sahmkow
9293c3a0f2
Shader_IR: Fix TLD4 and add Bindless Variant.
...
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
bunnei
2382bbe3ac
Merge pull request #3046 from ReinUsesLisp/clean-gl-state
...
gl_state: Miscellaneous clean up
2019-10-29 22:50:04 -04:00
bunnei
b5138f3c35
Merge pull request #3035 from ReinUsesLisp/rasterizer-accelerated
...
rasterizer_accelerated: Add intermediary for GPU rasterizers
2019-10-29 22:06:41 -04:00
Rodrigo Locatti
3d0cde6a75
gl_state: Use std::array::fill instead of std::fill
...
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-10-30 01:30:31 +00:00
ReinUsesLisp
ce20ed8e4e
gl_state: Move dirty checks to individual apply calls instead of Apply
...
This requires removing constness from some methods, but for consistency
it's removed in all methods.
2019-10-29 21:27:25 -03:00
ReinUsesLisp
3c6557c235
gl_state: Remove ApplyDefaultState
...
OpenGL has defaults values we can trust. Remove these.
2019-10-29 21:27:25 -03:00
ReinUsesLisp
d3651b0b82
gl_state: Change SetDefaultViewports to use default constructor
2019-10-29 21:27:24 -03:00
ReinUsesLisp
c7698d0bc8
gl_state: Minor style changes
2019-10-29 21:27:24 -03:00
ReinUsesLisp
a14d202ac2
gl_state: Remove unused Citra TextureUnits
2019-10-29 21:27:24 -03:00
ReinUsesLisp
28fece8e9b
gl_state: Move initializers from constructor to class declaration
2019-10-29 21:27:23 -03:00
ReinUsesLisp
a993df1ee2
shader/node: Unpack bindless texture encoding
...
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.
Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Rodrigo Locatti
2ec5b55ee3
Merge pull request #3004 from ReinUsesLisp/maxwell3d-cleanup
...
maxwell_3d: Remove unused entries
2019-10-29 23:46:33 +00:00
Rodrigo Locatti
c5d9589942
Merge pull request #3037 from FernandoS27/new-formats
...
video_core: Implement texture format E5B9G9R9_SHAREDEXP.
2019-10-28 01:36:58 -03:00
ReinUsesLisp
fa31e5b868
maxwell_3d/kepler_compute: Remove unused arguments in GetTexture
2019-10-28 00:23:42 -03:00
ReinUsesLisp
538ddd220e
video_core/textures: Remove unused index entry in FullTextureInfo
2019-10-28 00:14:38 -03:00
ReinUsesLisp
961fe4d19b
maxwell_3d: Remove unused method GetStageTextures
2019-10-28 00:14:29 -03:00
Fernando Sahmkow
3f9262195b
Video_Core: Implement texture format E5B9G9R9_SHAREDEXP.
...
This commit implements the E5B9G9R9 Texture format into the general
system and OpenGL backend.
2019-10-27 16:44:09 -04:00
bunnei
6909b2f0f9
Merge pull request #3034 from ReinUsesLisp/w4244-maxwell3d
...
maxwell_3d: Silence implicit conversion warnings
2019-10-27 15:08:59 -04:00
ReinUsesLisp
3e469cecc1
maxwell_3d: Silence implicit conversion warnings
...
While we are at it, unify types for dirty reg pointers.
2019-10-27 15:22:17 -03:00
ReinUsesLisp
bd2aff3e26
rasterizer_accelerated: Add intermediary for GPU rasterizers
...
Add an intermediary class that implements common functions across GPU
accelerated rasterizers. This avoids code repetition on different
backends.
2019-10-27 03:40:08 -03:00
ReinUsesLisp
a5aa1bb174
astc: Silence implicit conversion warnings
2019-10-27 03:04:50 -03:00
Rodrigo Locatti
26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
...
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow
be856a38d6
Shader_IR: Address Feedback.
2019-10-26 15:38:30 -04:00
Rodrigo Locatti
a0d79085c4
Merge pull request #3027 from lioncash/lookup
...
shader_ir: Use std::array with std::pair instead of std::unordered_map
2019-10-26 05:49:15 -03:00
Rodrigo Locatti
d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
...
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
Fernando Sahmkow
e3afd6595a
Shader_IR: Clang format
2019-10-25 09:01:32 -04:00
ReinUsesLisp
78f3e8a757
gl_shader_cache: Implement locker variants invalidation
2019-10-25 09:01:32 -04:00
ReinUsesLisp
ec85648af3
gl_shader_disk_cache: Store and load fast BRX
2019-10-25 09:01:31 -04:00
ReinUsesLisp
fa2c297f3e
const_buffer_locker: Minor style changes
2019-10-25 09:01:31 -04:00
ReinUsesLisp
7b81ba4d8a
gl_shader_decompiler: Move entries to a separate function
2019-10-25 09:01:31 -04:00
Fernando Sahmkow
1244f2d368
Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.
2019-10-25 09:01:31 -04:00
Fernando Sahmkow
a05120ec0b
Shader_IR: Correct typo in Consistent method.
2019-10-25 09:01:30 -04:00
Fernando Sahmkow
33fcec3502
Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it
2019-10-25 09:01:30 -04:00
Fernando Sahmkow
8909f52166
Shader_IR: Implement Fast BRX and allow multi-branches in the CFG.
2019-10-25 09:01:30 -04:00
Fernando Sahmkow
acd6441134
Shader_Cache: setup connection of ConstBufferLocker
2019-10-25 09:01:29 -04:00
Fernando Sahmkow
1a58f45d76
VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders.
2019-10-25 09:01:29 -04:00
Fernando Sahmkow
2ef696c85a
Shader_IR: Implement BRX tracking.
2019-10-25 09:01:29 -04:00
Rodrigo Locatti
5062728669
Merge pull request #3028 from lioncash/constexpr
...
shader_bytecode: Make Matcher constexpr capable
2019-10-24 15:10:40 -03:00
Lioncash
7fdf991097
shader_bytecode: Make Matcher constexpr capable
...
Greatly shrinks the amount of generated code for GetDecodeTable().
Collapses an assembly output of 9000+ lines down to ~3621 with Clang,
and 6513 down to ~2616 with GCC, given it's now allowed to construct all
the entries as a sequence of constant data.
2019-10-24 01:10:10 -04:00
Lioncash
382717172e
shader_ir: Use std::array with pair instead of unordered_map
...
Given the overall size of the maps are very small, we can use arrays of
pairs here instead of always heap allocating a new map every time the
functions are called. Given the small size of the maps, the difference
in container lookups are negligible, especially given the entries are
already sorted.
2019-10-24 00:25:38 -04:00
Lioncash
1f5401c89c
video_core/shader: Resolve instances of variable shadowing
...
Silences a few -Wshadow warnings.
2019-10-23 23:00:31 -04:00
Fernando Sahmkow
c4a0aa9207
Merge pull request #2995 from ReinUsesLisp/ignore-gmem
...
shader_ir/memory: Ignore global memory when tracking fails
2019-10-22 13:22:43 -04:00
Fernando Sahmkow
7ecf9f7228
Merge pull request #2983 from lioncash/fallthrough
...
gl_shader_decompiler/vk_shader_decompiler: Resolve implicit fallthrough cases
2019-10-22 13:16:46 -04:00
Fernando Sahmkow
1509d2ffbd
Shader_Ir: Fix TLD4S from using a component mask.
...
TLD4S always outputs 4 values, the previous code checked a component
mask and omitted those values that weren't part of it. This commit
corrects that and makes sure all 4 values are set.
2019-10-22 10:59:07 -04:00
ReinUsesLisp
1ea07954fb
shader_ir/memory: Ignore global memory when tracking fails
...
Ignore global memory operations instead of invoking undefined behaviour
when constant buffer tracking fails and we are blasting through asserts,
ignore the operation.
In the case of LDG this means filling the destination registers with
zeroes; for STG this means ignore the instruction as a whole.
The default behaviour is still to abort execution on failure.
2019-10-22 02:49:17 -03:00
ReinUsesLisp
e3107788e6
maxwell_3d: Reduce FlushMMEInlineDraw logging to Trace
2019-10-20 03:43:17 -03:00
Rodrigo Locatti
dc5eedef71
Merge pull request #2994 from lioncash/fmt
...
video_core/shader/ast: Minor changes to ASTPrinter
2019-10-18 01:05:25 -03:00
Lioncash
074b38b7a9
video_core/shader/ast: Make ShowCurrentState() and SanityCheck() const member functions
...
These can also trivially be made const member functions, with the
addition of a few consts.
2019-10-17 20:59:48 -04:00
Lioncash
222f4b45eb
video_core/shader/ast: Make ASTManager::Print a const member function
...
Given all visiting functions never modify the nodes, we can trivially
make this a const member function.
2019-10-17 20:56:39 -04:00
Rodrigo Locatti
fd922ddb01
Merge pull request #2993 from lioncash/vulkan-expr
...
vk_shader_decompiler: Mark operator() function parameters as const references
2019-10-17 21:46:49 -03:00
Lioncash
7831e86c34
video_core/shader/ast: Make ExprPrinter members private
...
This member already has an accessor, so there's no need for it to be
public.
2019-10-17 20:39:36 -04:00
Lioncash
a2eccbf075
video_core/shader/ast: Make Indent() return a string_view
...
The returned string is simply a substring of our constexpr tabs
string_view, so we can just use a string_view here as well, since the
original string_view is guaranteed to always exist.
Now the function is fully non-allocating.
2019-10-17 20:29:00 -04:00
Lioncash
15d177a6ac
video_core/shader/ast: Make Indent() private
...
It's never used outside of this class, so we can narrow its scope down.
2019-10-17 20:26:13 -04:00
Lioncash
7f6a8a33d4
video_core/shader/ast: Rename Ident() to Indent()
...
This can be confusing, given "ident" is generally used as a shorthand
for "identifier".
2019-10-17 20:26:13 -04:00
Lioncash
081530686c
video_core/shader/ast: Make use of fmt where applicable
...
Makes a few strings nicer to read and also eliminates a bit of string
churn with operator+.
2019-10-17 20:26:10 -04:00
Lioncash
c6bec9aa10
vk_shader_decompiler: Mark operator() function parameters as const references
...
These parameters aren't actually modified in any way, so they can be
made const references.
2019-10-17 19:44:00 -04:00
Rodrigo Locatti
219fdcb9d9
Merge pull request #2966 from FernandoS27/astc-formats
...
Implement a series of ASTC formats and R4G4B4A4 format
2019-10-17 19:24:11 -03:00
Rodrigo Locatti
a21b88ef8f
Merge pull request #2979 from lioncash/macro
...
video_core/macro_interpreter: Make definitions of most private enums/unions hidden
2019-10-17 19:21:09 -03:00
Fernando Sahmkow
c0eb1aecfd
Fermi2D: Use a different formula for delimiting blit areas.
2019-10-17 18:21:01 -04:00
Lioncash
125caf5d6e
video_core/macro_interpreter: Make definitions of most private enums/unions hidden
...
This allows the implementation of these types to change without
requiring a rebuild of everything that includes the macro interpreter
header.
2019-10-17 17:55:46 -04:00
bunnei
9fe8072c67
Merge pull request #2980 from lioncash/warn
...
maxwell_3d: Silence truncation warnings
2019-10-17 14:02:16 -04:00
Fernando Sahmkow
57a46c69f1
Fermi2D: limit blit area to only available area
...
Normaly OpenGL does not care if the areas exceed the texture regions but
other backends such as Vulkan do care about the limits of this areas.
This PR crops the areas of the blit in order that they don't surpass the
limits of the textures. This should help Vulkan and faulty OpenGL
drivers
2019-10-17 10:38:44 -04:00
Rodrigo Locatti
60c602e4e7
Merge pull request #2978 from lioncash/doxygen
...
video_core/texture_cache: Amend Doxygen references
2019-10-16 22:09:40 -03:00
Rodrigo Locatti
e00b529a89
Merge pull request #2982 from lioncash/surface
...
texture_cache: Avoid unnecessary surface copies within PickStrategy() and TryReconstructSurface()
2019-10-16 19:43:32 -03:00
bunnei
ef9b31783d
Merge pull request #2912 from FernandoS27/async-fixes
...
General fixes to Async GPU
2019-10-16 10:34:48 -04:00
Rodrigo Locatti
60315060b1
Merge pull request #2984 from lioncash/fallthrough2
...
video_core/surface: Add missing break in PixelFormatFromTextureFormat()
2019-10-15 23:08:34 -03:00
Lioncash
cf9e13c255
video_core/surface: Add missing break in PixelFormatFromTextureFormat()
...
Prevents fallthrough into the following case.
2019-10-15 21:53:15 -04:00
Rodrigo Locatti
14f3cebcd4
Merge pull request #2981 from lioncash/copy
...
gl_shader_decompiler: Minor cleanup-related changes
2019-10-15 21:07:25 -03:00
Lioncash
6947bf8e44
vk_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()
...
This would previously result in NeverExecute and UnusedIndex being
treated as regular predicates.
2019-10-15 19:40:58 -04:00
Lioncash
b42a74ff2c
gl_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()
...
This would previously result in NeverExecute and UnusedIndex being
treated as regular predicates.
2019-10-15 19:38:55 -04:00
Lioncash
a24e8bf9cf
texture_cache: Avoid unnecessary surface copies within PickStrategy() and TryReconstructSurface()
...
We can take these by const reference and avoid making unnecessary
copies, preventing some atomic reference count increments and
decrements.
2019-10-15 19:31:33 -04:00
Lioncash
77b4916b33
control_flow: Silence truncation warnings
...
This can be trivially fixed by making the input size a size_t.
CFGRebuildState's constructor parameter is already a std::size_t, so
this just makes the size type fully conform with it.
2019-10-15 19:10:28 -04:00
Lioncash
4f16ce9294
gl_shader_decompiler: Make ExprDecompiler's GetResult() a const member function
...
This is only ever used to read, but not write, the resulting string, so
we can enforce this by making it a const member function.
2019-10-15 19:02:59 -04:00
Lioncash
67df3f7742
gl_shader_decompiler: Use a std::string_view with GetDeclarationWithSuffix()
...
This allows the function to be completely non-allocating for inputs of
all sizes (i.e. there's no heap cost for an input to convert to a
std::string_view).
2019-10-15 19:00:48 -04:00
Lioncash
04a1161354
gl_shader_decompiler: Fold flow_var constant into GetFlowVariable()
...
This is only ever used within this function, so we can narrow it's scope
down.
2019-10-15 18:58:36 -04:00
Lioncash
2f2ab9b5bc
gl_shader_decompiler: Mark ASTDecompiler/ExprDecompiler parameters as const references where applicable
...
These member functions don't actually modify the input parameter, so we
can make this explicit with the use of const.
2019-10-15 18:57:02 -04:00
Lioncash
b8a62adcf1
gl_shader_decompiler: Pass by reference to GenerateTextureArgument()
...
Avoids an unnecessary atomic reference count increment and decrement.
2019-10-15 18:29:37 -04:00
Lioncash
d1d7ce74d2
gl_shader_decompiler: Use std::holds_alternative within GenerateTexture()
...
This only ever queries if the type exists within the variant, but
doesn't actually do anything with the return value. We can just use
std::holds_alternative for this use case.
2019-10-15 18:25:48 -04:00
Lioncash
67658dd6e8
shader/node: std::move Meta instance within OperationNode constructor
...
Allows usages of the constructor to avoid an unnecessary copy.
2019-10-15 18:21:59 -04:00
Lioncash
9760795bfb
gl_shader_decompiler: Avoid unnecessary copies of MetaImage
...
MetaImage contains a std::vector, so copying here could result in
unnecessary reallocations. Given the operation lives throughout the
entire scope, this is safe to do.
2019-10-15 18:14:55 -04:00
Lioncash
c9c75f9587
maxwell_3d: Silence truncation warnings
...
A trivial warning caused by not using size_t as the argument types
instead of u32.
2019-10-15 17:51:35 -04:00
bunnei
2299950de1
Merge pull request #2972 from lioncash/system
...
{bcat, gpu, nvflinger}: Remove trivial usages of the global system accessor
2019-10-15 17:49:12 -04:00
Lioncash
b25b94400e
video_core/gpu: Remove use of the global system accessor
...
We can just make use of the reference member variable instead of
accessing the global system instance.
2019-10-15 16:39:30 -04:00
Lioncash
524eb15513
video_core/texture_cache: Amend Doxygen references
...
Amends the doxygen comments so that they properly resolve. While we're
at it, we can correct some typos and fix up some of the comments'
formatting in order to make them slightly nicer to read.
2019-10-15 15:40:00 -04:00
Lioncash
ac4dbd3b25
common: Rename binary_find.h to algorithm.h
...
Makes the header more general for other potential algorithms in the
future. While we're at it, include a missing <functional> include to
satisfy the use of std::less.
2019-10-15 15:24:50 -04:00
Fernando Sahmkow
cfc2f30dc4
AsyncGpu: Address Feedback
2019-10-11 13:41:15 -04:00
bunnei
2ba273e49e
Merge pull request #2928 from ReinUsesLisp/dirty-depth-bounds
...
maxwell_3d: Add dirty flags for depth bounds values
2019-10-09 15:44:30 -04:00
bunnei
6b5e50d20e
Merge pull request #2927 from ReinUsesLisp/polygon-offset-units
...
gl_rasterizer: Fix polygon offset units
2019-10-09 15:38:52 -04:00
Fernando Sahmkow
f32a49d3d8
Surfaces: Implement R4G4B4A4U format.
2019-10-09 12:57:02 -04:00
Fernando Sahmkow
b9ddb517b1
Surfaces: Implement ASTC 6x6 10x10 12x12 8x6 6x5
2019-10-09 12:44:31 -04:00
ReinUsesLisp
3d0f357307
shader/half_set_predicate: Fix HSETP2 for constant buffers
...
HSETP2 when used with a constant buffer parses the second operand type
as F32. This is not configurable.
2019-10-07 14:49:47 -03:00
ReinUsesLisp
632c9e4ee3
shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG
2019-10-07 14:48:58 -03:00
ReinUsesLisp
58b597c5ec
gl_shader_disk_cache: Properly ignore existing cache
...
Previously old entries where appended to the file even if the shader
cache was ignored at boot. Address that issue.
2019-10-06 18:00:20 -03:00
Lioncash
f883cd4f0e
video_core/control_flow: Eliminate variable shadowing warnings
2019-10-05 09:14:27 -04:00
Lioncash
25702b6256
video_core/control_flow: Eliminate pessimizing moves
...
These can inhibit the ability of a compiler to perform RVO.
2019-10-05 09:14:27 -04:00
Lioncash
d82b181d44
video_core/ast: Unindent most of IsFullyDecompiled() by one level
2019-10-05 09:14:27 -04:00
Lioncash
6c41d1cd7e
video_core/ast: Make ShowCurrentState() take a string_view instead of std::string
...
Allows the function to be non-allocating in terms of the output string.
2019-10-05 09:14:27 -04:00
Lioncash
3c54edae24
video_core/ast: Eliminate variable shadowing warnings
2019-10-05 09:14:26 -04:00
Lioncash
5a0a9c7449
video_core/ast: Replace std::string with a constexpr std::string_view
...
Same behavior, but without the need to heap allocate
2019-10-05 09:14:26 -04:00
Lioncash
3a20d9734f
video_core/ast: Default the move constructor and assignment operator
...
This is behaviorally equivalent and also fixes a bug where some members
weren't being moved over.
2019-10-05 09:14:26 -04:00
Lioncash
43503a69bf
video_core/{ast, expr}: Organize forward declaration
...
Keeps them alphabetically sorted for readability.
2019-10-05 09:14:26 -04:00
Lioncash
50ad745585
video_core/expr: Supply operator!= along with operator==
...
Provides logical symmetry to the interface.
2019-10-05 09:14:26 -04:00
Lioncash
8eb1398f8d
video_core/{ast, expr}: Use std::move where applicable
...
Avoids unnecessary atomic reference count increments and decrements.
2019-10-05 09:14:23 -04:00
Lioncash
8e0c80f269
video_core/ast: Supply const accessors for data where applicable
...
Provides const equivalents of data accessors for use within const
contexts.
2019-10-05 08:22:03 -04:00
David
3728bbc22a
Merge pull request #2888 from FernandoS27/decompiler2
...
Shader_IR: Implement a full control flow decompiler for the shader IR.
2019-10-05 21:52:20 +10:00
ReinUsesLisp
fe7f20e659
maxwell_3d: Add dirty flags for depth bounds values
...
This is useful in Vulkan where we want to update depth bounds without
caring if it's enabled or disabled through vkCmdSetDepthBounds.
2019-10-05 04:07:47 +00:00
Fernando Sahmkow
538f5880ff
GL_Renderer: Remove lefting snippet.
2019-10-04 19:59:55 -04:00
Fernando Sahmkow
9f2719d1a4
Gl_Rasterizer: Protect CPU Memory mapping from multiple threads.
2019-10-04 19:59:53 -04:00
Fernando Sahmkow
3f104464de
Core: Wait for GPU to be idle before shutting down.
2019-10-04 19:59:53 -04:00
Fernando Sahmkow
ffc2ce89a0
Nvdrv: Do framelimiting only in the CPU Thread
2019-10-04 19:59:50 -04:00
Fernando Sahmkow
5b5e60ffec
GPU_Async: Correct fences, display events and more.
...
This commit uses guest fences on vSync event instead of an articial fake
fence we had.
It also corrects to keep signaling display events while loading the game
as the OS is suppose to send buffers to vSync during that time.
2019-10-04 19:59:48 -04:00
Fernando Sahmkow
ab47a660c8
Texture_Cache: Blit Deduction corrections and simplifications.
2019-10-04 18:53:47 -04:00
Fernando Sahmkow
2036504a82
TextureCache: Add the ability to deduce if two textures are depth on blit.
2019-10-04 18:53:46 -04:00
Fernando Sahmkow
e6eae4b815
Shader_ir: Address feedback
2019-10-04 18:52:57 -04:00
Fernando Sahmkow
3c09d9abe6
Shader_Ir: Address Feedback and clang format.
2019-10-04 18:52:57 -04:00
Fernando Sahmkow
507a9c6a40
vk_shader_decompiler: Correct Branches inside conditionals.
2019-10-04 18:52:56 -04:00
Fernando Sahmkow
000ad558dd
vk_shader_decompiler: Clean code and be const correct.
2019-10-04 18:52:55 -04:00
Fernando Sahmkow
7c756baa77
Shader_IR: clean up AST handling and add documentation.
2019-10-04 18:52:55 -04:00
Fernando Sahmkow
5ea740beb5
Shader_IR: Correct OutwardMoves for Ifs
2019-10-04 18:52:54 -04:00
Fernando Sahmkow
100a4bd988
vk_shader_compiler: Don't enclose branches with if(true) to avoid crashing AMD
2019-10-04 18:52:54 -04:00
Fernando Sahmkow
189a50bc2a
gl_shader_decompiler: Refactor and address feedback.
2019-10-04 18:52:53 -04:00
Fernando Sahmkow
b3c46d6948
Shader_IR: corrections and clang-format
2019-10-04 18:52:53 -04:00
Fernando Sahmkow
466cd52ad4
vk_shader_compiler: Correct SPIR-V AST Decompiling
2019-10-04 18:52:52 -04:00
Fernando Sahmkow
2e9a810423
Shader_IR: allow else derivation to be optional.
2019-10-04 18:52:52 -04:00
Fernando Sahmkow
ca9901867e
vk_shader_compiler: Implement the decompiler in SPIR-V
2019-10-04 18:52:51 -04:00
Fernando Sahmkow
0366c18d87
Shader_IR: mark labels as unused for partial decompile.
2019-10-04 18:52:51 -04:00
Fernando Sahmkow
47e4f6a52c
Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes.
2019-10-04 18:52:50 -04:00
Fernando Sahmkow
38fc995f6c
gl_shader_decompiler: Implement AST decompiling
2019-10-04 18:52:50 -04:00
Fernando Sahmkow
6fdd501113
shader_ir: Declare Manager and pass it to appropiate programs.
2019-10-04 18:52:49 -04:00
Fernando Sahmkow
8be6e1c522
shader_ir: Corrections to outward movements and misc stuffs
2019-10-04 18:52:48 -04:00
Fernando Sahmkow
4fde66e609
shader_ir: Add basic goto elimination
2019-10-04 18:52:48 -04:00
Fernando Sahmkow
c17953978b
shader_ir: Initial Decompile Setup
2019-10-04 18:52:47 -04:00
ReinUsesLisp
69c806feb6
gl_rasterizer: Fix polygon offset units
...
For some reason hardware divides polygon offset units by two. This is
visible since drivers multiply the application requested polygon offset
by two.
2019-10-01 02:00:23 -03:00
ReinUsesLisp
f926230ab1
gl_shader_decompiler: Add tailing return for HUnpack2
2019-09-24 01:03:59 -03:00
ReinUsesLisp
25bfaffdff
gl_shader_decompiler: Fix clang build issues
2019-09-24 01:03:27 -03:00
bunnei
376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
...
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
David
9d69206cd0
Merge pull request #2870 from FernandoS27/multi-draw
...
Implement a MME Draw commands Inliner and correct host instance drawing
2019-09-22 23:13:02 +10:00
Fernando Sahmkow
822ca65d69
Merge pull request #2891 from FearlessTobi/rod-tex
...
video_core: Implement RGBX16F and lower Surface Copy log severity
2019-09-22 09:11:28 -04:00
David
3bfba23362
Merge pull request #2867 from ReinUsesLisp/configure-framebuffers-clean
...
gl_rasterizer: Remove unused code paths from ConfigureFramebuffers
2019-09-22 23:10:07 +10:00
Fernando Sahmkow
68f5aff64f
Maxwell3D: Corrections and refactors to MME instance refactor
2019-09-22 07:23:13 -04:00
FearlessTobi
01fc969a5f
Fix clang-format
2019-09-22 02:21:56 +02:00
FearlessTobi
366e900376
fermi_2d: Lower surface copy log severity to DEBUG
2019-09-22 02:18:57 +02:00
FearlessTobi
55d272efe6
video_core: Implement RGBX16F PixelFormat
2019-09-22 02:16:44 +02:00
Rodrigo Locatti
9286976948
Merge pull request #2878 from FernandoS27/icmp
...
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
ReinUsesLisp
44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
...
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp
675f23aedc
shader/image: Implement SULD and remove irrelevant code
...
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
ReinUsesLisp
4de0f1e1c8
shader_bytecode: Add SULD encoding
2019-09-21 17:31:46 -03:00
Fernando Sahmkow
527b841c15
Shader_IR: ICMP corrections and fixes
2019-09-21 14:28:03 -04:00
David
9ad42fb0cf
Merge pull request #2868 from ReinUsesLisp/fix-mipmaps
...
maxwell_to_gl: Fix mipmap filtering
2019-09-21 19:57:09 +10:00
David Marcec
01a4afee42
Mark DrawArrays as LOG_TRACE
...
There's no reason to clog logs with DrawArray.
2019-09-21 15:43:58 +10:00
bunnei
bbe82d62b0
Merge pull request #2846 from ReinUsesLisp/fixup-viewport-index
...
gl_shader_decompiler: Avoid writing output attribute when unimplemented
2019-09-20 17:11:20 -04:00
bunnei
88d857499b
Merge pull request #2855 from ReinUsesLisp/shfl
...
shader_ir/warp: Implement SHFL for Nvidia devices
2019-09-20 17:10:42 -04:00
Fernando Sahmkow
433e764bb0
Rasterizer: Correct introduced bug where a conditional render wouldn't stop a draw call from executing
2019-09-20 15:44:28 -04:00
Fernando Sahmkow
4b81d19a1a
Shader_IR: Implement ICMP.
2019-09-19 20:56:29 -04:00
Fernando Sahmkow
7761e44d18
Rasterizer: Refactor and simplify DrawBatch Interface.
2019-09-19 11:41:33 -04:00
Fernando Sahmkow
d2ea592ddb
Rasterizer: Address Feedback and conscerns.
2019-09-19 11:41:32 -04:00
Fernando Sahmkow
c17655ce74
Rasterizer: Refactor draw calls, remove deadcode and clean up.
2019-09-19 11:41:31 -04:00
Fernando Sahmkow
7606da5611
VideoCore: Corrections to the MME Inliner and removal of hacky instance management.
2019-09-19 11:41:29 -04:00
Fernando Sahmkow
ba02d564f8
Video Core: initial Implementation of InstanceDraw Packaging
2019-09-19 11:41:27 -04:00
bunnei
b31880dc5e
Merge pull request #2784 from ReinUsesLisp/smem
...
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00
ReinUsesLisp
0526bf1895
shader_ir/warp: Implement SHFL
2019-09-17 17:44:07 -03:00
ReinUsesLisp
2dd6411753
maxwell_to_gl: Fix mipmap filtering
...
OpenGL texture filters follow GL_<texture_filter>_MIPMAP_<mipmap_filter>
but we were using them in the opposite way.
2019-09-17 03:32:24 -03:00
ReinUsesLisp
af809b491e
gl_rasterizer: Remove unused code paths from ConfigureFramebuffers
2019-09-17 02:50:42 -03:00
Fernando Sahmkow
393cc3ef2f
Merge pull request #2851 from ReinUsesLisp/srgb
...
renderer_opengl: Fix sRGB blits
2019-09-15 10:38:10 -04:00
Fernando Sahmkow
b8b1747704
Merge pull request #2824 from ReinUsesLisp/mme
...
Revert "Revert #2466 " and stub FirmwareCall 4
2019-09-15 06:17:04 -04:00
Rodrigo Locatti
193bfefce4
maxwell_3d: Update firmware 4 call stub commentary
2019-09-14 22:51:18 -03:00
Fernando Sahmkow
daae327e86
Merge pull request #2857 from ReinUsesLisp/surface-srgb
...
video_core/surface: Add function to detect sRGB surfaces
2019-09-14 03:53:21 -04:00
Fernando Sahmkow
18fac59050
Merge pull request #2858 from ReinUsesLisp/vk-device
...
vk_device: Add miscellaneous features and minor style changes
2019-09-14 03:52:06 -04:00
ReinUsesLisp
01d96e1136
vk_device: Add miscellaneous features and minor style changes
...
* Increase minimum Vulkan requirements
* Require VK_EXT_vertex_attribute_divisor
* Require depthClamp, samplerAnisotropy and largePoints features
* Search and expose VK_KHR_uniform_buffer_standard_layout
* Search and expose VK_EXT_index_type_uint8
* Search and expose native float16 arithmetics
* Track current driver with VK_KHR_driver_properties
* Query and expose SSBO alignment
* Query more image formats
* Improve logging overall
* Minor style changes
* Minor rephrasing of commentaries
2019-09-13 02:10:07 -03:00
ReinUsesLisp
99e23bd0fd
video_core/surface: Add function to detect sRGB surfaces
...
This is required for proper conversion to RGBA8_UNORM or RGBA8_SRGB
surfaces when a backend can target both native and converted ASTC.
2019-09-13 00:27:04 -03:00
ReinUsesLisp
6b997c8f7f
renderer_opengl: Fix rebase mistake
2019-09-11 00:09:37 -03:00
ReinUsesLisp
36abf67e79
shader/image: Implement SUATOM and fix SUST
2019-09-10 20:22:31 -03:00
Fernando Sahmkow
e60d281a01
gl_rasterizer: Correct sRGB Fix regression
2019-09-10 19:31:42 -03:00
ReinUsesLisp
78574746bd
renderer_opengl: Fix sRGB blits
...
Removes the sRGB hack of tracking if a frame used an sRGB rendertarget
to apply at least once to blit the final texture as sRGB. Instead of
doing this apply sRGB if the presented image has sRGB.
Also enable sRGB by default on Maxwell3D registers as some games seem to
assume this.
2019-09-10 19:31:42 -03:00
bunnei
34b2c60f95
Merge pull request #2823 from ReinUsesLisp/shr-clamp
...
shader/shift: Implement SHR wrapped and clamped variants
2019-09-10 11:56:17 -04:00
bunnei
c7ec7bc1f5
Merge pull request #2810 from ReinUsesLisp/mme-opt
...
maxwell_3d: Avoid moving macro_params
2019-09-10 11:55:45 -04:00
ReinUsesLisp
17a9b0178d
gl_shader_decompiler: Avoid writing output attribute when unimplemented
2019-09-06 15:02:12 -03:00
ReinUsesLisp
1f43e5296f
gl_shader_decompiler: Keep track of written images and mark them as modified
2019-09-05 23:26:05 -03:00
ReinUsesLisp
7228e22098
texture_cache: Minor changes
2019-09-05 23:25:15 -03:00
ReinUsesLisp
322d0200c8
gl_rasterizer: Apply textures and images state
2019-09-05 20:35:51 -03:00
ReinUsesLisp
80ec2feee8
gl_rasterizer: Add samplers to compute dispatches
2019-09-05 20:35:51 -03:00
ReinUsesLisp
954fc02fdd
gl_rasterizer: Minor code changes
2019-09-05 20:35:51 -03:00
ReinUsesLisp
04cdecb7a1
gl_state: Split textures and samplers into two arrays
2019-09-05 20:35:51 -03:00
ReinUsesLisp
6170337001
gl_rasterizer: Implement image bindings
2019-09-05 20:35:51 -03:00
ReinUsesLisp
5edf24b510
gl_state: Add support for glBindImageTextures
2019-09-05 20:35:51 -03:00
ReinUsesLisp
2424eefad2
texture_cache: Pass TIC to texture cache
2019-09-05 20:35:51 -03:00
ReinUsesLisp
3a450c1395
kepler_compute: Implement texture queries
2019-09-05 20:35:51 -03:00
ReinUsesLisp
2e5b5c2358
gl_rasterizer: Split SetupTextures
2019-09-05 20:35:51 -03:00
Fernando Sahmkow
4ee9949639
Merge pull request #2804 from ReinUsesLisp/remove-gs-special
...
gl_shader_cache: Remove special casing for geometry shaders
2019-09-05 16:03:46 -04:00
bunnei
03badbdd9b
Merge pull request #2833 from ReinUsesLisp/fix-stencil
...
gl_rasterizer: Fix stencil testing
2019-09-05 15:27:31 -04:00
ReinUsesLisp
0f7b813d65
gl_shader_decompiler: Implement shared memory
2019-09-05 01:40:24 -03:00
ReinUsesLisp
4de04eba39
shader_ir: Implement LD_S
...
Loads from shared memory.
2019-09-05 01:38:37 -03:00
ReinUsesLisp
f17415d431
shader_ir: Implement ST_S
...
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
2019-09-05 01:38:37 -03:00
David
d34fa7c4fa
Merge pull request #2802 from ReinUsesLisp/hsetp2-pred
...
half_set_predicate: Fix HSETP2 predicate assignments
2019-09-05 12:26:39 +10:00
ReinUsesLisp
6177cbdbe1
gl_shader_decompiler: Fixup slow path
2019-09-04 15:03:51 -03:00
ReinUsesLisp
7bbc98cfc3
gl_rasterizer: Fix stencil testing
...
* Fix stencil dirty flags tracking when stencil is disabled
* Attach stencil on clears (previously it only attached depth)
* Attach stencil on drawing regardless of stencil testing being enabled
2019-09-04 01:59:09 -03:00
ReinUsesLisp
5f309b88db
Revert "Revert #2466 " and stub FirmwareCall 4
2019-09-04 01:55:45 -03:00
ReinUsesLisp
77ef4fa907
shader/shift: Implement SHR wrapped and clamped variants
...
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
ReinUsesLisp
701dedcfad
maxwell_3d: Avoid moving macro_params
2019-09-04 01:55:01 -03:00
ReinUsesLisp
42e1bb6d46
gl_shader_cache: Remove special casing for geometry shaders
...
Now that ProgramVariants holds the primitive topology we no longer need
to keep track of individual geometry shaders topologies.
2019-09-04 01:54:43 -03:00
ReinUsesLisp
dfae2d141a
half_set_predicate: Fix predicate assignments
2019-09-04 01:54:23 -03:00
ReinUsesLisp
9cf52d027d
gl_device: Disable precise in fragment shaders on bugged drivers
2019-09-04 01:54:00 -03:00
ReinUsesLisp
03276e7490
gl_shader_decompiler: Fixup AMD's slow path type
2019-09-04 01:54:00 -03:00
ReinUsesLisp
6c449793b8
gl_shader_decompiler: Rework GLSL decompiler type system
...
GLSL decompiler type system was broken. We converted all return values
to float except for some cases where returning we couldn't and
implicitly broke the rule of returning floats (e.g. for bools or bool
pairs).
Instead of doing this introduce class Expression that knows what type a
return value has and when a consumer wants to use the string it asks for
it with a required type, emitting a runtime error if types are
incompatible.
This has the disadvantage that there's more C++ code, but we can emit
better GLSL code that's easier to read.
2019-09-04 01:54:00 -03:00
bunnei
19af91434e
Merge pull request #2793 from ReinUsesLisp/bgr565
...
renderer_opengl: Implement RGB565 framebuffer format
2019-09-03 22:36:32 -04:00
bunnei
81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
...
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei
d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
...
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
bunnei
137d165672
Merge pull request #2826 from ReinUsesLisp/macro-binding
...
maxwell_3d: Fix macro binding cursor
2019-09-03 22:32:42 -04:00
bunnei
50b5bb44a0
Merge pull request #2765 from FernandoS27/dma-fix
...
MaxwellDMA: Fixes, corrections and relaxations.
2019-09-01 13:13:05 -04:00
ReinUsesLisp
52a41f482f
maxwell_3d: Fix macro binding cursor
2019-09-01 05:01:11 -03:00
Rodrigo Locatti
4d4f9cc104
video_core: Silent miscellaneous warnings ( #2820 )
...
* texture_cache/surface_params: Remove unused local variable
* rasterizer_interface: Add missing documentation commentary
* maxwell_dma: Remove unused rasterizer reference
* video_core/gpu: Sort member declaration order to silent -Wreorder warning
* fermi_2d: Remove unused MemoryManager reference
* video_core: Silent unused variable warnings
* buffer_cache: Silent -Wreorder warnings
* kepler_memory: Remove unused MemoryManager reference
* gl_texture_cache: Add missing override
* buffer_cache: Add missing include
* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
ReinUsesLisp
878adee0a3
gl_buffer_cache: Add missing include
...
RasterizerInterface was considered an incomplete object by clang.
2019-08-29 22:02:52 +00:00
bunnei
a67c4e6e02
Merge pull request #2742 from ReinUsesLisp/fix-texture-buffers
...
gl_texture_cache: Miscellaneous texture buffer fixes
2019-08-29 15:59:17 -04:00
bunnei
e424615839
Merge pull request #2783 from FernandoS27/new-buffer-cache
...
Implement a New LLE Buffer Cache
2019-08-29 13:07:01 -04:00
bunnei
f8cc5668f8
Merge pull request #2758 from ReinUsesLisp/packed-tid
...
shader/decode: Implement S2R Tic
2019-08-29 12:58:43 -04:00
ReinUsesLisp
e3534700d7
shader_ir/conversion: Split int and float selector and implement F2F H1
2019-08-28 16:09:33 -03:00
ReinUsesLisp
b13fbc25b8
shader_ir/conversion: Implement F2I F16 Ra.H1
2019-08-27 23:40:40 -03:00
ReinUsesLisp
6207751b00
float_set_predicate: Add missing negation bit for the second operand
2019-08-27 21:57:43 -03:00
ReinUsesLisp
4e35177e23
shader_ir: Implement VOTE
...
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics
Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.
To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:
* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true
ballotARB, also known as "uint64_t(activeThreadsNV())", emits
VOTE.ANY Rd, PT, PT;
on nouveau's compiler. This doesn't match exactly to Nvidia's code
VOTE.ALL Rd, PT, PT;
Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
Fernando Sahmkow
83ec2091c1
Buffer Cache: Adress Feedback.
2019-08-21 12:14:27 -04:00
Fernando Sahmkow
6ce2c85047
Buffer_Cache: Implement flushing.
2019-08-21 12:14:26 -04:00
Fernando Sahmkow
de8ff8a1c6
Buffer_Cache: Implement barriers.
2019-08-21 12:14:25 -04:00
Fernando Sahmkow
286f4c446a
Buffer_Cache: Optimize and track written areas.
2019-08-21 12:14:25 -04:00
Fernando Sahmkow
5f4b746a1e
BufferCache: Rework mapping caching.
2019-08-21 12:14:24 -04:00
Fernando Sahmkow
86d8563314
Buffer_Cache: Fixes and optimizations.
2019-08-21 12:14:23 -04:00
Fernando Sahmkow
862bec001b
Video_Core: Implement a new Buffer Cache
2019-08-21 12:14:22 -04:00
bunnei
d654b3d82e
Merge pull request #2769 from FernandoS27/commands-flush
...
GPU: Flush commands on every dma pusher step.
2019-08-21 10:29:56 -04:00
bunnei
dfdd20142e
Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix
...
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-21 10:29:17 -04:00
bunnei
cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
...
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp
80702aa88f
renderer_opengl: Implement RGB565 framebuffer format
2019-08-21 02:28:31 -03:00
ReinUsesLisp
9cdf5c6c31
renderer_opengl: Use block linear swizzling for CPU framebuffers
2019-08-21 02:17:14 -03:00
ReinUsesLisp
8ad7268c75
renderer_opengl: Use VideoCore pixel format
2019-08-21 02:16:40 -03:00
ReinUsesLisp
9a76e94b3d
gpu: Change optional<reference_wrapper<T>> to T* for FramebufferConfig
2019-08-21 01:55:25 -03:00
bunnei
ca61e298b3
Merge pull request #2778 from ReinUsesLisp/nop
...
shader_ir: Implement NOP
2019-08-18 08:51:34 -04:00
bunnei
87bbefe55f
Merge pull request #2768 from ReinUsesLisp/hsetp2-fix
...
decode/half_set_predicate: Fix predicates
2019-08-18 08:50:54 -04:00
ReinUsesLisp
2ff8044806
shader_ir: Implement NOP
2019-08-04 03:02:55 -03:00
ReinUsesLisp
ec0da3ef64
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-04 02:50:55 -03:00
Fernando Sahmkow
e52c895559
GPU: Flush commands on every dma pusher step.
...
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.
2019-07-26 16:54:22 -04:00
bunnei
52f54c728d
Merge pull request #2592 from FernandoS27/sync1
...
Implement GPU Synchronization Mechanisms & Correct NVFlinger
2019-07-26 14:26:44 -04:00
ReinUsesLisp
77f1a676a1
decode/half_set_predicate: Fix predicates
2019-07-26 00:12:38 -03:00
Fernando Sahmkow
a452ff983d
MaxwellDMA: Fixes, corrections and relaxations.
...
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
bunnei
b0ff3179ef
Merge pull request #2739 from lioncash/cflow
...
video_core/control_flow: Minor changes/warning cleanup
2019-07-25 13:04:56 -04:00
bunnei
4d26550f5f
Merge pull request #2737 from FernandoS27/track-fix
...
Shader_Ir: Correct tracking to track from right to left
2019-07-25 12:41:52 -04:00
bunnei
31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
...
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
bunnei
9be9600bdc
Merge pull request #2704 from FernandoS27/conditional
...
maxwell3d: Implement Conditional Rendering
2019-07-24 17:07:57 -04:00
ReinUsesLisp
104641db07
shader/decode: Implement S2R Tic
2019-07-22 16:16:10 -03:00
bunnei
f601f25bcc
Merge pull request #2734 from ReinUsesLisp/compute-shaders
...
gl_rasterizer: Implement compute shaders
2019-07-22 11:12:55 -04:00
bunnei
27e10e0442
Merge pull request #2735 from FernandoS27/pipeline-rework
...
Rework Dirty Flags in GPU Pipeline, Optimize CBData and Redo Clearing mechanism
2019-07-21 00:59:52 -04:00
Fernando Sahmkow
11f4e739bd
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
...
This commit takes care of implementing the F16 Variants of the
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow
7a35178ee2
Maxwell3D: Reorganize and address feedback
2019-07-20 10:18:35 -04:00
Fernando Sahmkow
1158777737
Shader_Ir: Change Debug Asserts for Log Warnings
2019-07-19 22:15:34 -04:00
ReinUsesLisp
45c162444d
shader/half_set_predicate: Fix HSETP2 implementation
2019-07-19 22:21:22 -03:00
ReinUsesLisp
6c4985edc9
shader/half_set_predicate: Implement missing HSETP2 variants
2019-07-19 22:20:47 -03:00
Lioncash
c1c89411da
video_core/control_flow: Provide operator!= for types with operator==
...
Provides operational symmetry for the respective structures.
2019-07-18 21:03:31 -04:00
Lioncash
1780e0e3d0
video_core/control_flow: Prevent sign conversion in TryGetBlock()
...
The return value is a u32, not an s32, so this would result in an
implicit signedness conversion.
2019-07-18 21:03:31 -04:00
Lioncash
a162a844d2
video_core/control_flow: Remove unnecessary BlockStack copy constructor
...
This is the default behavior of the copy constructor, so it doesn't need
to be specified.
While we're at it we can make the other non-default constructor
explicit.
2019-07-18 21:03:30 -04:00
Lioncash
56bc11d952
video_core/control_flow: Use std::move where applicable
...
Results in less work being done where avoidable.
2019-07-18 21:03:30 -04:00
Lioncash
e7b39f47f8
video_core/control_flow: Use the prefix variant of operator++ for iterators
...
Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath.
2019-07-18 21:03:30 -04:00
Lioncash
6885e7e7ec
video_core/control_flow: Use empty() member function for checking emptiness
...
It's what it's there for.
2019-07-18 21:03:30 -04:00
Lioncash
45fa12a05c
video_core: Resolve -Wreorder warnings
...
Ensures that the constructor members are always initialized in the order
that they're declared in.
2019-07-18 21:03:30 -04:00
Lioncash
47df844338
video_core/control_flow: Make program_size for ScanFlow() a std::size_t
...
Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface.
2019-07-18 21:03:29 -04:00
Lioncash
3df9558593
video_core/control_flow: Place all internally linked types/functions within an anonymous namespace
...
Previously, quite a few functions were being linked with external
linkage.
2019-07-18 21:03:29 -04:00
Lioncash
1109db86b7
video_core/shader/decode: Prevent sign-conversion warnings
...
Makes it explicit that the conversions here are intentional.
2019-07-18 21:03:29 -04:00
bunnei
63bda67a34
Merge pull request #2738 from lioncash/shader-ir
...
shader-ir: Minor cleanup-related changes
2019-07-18 13:52:01 -04:00
Fernando Sahmkow
5a06e33859
Shader_Ir: correct clang format
2019-07-18 10:09:26 -04:00
Fernando Sahmkow
43f57d668c
GPU: Add missing puller methods.
...
This adds some missing puller methods. We don't assert them as these are
nop operations for us.
2019-07-18 08:54:42 -04:00
Fernando Sahmkow
3a3fee5abf
MaxwellDMA/KeplerCopy: Downgrade DMA log message to Trace.
...
This log was just to know which games used DMA. It's no longer
important.
2019-07-18 08:31:38 -04:00
Fernando Sahmkow
d3b71ff80d
Gl_Texture_Cache: Remove assert on component type in GetFormatTuple
...
Textures can have different components types in different orders. This
assert was completely inprecise and the effectiveness of such is better
handled by case and within the texture cache.
2019-07-18 08:20:31 -04:00
Fernando Sahmkow
0b65e9335e
Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
...
This commit reduces the sevirity of asserts for FP precision and
rounding as this are well known and have little to no consequences in
gpu's accuracy.
2019-07-18 08:17:19 -04:00
ReinUsesLisp
74632c76ce
gl_shader_decompiler: Rename bufferImage to imageBuffer
...
The online OpenGL documentation is wrong. The type definition is
imageBuffer.
2019-07-18 01:16:44 -03:00
ReinUsesLisp
87909d327f
gl_shader_cache: Fix newline on buffer preprocessor definitions
2019-07-18 01:16:15 -03:00
ReinUsesLisp
e7bdf8b22a
textures: Fix texture buffer size calculation
2019-07-18 01:07:08 -03:00
ReinUsesLisp
84027f4808
gl_texture_cache: Do not set texture parameters to buffers
2019-07-18 01:06:26 -03:00
ReinUsesLisp
73b2dc6d4f
gl_texture_cache: Add missing break in CreateTexture
2019-07-18 01:04:18 -03:00
Fernando Sahmkow
4be61013a1
GL_State: Feedback and fixes
2019-07-17 17:29:56 -04:00
Fernando Sahmkow
5ad889f6fd
Maxwell3D: Address Feedback
2019-07-17 17:29:55 -04:00
Fernando Sahmkow
7826f0afd9
Texture_Cache: Rebase Fixes
2019-07-17 17:29:54 -04:00
Fernando Sahmkow
8cdbfe69b1
GL_Rasterizer: Corrections to Clearing.
2019-07-17 17:29:54 -04:00
Fernando Sahmkow
0ff4a5fa39
Maxwell3D: Correct marking dirtiness on CB upload
2019-07-17 17:29:53 -04:00
Fernando Sahmkow
fec32fed18
GL_Rasterizer: Rework RenderTarget/DepthBuffer clearing
2019-07-17 17:29:52 -04:00
Fernando Sahmkow
a081dea8ab
Maxwell3D: Implement State Dirty Flags.
2019-07-17 17:29:51 -04:00
Fernando Sahmkow
0d3db58657
Maxwell3D: Rework CBData Upload
2019-07-17 17:29:50 -04:00
Fernando Sahmkow
f2e7b29c14
Maxwell3D: Rework the dirty system to be more consistant and scaleable
2019-07-17 17:29:49 -04:00
Fernando Sahmkow
e42bcf2314
maxwell3d: Implement Conditional Rendering
...
Conditional Rendering takes care of conditionaly clearing or drawing
depending on a set of queries. This PR implements the query checks to
stablish if things can be rendered or not.
2019-07-17 17:13:19 -04:00
Fernando Sahmkow
223a535f3f
Merge pull request #2740 from lioncash/bra
...
shader/decode/other: Correct branch indirect argument within BRA handling
2019-07-17 14:25:08 -04:00
Lioncash
bebbdc2067
shader_ir: std::move Node instance where applicable
...
These are std::shared_ptr instances underneath the hood, which means
copying them isn't as cheap as a regular pointer. Particularly so on
weakly-ordered systems.
This avoids atomic reference count increments and decrements where they
aren't necessary for the core set of operations.
2019-07-16 19:49:23 -04:00
Lioncash
60926ac16b
shader_ir: Rename Get/SetTemporal to Get/SetTemporary
...
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
2019-07-16 19:47:43 -04:00
Lioncash
44d87ff641
shader_ir: Remove unused includes
...
Removes unnecessary header dependencies.
2019-07-16 19:47:42 -04:00
Fernando Sahmkow
d614193e49
Shader_Ir: Correct tracking to track from right to left
2019-07-16 15:06:59 -04:00
Fernando Sahmkow
b56e7f870a
Merge pull request #2565 from ReinUsesLisp/track-indirect
...
shader/track: Track indirect buffers
2019-07-16 14:58:35 -04:00
Lioncash
e2d7dda166
shader/decode/other: Correct branch indirect argument within BRA handling
...
This appears to have been a copy/paste error introduced within
8a6fc529a9
2019-07-16 12:20:45 -04:00
ReinUsesLisp
2a4044a858
gl_shader_cache: Fix clang-format issues
2019-07-15 20:33:51 -03:00
ReinUsesLisp
6b0d017675
gl_shader_decompiler: Stub local memory size
2019-07-15 17:38:25 -03:00
ReinUsesLisp
56bca83bde
gl_shader_cache: Address review commentaries
2019-07-15 17:38:25 -03:00
ReinUsesLisp
bbecd13697
gl_shader_cache: Address CI issues
2019-07-15 17:38:25 -03:00
ReinUsesLisp
725ba6cf63
gl_rasterizer: Implement compute shaders
2019-07-15 17:38:25 -03:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
...
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
bunnei
b77a1ed67a
Merge pull request #2705 from FernandoS27/tex-cache-fixes
...
GPU: Fixes to Texture Cache and Include Microprofiles for GL State/BufferCopy/Macro Interpreter
2019-07-14 22:44:36 -04:00
ReinUsesLisp
afa8096df5
shader: Allow tracking of indirect buffers without variable offset
...
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
2019-07-14 22:36:44 -03:00
bunnei
3477b92289
Merge pull request #2675 from ReinUsesLisp/opengl-buffer-cache
...
buffer_cache: Implement a generic buffer cache and its OpenGL backend
2019-07-14 19:03:43 -04:00
Fernando Sahmkow
2ac7472d3f
Texture_Cache: Address Feedback
2019-07-14 17:42:39 -04:00
Fernando Sahmkow
0f54b541f4
Texture_Cache: Remove some unprecise fallback case and clang format
2019-07-14 12:00:32 -04:00
Fernando Sahmkow
5818959e54
Texture_Cache: Force Framebuffer reset if an active render target is unregistered.
2019-07-14 12:00:31 -04:00
Fernando Sahmkow
913b7a6872
GPU: Add a microprofile for macro interpreter
2019-07-14 12:00:30 -04:00
Fernando Sahmkow
a9943222f2
GL_State: Add a microprofile timer to OpenGL state.
2019-07-14 12:00:30 -04:00
Fernando Sahmkow
5c1e1a148e
Gl_Texture_Cache: Measure Buffer Copy Times
2019-07-14 12:00:29 -04:00
Fernando Sahmkow
5d31bab69a
Texture_Cache: Correct Linear Structural Match.
2019-07-14 12:00:28 -04:00
Fernando Sahmkow
4882c058fd
Merge pull request #2690 from SciresM/physmem_fixes
...
Implement MapPhysicalMemory/UnmapPhysicalMemory
2019-07-14 09:16:46 -04:00
Fernando Sahmkow
0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
...
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
bunnei
bb67091c77
Merge pull request #2609 from FernandoS27/new-scan
...
Implement a New Shader Scanner, Decompile Flow Stack and implement BRX BRA.CC
2019-07-11 17:36:23 -04:00
ReinUsesLisp
0eb0c24269
gl_shader_decompiler: Fix gl_PointSize redeclaration
2019-07-11 16:10:59 -03:00
ReinUsesLisp
aca40de224
gl_shader_decompiler: Fix conditional usage of GL_ARB_shader_viewport_layer_array
2019-07-11 04:27:00 -03:00
bunnei
fd066ffbce
Merge pull request #2697 from lioncash/doc
...
gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
2019-07-10 16:38:09 -04:00
bunnei
7fb7054bc8
Merge pull request #2686 from ReinUsesLisp/vk-scheduler
...
vk_scheduler: Drop execution context in favor of views
2019-07-10 16:35:48 -04:00
bunnei
206ec29f17
Merge pull request #2691 from lioncash/override
...
video_core: Add missing override specifiers
2019-07-10 16:25:43 -04:00
Fernando Sahmkow
f2549739d1
shader_ir: Add comments on missing instruction.
...
Also shows Nvidia's address space on comments.
2019-07-09 17:15:45 -04:00
Michael Scire
a1845d1dd3
prefer system reference over global accessor
2019-07-09 08:11:35 -07:00
Fernando Sahmkow
2de7649311
shader_ir: limit explorastion to best known program size.
2019-07-09 08:14:43 -04:00
Fernando Sahmkow
e7c6045a03
control_flow: Correct block breaking algorithm.
2019-07-09 08:14:43 -04:00
Fernando Sahmkow
dc4a93594c
control_flow: Assert shaders bigger than limit.
2019-07-09 08:14:42 -04:00
Fernando Sahmkow
e7a88f0ab3
control_flow: Address feedback.
2019-07-09 08:14:42 -04:00
Fernando Sahmkow
34357b110c
shader_ir: Correct parsing of scheduling instructions and correct sizing
2019-07-09 08:14:41 -04:00
Fernando Sahmkow
cfb3db1a32
shader_ir: Correct max sizing
2019-07-09 08:14:40 -04:00
Fernando Sahmkow
d45fed3030
shader_ir: Remove unnecessary constructors and use optional for ScanFlow result
2019-07-09 08:14:40 -04:00
Fernando Sahmkow
01b21ee1e8
shader_ir: Corrections, documenting and asserting control_flow
2019-07-09 08:14:39 -04:00
Fernando Sahmkow
d5533b440c
shader_ir: Unify blocks in decompiled shaders.
2019-07-09 08:14:39 -04:00
Fernando Sahmkow
926b80102f
shader_ir: Decompile Flow Stack
2019-07-09 08:14:38 -04:00
Fernando Sahmkow
459fce3a8f
shader_ir: propagate shader size to the IR
2019-07-09 08:14:37 -04:00
Fernando Sahmkow
8a6fc529a9
shader_ir: Implement BRX & BRA.CC
2019-07-09 08:14:37 -04:00
Fernando Sahmkow
c218ae4b02
shader_ir: Remove the old scanner.
2019-07-09 08:14:36 -04:00
Fernando Sahmkow
8af6e6a052
shader_ir: Implement a new shader scanner
2019-07-09 08:14:36 -04:00
Lioncash
c04785c928
gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
...
must_reconfigure isn't a parameter for this function any more, so it can
be replaced with current_state.
While we're at it, we can make the parameters of the declaration match
the same name as the ones in the definition.
2019-07-09 02:08:15 -04:00
Michael Scire
697206092e
Prevent merging of device mapped memory blocks.
...
This sets the DeviceMapped attribute for GPU-mapped memory blocks,
and prevents merging device mapped blocks. This prevents memory
mapped from the gpu from having its backing address changed by
block coalesce.
2019-07-08 22:52:05 -07:00
ReinUsesLisp
c9d886c84e
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
...
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.
In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
Tobias
be020f7621
Delete decode_integer_set.cpp
2019-07-07 21:40:33 +02:00
ReinUsesLisp
d0966b9f7c
shader/texture: Add F16 support for TLDS
2019-07-07 16:05:56 -03:00
Lioncash
cbdd6cd1c0
vk_sampler_cache: Remove unused includes
...
These are no longer used within this header, so they can be removed.
2019-07-07 13:40:36 -04:00
Lioncash
4b27680639
video_core: Add missing override specifiers
2019-07-07 13:38:39 -04:00
ReinUsesLisp
86a874a2fc
vk_scheduler: Drop execution context in favor of views
...
Instead of passing by copy an execution context through out the whole
Vulkan call hierarchy, use a command buffer view and fence view
approach.
This internally dereferences the command buffer or fence forcing the
user to be unable to use an outdated version of it on normal usage.
It is still possible to keep store an outdated if it is casted to
VKFence& or vk::CommandBuffer.
While changing this file, add an extra parameter for Flush and Finish to
allow releasing the fence from this calls.
2019-07-07 03:30:22 -03:00
ReinUsesLisp
79a23ca5f0
buffer_cache: Avoid [[nodiscard]] to make clang-format happy
2019-07-06 01:17:05 -03:00
ReinUsesLisp
83050c9495
buffer_cache: Try to fix MinGW build
2019-07-06 01:14:05 -03:00
ReinUsesLisp
f7691ebe57
gl_rasterizer: Fix nullptr dereference on disabled buffers
2019-07-06 00:37:56 -03:00
ReinUsesLisp
7ecf64257a
gl_rasterizer: Minor style changes
2019-07-06 00:37:55 -03:00
ReinUsesLisp
9cdc576f60
gl_rasterizer: Fix vertex and index data invalidations
2019-07-06 00:37:55 -03:00
ReinUsesLisp
1fa21fa192
gl_buffer_cache: Implement with generic buffer cache
2019-07-06 00:37:55 -03:00
ReinUsesLisp
32c0212b24
buffer_cache: Implement a generic buffer cache
...
Implements a templated class with a similar approach to our current
generic texture cache. It is designed to be compatible with Vulkan and
OpenGL,
2019-07-06 00:37:55 -03:00
ReinUsesLisp
2bcae41a73
gl_buffer_cache: Remove global system getters
2019-07-06 00:37:55 -03:00
ReinUsesLisp
02ab844934
gl_device: Query SSBO alignment
2019-07-06 00:37:55 -03:00
ReinUsesLisp
d14fbfb9b5
gl_buffer_cache: Implement flushing
2019-07-06 00:37:55 -03:00
ReinUsesLisp
345f852bdb
gl_rasterizer: Drop gl_global_cache in favor of gl_buffer_cache
2019-07-06 00:37:55 -03:00
ReinUsesLisp
8155b12d3d
gl_buffer_cache: Rework to support internalized buffers
2019-07-06 00:37:55 -03:00
ReinUsesLisp
f8ba72d491
gl_buffer_cache: Store in CachedBufferEntry the used buffer handle
2019-07-06 00:37:55 -03:00
ReinUsesLisp
b54fb8fc4c
gl_buffer_cache: Return used buffer from Upload function
2019-07-06 00:37:55 -03:00
ReinUsesLisp
a6d2f52fc3
gl_rasterizer: Add some commentaries
2019-07-06 00:37:55 -03:00
ReinUsesLisp
2b9d4088ec
gl_rasterizer: Make DrawParameters rasterizer instance const
2019-07-06 00:37:55 -03:00
ReinUsesLisp
2e39c20da5
gl_rasterizer: Move index buffer uploading to its own method
2019-07-06 00:37:55 -03:00
Fernando Sahmkow
d20ede40b1
NVServices: Styling, define constructors as explicit and corrections
2019-07-05 15:49:32 -04:00
Fernando Sahmkow
b391e5f638
NVFlinger: Correct GCC compile error
2019-07-05 15:49:31 -04:00
Fernando Sahmkow
0335a25d1f
NVServices: Make NVEvents Automatic according to documentation.
2019-07-05 15:49:29 -04:00
Fernando Sahmkow
7d1b974bca
GPU: Correct Interrupts to interrupt on syncpt/value instead of event, mirroring hardware
2019-07-05 15:49:26 -04:00
Fernando Sahmkow
f2e026a1d8
gpu_asynch: Simplify synchronization to a simpler consumer->producer scheme.
2019-07-05 15:49:20 -04:00
Fernando Sahmkow
0706d633bf
nv_host_ctrl: Make Sync GPU variant always return synced result.
2019-07-05 15:49:20 -04:00
Fernando Sahmkow
600dddf88d
Async GPU: do invalidate as synced operation
...
Async GPU: Always invalidate synced.
2019-07-05 15:49:19 -04:00
Fernando Sahmkow
c13433aee4
Gpu: use an std mutex instead of a spin_lock to guard syncpoints
2019-07-05 15:49:18 -04:00
Fernando Sahmkow
eef55f493b
Gpu: Mark areas as protected.
2019-07-05 15:49:16 -04:00
Fernando Sahmkow
a45643cb3b
nv_services: Stub CtrlEventSignal
2019-07-05 15:49:15 -04:00
Fernando Sahmkow
8942047d41
Gpu: Implement Hardware Interrupt Manager and manage GPU interrupts
2019-07-05 15:49:14 -04:00
Fernando Sahmkow
82b829625b
video_core: Implement GPU side Syncpoints
2019-07-05 15:49:11 -04:00
Zach Hilman
772c86a260
Merge pull request #2601 from FernandoS27/texture_cache
...
Implement a new Texture Cache
2019-07-05 13:39:13 -04:00
Fernando Sahmkow
3b9d89839d
texture_cache: Address Feedback
2019-07-05 09:46:53 -04:00
Fernando Sahmkow
30b176f92b
texture_cache: Correct Texture Buffer Uploading
2019-07-04 19:38:19 -04:00
Zach Hilman
ad50cd7df9
gl_shader_cache: Make CachedShader constructor private
...
Fixes missing review comments introduced.
2019-07-03 20:39:46 -04:00
Zach Hilman
da5a537029
Merge pull request #2563 from ReinUsesLisp/shader-initializers
...
gl_shader_cache: Use static constructors for CachedShader initialization
2019-07-03 20:20:05 -04:00
Fernando Sahmkow
4705d1b523
rasterizer_cache: Protect inherited caches from submission level
2019-07-01 04:32:01 -04:00
ReinUsesLisp
6e1db6b703
texture_cache: Pack sibling queries inside a method
2019-06-29 20:47:46 -03:00
ReinUsesLisp
8eae66907e
texture_cache: Use std::vector reservation for sampled_textures
2019-06-29 20:10:31 -03:00
ReinUsesLisp
f6f1a8f26a
texture_cache: Style changes
2019-06-29 19:52:37 -03:00
ReinUsesLisp
dd9ace502b
texture_cache: Use std::array for siblings_table
2019-06-29 18:54:13 -03:00
ReinUsesLisp
3f3c3ca5f9
texture_cache: Address feedback
2019-06-29 17:29:39 -03:00
Fernando Sahmkow
223ca80753
texture_cache: Correct variable naming.
2019-06-25 19:35:08 -04:00
Fernando Sahmkow
5aeabd9a17
gl_texture_cache: Correct asserts
2019-06-25 19:26:59 -04:00
Fernando Sahmkow
88bc39374f
texture_cache: Corrections, documentation and asserts
2019-06-25 18:36:19 -04:00
Fernando Sahmkow
c0abc7124d
surface_params: Corrections, asserts and documentation.
2019-06-25 18:03:25 -04:00
Fernando Sahmkow
fb234560b0
copy_params: use constexpr for constructor
2019-06-25 17:42:50 -04:00
Fernando Sahmkow
18d24fbdd0
gl_texture_cache: Corrections and fixes
2019-06-25 17:40:08 -04:00
Fernando Sahmkow
36665ce0b2
gl_resource_manager: Correct MakeStreamCopy
2019-06-25 17:32:04 -04:00
Fernando Sahmkow
58c8a44e7a
texture_cache: Query MemoryManager from the system
2019-06-25 17:26:00 -04:00
ReinUsesLisp
7565389700
texture_cache: Include "core/core.h"
2019-06-24 02:15:57 -03:00
ReinUsesLisp
e723441e37
gl_texture_cache: Explicitly add indirect include
2019-06-24 02:13:55 -03:00
ReinUsesLisp
34841a41c3
texture_cache/surface_view: Address feedback
2019-06-24 02:09:56 -03:00
ReinUsesLisp
0837290992
texture_cache/surface_base: Address feedback
2019-06-24 02:08:52 -03:00
ReinUsesLisp
75de730e28
video_core/surface: Address feedback
2019-06-24 02:07:11 -03:00
ReinUsesLisp
10a83653ee
decode/texture: Address feedback
2019-06-24 02:05:05 -03:00
ReinUsesLisp
4504302abc
renderer_opengl/utils: Remove unused includes and unused forward declaration
2019-06-24 02:03:37 -03:00
ReinUsesLisp
4b2ff1e00e
gl_texture_cache: Address some feedback
2019-06-24 02:01:44 -03:00
ReinUsesLisp
0b6df52109
gl_shader_disk_cache: Address feedback
2019-06-24 01:59:32 -03:00
ReinUsesLisp
b8b05a484a
gl_shader_decompiler: Address feedback
2019-06-24 01:56:38 -03:00
ReinUsesLisp
4d63f97945
shader_bytecode: Include missing <array>
2019-06-24 01:51:02 -03:00
bunnei
a9f3c54871
Merge pull request #2579 from ReinUsesLisp/fix-aoffi-test
...
gl_device: Fix TestVariableAoffi test
2019-06-21 15:28:55 -04:00
Fernando Sahmkow
d1812316e1
texture_cache: Style and Corrections
2019-06-20 21:24:47 -04:00
Fernando Sahmkow
51ba60b27e
shader_cache: Correct versioning and size calculation.
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
97c8c9f49a
texture_cache: Eliminate linear textures fallthrough
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
6acdae0e4c
texture_cache: Correct format R16U as sibling
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
d7587842eb
texture_cache: Implement texception detection and texture barriers.
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
198a0395bb
texture_cache: Corrections to buffers and shadow formats use.
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
fed773a86c
texture_cache: Implement Irregular Views in surfaces
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
082740d34d
surface: Correct format S8Z24
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
03d489dcf5
texture_cache: Initialize all siblings to invalid pixel format.
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
9422cf7c10
gl_texture_cache: Use Stream Buffers instead of Persistant for Buffer Copies.
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
fac3706253
gl_texture_cache: Correct Image Blit
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
7232a1ed16
decoders: correct block calculation
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
3dd7643214
texture_cache: Use siblings textures on Rebuild and fix possible error on blitting
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
4db28f72f6
texture_cache: Remove old rasterizer cache
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
2d83553ea7
texture_cache: Implement siblings texture formats.
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
cb728797b0
fermi2d: Correct Origin Mode
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
a56f687793
texture_cache: correct texture buffer on surface params
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
b01f9c8a70
texture_cache: eliminate accelerated depth->color/color->depth copies due to driver instability.
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
561ce29c98
texture_cache: correct mutex locks
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
b7de31ac97
shader_ir: Fix image copy rebase issues
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
6f69f06873
texture_cache: Don't Image Copy if component types differ
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
9f755218a1
texture_cache: move some large methods to cpp files
2019-06-20 21:38:34 -03:00
Fernando Sahmkow
3809041c24
texture_cache: Optimize GetSurface and use references on functions that don't change a surface.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow
60bf761afb
texture_cache: Implement Buffer Copy and detect Turing GPUs Image Copies
2019-06-20 21:38:33 -03:00