Conversation
hhyyrylainen
left a comment
There was a problem hiding this comment.
This improves performance quite a bit. Here's one benchmark I just did:
Benchmark results for CloudBenchmark v1
Cloud spawn score: 414
Absorber score: 418.667
Many spawners score: 366.373
Cloud sim multiplier before under 60 FPS: 6.5846
Stress test spawners: 92
Stress test average FPS: 291.833
Stress test min FPS: 56
Total test duration: 220.1s
CPU: AMD Ryzen 9 9950X3D 16-Core Processor (used tasks: 16, native: 8, sim threads: True)
GPU: AMD Radeon RX 7900 XTX (RADV NAVI31)
OS: Linux
So the functionality seems good.
I had some code tweaking related comments still and one that might give ever so slightly more performance still.
|
I have no idea why but moving PrecalculateWorldShiftVectors to constructor totally breaks the behaviour of the cloouds so I reverted that change |
I think it's because it depends on CloudResolution, which is initialized in _Ready() |
|
Ah you're right, so when it's called in constructor it takes the default value |
hhyyrylainen
left a comment
There was a problem hiding this comment.
I did some slight code improvements and one final set of testing and while I still don't fully understand how the 81 entries in the cache are enough I did not encounter any errors while testing.
| /// Do not ever modify this dictionary after construction, it is thread unsafe. | ||
| /// This dictionary contains 81 values. And is filled in _Ready once the cloud size is known. | ||
| /// </summary> | ||
| private Dictionary<int, Vector2> cachedWorldShiftVectors = null!; |
There was a problem hiding this comment.
Just a little comment about this: since it doesn't depend on local parameters, wouldn't it be better to have this static and initialize it once in the first _Ready() call? Otherwise each instance here would have the same dictionary everywhere. Also, I'd use a FrozenDictionary here, that has almost zero overhead on read operations after initialization, even faster than ImmutableDictionary.
There was a problem hiding this comment.
Also, I'd use a FrozenDictionary here, that has almost zero overhead on read operations after initialization, even faster than ImmutableDictionary.
It doesn't support CollectionsMarshal.GetValueRefOrNullRef thus FrozenDictionary is slower as it doesn't allow that special fast access.
And also we only have 2 compound cloud planes so recalculating the values twice is not a big problem. A bigger problem is static variables which cause all kinds of problems with hot reloading in Godot, which is why I prefer to avoid static variables as much as possible in Thrive (this is a Godot-specific thing).
There was a problem hiding this comment.
I see, I didn't know that about static variables, thanks.
About the FrozenDictionary, it does support GetValueRefOrNullRef and FrozenDictionary lookup is O(1) vs ImmutableDictionary's O(log N) because it uses a balanced binary tree. Here's a benchmark.
There was a problem hiding this comment.
I tried swapping the type of the object but it causes a compile error in the CollectionsMarshal.GetValueRefOrNullRef call. So I'll merge this as that change clearly needs more than a simple type change.
Increases performance od compund currents by reducing number of calculations.
Master:
Benchmark results for CloudBenchmark v1
Cloud spawn score: 109
Absorber score: 110.431
Many spawners score: 106.686
Cloud sim multiplier before under 60 FPS: 2.1077
Stress test spawners: 20
Stress test average FPS: 109.294
Stress test min FPS: 58
Total test duration: 105s
CPU: 12th Gen Intel(R) Core(TM) i5-12450H (used tasks: 12, native: 6, sim threads: True)
GPU: NVIDIA GeForce RTX 3050 Laptop GPU
OS: Linux
This branch:
Benchmark results for CloudBenchmark v1
Cloud spawn score: 108
Absorber score: 123.627
Many spawners score: 116.02
Cloud sim multiplier before under 60 FPS: 2.8615
Stress test spawners: 32
Stress test average FPS: 112.57
Stress test min FPS: 26
Total test duration: 124.9s
CPU: 12th Gen Intel(R) Core(TM) i5-12450H (used tasks: 12, native: 6, sim threads: True)
GPU: NVIDIA GeForce RTX 3050 Laptop GPU
OS: Linux
Progress Checklist
Note: before starting this checklist the PR should be marked as non-draft.
break existing features:
https://wiki.revolutionarygamesstudio.com/wiki/Testing_Checklist
(this is important as to not waste the time of Thrive team
members reviewing this PR)
styleguide.
Before merging all CI jobs should finish on this PR without errors, if
there are automatically detected style issues they should be fixed by
the PR author. Merging must follow our
styleguide.