add benchmarking for TraverseStrategy concept#4834
add benchmarking for TraverseStrategy concept#4834alokkumardalei-wq wants to merge 2 commits intotypelevel:mainfrom
Conversation
|
Thank you for the PR! I'd suggest to address a couple of issues with the PR first:
|
|
Hello @satorg ,thank you for your feedback. And secondly, avoiding screenshots for textual information and provide markdown formatted code blocks instead, I think I fully comply with this except for the benchmarking results because I have show you difference in b/op in different types therefore I need to highlight it . If something else can be done or anything wrong I have done here please let me know. If everything is fine please let me know if the PR is ready to review or not. Thank you.. |
|
If this works, you can send pr to scala/scala too, where TailCall works the same way |
Motivation and Context
This PR introduces
TraverseStrategyto optimizetraverseandsequenceoperations for lazily-evaluated and inherently stack-safe data structures (such asVector,List,Chain, andEval).Previously,
traverseandtraverseVoidincurred significant and redundantEvalallocations, causing excessive Garbage Collection (GC) pressure and calculation overhead. By introducingTraverseStrategywithDirectandViaEvalexecution paths, we can execute traversals strictly when the type is already stack-safe, completely bypassing theEvalwrapping overhead.Furthermore, we updated
cats.instancesimplementations (likeListandVector) to use a balanced divide-and-conquer recursion (runHalf) to prevent stack overflows on high-element iterations.I have used AI to know about TraverseStrategy, Garbage Collection (GC) pressure in Scala, the differences between strict evaluation (List/Vector) vs safe-check lazy types (Eval, Either), and to learn how to write and execute JMH benchmarks.
Changes Made
TraverseStrategy: AddedDirectandViaEvalpaths insidecats.Apply.ListandVectorto use balanced recursion (runHalf) fortraverseVoid, guaranteeing stack safety.cats.bench.TraverseStrategyBenchto statistically measure throughput adjustments.Code Highlights
To provide deeper context on the implementation structure:
1. Building the Execution Strategy (cats/Apply.scala)
We introduced
TraverseStrategywith two specific evaluation models.Directtargets lazily evaluated native types that don't needEvalwrapping logic, whileViaEvalprovides safe-fallback wrapping limits.2. Fixing List & Vector Stack Overflows (cats/instances/vector.scala)
Previously, nested loops inside Vector.traverseVoid lacked stack-safety execution models. We refactored strict loops to parse structures sequentially via a runHalf divide-and-conquer implementation to guarantee stack safety across massive iterations.
3. Implementing Strategy Instances in Core Types (cats/data/Kleisli.scala)
We propagated the new traverseStrategy upwards through nested typeclasses so they dynamically adapt. For example, Kleisli defines its strategy lazily, mirroring the inner F type's stack-safety model seamlessly:
Similar to Vector, Chain needs to construct its internal traversal trees without blowing the JVM limits. We utilized G.traverseStrategy combined with a width = 128 grouping factor when processing elements in traverseFilterViaChain:
Benchmark Results
To verify these performance improvements statistically, I ran the local JMH TraverseStrategyBench suite. The results demonstrate significant throughput scaling without stack overflows, even on 10,000+ length Iterables.
You can clearly see that lazy types like Eval has much less b/op mean less GC pressure than strict types Either in every iteration.
Demo video:
This video demonstrates about how we produce the benchmarking results:
Demo