feat(retry): add JitterFactor to RetryPolicy by javier-aliaga · Pull Request #69 · dapr/durabletask-go

javier-aliaga · 2026-03-05T11:36:52Z

Context

When orchestrations retry failed activities or sub-orchestrations, the retry delay is calculated using exponential backoff (InitialRetryInterval * BackoffCoefficient^attempt). In systems running many concurrent orchestration instances, all retries sharing the same policy configuration will compute identical delays, causing them to fire at the exact same time.

Problem

Without jitter, concurrent retries create thundering herd scenarios — all failing instances retry simultaneously, overwhelming downstream services and increasing the likelihood of repeated failures. This is a well-known distributed systems anti-pattern. Additionally, the existing computeNextDelay function had a bug where the MaxRetryInterval comparison used < instead of >, meaning the delay was never actually capped before being returned.

Solution

Add a JitterFactor field to RetryPolicy (range [0.0, 1.0]) that introduces controlled randomness to retry delays:

Deterministic jitter: Seeded by firstAttempt + attempt so replays produce identical delays, preserving orchestrator replay safety.
Configurable reduction: A factor of 0.5 allows up to 50% reduction of the computed delay; 0.0 disables jitter entirely (backward compatible).
Validation clamping: Validate() clamps out-of-range values to [0.0, 1.0] instead of returning an error.
Bug fix: Corrected the MaxRetryInterval comparison (< → >) so the delay is properly capped before jitter is applied.
Nil-safety: WithActivityRetryPolicy and WithChildWorkflowRetryPolicy now handle nil policy gracefully.

Introduce JitterFactor field to desynchronize concurrent retries using deterministic randomness (seeded by firstAttempt + attempt) for replay safety. Clamp value to [0.0, 1.0] in Validate(). Fix MaxRetryInterval comparison (was <, now >) so delay is properly capped before jitter is applied. Tests cover deterministic replay safety, delay reduction, zero-jitter passthrough, per-attempt variation, MaxRetryInterval capping, and validation clamping. Signed-off-by: Javier Aliaga <[email protected]>

javier-aliaga changed the title ~~feat(retry): add JitterFactor to RetryPolicy with unit tests~~ feat(retry): add JitterFactor to RetryPolicy Mar 9, 2026

javier-aliaga force-pushed the add-jitter-option branch from 58d196b to 94c5af8 Compare March 10, 2026 15:16

javier-aliaga marked this pull request as ready for review March 11, 2026 14:54

javier-aliaga requested a review from a team as a code owner March 11, 2026 14:54

javier-aliaga force-pushed the add-jitter-option branch from 94c5af8 to bb6579b Compare March 11, 2026 15:53

javier-aliaga mentioned this pull request Mar 11, 2026

Feat: Add jitterfactor to retry policy dapr/java-sdk#1687

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(retry): add JitterFactor to RetryPolicy#69

feat(retry): add JitterFactor to RetryPolicy#69
javier-aliaga wants to merge 1 commit intodapr:mainfrom
javier-aliaga:add-jitter-option

javier-aliaga commented Mar 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

javier-aliaga commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Problem

Solution

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

javier-aliaga commented Mar 5, 2026 •

edited

Loading