Skip to content

Conversation

@YaroslavStryhun
Copy link
Collaborator

@YaroslavStryhun YaroslavStryhun commented Sep 9, 2025

Hello!

  • Type: bug fix | new feature | code quality | documentation
  • Link to issue:

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the Contributing Guidelines?
  • I have checked that another pull request for this purpose does not exist.
  • I wrote some tests for this PR.

Small description of change:

Thanks

Summary by CodeRabbit

  • New Features
    • Exponential backoff for job retries to smooth retry pacing and reduce load spikes.
    • Optional per-job backoff settings when enqueuing, or use global defaults.
    • New configuration options: backoff initial delay and factor.
    • Automatic rescheduling after failures with increasing delays based on the configured strategy.
    • Defaults: disabled unless an initial delay > 0 is set; factor defaults to 3.
    • Backward compatible: behavior is unchanged when no backoff is configured.

@YaroslavStryhun YaroslavStryhun requested a review from a team as a code owner September 9, 2025 11:06
@coderabbitai
Copy link

coderabbitai bot commented Sep 9, 2025

Walkthrough

Adds a BackoffStrategyDTO value object. Extends Queue::put with an optional BackoffStrategyDTO and passes a serialized backoff payload. Updates qless Lua core to accept/store per-job backoff data, add global backoff defaults, and alter retry handling by rescheduling failed/expired jobs using exponential backoff when configured.

Changes

Cohort / File(s) Summary
DTO introduction
src/Queues/DTO/BackoffStrategyDTO.php
New value object with constructor for initialDelay and factor, getters, and toArray() returning associative array (keys: initial_delay, factor).
Queue API and payload update
src/Queues/Queue.php
Queue::put signature extended with optional ?BackoffStrategyDTO $backoffStrategyDTO. Includes use import. Adds backoff JSON to enqueue payload: per-job backoff serialized via DTO or empty array.
Qless core backoff support
src/qless-core/qless.lua
Adds config defaults backoff-initial-delay (0) and backoff-factor (3). QlessQueue:put accepts options.backoff and stores as job field backoff (JSON). In lock invalidation, computes retry-based delay using per-job backoff or global defaults; if initial delay > 0, removes lock and reschedules job with exponential delay, updating state to scheduled.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant App as Application
  participant Queue as Queue::put
  participant Redis as Redis/Qless (qless.lua)
  participant Core as qless.lua Core

  App->>Queue: put(class, data, ..., backoffDTO?)
  Queue->>Redis: PUT job with payload (backoff JSON if provided)
  Redis->>Core: QlessQueue:put(options)
  Core-->>Redis: Store job with backoff field (if present)
  Redis-->>Queue: jid
  Queue-->>App: jid

  note over Core,Redis: Later: lock invalidation / retry handling
  Core->>Core: Read job.backoff or global defaults
  alt initial_delay == 0
    Core-->>Core: Immediate retry path (no backoff delay)
  else initial_delay > 0
    Core->>Core: retry_count from history
    Core->>Core: delay = initial_delay * (factor^retry_count)
    Core->>Redis: Remove lock, reschedule for now+delay, set state=scheduled
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Pre-merge checks (4 warnings, 1 inconclusive)

❌ Failed checks (4 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The pull request implements retry backoff logic but does not address the linked CORE-9170 issue, which focuses on the inner uploader problem for Microsoft documents, so it does not comply with the stated objectives. Align the changes to implement one of the proposed mitigations for the inner uploader problem described in CORE-9170 or create a new issue for the retry delay feature and retarget this PR accordingly.
Out of Scope Changes Check ⚠️ Warning All modifications introduce retry delay and exponential backoff support, which are unrelated to isolating problematic document processing as outlined in the CORE-9170 issue, making these changes out of scope. Remove or split out the backoff retry logic into a separate pull request or issue, and focus this PR solely on the uploader problem objectives defined in CORE-9170.
Description Check ⚠️ Warning The current pull request description remains as the unfilled template boilerplate and does not include any information about the actual code changes, so it fails to describe any part of the implementation. Please update the description to include a concise summary of the new backoff retry logic in BackoffStrategyDTO, Queue::put, and qless.lua, explaining the purpose and scope of these changes.
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title Check ❓ Inconclusive The title “Add reties delay logic” is related to the main change of introducing retry delay (backoff) functionality but contains a typo and does not clearly convey the exponential backoff strategy being implemented. It partially reflects the changeset but lacks the precision and clarity expected in a concise PR title. Consequently, it is inconclusive whether it meets the standards for a well-formed title. Please correct the spelling and improve specificity by renaming the title to something like “Add exponential backoff for retry delays” so that it accurately and succinctly describes the primary change.
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch CORE-9170-add-retry-delay

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
src/Queues/Queue.php (2)

89-91: Public API: new optional arg — update PHPDoc.

Add the new parameter to the method docblock to keep IDE hints accurate.

  * @param  string[]|null $depends   A list of JIDs this job must wait on before executing.
+ * @param  \Qless\Queues\DTO\BackoffStrategyDTO|null $backoffStrategyDTO Optional per-job retry backoff.
  * @return string The job identifier.

125-127: Don’t always send a backoff payload.

Right now an empty JSON array is always sent; the Lua side will treat it as “present”. Either (a) omit the pair when DTO is null (preferred), or (b) decode-and-ignore empties in Lua (see my Lua fix below). If you want to do it here, build the varargs array conditionally.

// Outside this hunk, replace the direct ->put(...) with:
$args = [
    '', $this->name, $jid, $className, $putData,
    is_null($delay) ? 0 : $delay,
    'priority', is_null($priority) ? 0 : $priority,
    'tags', json_encode($tags ?: [], JSON_UNESCAPED_SLASHES),
    'retries', is_null($retries) ? 5 : $retries,
    'depends', json_encode($depends ?: [], JSON_UNESCAPED_SLASHES),
];
if ($backoffStrategyDTO) {
    $args[] = 'backoff';
    $args[] = json_encode($backoffStrategyDTO->toArray(), JSON_UNESCAPED_SLASHES);
}
$jid = call_user_func_array([$this->client, 'put'], $args);
src/qless-core/qless.lua (1)

1980-1993: Optional: clamp extreme delays.

Consider a max delay (configurable) to prevent runaway scheduling for large retry_count.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5e47114 and 7e45708.

📒 Files selected for processing (3)
  • src/Queues/DTO/BackoffStrategyDTO.php (1 hunks)
  • src/Queues/Queue.php (3 hunks)
  • src/qless-core/qless.lua (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/Queues/Queue.php (1)
src/Queues/DTO/BackoffStrategyDTO.php (2)
  • BackoffStrategyDTO (5-46)
  • toArray (39-45)
🪛 GitHub Actions: PR checks and tests
src/Queues/Queue.php

[error] 127-127: QlessException: ERR Error running script (call to f_15c3572b35acf0713ccad4d65166bf6c84b5cdfa): @user_script:1608: Lua redis() command arguments must be strings or integers

🔇 Additional comments (3)
src/Queues/DTO/BackoffStrategyDTO.php (1)

39-45: LGTM — array shape matches Lua expectations (snake_case keys).

src/Queues/Queue.php (1)

15-15: Import looks correct.

src/qless-core/qless.lua (1)

389-392: Config defaults: LGTM.

Good defaults with “0 = disabled” semantics for initial delay.

Comment on lines +1590 to +1608
local job_fields = {
'jid' , jid,
'klass' , klass,
'data' , raw_data,
'priority' , priority,
'tags' , cjson.encode(tags),
'state' , ((delay > 0) and 'scheduled') or 'waiting',
'worker' , '',
'expires' , 0,
'queue' , self.name,
'retries' , retries,
'remaining', retries,
'time' , string.format("%.20f", now)
}
if options['backoff'] then
table.insert(job_fields, 'backoff')
table.insert(job_fields, cjson.encode(options['backoff']))
end
redis.call('hmset', QlessJob.ns .. jid, job_fields)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix pipeline breaker: HMSET with a table (must unpack).

This is the CI error (“Lua redis() command arguments must be strings or integers”). HMSET needs flat field/value pairs.

-  redis.call('hmset', QlessJob.ns .. jid, job_fields)
+  redis.call('hmset', QlessJob.ns .. jid, unpack(job_fields))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
local job_fields = {
'jid' , jid,
'klass' , klass,
'data' , raw_data,
'priority' , priority,
'tags' , cjson.encode(tags),
'state' , ((delay > 0) and 'scheduled') or 'waiting',
'worker' , '',
'expires' , 0,
'queue' , self.name,
'retries' , retries,
'remaining', retries,
'time' , string.format("%.20f", now)
}
if options['backoff'] then
table.insert(job_fields, 'backoff')
table.insert(job_fields, cjson.encode(options['backoff']))
end
redis.call('hmset', QlessJob.ns .. jid, job_fields)
local job_fields = {
'jid' , jid,
'klass' , klass,
'data' , raw_data,
'priority' , priority,
'tags' , cjson.encode(tags),
'state' , ((delay > 0) and 'scheduled') or 'waiting',
'worker' , '',
'expires' , 0,
'queue' , self.name,
'retries' , retries,
'remaining', retries,
'time' , string.format("%.20f", now)
}
if options['backoff'] then
table.insert(job_fields, 'backoff')
table.insert(job_fields, cjson.encode(options['backoff']))
end
redis.call('hmset', QlessJob.ns .. jid, unpack(job_fields))
🤖 Prompt for AI Agents
In src/qless-core/qless.lua around lines 1590-1608, HMSET is being called with
the job_fields table itself which causes the Redis Lua client error; change the
call to pass the field/value pairs as separate arguments by unpacking the
job_fields table (use table.unpack(job_fields)) so HMSET receives flat
string/integer args instead of a table.

Comment on lines +1604 to +1607
if options['backoff'] then
table.insert(job_fields, 'backoff')
table.insert(job_fields, cjson.encode(options['backoff']))
end
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid double-encoding and ignore empty backoff.

options.backoff arrives JSON-encoded from PHP; encoding again stores a JSON string literal. Decode first and only persist non-empty tables.

-  if options['backoff'] then
-    table.insert(job_fields, 'backoff')
-    table.insert(job_fields, cjson.encode(options['backoff']))
-  end
+  if options['backoff'] then
+    local raw = options['backoff']
+    local backoff = (type(raw) == 'string') and (cjson.decode(raw) or {}) or raw
+    if type(backoff) == 'table' and next(backoff) ~= nil then
+      table.insert(job_fields, 'backoff')
+      table.insert(job_fields, cjson.encode(backoff))
+    end
+  end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if options['backoff'] then
table.insert(job_fields, 'backoff')
table.insert(job_fields, cjson.encode(options['backoff']))
end
if options['backoff'] then
local raw = options['backoff']
local backoff = (type(raw) == 'string') and (cjson.decode(raw) or {}) or raw
if type(backoff) == 'table' and next(backoff) ~= nil then
table.insert(job_fields, 'backoff')
table.insert(job_fields, cjson.encode(backoff))
end
end
🤖 Prompt for AI Agents
In src/qless-core/qless.lua around lines 1604-1607, options['backoff'] is
arriving as a JSON-encoded string so re-encoding it stores a JSON string
literal; decode it first (use pcall/cjson.decode), verify the decoded value is a
non-empty table (next(decoded) ~= nil), and only then add 'backoff' and the
properly encoded JSON of that table to job_fields; if decode fails or the table
is empty, do not insert anything.

Comment on lines +1966 to +1971
local backoff_json = redis.call('hget', Qless.ns .. jid, 'backoff')
local backoff_config = {}
if backoff_json then
backoff_config = cjson.decode(backoff_json)
end

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Read backoff from the correct job key.

Using Qless.ns builds “ql:”, not the job hash.

-        local backoff_json = redis.call('hget', Qless.ns .. jid, 'backoff')
+        local backoff_json = redis.call('hget', QlessJob.ns .. jid, 'backoff')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
local backoff_json = redis.call('hget', Qless.ns .. jid, 'backoff')
local backoff_config = {}
if backoff_json then
backoff_config = cjson.decode(backoff_json)
end
local backoff_json = redis.call('hget', QlessJob.ns .. jid, 'backoff')
local backoff_config = {}
if backoff_json then
backoff_config = cjson.decode(backoff_json)
end
🤖 Prompt for AI Agents
In src/qless-core/qless.lua around lines 1966–1971, the code reads backoff JSON
from redis.call('hget', Qless.ns .. jid, 'backoff') which concatenates the
namespace and jid (producing "ql:<jid>") instead of the job hash key; change the
key to the job hash (e.g. Qless.ns .. 'jobs:' .. jid or use the existing helper
like Qless.jobKey(jid) if available) so the hget targets the correct job hash,
then decode backoff_json as before.

Comment on lines +1972 to +1989
local initial_delay = tonumber(backoff_config['initial_delay'])
local backoff_factor = tonumber(backoff_config['factor'])
if initial_delay == nil then
initial_delay = tonumber(Qless.config.get('backoff-initial-delay', 0))
end
if backoff_factor == nil then
backoff_factor = tonumber(Qless.config.get('backoff-factor', 3))
end
if initial_delay == 0 then
table.insert(jids, jid)
else
local job = Qless.job(jid)
local job_history = job:history()
local retry_count = #job_history - 1
if retry_count < 0 then retry_count = 0 end

local delay = initial_delay * (backoff_factor ^ retry_count)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Derive retry_count from retries/remaining, not history length.

History length is not equal to attempts and can explode delays. Use (retries - remaining), clamped to >= 0.

-            local job = Qless.job(jid)
-            local job_history = job:history()
-            local retry_count = #job_history - 1
-            if retry_count < 0 then retry_count = 0 end
-
-            local delay = initial_delay * (backoff_factor ^ retry_count)
+            local total_retries = tonumber(redis.call('hget', QlessJob.ns .. jid, 'retries')) or 0
+            local remaining     = tonumber(redis.call('hget', QlessJob.ns .. jid, 'remaining')) or 0
+            local retry_count   = math.max(0, total_retries - remaining)
+            local delay         = initial_delay * (backoff_factor ^ retry_count)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
local initial_delay = tonumber(backoff_config['initial_delay'])
local backoff_factor = tonumber(backoff_config['factor'])
if initial_delay == nil then
initial_delay = tonumber(Qless.config.get('backoff-initial-delay', 0))
end
if backoff_factor == nil then
backoff_factor = tonumber(Qless.config.get('backoff-factor', 3))
end
if initial_delay == 0 then
table.insert(jids, jid)
else
local job = Qless.job(jid)
local job_history = job:history()
local retry_count = #job_history - 1
if retry_count < 0 then retry_count = 0 end
local delay = initial_delay * (backoff_factor ^ retry_count)
local initial_delay = tonumber(backoff_config['initial_delay'])
local backoff_factor = tonumber(backoff_config['factor'])
if initial_delay == nil then
initial_delay = tonumber(Qless.config.get('backoff-initial-delay', 0))
end
if backoff_factor == nil then
backoff_factor = tonumber(Qless.config.get('backoff-factor', 3))
end
if initial_delay == 0 then
table.insert(jids, jid)
else
local total_retries = tonumber(redis.call('hget', QlessJob.ns .. jid, 'retries')) or 0
local remaining = tonumber(redis.call('hget', QlessJob.ns .. jid, 'remaining')) or 0
local retry_count = math.max(0, total_retries - remaining)
local delay = initial_delay * (backoff_factor ^ retry_count)
🤖 Prompt for AI Agents
In src/qless-core/qless.lua around lines 1972 to 1989, the code computes
retry_count from job history length which can be incorrect and inflate backoff;
instead fetch the job's configured retries and current remaining attempts and
compute retry_count = math.max(0, retries - remaining). Update the code to get
numeric retries and remaining from the job (e.g. call the appropriate job API to
read retries and remaining, coerce to numbers), clamp to >= 0, and use that
retry_count when calculating delay.

Comment on lines +17 to +21
public function __construct(int $initialDelay, int $factor)
{
$this->initialDelay = $initialDelay;
$this->factor = $factor;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate inputs (non-negative delay, factor >= 1).

Protect against misconfiguration early.

   public function __construct(int $initialDelay, int $factor)
   {
+        if ($initialDelay < 0) {
+            throw new \InvalidArgumentException('initialDelay must be >= 0');
+        }
+        if ($factor < 1) {
+            throw new \InvalidArgumentException('factor must be >= 1');
+        }
         $this->initialDelay = $initialDelay;
         $this->factor = $factor;
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public function __construct(int $initialDelay, int $factor)
{
$this->initialDelay = $initialDelay;
$this->factor = $factor;
}
public function __construct(int $initialDelay, int $factor)
{
if ($initialDelay < 0) {
throw new \InvalidArgumentException('initialDelay must be >= 0');
}
if ($factor < 1) {
throw new \InvalidArgumentException('factor must be >= 1');
}
$this->initialDelay = $initialDelay;
$this->factor = $factor;
}
🤖 Prompt for AI Agents
In src/Queues/DTO/BackoffStrategyDTO.php around lines 17 to 21, the constructor
does not validate inputs; add checks to ensure $initialDelay is non-negative and
$factor is >= 1 before assigning to properties, and throw an
InvalidArgumentException (with clear messages like "initialDelay must be >= 0"
and "factor must be >= 1") when validations fail so misconfiguration is caught
early; perform validation first, then set $this->initialDelay and $this->factor
only if values pass.

@YaroslavStryhun YaroslavStryhun changed the title [CORE-9170] Add reties delay logic Add reties delay logic Sep 9, 2025
@YaroslavStryhun YaroslavStryhun deleted the CORE-9170-add-retry-delay branch September 9, 2025 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants