Performance: cap failed actions retention period to three months by default by kalessil · Pull Request #1310 · woocommerce/action-scheduler

kalessil · 2026-03-07T13:25:42Z

Partially addresses ACTSCH-15 #912.

The issue resurfaces in stats shared under p1772090743156589-slack-C08SVLULL3B. This PR limits the retention period for failed actions to three months to prevent failed actions from continuously increasing the A-S table size.

We are addressing A-S scalability issues in HVM caused by the accumulation of failed actions, which increases database table size and slows A-S performance. As the database is often the main bottleneck, slower A-S queries further impact performance, especially during BFCM.

Note: The actions scheduler allows you to customize retention periods for action statuses, but these settings apply to all statuses collectively. However, failed statuses require a longer retention period than others to support troubleshooting while maintaining a manageable actions table size. This PR addresses this need.

Testing Instructions:

Build the plugin from this branch and install it on your test system.
Modify a completed action by setting its status to failed and updating all date fields to twelve months in the past.
Navigate to the admin area.
Confirm that the action record has been deleted.

classes/data-stores/ActionScheduler_DBStore.php

classes/ActionScheduler_QueueCleaner.php

classes/ActionScheduler_QueueRunner.php

prettyboymp · 2026-03-09T15:29:26Z

classes/ActionScheduler_QueueCleaner.php

+			$deleted_failed_entries = $this->clean_actions( array( ActionScheduler_Store::STATUS_FAILED ), $cutoff_failed, $batch_size );
+			$count_statuses         = empty( $statuses_to_purge ) ? count( $this->default_statuses_to_purge ) : count( $statuses_to_purge );
+			$batch_size             = (int) ( ( ( $count_statuses * $batch_size ) - count( $deleted_failed_entries ) ) / $count_statuses );
+		}


The batch_size recalculation here can produce zero when $count_statuses is 1 (a single status via the action_scheduler_default_cleaner_statuses filter) and all $batch_size items are deleted. A per_page of 0 in query_actions() skips the LIMIT clause entirely, resulting in an unbounded query that returns all matching rows for deletion.

With the default 2 statuses this can't happen (minimum result is 10), but an early return when the budget is exhausted would be safer:

if ( $batch_size <= 0 ) { return $deleted_failed_entries; }

Good defensive addition: added in af7d701

prettyboymp · 2026-03-09T15:29:41Z

classes/ActionScheduler_QueueCleaner.php

 	/**
 	 * Default queue cleaner process used by queue runner.
 	 *
+	 * @since 3.9.4 by default, failed actions are removed after three months.


This is a backward-incompatible change to documented behavior. docs/perf.md currently states: "By default, Action Scheduler does not automatically delete old failed actions." — and this PR directly reverses that guarantee.

The change is well-justified (failed action accumulation is a documented problem across P2s and merchant cases), but it would be good to:

Update docs/perf.md to reflect the new default behavior.

Document a clean opt-out path for sites that intentionally keep failed actions (e.g., for debugging, auditing, or compliance). Currently returning a very large value from action_scheduler_retention_period_for_failed is the only workaround, which isn't obvious.

Consider a changelog/upgrade notice since this changes data retention defaults.

I'll update the file to include a note about the BC break and instructions for replicating the original behavior. I have also added the corresponding filter for this case.

Updating the docs in af7d701

classes/ActionScheduler_QueueCleaner.php

jorgeatorres

I left some comments for discussing @kalessil, but nothing too serious. I really like the improvements! 💯

jorgeatorres · 2026-03-16T23:04:54Z

classes/ActionScheduler_QueueCleaner.php

+		$lifespan_default = apply_filters( 'action_scheduler_retention_period_by_default', $lifespan );
+
+		/**
+		 * Set the retention period in seconds for actions with a failed status. If the action_scheduler_default_cleaner_statuses filter includes
+		 * a failed status, this filter result will be ignored, and the retention period for failed actions will match that of other statuses.
+		 *
+		 * @param int $retention_period Retention period in seconds.
+		 */
+		$lifespan_failed = apply_filters( 'action_scheduler_retention_period_for_failed', 3 * $this->month_in_seconds );


Probably excessive caution, an the existing $lifespan didn't consider this, but I wonder if we should pass these lifespans through absint() or at least abs() so there's no chance we end up with a date in the future.

Do you mean only for this specific filter @jorgeatorres ?

I think it should apply to all lifespan variables, unless I'm missing something. My concern is that a negative value could result in a date in the future, so we end up deleting more than necessary. Am I wrong?

With the piggy-back style of cleaning as it is now, that's technically possible. I'll go ahead with extra defence against the mentioned cases.

Addressed in 24eab61

jorgeatorres · 2026-03-16T23:09:55Z

classes/ActionScheduler_QueueCleaner.php

+			$batch_size             = (int) ( ( ( $count_statuses * $batch_size ) - count( $deleted_failed_entries ) ) / $count_statuses );
+			if ( $batch_size <= 0 ) {
+				return $deleted_failed_entries;
+			}
+		}
+
+		$deleted_entries = $this->clean_actions( $statuses_to_purge, $cutoff_default, $batch_size );


While this is clever, I wonder if we should always let actions in other statuses be deleted. If there's a large backlog of failed actions, we will only be removing failed actions for a while. Maybe we can split the 'budget' since the beginning (even if we give more to failed) so that we always clean up at least some actions of the other statuses?

Removing failed actions will use part of the execution budget, but at least half will remain available for cleaning other statuses.

$batch_size = (int) ( ( ( $count_statuses * $batch_size ) - count( $deleted_failed_entries ) ) / $count_statuses ); ensures that, even after removing failed entries, the remaining batch size is distributed evenly among the other statuses.

This approach ensures we remain within the same execution budget and that cleaning failed statuses does not prevent other statuses from being processed.

For improved execution budget management, please refer to #1312. In this case (here), I prefer a simpler approach.

Yes, that's true when there are at least 2 different statuses returned by filter action_scheduler_default_cleaner_statuses (which is the default), but when there's just one, the total budget will be $batch_size * 1 = $batch_size and the failed budget will be $batch_size so we're left with no budget to process the remaining statuses, assuming there were at least $batch_size failed actions, which wouldn't be uncommon.

Gotcha, I see your point. I'll look into handling this edge case better.

Addressed in 5c4deed

jorgeatorres · 2026-03-16T23:14:02Z

docs/perf.md

+add_filter( 'action_scheduler_retention_period_for_failed', function( $retention_period ) {
+    return 12 * MONTH_IN_SECONDS;
+} );


I wonder if we could change the filter so that 0 actually prevents removal and we document it here as a way to "bring back" the old behavior. I prefer the new one, but might be a way to soften the backwards incompatible change.

Sure, that leaves a loophole where A-S can be blamed for bloating tables, but it's a fair point to provide an opportunity to restore original behaviour. I'll add the necessary changes today.

Yes, I don't think this is a blocker, but the filter exists so if for some reason someone needs to do this they can use it instead of having to do very strange things. And, in any case, it'd be a conscious choice.

Addressed in 24eab61

kalessil self-assigned this Mar 7, 2026

kalessil mentioned this pull request Mar 8, 2026

Clean-up enhancements #912

Open

5 tasks

kalessil changed the title ~~Performance: cap failed actions retention to three months by default~~ Performance: cap failed actions retention period to three months by default Mar 8, 2026

kalessil commented Mar 8, 2026

View reviewed changes

classes/data-stores/ActionScheduler_DBStore.php Show resolved Hide resolved

classes/ActionScheduler_QueueCleaner.php Outdated Show resolved Hide resolved

classes/ActionScheduler_QueueRunner.php Show resolved Hide resolved

Delete failed actions after the 3-month retention period.

ffb7a65

kalessil force-pushed the performance/cap-failed-actions-retention-by-default branch from b4b5d24 to ffb7a65 Compare March 8, 2026 14:35

kalessil requested review from jorgeatorres and prettyboymp March 9, 2026 07:58

kalessil marked this pull request as ready for review March 9, 2026 07:58

Adjustments to optimize performance in clustered environments.

7a8d537

prettyboymp reviewed Mar 9, 2026

View reviewed changes

classes/ActionScheduler_QueueCleaner.php Show resolved Hide resolved

Addressed code review feedback.

af7d701

kalessil requested a review from prettyboymp March 9, 2026 16:15

Addressed code review feedback.

44d2096

jorgeatorres reviewed Mar 16, 2026

View reviewed changes

kalessil added 2 commits March 18, 2026 09:39

Addressed code review feedback.

24eab61

Addressed code review feedback.

5c4deed

kalessil requested a review from jorgeatorres March 18, 2026 08:57

Addressed code review feedback.

33da697

Conversation

kalessil commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jorgeatorres left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kalessil Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jorgeatorres Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kalessil commented Mar 7, 2026 •

edited

Loading

kalessil Mar 17, 2026 •

edited

Loading

jorgeatorres Mar 17, 2026 •

edited

Loading