Skip to content

Fix background jobs that previously failed and were never completed properly#3113

Draft
tw4l wants to merge 5 commits intomainfrom
issue-3067-rerun-bg-jobs-not-finished
Draft

Fix background jobs that previously failed and were never completed properly#3113
tw4l wants to merge 5 commits intomainfrom
issue-3067-rerun-bg-jobs-not-finished

Conversation

@tw4l
Copy link
Member

@tw4l tw4l commented Jan 15, 2026

Fixes #3067

This PR includes two migrations:

  • One that fixes background jobs that never had finished or success set properly in the database
  • One that finds all crawl and profile files that have not been replicated (if replica locations are set) and creates new background jobs to re-replicate them.

Work in progress. The migration numbers may need to be modified depending on relative merge timing of this PR, #3111, and #3112

tw4l added 5 commits March 25, 2026 11:44
Some background jobs previously failed and did not have success
or finished fields set due to bugs. This migration targets those
jobs to update these fields so that the existing API endpoints for
retrying background jobs can be used.
This is preferable to simply retrying older failed replication
background jobs, as it's possible that the objects they correlate
to have been deleted or changed since and so those old background
jobs would no longer be applicable.
@tw4l tw4l force-pushed the issue-3067-rerun-bg-jobs-not-finished branch from 0bf8c12 to acbddbf Compare March 25, 2026 15:45

crawls_match_query = {
"oid": {"$in": orgs_with_replicas},
"files": {"$elemMatch": {"replicas": {"$in": [None, []]}}},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to instead check that the length of the replica array isn't the same length as the number of configured replica locations. It's possible a file would be replicated to one location but not to another.


profiles_match_query = {
"oid": {"$in": orgs_with_replicas},
"resource.replicas": {"$in": [None, []]},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to instead check that the length of the replica array isn't the same length as the number of configured replica locations. It's possible a file would be replicated to one location but not to another.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task]: Re-run previously failed background jobs that never had finished/success set

1 participant