Skip to content

Better Quarantine: REST Interfacing #363

@ric-evans

Description

@ric-evans

Here are a few enhancements for bundle quarantine:

  • Dedicated REST endpoint for quarantine POST @ /Bundles/actions/quarantine/UUID
    • This will guarantee certain fields are updated as required: reason, etc.
    • Disable PATCH @ /Bundles/UUID for status='quarantine'
  • Dedicated REST endpoint for de-quarantine DELETE @ /Bundles/actions/quarantine/UUID
  • More: TBD

From @blinkdog:

I think #201 is still relevant because quarantine isn't quite fixed yet.

I think the three major steps to 'fixing' quarantine are probably:

Adding counting; both a quarantine_total_count (count of all quarantines since the Bundle record was created) and a quarantine_count or quarantine_streak meaning the most recent count of quarantines performed by a single component type (i.e.: GlobusReplicator has sent this bundle to quarantine 5 times.)

Adding a "Try everything again" button; all the bundles come out of quarantine, go back to the pipeline for another shot. At first, this is a manual tool for the operators (maybe ./ltacmd bundle repair --uuid $BUNDLE_UUID and ./ltacmd bundle repair --all) Later, a tool for automation to regularly invoke; keep-retrying

That automation checking the streak count and changing bundles over a certain limit to some other status (maybe operator?) so that it doesn't go back in the retry bin, and can be used to raise something on a dashboard / send an e-mail / sent a Slack alert / etc.

Now underlying all this are those new REST actions; they can 'keep score' and move things in and out of quarantine in a predictable way.

From @ric-evans:

I did something similar in ewms. I made a field that holds history for status changes
https://github.com/Observation-Management-Service/ewms-workflow-management-service/blob/667a4d713d777915ff616e3fe333af328968537f/wms/taskforce_launch_control.py#L42-L49
I used "phase" instead of "status", but this made retries and such query-able with mongo using filters and count


Note: This could also apply to transfer-request quarantine.

Metadata

Metadata

Labels

enhancementNew feature or requestlow prioritynot today, but perhaps tomorrow

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions