Skip to content

feat(deploy): break three config-only DAG edges for parallelism#3844

Closed
Rico Lin (ricolin) wants to merge 1 commit intofeat/parallel-deploy-orchestratorfrom
feat/dag-edge-removal-phase1
Closed

feat(deploy): break three config-only DAG edges for parallelism#3844
Rico Lin (ricolin) wants to merge 1 commit intofeat/parallel-deploy-orchestratorfrom
feat/dag-edge-removal-phase1

Conversation

@ricolin
Copy link
Copy Markdown
Member

Summary

Audit of install-time behaviour vs. declared dependencies in the parallel deployment orchestrator found three edges where the declared dependency is a configuration-only reference (endpoint URL stored in a Helm values template) rather than a real install-time API call. Removing them lets the orchestrator schedule components earlier.

Edges removed

Component Old deps New deps Why the removed edge is safe
neutron keystone, nova, ovn, coredns keystone, ovn, coredns roles/neutron/tasks/main.yml only makes Neutron API calls (openstack.cloud.network, subnets) and does not touch Nova at install time.
magnum barbican, heat keystone roles/magnum/vars/main.yml references barbican_client / heat_client in magnum.conf. These strings are only dereferenced when a user later creates a cluster.
rook-ceph-cluster rook-ceph, ceph, barbican rook-ceph, ceph, keystone The role creates an identity user and catalog entry for the RGW. The previous Barbican dependency was mis-declared — the real dependency is Keystone.

Readiness check added

Along with the Keystone dependency, roles/rook_ceph_cluster/tasks/main.yml now waits for the keystone-api Deployment to be Available before the subsequent openstack.cloud.* calls, preventing a race with Keystone's rollout. This mirrors the cert-manager secret wait added for Octavia in #3834.

The other two removals (neutron -> nova, magnum -> barbican/heat) don't need new gates: audit of the affected tasks/ files confirms they make no install-time API calls against the removed dependencies.

Measured impact

On atmosphere-molecule-aio-ovn the current critical path is:

... -> keystone -> Wave6 -> nova -> neutron -> (octavia || manila) = ~12 min tail

After these changes the expected critical path is:

... -> keystone -> Wave6 -> (nova || neutron) -> (octavia || manila) = ~8.5 min tail

Expected saving: ~3 min 20 s on the critical path (roughly 9% of the measured 34m 40s non-prepull deploy time in #3841).

Validation

  • go test ./pkg/dag/ ./internal/deploy/ passes.
  • go build ./cmd/atmosphere succeeds.
  • Release note passes vale with zero errors / warnings / suggestions.
  • DCO signed.

Stacking

This PR is based on #3818 (feat/parallel-deploy-orchestrator) so it is self-contained once #3818 lands on main. #3841 will be rebuilt as #3834 + #3835 + this PR on top of main.

Signed-off-by: Rico Lin [email protected]
Co-authored-by: Copilot [email protected]

Two edges in the parallel deploy orchestrator DAG are
configuration-only references (endpoint URLs stored in Helm values
templates) rather than real install-time API calls. Remove them so the
orchestrator schedules affected components earlier:

- magnum no longer depends on octavia, barbican, or heat. magnum.conf
  references barbican_client and heat_client but those strings are
  only dereferenced when a user later creates a cluster. magnum does
  depend on glance at install time to upload the cluster image.
- rook-ceph-cluster no longer depends on barbican; the real dependency
  was keystone, now declared correctly.

Add a keystone-api readiness wait in roles/rook_ceph_cluster/tasks/main.yml
to prevent a race with Keystone's rollout before the subsequent
openstack.cloud.* calls, and pre-create the service project and domain
to avoid racing Keystone's keystone-user jobs.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: ricolin <[email protected]>
@ricolin Rico Lin (ricolin) force-pushed the feat/dag-edge-removal-phase1 branch from c112285 to a3dc226 Compare April 23, 2026 16:52
@ricolin Rico Lin (ricolin) changed the base branch from main to feat/parallel-deploy-orchestrator April 23, 2026 16:52
@ricolin
Copy link
Copy Markdown
Member Author

Squash-merged into #3818 (feat/parallel-deploy-orchestrator). Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant