Skip to content

feat(deploy): add parallel deployment orchestrator#3818

Open
Mohammed Naser (mnaser) wants to merge 54 commits intomainfrom
feat/parallel-deploy-orchestrator
Open

feat(deploy): add parallel deployment orchestrator#3818
Mohammed Naser (mnaser) wants to merge 54 commits intomainfrom
feat/parallel-deploy-orchestrator

Conversation

@mnaser
Copy link
Copy Markdown
Member

Summary

Adds a Go binary (cmd/atmosphere) that deploys Atmosphere components in parallel waves using a DAG-based dependency graph, targeting a reduction from ~60 minutes to ~22 minutes for full deployments.

Architecture

Package Purpose
pkg/dag/ Generic Graph[T] library — Kahn's topo sort, subgraph extraction, parallel wave execution via errgroup
internal/deploy/ 42-component registry, Deployer interface + AnsibleDeployer, 3-mode orchestrator
cmd/atmosphere/ cobra CLI: atmosphere deploy --inventory <inv> [--tags <t>]

Three Operating Modes

  1. No tags (full deploy): All 42 components run across 11 parallel waves
  2. Single tag (--tags keystone): Pass-through to ansible-playbook site.yml --tags keystone — identical to today
  3. Multiple tags (--tags nova,keystone): DAG-aware subgraph with parallel waves

Wave Schedule (Full Deploy)

Wave 0:  ceph, kubernetes                          (2 parallel)
Wave 1:  csi                                       (1)
Wave 2:  cert-manager, memcached, rook-ceph, ...   (19 parallel)
Wave 3:  cluster-issuer, rabbitmq-op, pxc-op, ...  (6 parallel)
Wave 4:  ingress-nginx, pxc, ceph-provisioners     (3 parallel)
Wave 5:  keycloak                                  (1)
Wave 6:  keystone                                  (1)
Wave 7:  barbican, glance, cinder, placement, ...  (10 parallel)
Wave 8:  nova, staffeln                            (2 parallel)
Wave 9:  octavia                                   (1)
Wave 10: openstack-exporter                        (1)

How It Works

  • For PlaybookType components (ceph, kubernetes, csi): runs the full playbook file directly
  • For RoleType components (everything else): generates a minimal single-role playbook at runtime and pipes it via /dev/stdin to ansible-playbook, avoiding multi-play parsing overhead
  • Output from concurrent processes is streamed with [component-name] prefixes
  • A prerequisite step runs openstacksdk once before parallel waves to avoid concurrent pip install / clouds.yaml writes

Backwards Compatibility

  • Zero changes to existing Ansible roles, variables, or playbooks
  • ansible-playbook site.yml --tags <tag> continues to work unchanged
  • The orchestrator is purely additive — the Deployer interface enables incremental migration to direct Helm SDK calls per-component in future PRs

Tests

  • 11 DAG library tests (cycles, waves, parallel execution, error propagation, subgraph)
  • 3 orchestrator tests (full DAG, multi-tag, unknown tag)

CI Integration

molecule/aio/converge.yml is updated to build the Go binary and run atmosphere deploy instead of sequential import_playbook calls.

@mnaser Mohammed Naser (mnaser) force-pushed the feat/parallel-deploy-orchestrator branch 2 times, most recently from 515a3d5 to 96e3122 Compare April 13, 2026 16:50
@mnaser
Copy link
Copy Markdown
Member Author

Copilot can you check why the zuul job failed and look to push a fix, after you push the fix wait for the zuul job to report and wait for it until it's done and make sure the fix worked. If the fix didn't work then keep pushing and waiting for zuul till you get a pass.

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 13, 2026

Copilot can you check why the zuul job failed and look to push a fix, after you push the fix wait for the zuul job to report and wait for it until it's done and make sure the fix worked. If the fix d...

Investigated the failures and pushed three rounds of fixes:

  1. Pre-commit (end-of-file-fixer): removed extra trailing newline in internal/deploy/orchestrator.go
  2. Reno: added release note for the parallel deployment orchestrator feature
  3. Zuul (root cause): zuul.project.src_dir is a relative path — using it as a prefix in cmd while chdir was set to the same value doubled the path. Fixed to use ./bin/atmosphere. A secondary issue: Go 1.19+ refuses to execute binaries found via relative PATH entries (security restriction). Fixed by switching to ansible.builtin.shell with . .venv/bin/activate so the venv activation provides the absolute path.

Latest commits: 6a14e3766b2f4f0f41583

Partial results from the latest Zuul run: csi-local-path-provisioner passed. csi-rbd failed at Ceph bootstrap ("Cannot infer CIDR network") which appears to be a pre-existing infrastructure issue on that test node. The remaining jobs (keycloak, aio-ovn, aio-openvswitch) were still in progress when the session ended.

@mnaser
Copy link
Copy Markdown
Member Author

📊 CI Performance Report: Before vs After Parallel Orchestrator

Overall Results

33.4% faster across all CI jobs — 79 minutes saved per CI run

Job Before After Saved % Faster
atmosphere-molecule-aio-ovn 95m 41s 59m 46s 35m 55s 37.5%
atmosphere-molecule-aio-ovs 94m 33s 61m 18s 33m 15s 35.2%
atmosphere-molecule-csi-local-path 9m 30s 6m 28s 3m 02s 31.9%
atmosphere-molecule-csi-rbd 16m 27s 13m 02s 3m 25s 20.8%
atmosphere-molecule-keycloak 22m 06s 18m 13s 3m 53s 17.6%
Total 238m 17s 158m 47s 79m 30s 33.4%

Parallelism Analysis (AIO-OVN full deploy)

  • 47 components deployed in 11 waves
  • Deploy phase wall clock: 39.7 minutes
  • Sum of all components if serialized: 81.6 minutes
  • Effective parallelism: 2.05x

Deployment Waves

Wave  1 [t+0m]:    kubernetes(506s), ceph(362s), iscsi(511s), multipathd(24s), udev(4s), lpfc(3s)
Wave  2 [t+8.5m]:  cert-manager(31s), ingress-nginx(30s), ceph-provisioners(24s), openvswitch(13s),
                    rook-ceph(7s), csi(7s), memcached(7s), node-feature-discovery(6s),
                    goldpinger(6s), coredns(3s), frr-k8s(0s)
Wave  3 [t+9.0m]:  ovn(12s), rabbitmq-cluster-operator(11s), cluster-issuer(11s),
                    percona-xtradb-cluster-operator(10s), loki(9s), keepalived(4s)
Wave  4 [t+9.2m]:  percona-xtradb-cluster(91s), libvirt(12s), valkey(8s), vector(6s)
Wave  5 [t+10.7m]: keycloak(172s)
Wave  6 [t+13.6m]: keystone(229s), kube-prometheus-stack(129s)
Wave  7 [t+17.4m]: cinder(328s), glance(316s), heat(258s), barbican(183s), placement(120s),
                    horizon(44s), openstack-cli(16s), prometheus-pushgateway(6s), ipmi-exporter(5s)
Wave  8 [t+22.9m]: nova(231s), rook-ceph-cluster(57s), staffeln(30s)
Wave  9 [t+26.7m]: neutron(200s)
Wave 10 [t+30.1m]: octavia(289s), manila(258s), openstack-exporter(16s)
Wave 11 [t+34.9m]: magnum(290s)

Critical Path (39.6 minutes)

kubernetes(506s) → cert-manager(31s) → cluster-issuer(11s) → pxc(91s)
  → keycloak(172s) → keystone(229s) → cinder(328s) → nova(231s)
  → neutron(200s) → octavia(289s) → magnum(290s)

Top 15 Slowest Components

# Component Time Bottleneck
1 iscsi 511s apt lock contention with kubernetes/ceph
2 kubernetes 506s K8s install + wait for cluster ready
3 ceph 362s Ceph mon/osd deploy, image downloads
4 cinder 328s Helm + ks-user + ks-endpoints bootstrap
5 glance 316s Helm + ks-user + ks-endpoints bootstrap
6 magnum 290s Helm + ks-user + ks-endpoints bootstrap
7 octavia 289s Helm + ks-user + ks-endpoints bootstrap
8 heat 258s Helm + ks-user + ks-endpoints bootstrap
9 manila 258s Helm + ks-user + ks-endpoints bootstrap
10 nova 231s Helm + ks-user + ks-endpoints + cells_v2
11 keystone 229s Helm + ks-user + bootstrap + fernet keys
12 neutron 200s Helm + ks-user + ks-endpoints + networks
13 barbican 183s Helm + ks-user + ks-endpoints bootstrap
14 keycloak 172s Helm chart + wait for pods ready
15 kube-prometheus-stack 129s CRDs + Helm + dashboards

Future Optimization Opportunities

  1. Pre-pull container images (HIGH IMPACT, ~3-5 min)
    Each OpenStack service spends ~30-60s pulling images. Pre-pulling all images in wave 1 could significantly reduce critical path time.

  2. Break keycloak→keystone chain (HIGH IMPACT, ~2 min)
    If keycloak could start parallel with PXC instead of after it, we save the keycloak wait time from the critical path.

  3. Optimize Helm bootstrap jobs (MEDIUM IMPACT, ~15-30s/service)
    Each service runs ks-user, ks-service, ks-endpoints as sequential Kubernetes Jobs. Batching or parallelizing these hooks could compound across all services.

  4. Reduce apt lock contention (MEDIUM IMPACT)
    iscsi takes 511s primarily from apt lock serialization with kubernetes/ceph. Better scheduling or pre-installing packages could help.

  5. Further DAG refinement (LOW-MEDIUM IMPACT)

    • Decouple neutron network creation from nova AZ to move neutron earlier
    • Relax magnum's nova dependency to start it sooner
  6. Parallel Ansible within components (HIGH IMPACT in production)
    In multi-node production deployments, each component's Ansible run uses forks for host parallelism. The orchestrator adds cross-component parallelism on top — production gains will be even larger than CI.


Baseline: change 3809 (main branch, serialized molecule). After: PR #3818 latest patchset (parallel orchestrator).

@larainema Dong Ma (larainema) force-pushed the feat/parallel-deploy-orchestrator branch from 5695592 to 0634f82 Compare April 14, 2026 16:58
@okozachenko1203
Copy link
Copy Markdown
Member

/recheck

@okozachenko1203
Copy link
Copy Markdown
Member

recheck

Comment thread roles/manila/tasks/main.yml
@yaguangtang
Copy link
Copy Markdown
Member

Yaguang Tang (yaguangtang) commented Apr 15, 2026

Dong Ma (@larainema) It seems the job atmosphere-molecule-csi-rbd keep failing after your change
atmosphere-molecule-csi-rbd

@larainema
Copy link
Copy Markdown
Member

recheck

Mohammed Naser (mnaser) and others added 17 commits April 15, 2026 12:44
Add a Go binary (cmd/atmosphere) that deploys Atmosphere components
in parallel waves using a DAG-based dependency graph, reducing
deployment time from ~60 minutes to ~22 minutes.

Key components:
- pkg/dag: Generic Graph[T] library with topological sort, subgraph
  extraction, and parallel wave execution via errgroup
- internal/deploy: Component registry (42 components), Deployer
  interface with AnsibleDeployer, and 3-mode Orchestrator
- cmd/atmosphere: CLI with deploy subcommand (--inventory, --tags,
  --playbook-dir, --concurrency flags)

Three operating modes:
- No tags: full DAG parallel deployment (11 waves)
- Single tag: pass-through to ansible-playbook (backwards compatible)
- Multiple tags: DAG-aware subgraph with parallel waves

The orchestrator spawns concurrent ansible-playbook processes with
generated per-component playbooks piped via /dev/stdin, avoiding
multi-play parsing overhead. Output is streamed with [component]
prefixes for clear CI log interleaving.

Backwards compatibility: existing ansible-playbook usage, tags, and
variables are completely unchanged. The orchestrator is additive.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
Update molecule converge playbooks to build and use the atmosphere
binary for deployment:

- default: full DAG deploy (no tags)
- csi: multi-tag with ceph,kubernetes,csi (or kubernetes,csi)
- keycloak: multi-tag with all keycloak dependencies
- pxc: single-tag pass-through for percona-xtradb-cluster

The multi-tag mode resolves DAG ordering automatically, running
independent components in parallel where possible.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
Adjust DAG dependencies based on actual role analysis:

- ingress-nginx: drop cluster-issuer dep (only needs kubernetes)
- pxc, valkey, kube-prometheus-stack, loki: add csi dep (all use PVCs)
- lpfc, multipathd, iscsi, udev: remove kubernetes dep (pure host config)
- rook-ceph: depend on kubernetes only (operator, not storage consumer)
- rook-ceph-cluster: add ceph dep (needs ceph monitors)
- nova: add neutron dep, drop ovn/coredns (transitive via neutron)
- neutron: add coredns dep (dnsmasq_dns_servers uses coredns)
- magnum: depend on octavia, barbican, heat (configures all three clients)
- openstack-exporter: depend on cinder, neutron (only hard runtime deps)

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
- Add ensure-go role (v1.24.4) to molecule pre-run playbook
- Set CGO_ENABLED=0 and explicit Go PATH in all converge build tasks
- Add kubernetes, csi, valkey to keycloak scenario tags (transitive deps)

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
When rendering playbooks piped via /dev/stdin, ansible-playbook has no
collection context. Prefix bare role names with vexxhost.atmosphere. so
Ansible can resolve them from the installed collection.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
Use vexxhost.atmosphere.* fully-qualified collection names for both
playbooks (PlaybookType) and roles (RoleType). This removes the need
for --playbook-dir since Ansible resolves collection references
directly. Also removes the openstacksdk prerequisite step since
dependent roles already call it and Ansible does atomic writes.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
Add a ResourceCoordinator that serializes components sharing a named
resource (e.g., 'apt'). Components ceph and kubernetes declare the apt
resource since they come from external collections where we cannot add
retries. For all roles within vexxhost.atmosphere that use package
management, add retries (5 attempts, 10s delay) to gracefully handle
dpkg lock contention during parallel deployment.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
Mark multipathd and iscsi with the 'apt' resource since they install
packages on the same hosts as ceph/kubernetes (external collections
without retries). Also set changed_when: false on all molecule converge
build/deploy tasks to pass idempotence checks.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
Environment values containing Jinja expressions with single quotes
(e.g., ceph container image) broke YAML parsing when wrapped in
single-quoted YAML strings. Switch to Go's %q format which uses
double quotes, safely containing single quotes in the values.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
ipmi-exporter deploys directly into the monitoring namespace using
kubernetes.core.k8s (not Helm with create_namespace: true), so it
needs the namespace to exist first. kube-prometheus-stack creates it.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
keepalived and percona-xtradb-cluster deploy raw k8s resources into
the openstack namespace without creating it. memcached (via Helm with
create_namespace: true) creates the namespace. Add memcached as a
dependency so the namespace exists before these components run.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
prometheus-pushgateway enables serviceMonitor which requires the
ServiceMonitor CRD from kube-prometheus-stack. Without this dep,
the Helm install fails with 'no matches for kind ServiceMonitor'.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
The vexxhost.kubernetes collection uses kubernetes.core.k8s modules
in early plays before the Python kubernetes package is installed by
later plays. When running in parallel mode, this race becomes more
visible. Install the package in pre-run to ensure it's available
system-wide before any playbooks execute.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
libvirt, kube-prometheus-stack, and valkey all create Certificate
resources using cert-manager.io/v1 CRDs directly via kubernetes.core.k8s.
They also reference a ClusterIssuer named 'self-signed' created by
the cluster-issuer role. Add cluster-issuer as a dependency so the
CRDs and issuer exist before these components deploy.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
The kube_prometheus_stack role starts by waiting for the Keycloak
StatefulSet to be ready and then creates realms/clients. Without
keycloak in its dependency list, it can start before keycloak is
deployed, causing 'list object has no element 0' errors when checking
the StatefulSet status.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
The rook_ceph_cluster role creates Keystone users, services, and
endpoints for Swift/RGW integration using openstack.cloud modules.
Without keystone being deployed first, these calls fail with SSL
connection errors to the identity endpoint.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
Manila creates compute flavors (needs Nova endpoint), uploads images
(needs Glance via Nova chain), and its Helm values reference endpoints
for nova, neutron, and cinder. Without these services deployed first,
manila fails with EndpointNotFound for the compute service.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Mohammed Naser <[email protected]>
Rico Lin (ricolin) and others added 3 commits April 24, 2026 09:47
Introduce a magnum_pre role that uploads the Glance cluster image and
handles related OpenStack-side setup. Registered as the PreRoleName for
the magnum component so it runs concurrently with the magnum Helm
deploy, overlapping image upload time (~150s) with the Helm install.

The magnum main role checks `_pre_role_active` and skips the upload
tasks that magnum_pre now owns.

Also remove magnum's install-time dependency on octavia. Magnum's
Helm install only needs the octavia endpoint URL — a deterministic
string from openstack_helm_endpoints — not a running Octavia service.
Magnum can now start after barbican and heat complete instead of
waiting for octavia, saving several minutes on the critical path.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: ricolin <[email protected]>
Introduce an octavia_pre role that creates the cert-manager
Certificates and admin quota required by Octavia before the main Helm
deploy needs them. Registered as the PreRoleName for the octavia
component so it runs concurrently with the octavia Helm deploy,
overlapping ~15s of setup with the Helm install.

The octavia main role waits for the cert-manager secrets created by
octavia_pre before proceeding, and checks `_pre_role_active` to skip
tasks that octavia_pre now owns.

Add a registry integration test that verifies magnum and octavia each
declare a pre-role.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: ricolin <[email protected]>
Two edges in the parallel deploy orchestrator DAG are
configuration-only references (endpoint URLs stored in Helm values
templates) rather than real install-time API calls. Remove them so the
orchestrator schedules affected components earlier:

- magnum no longer depends on octavia, barbican, or heat. magnum.conf
  references barbican_client and heat_client but those strings are
  only dereferenced when a user later creates a cluster. magnum does
  depend on glance at install time to upload the cluster image.
- rook-ceph-cluster no longer depends on barbican; the real dependency
  was keystone, now declared correctly.

Add a keystone-api readiness wait in roles/rook_ceph_cluster/tasks/main.yml
to prevent a race with Keystone's rollout before the subsequent
openstack.cloud.* calls, and pre-create the service project and domain
to avoid racing Keystone's keystone-user jobs.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: ricolin <[email protected]>
…rriers

The parallel deploy orchestrator used Kahn's algorithm levels
("waves") to drive concurrency: every node in wave N had to finish
before any node in wave N+1 started. This introduced gaps on the
critical path when a short node shared a wave with an unrelated long
one — for example, a 9-second component (deps: cert-manager only)
could sit idle for 340s waiting for ceph to finish, purely because
both ended up in the same Kahn wave.

Replace the wave-barrier loop in pkg/dag/dag.go with an event-driven
scheduler: each node has its own done channel, waits on the done
channels of its direct dependencies, and then runs. The concurrency
parameter becomes a global in-flight cap (one semaphore across the
whole graph) instead of a per-wave cap, which matches what we
actually want to bound.

Correctness:

- The partial order from DependsOn is preserved; cycles are still
  rejected via the existing Waves() check.
- On error, the failing node's done channel is never closed, so
  dependents exit via the errgroup context cancellation instead of
  hanging.

Tests:

- TestRunShortNodeNotBlockedByUnrelatedLongNode locks in the new
  behavior by asserting a short node starts before an unrelated
  long node in the same Kahn wave finishes.
- TestRunStopsDependentsOnError asserts dependents of a failed node
  never run.
- TestRunConcurrencyCap asserts the global cap bounds in-flight
  goroutines across the whole graph.
- Existing tests (ordering, parallelism, subgraph) still pass under
  -race.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: ricolin <[email protected]>
@ricolin
Copy link
Copy Markdown
Member

newest patch measure

 Deploy critical path: -308s (-12.8%) — exactly what the event-driven scheduler was supposed to deliver.

 
┌───────────────────────────┬─────────┬─────────┬─────────────────────┐
│ Metric                    │ previous   │ now   │ Δ                   │
├───────────────────────────┼─────────┼─────────┼─────────────────────┤
│ Deploy tail (octavia end) │ 40m 4s  │ 34m 56s │ -5m 8s (-12.8%)     │
├───────────────────────────┼─────────┼─────────┼─────────────────────┤
│ aio-ovn job wall          │ 60m 33s │ 55m 8s  │ -5m 25s (-8.9%)     │
├───────────────────────────┼─────────┼─────────┼─────────────────────┤
│ aio-ovs job wall          │ 61m 0s  │ 57m 6s  │ -3m 54s (-6.4%)     │
└───────────────────────────┴─────────┴─────────┴─────────────────────┘

@ricolin
Copy link
Copy Markdown
Member

recheck

Rico Lin (ricolin) and others added 5 commits April 24, 2026 16:59
…cy cap

The parallel deploy orchestrator started goroutines for all nodes at
once and serialised admission through a plain channel semaphore when
a global concurrency cap was set. Admission order then fell out of
goroutine-scheduling order, which is effectively arbitrary.

Under a bounded cap that can make the makespan worse than necessary:
when multiple ready nodes compete for the last free slot, admitting a
short node with little behind it before a long node that still has a
deep chain of dependents is exactly the wrong choice.

Replace the channel semaphore with a priority-aware scheduler that
admits the highest-priority waiter whenever capacity becomes free.
Priority is the length of each node's longest downstream chain in the
DAG (a unit-weighted critical path) and is precomputed once at the
start of Run. Ties fall back to arrival order, so ordering stays
deterministic. When concurrency <= 0 the scheduler is skipped entirely
and behaviour is unchanged.

This is a classic list-scheduling / HEFT heuristic; it costs O(V + E)
at start-up and a bounded heap per Run invocation, and it does not
change the partial order established by DependsOn.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: ricolin <[email protected]>
Signed-off-by: Rico Lin <[email protected]>
Drop the specific 60m→22m numbers from the parallel deployment
orchestrator release note. Actual full-deploy times vary with the
environment (single-node AIO vs multi-node production) and with the
subsequent scheduling improvements in this series, so a single fixed
pair of numbers is misleading. Describe the qualitative behaviour
instead and let downstream operators measure in their own environment.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Rico Lin <[email protected]>
The orchestrator unit tests previously invoked ansible-playbook directly
via runPreflightChecks, which made them non-hermetic — they failed on
any host without Ansible installed. The tests were also not run by any
Zuul job, so the Go unit test coverage for the parallel deploy
orchestrator was not enforced by CI.

Address both gaps:

* Add a Preflight hook on Orchestrator so the preflight step can be
  overridden. The production code path is unchanged (nil hook falls back
  to the ansible-playbook-based implementation); tests inject a no-op.
* Add an atmosphere-go-test Zuul job that runs `go test` for
  pkg/dag, internal/deploy, and cmd/atmosphere on every check and gate,
  using the existing ensure-go role already relied on by the molecule
  jobs.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: ricolin <[email protected]>
Add the standard short-form VEXXHOST copyright and SPDX-License-Identifier
headers to all new Go sources and to .goreleaser.yml introduced in this
branch, and normalize the copyright year to 2026 on other new files that
were still stamped 2025. Files following no-header conventions (READMEs,
reno notes) are left unchanged.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: ricolin <[email protected]>
…book

Ansible-lint runs as a pre-commit hook and was failing on the new
test-playbooks/go-test/run.yml with two issues:

  * syntax-check could not resolve the 'ensure-go' role, which is
    provided by the Zuul executor at runtime and is not vendored
    into the repository. The same role is referenced by the
    existing molecule pre-run playbook but is skipped there because
    molecule/ paths are classified differently by ansible-lint.
  * both plays in the new playbook were unnamed, which trips
    name[play] under the basic profile.

Register ensure-go in mock_roles so ansible-lint stops trying to
resolve it on disk, and give each play an explicit name.

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Rico Lin <[email protected]>
Rico Lin (ricolin) added a commit that referenced this pull request Apr 24, 2026
Three complementary orchestrator improvements for the parallel deploy
critical path (measured on top of PR #3818).

1. PreRoleDependsOn asymmetric gating

Adds a `PreRoleDependsOn` field to `Component` so a pre-role can wait
for different dependencies than the main role. The orchestrator builds a
`preGate` closure that waits on a shared `completionTracker`; the main
role starts as soon as the component's own DependsOn are satisfied.

Applied to Keystone: the Keycloak realm / MFA / OIDC client / OpenID
ConfigMap tasks move to a new `keystone_pre` role that waits for
Keycloak, while the Keystone Helm install runs in parallel with Keycloak
startup. Frees ~3 min from the serial Keycloak→Keystone window.

2. Neutron DAG audit

Removes `nova` from `neutron.DependsOn`. The Neutron Ansible role only
runs Helm install + ingress; the neutron-metadata-agent pod talks to the
Nova API at runtime, not at install time. Lets Neutron install in parallel
with Nova (~2-3 min saving on the critical path).

3. Configurable ResourceCoordinator

Generalizes the resource semaphore from a per-resource mutex (cap=1) to
a configurable cap per resource. Adds a `k8s-api` resource with default
cap 6 and tags heavy helm-driven components (keystone, neutron, nova,
cinder, glance, placement, heat, barbican, octavia, magnum, manila,
horizon). Smooths apiserver load without serializing.

Tests

* completion_test.go — tracker Wait/MarkDone semantics and context cancel
* resource_concurrency_test.go — overrides and k8s-api default cap
* orchestrator_test.go — buildPreGate branching
* existing deployer/orchestrator tests updated for new `Deploy`
  signature (`preGate func(context.Context) error`)

Backwards compatibility

The sequential `playbooks/openstack.yml --tags keystone` path is
preserved: the Keycloak tasks still exist in `roles/keystone/tasks/main.yml`
guarded by `_pre_role_active`, following the same pattern used by
`octavia` and `magnum`.

Signed-off-by: Ricardo Lin <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Rico Lin <>
Rico Lin (ricolin) added a commit that referenced this pull request Apr 24, 2026
Three complementary orchestrator improvements for the parallel deploy
critical path (measured on top of PR #3818).

1. PreRoleDependsOn asymmetric gating

Adds a `PreRoleDependsOn` field to `Component` so a pre-role can wait
for different dependencies than the main role. The orchestrator builds a
`preGate` closure that waits on a shared `completionTracker`; the main
role starts as soon as the component's own DependsOn are satisfied.

Applied to Keystone: the Keycloak realm / MFA / OIDC client / OpenID
ConfigMap tasks move to a new `keystone_pre` role that waits for
Keycloak, while the Keystone Helm install runs in parallel with Keycloak
startup. Frees ~3 min from the serial Keycloak→Keystone window.

2. Neutron DAG audit

Removes `nova` from `neutron.DependsOn`. The Neutron Ansible role only
runs Helm install + ingress; the neutron-metadata-agent pod talks to the
Nova API at runtime, not at install time. Lets Neutron install in parallel
with Nova (~2-3 min saving on the critical path).

3. Configurable ResourceCoordinator

Generalizes the resource semaphore from a per-resource mutex (cap=1) to
a configurable cap per resource. Adds a `k8s-api` resource with default
cap 6 and tags heavy helm-driven components (keystone, neutron, nova,
cinder, glance, placement, heat, barbican, octavia, magnum, manila,
horizon). Smooths apiserver load without serializing.

Tests

* completion_test.go — tracker Wait/MarkDone semantics and context cancel
* resource_concurrency_test.go — overrides and k8s-api default cap
* orchestrator_test.go — buildPreGate branching
* existing deployer/orchestrator tests updated for new `Deploy`
  signature (`preGate func(context.Context) error`)

Backwards compatibility

The sequential `playbooks/openstack.yml --tags keystone` path is
preserved: the Keycloak tasks still exist in `roles/keystone/tasks/main.yml`
guarded by `_pre_role_active`, following the same pattern used by
`octavia` and `magnum`.

Signed-off-by: Ricardo Lin <[email protected]>
Co-authored-by: Copilot <[email protected]>
Signed-off-by: Rico Lin <>
Rico Lin (ricolin) added a commit that referenced this pull request Apr 24, 2026
Three complementary orchestrator improvements for the parallel deploy
critical path (measured on top of PR #3818).

1. PreRoleDependsOn asymmetric gating

Adds a `PreRoleDependsOn` field to `Component` so a pre-role can wait
for different dependencies than the main role. The orchestrator builds a
`preGate` closure that waits on a shared `completionTracker`; the main
role starts as soon as the component's own DependsOn are satisfied.

Applied to Keystone: the Keycloak realm / MFA / OIDC client / OpenID
ConfigMap tasks move to a new `keystone_pre` role that waits for
Keycloak, while the Keystone Helm install runs in parallel with Keycloak
startup. Frees ~3 min from the serial Keycloak→Keystone window.

2. Neutron DAG audit

Removes `nova` from `neutron.DependsOn`. The Neutron Ansible role only
runs Helm install + ingress; the neutron-metadata-agent pod talks to the
Nova API at runtime, not at install time. Lets Neutron install in parallel
with Nova (~2-3 min saving on the critical path).

3. Configurable ResourceCoordinator

Generalizes the resource semaphore from a per-resource mutex (cap=1) to
a configurable cap per resource. Adds a `k8s-api` resource with default
cap 6 and tags heavy helm-driven components (keystone, neutron, nova,
cinder, glance, placement, heat, barbican, octavia, magnum, manila,
horizon). Smooths apiserver load without serializing.

Tests

* completion_test.go — tracker Wait/MarkDone semantics and context cancel
* resource_concurrency_test.go — overrides and k8s-api default cap
* orchestrator_test.go — buildPreGate branching
* existing deployer/orchestrator tests updated for new `Deploy`
  signature (`preGate func(context.Context) error`)

Backwards compatibility

The sequential `playbooks/openstack.yml --tags keystone` path is
preserved: the Keycloak tasks still exist in `roles/keystone/tasks/main.yml`
guarded by `_pre_role_active`, following the same pattern used by
`octavia` and `magnum`.

Signed-off-by: Rico Lin <[email protected]>
Co-authored-by: Copilot <[email protected]>
Rico Lin (ricolin) added a commit that referenced this pull request Apr 24, 2026
Three complementary orchestrator improvements for the parallel deploy
critical path (measured on top of PR #3818).

1. PreRoleDependsOn asymmetric gating

Adds a `PreRoleDependsOn` field to `Component` so a pre-role can wait
for different dependencies than the main role. The orchestrator builds a
`preGate` closure that waits on a shared `completionTracker`; the main
role starts as soon as the component's own DependsOn are satisfied.

Applied to Keystone: the Keycloak realm / MFA / OIDC client / OpenID
ConfigMap tasks move to a new `keystone_pre` role that waits for
Keycloak, while the Keystone Helm install runs in parallel with Keycloak
startup. Frees ~3 min from the serial Keycloak→Keystone window.

2. Neutron DAG audit

Removes `nova` from `neutron.DependsOn`. The Neutron Ansible role only
runs Helm install + ingress; the neutron-metadata-agent pod talks to the
Nova API at runtime, not at install time. Lets Neutron install in parallel
with Nova (~2-3 min saving on the critical path).

3. Configurable ResourceCoordinator

Generalizes the resource semaphore from a per-resource mutex (cap=1) to
a configurable cap per resource. Adds a `k8s-api` resource with default
cap 6 and tags heavy helm-driven components (keystone, neutron, nova,
cinder, glance, placement, heat, barbican, octavia, magnum, manila,
horizon). Smooths apiserver load without serializing.

Tests

* completion_test.go — tracker Wait/MarkDone semantics and context cancel
* resource_concurrency_test.go — overrides and k8s-api default cap
* orchestrator_test.go — buildPreGate branching
* existing deployer/orchestrator tests updated for new `Deploy`
  signature (`preGate func(context.Context) error`)

Backwards compatibility

The sequential `playbooks/openstack.yml --tags keystone` path is
preserved: the Keycloak tasks still exist in `roles/keystone/tasks/main.yml`
guarded by `_pre_role_active`, following the same pattern used by
`octavia` and `magnum`.

Signed-off-by: Rico Lin <[email protected]>
Co-authored-by: Copilot <[email protected]>
Rico Lin (ricolin) and others added 4 commits April 25, 2026 08:33
Two complementary orchestrator improvements for the parallel deploy
critical path.

1. PreRoleDependsOn asymmetric gating

Adds a `PreRoleDependsOn` field to `Component` so a pre-role can wait
for different dependencies than the main role. The orchestrator builds a
`preGate` closure that waits on a shared `completionTracker`; the main
role starts as soon as the component's own DependsOn are satisfied.

Applied to Keystone: the Keycloak realm / MFA / OIDC client / OpenID
ConfigMap tasks move to a new `keystone_pre` role that waits for
Keycloak, while the Keystone Helm install runs in parallel with Keycloak
startup. Frees ~3 min from the serial Keycloak→Keystone window.

2. Configurable ResourceCoordinator

Generalizes the resource semaphore from a per-resource mutex (cap=1) to
a configurable cap per resource. Adds a `k8s-api` resource with default
cap 6 and tags heavy helm-driven components (keystone, neutron, nova,
cinder, glance, placement, heat, barbican, octavia, magnum, manila,
horizon). Smooths apiserver load without serializing.

Adds a `keycloak-admin` resource (cap 1) on `keystone` and
`kube-prometheus-stack` so concurrent calls to the Keycloak admin
HTTP endpoint don't race.

Tests

* completion_test.go — tracker Wait/MarkDone semantics and context cancel
* resource_concurrency_test.go — overrides and k8s-api default cap
* orchestrator_test.go — buildPreGate branching
* existing deployer/orchestrator tests updated for new `Deploy`
  signature (`preGate func(context.Context) error`)

Backwards compatibility

The sequential `playbooks/openstack.yml --tags keystone` path is
preserved: the Keycloak tasks still exist in `roles/keystone/tasks/main.yml`
guarded by `_pre_role_active`, following the same pattern used by
`octavia` and `magnum`.

Signed-off-by: Rico Lin <[email protected]>
Co-authored-by: Copilot <[email protected]>
Apply the same `PreRoleDependsOn` pattern used for Keystone to Neutron.

The `neutron` component's heavy Helm install (~5 min) only needs
Keystone, OVN, and CoreDNS — it does not require Nova. Only the
post-install "Create networks" task hits the neutron-server availability
zone check that requires Nova compute to have registered the default
`nova` zone.

Move the install (set facts, Helm install, ingress) into a new
`neutron_pre` role gated on `keystone`, `ovn`, and `coredns`. The main
`neutron` role keeps the network creation tasks and now only depends on
Nova. The duplicated install tasks in the main role are gated with
`when: not (_pre_role_active | default(false))` so a legacy serial
deploy still works without the orchestrator.

Under the parallel orchestrator the install now overlaps with Nova,
shaving roughly three minutes off the OVN and OVS critical path. The
network creation step continues to wait on Nova so the OVS availability
zone check still passes.

Signed-off-by: Rico Lin <[email protected]>

Co-authored-by: Copilot <[email protected]>
Image uploads (the glance_image loop) move into a new glance_images
role and a new glance-images orchestrator component that depends on
glance. Downstream services (Nova, Magnum, ...) now wait only for
the Glance API instead of for image downloads. playbooks/openstack.yml
runs glance_images right after glance so legacy site.yml deploys
keep the same behaviour.

Note: an earlier revision of this commit also tried to move
Octavia's Helm install up to Keystone time. That does not work
because the chart values reference the management network UUID,
amphora image owner, and controller port list — all of which are
populated by tasks in generate_resources.yml that need Neutron,
Glance, and Nova APIs. Those Octavia changes have been reverted;
only the Glance split remains.

Signed-off-by: Rico Lin <[email protected]>

Co-authored-by: Copilot <[email protected]>
The neutron split moved the Helm install into neutron_pre, which
removes the ~3-minute cushion previously provided by running the
Helm install sequentially after Nova. With the cushion gone, the
'Create networks' task in the main neutron role can hit the
'AvailabilityZone nova could not be found' error when Nova compute
takes longer than 5 minutes to register itself. Bump retries from
60 (5 min) to 120 (10 min) for both 'Create networks' and 'Create
subnets' loops so the network creation tolerates slower Nova boots.

Also rephrase the Glance image split bullet in the release note to
avoid a Microsoft.Passive ('be deployed') Vale annotation that the
reviewdog action treats as fail-on-error in added lines.

Signed-off-by: Rico Lin <[email protected]>

Co-authored-by: Copilot <[email protected]>
@ricolin
Copy link
Copy Markdown
Member

CI performance — adding tier-1 + tier-2 deploy-perf commits

The four follow-up commits from #3877 have been folded into this PR:

  • 1cac8259 tier-1 perf — keystone pre-role gating and k8s-api cap
  • aee3d348 split Neutron Helm install into neutron_pre
  • 45f7aab7 split Glance image uploads into a separate component
  • 4c053e04 widen Neutron Create networks retries + Vale rephrase

Cumulative CI runtime

Job Pre-#3818 #3818 (orch only) + tier-1/2 (this PR) Saved vs pre-#3818 % faster
atmosphere-molecule-aio-ovn 95m 41s 59m 46s 49m 39s 46m 02s 48.1%
atmosphere-molecule-aio-ovs 94m 33s 61m 18s 51m 58s 42m 35s 45.0%
atmosphere-molecule-csi-local-path 9m 30s 6m 28s 6m 15s 3m 15s 34.2%
atmosphere-molecule-csi-rbd 16m 27s 13m 02s 10m 10s 6m 17s 38.2%
atmosphere-molecule-keycloak 22m 06s 18m 13s 17m 31s 4m 35s 20.7%
Total 238m 17s 158m 47s 135m 33s 102m 44s 43.1%

Incremental gain from the tier-1/2 commits (vs the orchestrator-only baseline)

Job #3818 only + tier-1/2 Saved % Faster
aio-ovn 59m 46s 49m 39s 10m 07s 16.9%
aio-ovs 61m 18s 51m 58s 9m 20s 15.2%
csi-local-path 6m 28s 6m 15s 0m 13s 3.4%
csi-rbd 13m 02s 10m 10s 2m 52s 22.0%
keycloak 18m 13s 17m 31s 0m 42s 3.8%
Total 158m 47s 135m 33s 23m 14s 14.6%

Run sources (latest successful on 4c053e04): f17f8aef… aio-ovn, f2168704… aio-ovs, plus the matching csi-local-path / csi-rbd / keycloak builds in Zuul.

What drove the additional 23 m saving

  1. PreRoleDependsOn + keystone_pre / keycloak_pre / neutron_pre / glance_images splits — Helm install of one component overlaps with the prerequisite work of the next.
  2. k8s-API concurrency cap — eliminates the IPVS-conflict retries that dominated csi-rbd.
  3. Keycloak admin semaphore — serialises the few mutating admin calls without forcing keycloak as a whole to run sequentially.
  4. Image upload extracted from the glance main role — Nova no longer waits on Glance image downloads.
  5. Neutron AZ-race retries 60→120 — defensive bump to cover the rare slow-Nova case after the neutron_pre split.

@ricolin
Copy link
Copy Markdown
Member

recheck

Rico Lin and others added 2 commits April 27, 2026 14:03
New 'image_warmup' role and component that runs 'crictl pull' on every Kubernetes node for every value in '_atmosphere_images' as soon as the cluster is ready. The component depends only on 'kubernetes' and disables fact gathering so it joins wave 1 of the parallel orchestrator and runs alongside 'csi', 'cert-manager', and the early infrastructure components.

Pulls are best-effort: each command sets failed_when=false so a registry hiccup or missing image does not break the deploy. The kubelet falls back to on-demand pulling for any image that was not warmed in time. The role also exposes 'image_warmup_images' to override the list and 'image_warmup_concurrency' to throttle parallel pulls per host.

Co-authored-by: Copilot <[email protected]>
Change-Id: Ic45d666bd1c67e26ab7fe5390981ab73e6715e72
Signed-off-by: Rico Lin <[email protected]>
Components such as magnum and octavia run a pre-role and main role
concurrently inside an errgroup. When one of them fails (e.g. the
flaky GitHub release download in magnum's main role), the errgroup
cancels its context and exec.CommandContext SIGKILLs the direct
ansible-playbook process. Grandchild processes (ssh, python helpers,
glance image download workers) inherit the parent's stdout/stderr
pipes and survive the SIGKILL, so prefixOutput's bufio.Scanner blocks
forever, cmd.Wait never returns, and the orchestrator hangs until
Zuul kills the entire job at the 2 h timeout.

This was the actual cause of the molecule-aio-ovn timeout in PR 3818
CI: 41 of 42 components finished cleanly within ~30 minutes, but the
magnum goroutine remained stuck in cmd.Wait for 1 h 39 min after
the main role exited with failed=1.

Fix: start every ansible-playbook subprocess in its own process group
(Setpgid) and override exec.Cmd.Cancel to signal the whole group with
SIGKILL on context cancel, and set WaitDelay so the runtime force-
closes inherited pipes if any descendant outlives the kill. Applied
to AnsibleDeployer.runRole and to the preflight and single-tag
codepaths in the orchestrator.

Co-authored-by: Copilot <[email protected]>
Change-Id: Iaa187dfb3bf9d196f2f64113a9fc1399345b8a59
Signed-off-by: Rico Lin <[email protected]>
@ricolin
Copy link
Copy Markdown
Member

recheck

Both failures (csi-local-path-provisioner and keycloak) hit at the same minute (03:11 UTC) with timing-related helm hook errors (etcdserver: request timed out and helm post-install timed out waiting for the condition). Both jobs are green on 8 subsequent runs and unrelated scenarios on the same patchset (e.g. csi-rbd) passed cleanly. Looks like a transient etcd / node-pool flake, not a code regression.

@ricolin
Copy link
Copy Markdown
Member

recheck

1 similar comment
@ricolin
Copy link
Copy Markdown
Member

recheck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants