Skip to content

Commit f239aa7

Browse files
committed
docs: synchronize foundational and user documentation with failover stack
Updated README, ARCHITECTURE, and CONFIGURATION to reflect the completion of the automatic failover, quorum durability, and cluster membership work. - Unified failover documentation by merging controlled and automatic paths into website/docs/concepts/failover.mdx. - Added ChaosFailoverProof mission to transit-cli to provide a live verification path for ElectionMonitor. - Integrated chaos-failover-proof into 'just screen' and the default verification suite. - Fixed a bug in LocalEngine::is_leader that incorrectly defaulted to true when a consensus provider was configured but no handle was bound. - Updated AGENTS.md and GUIDE.md status to match the latest implementation milestones. • [MSN] No active missions on the board • [EXC] Board idle, no stories queued or active • [HLT] 2 warnings, no structural errors detected
1 parent 0306630 commit f239aa7

File tree

15 files changed

+381
-136
lines changed

15 files changed

+381
-136
lines changed

AGENTS.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,18 +51,18 @@ Follow the formal procedural loops and checklists defined in:
5151

5252
Keep `just screen` as the default human proof path. If verification gets richer, improve that path instead of making the operator memorize an expanding command list.
5353

54-
## Current Status (2026-03-25)
54+
## Current Status (2026-03-30)
5555

5656
- **Kernel Done:** Single-node local engine with branch, merge, and tiered storage verified.
5757
- **Server Done:** Networked daemon with framed protocol, remote CLI, and tail sessions verified.
5858
- **Integrity Done:** Verifiable lineage primitives, manifest roots, and checkpoints landed.
5959
- **Materialization Done:** Branch-aware materialization kernel and Prolly Tree snapshots landed.
60-
- **Consensus Slice Done:** Initial consensus kernel and leader-enforcement slice landed.
61-
- **Proof Ready:** `just screen` covers local, tiered, networked, integrity, and materialization end-to-end flows.
60+
- **Consensus Done:** Lease-backed consensus, controlled failover, and leader enforcement verified.
61+
- **Quorum & Failover Done:** Quorum durability mode, cluster membership, automatic leader election, and election monitoring verified. A follower can automatically acquire an expired lease and become the writable primary; the former primary is fenced.
62+
- **Proof Ready:** `just screen` covers local, tiered, networked, integrity, materialization, and controlled failover end-to-end flows.
6263

6364
## Next Missions
6465

65-
- **Replication Planning:** Decomposing the staged multi-node replication model into voyages and ready stories.
6666
- **Board Hygiene:** Keeping mission/epic intent, generated artifacts, and pacemaker state aligned with execution.
6767
- **Client Libs:** Promoting external usage via a dedicated Rust client first; other language bindings can follow later.
6868

ARCHITECTURE.md

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ Server mode exposes the same storage engine behind a network API:
125125

126126
The server should not invent a second storage format or branch model.
127127

128-
The first implementation step is a thin daemon bootstrap that opens the shared engine and binds a listener. The current server slice now layers provisional remote root creation, append, read, snapshot-tail, branch creation, merge creation, and lineage inspection operations on top of that bootstrap, wrapped in a framed request/response envelope with correlation IDs plus explicit acknowledgement and error semantics. Tail streaming now uses logical session IDs with `open/poll/cancel` operations and credit-based delivery so the semantics do not collapse into one socket or underlay assumption. The first CLI client surface now mirrors those remote workflows directly, while richer client surfaces remain downstream. The public server surface remains explicitly single-node. Separately, the shared engine now has a bounded controlled failover slice for published-frontier readiness, explicit lease handoff, and former-primary fencing, but that slice still sits below quorum acknowledgement, automatic election, and multi-primary behavior.
128+
The first implementation step is a thin daemon bootstrap that opens the shared engine and binds a listener. The current server slice now layers provisional remote root creation, append, read, snapshot-tail, branch creation, merge creation, and lineage inspection operations on top of that bootstrap, wrapped in a framed request/response envelope with correlation IDs plus explicit acknowledgement and error semantics. Tail streaming now uses logical session IDs with `open/poll/cancel` operations and credit-based delivery so the semantics do not collapse into one socket or underlay assumption. The first CLI client surface now mirrors those remote workflows directly, while richer client surfaces remain downstream. The shared engine now has a full failover stack: controlled handoff, automatic leader election via `ElectionMonitor`, quorum-based durability, and cluster membership. Multi-primary behavior remains explicitly out of scope.
129129

130130
The transport boundary is also explicit: `transit` defines an application protocol above the transport layer. TCP, QUIC, or other ordinary transports can carry that protocol, and secure meshes such as WireGuard remain optional deployment underlays rather than protocol replacements.
131131

@@ -182,7 +182,8 @@ Suggested durability modes:
182182

183183
- `memory`: acknowledged after in-memory acceptance, for tests only
184184
- `local`: acknowledged after local durable write
185-
- `replicated`: acknowledged after the published handoff frontier is durable enough for read-only replica catch-up and promotion readiness; it does not imply follower hydration, quorum acknowledgement, or automatic failover
185+
- `replicated`: acknowledged after the published handoff frontier is durable enough for read-only replica catch-up and promotion readiness
186+
- `quorum`: acknowledged after a majority of configured cluster peers have confirmed receipt
186187
- `tiered`: acknowledged only after the relevant segment state is durable in the remote tier
187188

188189
## Read Path
@@ -228,9 +229,18 @@ The initial reference model should be simple:
228229
- acknowledged records are immutable
229230
- recovery must never expose unacknowledged bytes as committed history
230231

231-
The first controlled failover slice preserves that model by allowing a caught-up read-only replica to become the writable primary only through explicit lease handoff. Former primaries are fenced after handoff so stale leaders cannot continue acknowledged writes. This slice remains explicitly below quorum acknowledgement, election, and multi-primary behavior.
232+
The failover model preserves that invariant through three complementary components:
232233

233-
Any move toward replicated or multi-writer semantics must preserve those invariants and define conflict rules directly.
234+
- **ClusterMembership:** Nodes discover each other and maintain heartbeats to calculate quorum size.
235+
- **ElectionMonitor:** A background worker that polls for lease expiration and triggers automatic leader election via the `ConsensusProvider`.
236+
- **ObjectStoreConsensus:** A provider that uses optimistic locking on the remote tier to ensure that only one node can acquire a writable lease at a time.
237+
238+
Failover is supported through two paths:
239+
240+
- **Controlled failover:** A caught-up read-only replica becomes the writable primary through explicit lease handoff. The former primary is fenced after handoff.
241+
- **Automatic failover:** The `ElectionMonitor` detects primary failure (lease expiry) and triggers automatic acquisition by an eligible follower.
242+
243+
Both paths preserve the one-writer-per-stream-head invariant. Multi-primary behavior remains explicitly out of scope.
234244

235245
## Processing And Materialization
236246

@@ -326,7 +336,8 @@ That means lineage metadata should be cheap to create and easy to query.
326336

327337
These areas are important but should stay explicit future work until designed:
328338

329-
- distributed consensus and cross-node replication
339+
- multi-primary or multi-writer semantics
340+
- dynamic cluster rebalancing and automatic data sharding
330341
- compaction or projection layers above immutable history
331342
- authn/authz and multi-tenant isolation
332343
- query surfaces beyond ordered log replay and tailing

CONFIGURATION.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -148,15 +148,18 @@ The current bootstrap implementation wires `data_dir` through `transit server ru
148148

149149
### `[replication]`
150150

151-
Replication is deferred scope, but the config surface should be explicit once introduced.
151+
Replication and failover settings for clustered deployments.
152152

153153
| Key | Type | Default | Description |
154154
|-----|------|---------|-------------|
155-
| `mode` | String | `"single-node"` | Initial deployment model. |
156-
| `sync_quorum` | Integer | `1` | Number of nodes required for future replicated ack. |
157-
| `peer_urls` | Array | `[]` | Planned peer list for future replicated topologies. |
158-
159-
Until replication exists, `single-node` should remain the only supported value.
155+
| `mode` | String | `"single-node"` | Deployment model: `single-node` or `cluster`. |
156+
| `node_id` | String | null | Unique identity for this node (overrides `[node].id`). |
157+
| `consensus_root` | String | null | Object-store path used for shared leases and elections. |
158+
| `lease_duration_secs` | Integer | `10` | TTL for the primary lease. |
159+
| `election_poll_interval_ms` | Integer | `1000` | How often the `ElectionMonitor` checks lease health. |
160+
| `quorum_size` | Integer | `1` | Number of nodes required for `quorum` durability. |
161+
162+
`durability` mode `quorum` depends on these settings to discover peers and calculate the required majority.
160163

161164
### `[telemetry]`
162165

GUIDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ Examples:
4444

4545
At the current bootstrap stage, the shared-engine server exposes provisional remote root creation, append, read, snapshot-tail, branch creation, merge creation, and lineage inspection operations through `transit-core::server::RemoteClient`. The first wire shape now includes request correlation plus explicit acknowledgement and error envelopes, and the first tail-session model now uses logical `open/poll/cancel` operations with credit-based delivery rather than assuming one long-lived socket. The `transit server` CLI namespace now mirrors those workflows directly with `create-root`, `append`, `read`, `branch`, `merge`, `lineage`, `tail-open`, `tail-poll`, and `tail-cancel`. The surface is still explicitly single-node; replication-aware behavior is downstream work, and secure transports such as WireGuard remain optional underlays instead of becoming the `transit` protocol.
4646

47-
The first replicated failover slice is proof-oriented and intentionally bounded. A caught-up read-only replica can be promoted through an explicit lease handoff, the former primary is fenced, and the proof surface makes the non-claims explicit: this is not quorum acknowledgement, automatic election, or multi-primary behavior.
47+
The failover stack now supports two paths. **Controlled failover** lets an operator explicitly hand off the writable lease from a primary to a caught-up read-only replica. **Automatic failover** uses an `ElectionMonitor` that detects an expired primary lease and triggers a follower to acquire it via optimistic locking; only one node wins the race, and the former primary is fenced. The engine also supports a `quorum` durability mode where appends block until a majority of configured cluster peers have acknowledged, and a `ClusterMembership` surface for node discovery and quorum calculation. Multi-primary behavior remains explicitly out of scope.
4848

4949
## Modeling Conversations
5050

Justfile

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ screen:
2828
cd "$repo_root"
2929

3030
rm -rf "$screen_root"
31-
mkdir -p "$screen_root/object-store" "$screen_root/local-engine" "$screen_root/integrity" "$screen_root/materialization" "$screen_root/tiered-engine" "$screen_root/controlled-failover" "$screen_root/networked-server"
31+
mkdir -p "$screen_root/object-store" "$screen_root/local-engine" "$screen_root/integrity" "$screen_root/materialization" "$screen_root/tiered-engine" "$screen_root/controlled-failover" "$screen_root/chaos-failover" "$screen_root/networked-server"
3232

3333
announce "Build workspace"
3434
just build
@@ -38,6 +38,8 @@ screen:
3838
just run mission tiered-engine-proof --root "$screen_root/tiered-engine"
3939
announce "Prove controlled failover"
4040
just run mission controlled-failover-proof --root "$screen_root/controlled-failover"
41+
announce "Prove chaos failover"
42+
just run mission chaos-failover-proof --root "$screen_root/chaos-failover"
4143
announce "Prove networked server"
4244
just run mission networked-server-proof --root "$screen_root/networked-server"
4345
announce "Prove integrity proof"

README.md

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -93,16 +93,14 @@ This repository is at the bootstrap stage.
9393

9494
Today it contains:
9595

96-
- a Rust workspace with `transit-core` and `transit-cli`
97-
- a Nix flake and Rust toolchain bootstrap
98-
- a `Justfile` with a human-facing `just screen` verification path for local-engine proof, tiered publication/restore proof, controlled failover proof, networked single-node server proof, integrity proof, materialization proof, object-store probing, and the current Keel board view
99-
- a local durable engine that can append, replay, branch, merge, recover from trailing uncommitted active-head bytes, publish rolled immutable segments to object storage, and cold-restore published history from remote manifests
100-
- an initial shared-engine server bootstrap that can open the same local engine, bind a daemon listener, shut down deterministically, and serve provisional remote root creation, append/read/tail, branch/merge, and lineage-inspection operations through a framed request/response envelope with correlation IDs, explicit acknowledgement and error semantics, and logical tail sessions with credit-based delivery, without introducing a second storage path
101-
- a first CLI client surface for remote root creation, append, read, branch, merge, lineage inspection, and logical tail-session workflows
102-
- a first Rust client library surface plus a native proof example that exercises create_root, append, read, tail, branch, merge, and lineage against a locally started server
103-
- a first controlled failover proof path that shows promotion readiness, explicit lease handoff, and former-primary fencing while keeping local, replicated, tiered, quorum, and multi-primary guarantees explicit
104-
- a first networked mission proof path that validates the live single-node server and keeps the `transit` protocol explicitly distinct from optional secure underlays such as WireGuard
105-
- an initial `object_store` integration with a filesystem probe command
96+
- **Local Engine:** Durable local append, replay, branch, merge, and crash recovery with trailing-byte truncation.
97+
- **Tiered Storage:** Native publication to object storage and cold restore from remote manifests.
98+
- **Failover Stack:** Controlled handoff with lease fencing, automatic leader election via `ElectionMonitor`, and quorum-based durability.
99+
- **Networked Server:** Single-node daemon bootstrap with a framed request/response protocol and logical tail sessions.
100+
- **Integrity:** Staged verification from checksums to manifest roots and lineage checkpoints.
101+
- **Materialization:** Incremental processing with Prolly Tree snapshots and checkpoint-based resume.
102+
- **Clients:** Native Rust client library and a feature-complete CLI for operations and proofs.
103+
- **Verification:** A unified `just screen` path that runs the full suite of human-verifiable missions.
106104

107105
The implementation work now has a real scaffold to grow from instead of needing to reverse-engineer direction later.
108106

0 commit comments

Comments
 (0)