Claude Code Guidelines for redis-rust

⚠️ MANDATORY: Read These First

Before writing any code, read these three documents — they are the source of truth:

docs/HARNESS.md — 4-layer verification loop, expected outputs, Tcl harness pitfalls, command addition checklist
docs/RUST_STYLE.md — Rust coding standards: error handling, file size limits, clone avoidance, iterator patterns, assertion requirements, checked arithmetic
docs/DST_GUIDE.md — Deterministic simulation testing methodology: SimulatedRng, VirtualTime, buggify fault injection, shadow state comparison

Violations of RUST_STYLE.md or DST_GUIDE.md will be caught by code review and CI. Do not skip them.

On-Demand Skills

Invoke these skills to inject domain expertise before working on specific areas:

Skill	When to use
`/distributed-systems`	Before working on replication, gossip, CRDTs, or multi-node code
`/actor-model`	Before working on sharded_actor, connection handler, or message passing
`/dst`	Before writing or modifying DST tests, fault injection, or shadow state
`/tigerstyle`	Before modifying executor or data structure code (assertions, checked arithmetic)
`/rust-dev`	Before writing any Rust code in this project (style, conventions, patterns)
`/formal-verification`	Before working on TLA+ specs, Stateright models, Kani proofs, or Maelstrom

Skills are defined in .claude/agents/ and are invocable via /skill-name in Claude Code.

⚠️ MANDATORY: Default Development Workflow

For EVERY code change, follow this order:

1. Plan First (Architecture)

Before writing any code:

Identify which components are affected
Check if new I/O abstractions are needed for testability
Determine if actor boundaries need modification
Document the approach in a brief plan

2. TigerStyle First

Every function/method must have:

debug_assert! for preconditions at entry
debug_assert! for postconditions/invariants after mutations
Checked arithmetic (use checked_add, checked_sub, etc.)
Explicit error handling (no .unwrap() in production paths)
Early returns for error cases

3. DST-Compatible Design

Every new component must be:

Simulatable: All I/O through trait abstractions
Deterministic: No hidden randomness, use SimulatedRng
Time-controllable: Use VirtualTime, not std::time

4. Write Tests Before/With Code

Unit tests for pure logic
DST tests with fault injection for I/O components
Multiple seeds (minimum 10) for simulation tests
Redis equivalence tests for command behavior changes
Use Zipfian distribution (not uniform) for realistic workload tests

5. Design-by-Contract (TigerStyle Assertions)

State invariants must be checkable at any point
Add verify_invariants() methods to stateful structs
Run invariant checks in debug builds after every mutation

Note: VOPR (Viewstamped Operation Replicator) is TigerBeetle's specific DST tool for testing their Viewstamped Replication consensus protocol — it is NOT a synonym for DST in general. Our DST methodology follows the FoundationDB approach (seed-based determinism, simulated I/O, fault injection). The verify_invariants() pattern comes from TigerStyle / Design-by-Contract, not VOPR.

Project Philosophy

This project follows Simulation-First Development inspired by FoundationDB and TigerBeetle. The core principle: if you can't simulate it, you can't test it properly.

Architecture Principles

1. Deterministic Simulation Testing (DST)

All I/O operations must go through abstractions that can be:

Simulated: Deterministic, controllable behavior
Fault-injected: Network partitions, disk failures, message drops
Time-controlled: Fast-forward time, test timeout scenarios

// GOOD: I/O through trait abstraction
trait ObjectStore: Send + Sync {
    async fn put(&self, key: &str, data: &[u8]) -> Result<()>;
    async fn get(&self, key: &str) -> Result<Vec<u8>>;
}

// BAD: Direct I/O that can't be simulated
std::fs::write(path, data)?;

2. Actor Architecture

Components communicate via message passing, not shared mutable state:

// GOOD: Actor owns state exclusively
struct PersistenceActor {
    state: PersistenceState,  // Owned, not shared
    rx: mpsc::Receiver<Message>,
}

// BAD: Shared state with mutex
struct SharedPersistence {
    state: Arc<Mutex<PersistenceState>>,  // Contention, hard to test
}

3. TigerStyle Coding

Assertions (REQUIRED for every mutation):

// GOOD: Assert preconditions and postconditions
fn hincrby(&mut self, field: &str, increment: i64) -> Result<i64> {
    debug_assert!(!field.is_empty(), "Precondition: field must not be empty");

    let new_value = self.value.checked_add(increment)
        .ok_or_else(|| Error::Overflow)?;

    self.value = new_value;

    debug_assert_eq!(self.get(field), Some(new_value),
        "Postcondition: value must equal computed result");

    Ok(new_value)
}

// BAD: No assertions, silent failures
fn hincrby(&mut self, field: &str, increment: i64) -> i64 {
    self.value += increment;  // Can overflow!
    self.value
}

Checked Arithmetic (REQUIRED):

Use checked_add, checked_sub, checked_mul, checked_div
Return explicit errors on overflow, never wrap silently
Use saturating_* only when saturation is the correct behavior

Control Flow:

Prefer early returns, avoid deep nesting
No hidden allocations - be explicit about Vec::with_capacity
No panics in production paths, explicit Result<T, E>

Naming:

Functions that can fail: return Result<T, E>
Functions that assert: prefix with debug_assert!
Unsafe blocks: document why they're safe

4. Design-by-Contract (TigerStyle Assertions)

Every stateful struct should be verifiable:

// GOOD: Struct with verification method
impl RedisSortedSet {
    /// Verify all invariants hold - call in debug builds after mutations
    fn verify_invariants(&self) {
        debug_assert_eq!(
            self.members.len(),
            self.sorted_members.len(),
            "Invariant: members and sorted_members must have same length"
        );
        debug_assert!(
            self.is_sorted(),
            "Invariant: sorted_members must be sorted by (score, member)"
        );
        for (member, _) in &self.sorted_members {
            debug_assert!(
                self.members.contains_key(member),
                "Invariant: every sorted member must exist in members map"
            );
        }
    }

    pub fn add(&mut self, member: String, score: f64) {
        // ... mutation logic ...

        #[cfg(debug_assertions)]
        self.verify_invariants();
    }
}

Design-by-Contract Checklist for New Structs:

Define all invariants in comments
Implement verify_invariants() method
Call verification after every public mutation method
Add #[cfg(debug_assertions)] to avoid release overhead

5. Static Stability

Systems must remain stable under partial failures:

Graceful degradation when dependencies fail
Bounded queues with backpressure
Timeouts on all external operations

Testing Strategy

Read docs/HARNESS.md first. It documents the full 4-layer verification loop, expected outputs, Tcl harness pitfalls, and the command addition checklist. Everything below is supplementary.

Priority Order (most to least important):

1. DST Tests (REQUIRED for I/O components)

#[test]
fn test_with_fault_injection() {
    for seed in 0..50 {  // Minimum 10 seeds, prefer 50+
        let mut harness = SimulationHarness::new(seed);
        harness.enable_chaos();  // Random faults

        // Run test scenario
        harness.run_scenario(|ctx| async {
            // Test logic here
        });

        harness.verify_invariants();
    }
}

DST tests MUST:

Run with multiple seeds (minimum 10, ideally 50+)
Include fault injection (network drops, delays, failures)
Verify invariants after each operation
Be deterministic given the same seed

2. Unit Tests (REQUIRED for pure logic)

Test pure logic without I/O
Use InMemoryObjectStore for storage tests
Include edge cases: empty inputs, max values, overflow

3. Integration Tests

Test component interactions
Use real async runtime but simulated I/O

4. Redis Compatibility Tests (Official Tcl Suite)

Run the official Redis test suite against our implementation:

./scripts/run-redis-compat.sh                          # default test files
./scripts/run-redis-compat.sh unit/type/string         # specific test file
./scripts/run-redis-compat.sh unit/type/incr unit/expire  # multiple files

IMPORTANT: Tcl Harness Pitfalls (read before modifying commands)

The Tcl harness crashes the entire test file on unimplemented commands. One ERR unknown command 'FOO' kills all remaining tests in that file. This means:

Adding a stub that returns an error is better than not parsing at all. Parse unknown commands as Command::Unknown(name) so they return a clean error instead of crashing the RESP parser.
Error message format matters. The Tcl tests use assert_error "*pattern*" with glob matching. Redis uses these exact formats:
- ERR wrong number of arguments for 'xxx' command
- ERR value is not an integer or out of range
- ERR value is not a valid float
- ERR syntax error
- WRONGTYPE Operation against a key holding the wrong kind of value
Double ERR prefix bug. connection_optimized.rs::encode_error_into prepends ERR to messages. If your parser error string already starts with ERR , the client sees ERR ERR .... The function now checks for this, but keep it in mind.
perf_config.toml affects tests. The root config has num_shards = 1 which is required for Tcl tests (MULTI/EXEC, Lua scripts need all keys on one shard). The docker-benchmark/perf_config.toml has num_shards = 16 for throughput. Do not change the root config to multi-shard without understanding the consequences for transactions and Lua.
max_size in perf_config.toml limits request size. It was previously 1MB which caused string.tcl to crash on 4MB payloads. Now 512MB. If you see "buffer overflow" in server logs, check this value.
MULTI/EXEC state is at the connection level (connection_optimized.rs), not per-shard executor. The executor still has transaction state for the simulation/DST path, but the production server intercepts MULTI/EXEC/DISCARD/WATCH before routing to shards.
WATCH uses GET-snapshot comparison. At WATCH time, the connection does GET key and stores the result. At EXEC time, it does GET key again and compares. If different, EXEC returns nil array. This is correct but different from Redis's internal dirty-flag approach.
Shard-aggregated commands. DBSIZE, SCAN, KEYS, EXISTS, FLUSHDB/FLUSHALL, and DEL are handled specially in sharded_actor.rs to fan out across all shards. If you add a new command that needs to see all keys, add aggregation there.
TIME command returns real wall-clock time via SystemTime::now() at the sharded_actor level, not virtual time from the executor.

Current Tcl compatibility status:

Suite	Pass	Blocker
`unit/type/incr`	28/28	-
`unit/expire`	all	-
`unit/type/string`	72 pass, 0 err	LCS command not implemented (crashes test file)
`unit/multi`	20/56	SWAPDB (database swapping not implemented)
ACL (`--features acl`)	536/536 lib tests	ACL DST + 21 unit tests pass; Tcl `acl.tcl`/`acl-v2.tcl` blocked on ACL LOG, DRYRUN

To add a new command without breaking the harness:

Add variant to Command enum in command.rs
Update get_primary_key(), get_keys(), name(), is_read_only() match arms
Add parsing in BOTH parser.rs AND commands.rs (zero-copy parser). Use map_err to return Redis-compatible error strings, never raw Rust parse errors.
Add execution in executor/mod.rs dispatch + the appropriate *_ops.rs file
Add DST coverage in executor_dst.rs with shadow state tracking (see /dst skill section 8)
If the command needs all-shard visibility, add aggregation in sharded_actor.rs
If adding keepttl or similar fields to Command::Set, update ALL struct literal constructions (there are ~25+ across test files, script_ops.rs, etc.)

5. Linearizability Tests (Jepsen-style)

Use Maelstrom for distributed correctness
Test under network partitions
Verify consistency guarantees
Run: ./maelstrom/maelstrom test -w lin-kv --bin ./target/release/maelstrom_kv_replicated

6. Workload Realism (Zipfian Distribution)

Use Zipfian distribution for key access patterns in DST tests:

// GOOD: Realistic hot/cold key pattern
KeyDistribution::Zipfian { num_keys: 100, skew: 1.0 }

// BAD: Uniform distribution (unrealistic)
KeyDistribution::Uniform { num_keys: 100 }

With skew=1.0, top 10 keys receive ~40% of accesses (like real workloads).

Key Files

File	Purpose
`src/io/mod.rs`	I/O abstractions (Clock, Network, RNG)
`src/simulator/`	DST harness and fault injection
`src/simulator/dst_integration.rs`	DST with Zipfian distribution
`src/streaming/simulated_store.rs`	Fault-injectable object store
`src/buggify/`	Probabilistic fault injection
`tests/redis-tests/`	Official Redis Tcl test suite (git submodule)
`scripts/run-redis-compat.sh`	Wrapper to run Tcl tests against our server
`src/security/acl/`	ACL system (users, commands, patterns, file)
`src/security/acl_dst.rs`	ACL DST harness with shadow state
`tests/acl_dst_test.rs`	ACL DST multi-seed integration tests
`src/streaming/wal.rs`	WAL entry format, writer, reader, rotator
`src/streaming/wal_store.rs`	WAL storage trait + local/simulated impls
`src/streaming/wal_actor.rs`	WAL group commit actor (turbopuffer pattern)
`src/streaming/wal_config.rs`	WAL configuration (fsync policies, rotation)
`src/streaming/wal_dst.rs`	WAL DST harness with crash/recovery testing
`tests/wal_dst_test.rs`	WAL DST multi-seed integration tests

Common Patterns

Creating Testable Components

// 1. Define trait for the I/O operation
pub trait ObjectStore: Send + Sync + 'static {
    fn put(&self, key: &str, data: &[u8]) -> impl Future<Output = IoResult<()>>;
}

// 2. Create production implementation
pub struct LocalFsObjectStore { path: PathBuf }

// 3. Create simulated implementation with fault injection
pub struct SimulatedObjectStore {
    inner: InMemoryObjectStore,
    fault_config: FaultConfig,
    rng: SimulatedRng,
}

// 4. Use generic in component
pub struct StreamingPersistence<S: ObjectStore> {
    store: S,
    // ...
}

Actor Shutdown Pattern

enum Message {
    DoWork(Work),
    Shutdown { response: oneshot::Sender<()> },
}

async fn run(mut self) {
    while let Some(msg) = self.rx.recv().await {
        match msg {
            Message::DoWork(w) => self.handle_work(w).await,
            Message::Shutdown { response } => {
                self.cleanup().await;
                let _ = response.send(());
                break;
            }
        }
    }
}

Bridging Sync to Async

When you need to connect sync code (like command execution) to async actors:

// Use std::sync::mpsc for fire-and-forget from sync context
let (tx, rx) = std::sync::mpsc::channel();

// Bridge task drains sync channel into async actor
async fn bridge(rx: Receiver<Delta>, actor: ActorHandle) {
    loop {
        // Use recv_timeout to stay responsive to shutdown
        if let Some(delta) = rx.recv_timeout(Duration::from_millis(50)) {
            actor.send(delta);
        }
        if shutdown.load(Ordering::SeqCst) {
            break;
        }
    }
}

Debugging Tips

Seed-based reproduction: All simulations use seeds. Save failing seeds to reproduce bugs.
Trace logging: Use tracing crate with structured logging.
Invariant checks: Add assertions for state invariants after each operation.

Performance Guidelines

Batch operations: Accumulate deltas in write buffer before flushing
Async I/O: Use tokio::spawn for background operations
Zero-copy where possible: Use Bytes for large data transfers
Profile before optimizing: Use cargo flamegraph

Benchmarking Requirements

All performance comparisons MUST use Docker containers to ensure fair, reproducible results.

Why Docker-Only Benchmarks?

Concern	Docker Solution
Resource fairness	Both servers get identical CPU/memory limits
Environment consistency	Same OS, kernel, networking stack
Reproducibility	Anyone can run identical benchmarks
Apples-to-apples	Eliminates "my machine is faster" bias

Docker Benchmark Configuration

Location: docker-benchmark/

Setting	Value	Rationale
CPU Limit	2 cores	Realistic server constraint
Memory Limit	1GB	Realistic server constraint
Network	Host networking	Eliminates Docker NAT overhead
Requests	100,000	Statistically significant
Clients	50 concurrent	Realistic load
Pipeline depths	1, 16, 64	Tests different access patterns

Running Docker Benchmarks

# REQUIRED for any performance claims
cd docker-benchmark
./run-benchmarks.sh

# This runs:
# 1. Official Redis 7.4 (port 6379)
# 2. Rust implementation (port 3000)
# 3. Both non-pipelined and pipelined tests

What NOT to Use for Comparisons

# DON'T use these for Redis vs Rust comparisons:
cargo run --release --bin quick_benchmark  # Local only, no Redis comparison
cargo run --release --bin benchmark        # Local only, unfair conditions

The local benchmarks (quick_benchmark, benchmark) are useful for:

Quick smoke tests during development
Profiling and optimization work
Regression detection between commits

But they MUST NOT be used for Redis vs Rust performance claims in documentation.

Updating BENCHMARK_RESULTS.md

When updating performance numbers:

Always run Docker benchmarks - never use local numbers for comparisons
Run multiple times - take median of 3 runs to reduce variance
Document the environment - include Docker version, host OS
Include all pipeline depths - P=1, P=16, P=64 tell different stories

# Example workflow for updating benchmarks
cd docker-benchmark
for i in 1 2 3; do
    echo "=== Run $i ==="
    ./run-benchmarks.sh 2>&1 | tee run_$i.log
done
# Use median results in BENCHMARK_RESULTS.md

Commit Requirements

Every commit must be fully tested and documented:

Pre-Commit Checklist

Full Test Suite: Run cargo test --release - all tests must pass
DST Tests: Run streaming and simulator DST tests with multiple seeds
Local Smoke Test: Run cargo run --release --bin quick_benchmark for quick validation
Docker Benchmarks: Run docker-benchmark/run-benchmarks.sh if updating performance claims
Documentation: Update BENCHMARK_RESULTS.md with Docker benchmark results if metrics change

Commit Workflow

# 1. Run full test suite
cargo test --release

# 2. Run DST batch tests (if modifying streaming/persistence)
cargo test streaming_dst --release

# 3. Run Maelstrom linearizability tests (if modifying replication)
./maelstrom/maelstrom test -w lin-kv --bin ./target/release/maelstrom_kv_replicated \
    --node-count 3 --time-limit 60 --rate 100

# 4. Quick local smoke test
cargo run --release --bin quick_benchmark

# 5. Docker benchmarks (REQUIRED for performance claims)
cd docker-benchmark && ./run-benchmarks.sh

# 6. Update BENCHMARK_RESULTS.md with Docker benchmark results

# 7. Commit with descriptive message
git commit -m "Description of changes

- What was added/changed
- Test results summary
- Docker benchmark comparison (if performance-related)"

What Must Be Updated

Change Type	Required Updates
New feature	Tests, README feature list
Performance change	Docker benchmarks, BENCHMARK_RESULTS.md (Docker numbers only)
Bug fix	Regression test
Replication change	Maelstrom test run
Streaming change	DST tests with fault injection
Command behavior change	Redis equivalence test
Any Redis comparison	Docker benchmarks (NEVER local benchmarks)

Benchmark Documentation Format

When updating BENCHMARK_RESULTS.md, include:

## [Date] - [Change Description]

### Test Configuration
- **Method**: Docker benchmarks (docker-benchmark/run-benchmarks.sh)
- **Docker**: [version]
- **Host OS**: [OS version]
- **CPU Limit**: 2 cores per container
- **Memory Limit**: 1GB per container

### Results (Docker - Fair Comparison)
| Operation | Redis 7.4 | Rust Implementation | Relative |
|-----------|-----------|---------------------|----------|
| SET (P=1) | X req/s   | Y req/s             | Z%       |
| GET (P=1) | X req/s   | Y req/s             | Z%       |
| SET (P=16)| X req/s   | Y req/s             | Z%       |
| GET (P=16)| X req/s   | Y req/s             | Z%       |

Important: All Redis vs Rust comparison numbers MUST come from Docker benchmarks.

Architecture Decision Records (ADRs)

ADRs in docs/adr/ are living decision logs that track architectural decisions as they evolve. They are not static documents - update them as you work.

ADR Index

ADR	Title	Summary
001	Simulation-First Development	FoundationDB/TigerBeetle-style DST
002	Actor-per-Shard Architecture	Lock-free message passing
003	TigerStyle Coding Standards	Safety-first engineering
004	Anna KVS CRDT Replication	Eventual/causal consistency
005	Streaming Persistence	Cloud-native S3 storage
006	Zero-Copy RESP Parser	High-performance parsing
007	Feature Flag Optimizations	Measurable performance tuning
008	Docker Benchmark Methodology	Fair performance comparisons
009	Security - TLS and ACL	Optional TLS and Redis 6.0+ ACL

When to Update an ADR

During development, add a Decision Log entry when you:

Choose between alternative implementations
Discover constraints that affect the architecture
Defer or reject a planned feature
Change approach based on learnings
Make trade-offs during implementation

After commits, update Implementation Status when:

New components are implemented
Features are validated/tested
Items move from "Not Yet Implemented" to "Implemented"

How to Update an ADR

Add a Decision Log entry (most common):

| 2026-01-15 | Use batch inserts for persistence | Reduces write amplification |

Update Implementation Status tables as work progresses
Update main sections only if the decision fundamentally changes the approach

Creating a New ADR

When making a decision that doesn't fit existing ADRs:

Copy docs/adr/000-template.md to docs/adr/NNN-descriptive-title.md
Fill in Context, Decision, Consequences
Add initial Decision Log entry
Update docs/adr/README.md index

Dependencies

Core dependencies and their purposes:

tokio: Async runtime
bincode: Binary serialization (3-5x smaller than JSON)
crc32fast: Checksums for data integrity
tracing: Structured logging

FilesExpand file tree

CLAUDE.md

Latest commit

History