Skip to content

Latest commit

 

History

History
614 lines (476 loc) · 23.6 KB

File metadata and controls

614 lines (476 loc) · 23.6 KB

Claude Code Guidelines for redis-rust

⚠️ MANDATORY: Read These First

Before writing any code, read these three documents — they are the source of truth:

  1. docs/HARNESS.md — 4-layer verification loop, expected outputs, Tcl harness pitfalls, command addition checklist
  2. docs/RUST_STYLE.md — Rust coding standards: error handling, file size limits, clone avoidance, iterator patterns, assertion requirements, checked arithmetic
  3. docs/DST_GUIDE.md — Deterministic simulation testing methodology: SimulatedRng, VirtualTime, buggify fault injection, shadow state comparison

Violations of RUST_STYLE.md or DST_GUIDE.md will be caught by code review and CI. Do not skip them.

On-Demand Skills

Invoke these skills to inject domain expertise before working on specific areas:

Skill When to use
/distributed-systems Before working on replication, gossip, CRDTs, or multi-node code
/actor-model Before working on sharded_actor, connection handler, or message passing
/dst Before writing or modifying DST tests, fault injection, or shadow state
/tigerstyle Before modifying executor or data structure code (assertions, checked arithmetic)
/rust-dev Before writing any Rust code in this project (style, conventions, patterns)
/formal-verification Before working on TLA+ specs, Stateright models, Kani proofs, or Maelstrom

Skills are defined in .claude/agents/ and are invocable via /skill-name in Claude Code.

⚠️ MANDATORY: Default Development Workflow

For EVERY code change, follow this order:

1. Plan First (Architecture)

Before writing any code:

  • Identify which components are affected
  • Check if new I/O abstractions are needed for testability
  • Determine if actor boundaries need modification
  • Document the approach in a brief plan

2. TigerStyle First

Every function/method must have:

  • debug_assert! for preconditions at entry
  • debug_assert! for postconditions/invariants after mutations
  • Checked arithmetic (use checked_add, checked_sub, etc.)
  • Explicit error handling (no .unwrap() in production paths)
  • Early returns for error cases

3. DST-Compatible Design

Every new component must be:

  • Simulatable: All I/O through trait abstractions
  • Deterministic: No hidden randomness, use SimulatedRng
  • Time-controllable: Use VirtualTime, not std::time

4. Write Tests Before/With Code

  • Unit tests for pure logic
  • DST tests with fault injection for I/O components
  • Multiple seeds (minimum 10) for simulation tests
  • Redis equivalence tests for command behavior changes
  • Use Zipfian distribution (not uniform) for realistic workload tests

5. Design-by-Contract (TigerStyle Assertions)

  • State invariants must be checkable at any point
  • Add verify_invariants() methods to stateful structs
  • Run invariant checks in debug builds after every mutation

Note: VOPR (Viewstamped Operation Replicator) is TigerBeetle's specific DST tool for testing their Viewstamped Replication consensus protocol — it is NOT a synonym for DST in general. Our DST methodology follows the FoundationDB approach (seed-based determinism, simulated I/O, fault injection). The verify_invariants() pattern comes from TigerStyle / Design-by-Contract, not VOPR.


Project Philosophy

This project follows Simulation-First Development inspired by FoundationDB and TigerBeetle. The core principle: if you can't simulate it, you can't test it properly.

Architecture Principles

1. Deterministic Simulation Testing (DST)

All I/O operations must go through abstractions that can be:

  • Simulated: Deterministic, controllable behavior
  • Fault-injected: Network partitions, disk failures, message drops
  • Time-controlled: Fast-forward time, test timeout scenarios
// GOOD: I/O through trait abstraction
trait ObjectStore: Send + Sync {
    async fn put(&self, key: &str, data: &[u8]) -> Result<()>;
    async fn get(&self, key: &str) -> Result<Vec<u8>>;
}

// BAD: Direct I/O that can't be simulated
std::fs::write(path, data)?;

2. Actor Architecture

Components communicate via message passing, not shared mutable state:

// GOOD: Actor owns state exclusively
struct PersistenceActor {
    state: PersistenceState,  // Owned, not shared
    rx: mpsc::Receiver<Message>,
}

// BAD: Shared state with mutex
struct SharedPersistence {
    state: Arc<Mutex<PersistenceState>>,  // Contention, hard to test
}

3. TigerStyle Coding

Assertions (REQUIRED for every mutation):

// GOOD: Assert preconditions and postconditions
fn hincrby(&mut self, field: &str, increment: i64) -> Result<i64> {
    debug_assert!(!field.is_empty(), "Precondition: field must not be empty");

    let new_value = self.value.checked_add(increment)
        .ok_or_else(|| Error::Overflow)?;

    self.value = new_value;

    debug_assert_eq!(self.get(field), Some(new_value),
        "Postcondition: value must equal computed result");

    Ok(new_value)
}

// BAD: No assertions, silent failures
fn hincrby(&mut self, field: &str, increment: i64) -> i64 {
    self.value += increment;  // Can overflow!
    self.value
}

Checked Arithmetic (REQUIRED):

  • Use checked_add, checked_sub, checked_mul, checked_div
  • Return explicit errors on overflow, never wrap silently
  • Use saturating_* only when saturation is the correct behavior

Control Flow:

  • Prefer early returns, avoid deep nesting
  • No hidden allocations - be explicit about Vec::with_capacity
  • No panics in production paths, explicit Result<T, E>

Naming:

  • Functions that can fail: return Result<T, E>
  • Functions that assert: prefix with debug_assert!
  • Unsafe blocks: document why they're safe

4. Design-by-Contract (TigerStyle Assertions)

Every stateful struct should be verifiable:

// GOOD: Struct with verification method
impl RedisSortedSet {
    /// Verify all invariants hold - call in debug builds after mutations
    fn verify_invariants(&self) {
        debug_assert_eq!(
            self.members.len(),
            self.sorted_members.len(),
            "Invariant: members and sorted_members must have same length"
        );
        debug_assert!(
            self.is_sorted(),
            "Invariant: sorted_members must be sorted by (score, member)"
        );
        for (member, _) in &self.sorted_members {
            debug_assert!(
                self.members.contains_key(member),
                "Invariant: every sorted member must exist in members map"
            );
        }
    }

    pub fn add(&mut self, member: String, score: f64) {
        // ... mutation logic ...

        #[cfg(debug_assertions)]
        self.verify_invariants();
    }
}

Design-by-Contract Checklist for New Structs:

  • Define all invariants in comments
  • Implement verify_invariants() method
  • Call verification after every public mutation method
  • Add #[cfg(debug_assertions)] to avoid release overhead

5. Static Stability

Systems must remain stable under partial failures:

  • Graceful degradation when dependencies fail
  • Bounded queues with backpressure
  • Timeouts on all external operations

Testing Strategy

Read docs/HARNESS.md first. It documents the full 4-layer verification loop, expected outputs, Tcl harness pitfalls, and the command addition checklist. Everything below is supplementary.

Priority Order (most to least important):

1. DST Tests (REQUIRED for I/O components)

#[test]
fn test_with_fault_injection() {
    for seed in 0..50 {  // Minimum 10 seeds, prefer 50+
        let mut harness = SimulationHarness::new(seed);
        harness.enable_chaos();  // Random faults

        // Run test scenario
        harness.run_scenario(|ctx| async {
            // Test logic here
        });

        harness.verify_invariants();
    }
}

DST tests MUST:

  • Run with multiple seeds (minimum 10, ideally 50+)
  • Include fault injection (network drops, delays, failures)
  • Verify invariants after each operation
  • Be deterministic given the same seed

2. Unit Tests (REQUIRED for pure logic)

  • Test pure logic without I/O
  • Use InMemoryObjectStore for storage tests
  • Include edge cases: empty inputs, max values, overflow

3. Integration Tests

  • Test component interactions
  • Use real async runtime but simulated I/O

4. Redis Compatibility Tests (Official Tcl Suite)

Run the official Redis test suite against our implementation:

./scripts/run-redis-compat.sh                          # default test files
./scripts/run-redis-compat.sh unit/type/string         # specific test file
./scripts/run-redis-compat.sh unit/type/incr unit/expire  # multiple files

IMPORTANT: Tcl Harness Pitfalls (read before modifying commands)

The Tcl harness crashes the entire test file on unimplemented commands. One ERR unknown command 'FOO' kills all remaining tests in that file. This means:

  • Adding a stub that returns an error is better than not parsing at all. Parse unknown commands as Command::Unknown(name) so they return a clean error instead of crashing the RESP parser.
  • Error message format matters. The Tcl tests use assert_error "*pattern*" with glob matching. Redis uses these exact formats:
    • ERR wrong number of arguments for 'xxx' command
    • ERR value is not an integer or out of range
    • ERR value is not a valid float
    • ERR syntax error
    • WRONGTYPE Operation against a key holding the wrong kind of value
  • Double ERR prefix bug. connection_optimized.rs::encode_error_into prepends ERR to messages. If your parser error string already starts with ERR , the client sees ERR ERR .... The function now checks for this, but keep it in mind.
  • perf_config.toml affects tests. The root config has num_shards = 1 which is required for Tcl tests (MULTI/EXEC, Lua scripts need all keys on one shard). The docker-benchmark/perf_config.toml has num_shards = 16 for throughput. Do not change the root config to multi-shard without understanding the consequences for transactions and Lua.
  • max_size in perf_config.toml limits request size. It was previously 1MB which caused string.tcl to crash on 4MB payloads. Now 512MB. If you see "buffer overflow" in server logs, check this value.
  • MULTI/EXEC state is at the connection level (connection_optimized.rs), not per-shard executor. The executor still has transaction state for the simulation/DST path, but the production server intercepts MULTI/EXEC/DISCARD/WATCH before routing to shards.
  • WATCH uses GET-snapshot comparison. At WATCH time, the connection does GET key and stores the result. At EXEC time, it does GET key again and compares. If different, EXEC returns nil array. This is correct but different from Redis's internal dirty-flag approach.
  • Shard-aggregated commands. DBSIZE, SCAN, KEYS, EXISTS, FLUSHDB/FLUSHALL, and DEL are handled specially in sharded_actor.rs to fan out across all shards. If you add a new command that needs to see all keys, add aggregation there.
  • TIME command returns real wall-clock time via SystemTime::now() at the sharded_actor level, not virtual time from the executor.

Current Tcl compatibility status:

Suite Pass Blocker
unit/type/incr 28/28 -
unit/expire all -
unit/type/string 72 pass, 0 err LCS command not implemented (crashes test file)
unit/multi 20/56 SWAPDB (database swapping not implemented)
ACL (--features acl) 536/536 lib tests ACL DST + 21 unit tests pass; Tcl acl.tcl/acl-v2.tcl blocked on ACL LOG, DRYRUN

To add a new command without breaking the harness:

  1. Add variant to Command enum in command.rs
  2. Update get_primary_key(), get_keys(), name(), is_read_only() match arms
  3. Add parsing in BOTH parser.rs AND commands.rs (zero-copy parser). Use map_err to return Redis-compatible error strings, never raw Rust parse errors.
  4. Add execution in executor/mod.rs dispatch + the appropriate *_ops.rs file
  5. Add DST coverage in executor_dst.rs with shadow state tracking (see /dst skill section 8)
  6. If the command needs all-shard visibility, add aggregation in sharded_actor.rs
  7. If adding keepttl or similar fields to Command::Set, update ALL struct literal constructions (there are ~25+ across test files, script_ops.rs, etc.)

5. Linearizability Tests (Jepsen-style)

  • Use Maelstrom for distributed correctness
  • Test under network partitions
  • Verify consistency guarantees
  • Run: ./maelstrom/maelstrom test -w lin-kv --bin ./target/release/maelstrom_kv_replicated

6. Workload Realism (Zipfian Distribution)

Use Zipfian distribution for key access patterns in DST tests:

// GOOD: Realistic hot/cold key pattern
KeyDistribution::Zipfian { num_keys: 100, skew: 1.0 }

// BAD: Uniform distribution (unrealistic)
KeyDistribution::Uniform { num_keys: 100 }

With skew=1.0, top 10 keys receive ~40% of accesses (like real workloads).

Key Files

File Purpose
src/io/mod.rs I/O abstractions (Clock, Network, RNG)
src/simulator/ DST harness and fault injection
src/simulator/dst_integration.rs DST with Zipfian distribution
src/streaming/simulated_store.rs Fault-injectable object store
src/buggify/ Probabilistic fault injection
tests/redis-tests/ Official Redis Tcl test suite (git submodule)
scripts/run-redis-compat.sh Wrapper to run Tcl tests against our server
src/security/acl/ ACL system (users, commands, patterns, file)
src/security/acl_dst.rs ACL DST harness with shadow state
tests/acl_dst_test.rs ACL DST multi-seed integration tests
src/streaming/wal.rs WAL entry format, writer, reader, rotator
src/streaming/wal_store.rs WAL storage trait + local/simulated impls
src/streaming/wal_actor.rs WAL group commit actor (turbopuffer pattern)
src/streaming/wal_config.rs WAL configuration (fsync policies, rotation)
src/streaming/wal_dst.rs WAL DST harness with crash/recovery testing
tests/wal_dst_test.rs WAL DST multi-seed integration tests

Common Patterns

Creating Testable Components

// 1. Define trait for the I/O operation
pub trait ObjectStore: Send + Sync + 'static {
    fn put(&self, key: &str, data: &[u8]) -> impl Future<Output = IoResult<()>>;
}

// 2. Create production implementation
pub struct LocalFsObjectStore { path: PathBuf }

// 3. Create simulated implementation with fault injection
pub struct SimulatedObjectStore {
    inner: InMemoryObjectStore,
    fault_config: FaultConfig,
    rng: SimulatedRng,
}

// 4. Use generic in component
pub struct StreamingPersistence<S: ObjectStore> {
    store: S,
    // ...
}

Actor Shutdown Pattern

enum Message {
    DoWork(Work),
    Shutdown { response: oneshot::Sender<()> },
}

async fn run(mut self) {
    while let Some(msg) = self.rx.recv().await {
        match msg {
            Message::DoWork(w) => self.handle_work(w).await,
            Message::Shutdown { response } => {
                self.cleanup().await;
                let _ = response.send(());
                break;
            }
        }
    }
}

Bridging Sync to Async

When you need to connect sync code (like command execution) to async actors:

// Use std::sync::mpsc for fire-and-forget from sync context
let (tx, rx) = std::sync::mpsc::channel();

// Bridge task drains sync channel into async actor
async fn bridge(rx: Receiver<Delta>, actor: ActorHandle) {
    loop {
        // Use recv_timeout to stay responsive to shutdown
        if let Some(delta) = rx.recv_timeout(Duration::from_millis(50)) {
            actor.send(delta);
        }
        if shutdown.load(Ordering::SeqCst) {
            break;
        }
    }
}

Debugging Tips

  1. Seed-based reproduction: All simulations use seeds. Save failing seeds to reproduce bugs.
  2. Trace logging: Use tracing crate with structured logging.
  3. Invariant checks: Add assertions for state invariants after each operation.

Performance Guidelines

  1. Batch operations: Accumulate deltas in write buffer before flushing
  2. Async I/O: Use tokio::spawn for background operations
  3. Zero-copy where possible: Use Bytes for large data transfers
  4. Profile before optimizing: Use cargo flamegraph

Benchmarking Requirements

All performance comparisons MUST use Docker containers to ensure fair, reproducible results.

Why Docker-Only Benchmarks?

Concern Docker Solution
Resource fairness Both servers get identical CPU/memory limits
Environment consistency Same OS, kernel, networking stack
Reproducibility Anyone can run identical benchmarks
Apples-to-apples Eliminates "my machine is faster" bias

Docker Benchmark Configuration

Location: docker-benchmark/

Setting Value Rationale
CPU Limit 2 cores Realistic server constraint
Memory Limit 1GB Realistic server constraint
Network Host networking Eliminates Docker NAT overhead
Requests 100,000 Statistically significant
Clients 50 concurrent Realistic load
Pipeline depths 1, 16, 64 Tests different access patterns

Running Docker Benchmarks

# REQUIRED for any performance claims
cd docker-benchmark
./run-benchmarks.sh

# This runs:
# 1. Official Redis 7.4 (port 6379)
# 2. Rust implementation (port 3000)
# 3. Both non-pipelined and pipelined tests

What NOT to Use for Comparisons

# DON'T use these for Redis vs Rust comparisons:
cargo run --release --bin quick_benchmark  # Local only, no Redis comparison
cargo run --release --bin benchmark        # Local only, unfair conditions

The local benchmarks (quick_benchmark, benchmark) are useful for:

  • Quick smoke tests during development
  • Profiling and optimization work
  • Regression detection between commits

But they MUST NOT be used for Redis vs Rust performance claims in documentation.

Updating BENCHMARK_RESULTS.md

When updating performance numbers:

  1. Always run Docker benchmarks - never use local numbers for comparisons
  2. Run multiple times - take median of 3 runs to reduce variance
  3. Document the environment - include Docker version, host OS
  4. Include all pipeline depths - P=1, P=16, P=64 tell different stories
# Example workflow for updating benchmarks
cd docker-benchmark
for i in 1 2 3; do
    echo "=== Run $i ==="
    ./run-benchmarks.sh 2>&1 | tee run_$i.log
done
# Use median results in BENCHMARK_RESULTS.md

Commit Requirements

Every commit must be fully tested and documented:

Pre-Commit Checklist

  1. Full Test Suite: Run cargo test --release - all tests must pass
  2. DST Tests: Run streaming and simulator DST tests with multiple seeds
  3. Local Smoke Test: Run cargo run --release --bin quick_benchmark for quick validation
  4. Docker Benchmarks: Run docker-benchmark/run-benchmarks.sh if updating performance claims
  5. Documentation: Update BENCHMARK_RESULTS.md with Docker benchmark results if metrics change

Commit Workflow

# 1. Run full test suite
cargo test --release

# 2. Run DST batch tests (if modifying streaming/persistence)
cargo test streaming_dst --release

# 3. Run Maelstrom linearizability tests (if modifying replication)
./maelstrom/maelstrom test -w lin-kv --bin ./target/release/maelstrom_kv_replicated \
    --node-count 3 --time-limit 60 --rate 100

# 4. Quick local smoke test
cargo run --release --bin quick_benchmark

# 5. Docker benchmarks (REQUIRED for performance claims)
cd docker-benchmark && ./run-benchmarks.sh

# 6. Update BENCHMARK_RESULTS.md with Docker benchmark results

# 7. Commit with descriptive message
git commit -m "Description of changes

- What was added/changed
- Test results summary
- Docker benchmark comparison (if performance-related)"

What Must Be Updated

Change Type Required Updates
New feature Tests, README feature list
Performance change Docker benchmarks, BENCHMARK_RESULTS.md (Docker numbers only)
Bug fix Regression test
Replication change Maelstrom test run
Streaming change DST tests with fault injection
Command behavior change Redis equivalence test
Any Redis comparison Docker benchmarks (NEVER local benchmarks)

Benchmark Documentation Format

When updating BENCHMARK_RESULTS.md, include:

## [Date] - [Change Description]

### Test Configuration
- **Method**: Docker benchmarks (docker-benchmark/run-benchmarks.sh)
- **Docker**: [version]
- **Host OS**: [OS version]
- **CPU Limit**: 2 cores per container
- **Memory Limit**: 1GB per container

### Results (Docker - Fair Comparison)
| Operation | Redis 7.4 | Rust Implementation | Relative |
|-----------|-----------|---------------------|----------|
| SET (P=1) | X req/s   | Y req/s             | Z%       |
| GET (P=1) | X req/s   | Y req/s             | Z%       |
| SET (P=16)| X req/s   | Y req/s             | Z%       |
| GET (P=16)| X req/s   | Y req/s             | Z%       |

Important: All Redis vs Rust comparison numbers MUST come from Docker benchmarks.

Architecture Decision Records (ADRs)

ADRs in docs/adr/ are living decision logs that track architectural decisions as they evolve. They are not static documents - update them as you work.

ADR Index

ADR Title Summary
001 Simulation-First Development FoundationDB/TigerBeetle-style DST
002 Actor-per-Shard Architecture Lock-free message passing
003 TigerStyle Coding Standards Safety-first engineering
004 Anna KVS CRDT Replication Eventual/causal consistency
005 Streaming Persistence Cloud-native S3 storage
006 Zero-Copy RESP Parser High-performance parsing
007 Feature Flag Optimizations Measurable performance tuning
008 Docker Benchmark Methodology Fair performance comparisons
009 Security - TLS and ACL Optional TLS and Redis 6.0+ ACL

When to Update an ADR

During development, add a Decision Log entry when you:

  • Choose between alternative implementations
  • Discover constraints that affect the architecture
  • Defer or reject a planned feature
  • Change approach based on learnings
  • Make trade-offs during implementation

After commits, update Implementation Status when:

  • New components are implemented
  • Features are validated/tested
  • Items move from "Not Yet Implemented" to "Implemented"

How to Update an ADR

  1. Add a Decision Log entry (most common):

    | 2026-01-15 | Use batch inserts for persistence | Reduces write amplification |
  2. Update Implementation Status tables as work progresses

  3. Update main sections only if the decision fundamentally changes the approach

Creating a New ADR

When making a decision that doesn't fit existing ADRs:

  1. Copy docs/adr/000-template.md to docs/adr/NNN-descriptive-title.md
  2. Fill in Context, Decision, Consequences
  3. Add initial Decision Log entry
  4. Update docs/adr/README.md index

Dependencies

Core dependencies and their purposes:

  • tokio: Async runtime
  • bincode: Binary serialization (3-5x smaller than JSON)
  • crc32fast: Checksums for data integrity
  • tracing: Structured logging