This document explains the container-first philosophy that drives this project's architecture.
This project embraces a container-first approach as a core design principle:
- Everything that can be containerized, is containerized
- Zero local tool installation required (except Docker itself)
- Maximum portability - runs identically on any Linux system
- Self-hosted infrastructure - no cloud dependencies or costs
- Portability: Works on any Linux system with Docker - no other setup needed
- Consistency: Eliminates "works on my machine" problems forever
- Simplicity: No complex dependency management or version conflicts
- Isolation: Each tool runs in its own environment
- Cost-effective: Designed for self-hosted runners with zero cloud costs
The Python CI container (docker/python-ci.Dockerfile) includes all necessary tools:
- Base Image: Python 3.11-slim for optimal performance
- Linter/Formatter: Ruff (replaces black, flake8, isort, pylint)
- Type Checker: mypy
- Testing: pytest, pytest-cov, pytest-asyncio, pytest-mock
- Security: bandit, safety
- Utilities: yamllint, pre-commit
- Coverage: XML and terminal coverage reports
The project includes specialized containers for AI backdoor detection:
-
sleeper-eval-cpu (
docker/sleeper-evaluation-cpu.Dockerfile):- CPU-optimized for CI/CD pipelines
- TransformerLens for residual stream analysis
- Lightweight for quick detection tests
- Used in PR validation workflows
-
sleeper-eval-gpu (
docker/sleeper-evaluation.Dockerfile):- GPU-enabled for comprehensive analysis
- Full PyTorch with CUDA support
- Advanced attention pattern analysis
- Used for detailed model evaluation
python-ci:
build:
context: .
dockerfile: docker/python-ci.Dockerfile
container_name: python-ci
user: "${USER_ID:-1000}:${GROUP_ID:-1000}"
environment:
- PYTHONDONTWRITEBYTECODE=1
- PYTHONPYCACHEPREFIX=/tmp/pycache
sleeper-eval-cpu:
build:
context: .
dockerfile: docker/sleeper-evaluation-cpu.Dockerfile
container_name: sleeper-eval-cpu
environment:
- PYTHONDONTWRITEBYTECODE=1
- DEVICE=cpu
volumes:
- ./evaluation_results:/app/evaluation_results
sleeper-eval-gpu:
build:
context: .
dockerfile: docker/sleeper-evaluation.Dockerfile
container_name: sleeper-eval-gpu
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]Key features:
- Runs as current user to avoid permission issues
- Python cache prevention enabled
- Mounts current directory as working directory
The automation-cli binary provides a unified interface to all CI stages:
# Build once
cargo build --release -p automation-cli
# Format checking
automation-cli ci run format
# Linting
automation-cli ci run lint-basic
automation-cli ci run lint-full
# Testing
automation-cli ci run test
# Security scanning
automation-cli ci run security
# Auto-formatting
automation-cli ci run autoformat
# Full CI pipeline (all checks)
automation-cli ci run full
# YAML/JSON validation
automation-cli ci run yaml-lint
automation-cli ci run json-lint
# List all available stages
automation-cli ci listSee tools/rust/automation-cli/README.md for the full command reference.
For more control, use Docker Compose directly:
# Run Black formatter
docker compose run --rm python-ci black .
# Run specific pytest tests
docker compose run --rm python-ci pytest tests/test_specific.py -v
# Run with custom environment
docker compose run --rm -e CUSTOM_VAR=value python-ci commandTo prevent permission issues with Python cache files:
-
Environment Variables:
PYTHONDONTWRITEBYTECODE=1- Prevents .pyc file creationPYTHONPYCACHEPREFIX=/tmp/pycache- Redirects cache to temp directory
-
Configuration Files:
pytest.iniincludes-p no:cacheproviderto disable pytest cache
-
Container User Permissions:
- Containers run as current user (USER_ID:GROUP_ID)
- No files are created with root permissions
GitHub Actions workflows use the containerized approach:
- name: Run Python Linting
run: |
automation-cli ci run lint-basic
- name: Run Tests with Coverage
run: |
automation-cli ci run testThis ensures:
- Consistent behavior between local and CI environments
- No need to install Python dependencies on runners
- Faster execution with cached Docker images
- Python 3.11 environment matches production
To add a new Python tool:
-
Update
docker/python-ci.Dockerfile:RUN pip install --no-cache-dir new-tool -
Add a new
Stagevariant intools/rust/automation-cli/src/commands/ci/stages.rsand handle it inci/mod.rs:Stage::NewStage => { output::header("Running new tool"); docker::run_python_ci(compose, &["new-tool", "."], &[]) },
-
Rebuild the container:
docker compose build python-ci
# Force rebuild without cache
docker compose build --no-cache python-ci
# Check build logs
docker compose build python-ci 2>&1 | tee build.log# Verify user IDs
echo "USER_ID=$(id -u) GROUP_ID=$(id -g)"
# Run with explicit user
USER_ID=$(id -u) GROUP_ID=$(id -g) docker compose run --rm python-ci command# Use BuildKit for faster builds
DOCKER_BUILDKIT=1 docker compose build python-ci
# Prune old images
docker image prune -f- All Python tools: Ruff, mypy, pytest
- MCP server: Runs in its own container with all dependencies
- CI/CD operations: All pipeline steps use containers
- Development tools: Any tool that doesn't need Docker access
- Claude CLI: Requires host subscription authentication (machine-specific)
- Docker Compose: Obviously needs to run on the host
- GitHub Actions runner: Needs system-level access
- Git operations: Need access to host git configuration
- Zero Setup Time: Clone and run - no installation guides needed
- Perfect Reproducibility: Same environment for solo developer across all machines
- No Version Conflicts: Each container has exactly what it needs
- Easy Updates: Just rebuild the container
- Self-Hosted Friendly: Optimized for personal infrastructure
- Always use helper scripts for common operations
- Keep containers lightweight - only install necessary tools
- Use specific versions in Dockerfile for reproducibility (e.g., Python 3.11)
- Leverage Docker layer caching on self-hosted runners
- Design for single-maintainer efficiency
- Run containers with user permissions to avoid file ownership issues
- Use multi-stage builds when appropriate for smaller final images
This container-first approach means:
- No README sections about installing dependencies
- No version compatibility matrices
- No "please install X, Y, Z first" instructions
- Just Docker, and everything works
Perfect for individual developers who want professional infrastructure without the complexity.