symbi-redteam

Governed autonomous penetration testing platform powered by Symbiont. An AI engagement controller orchestrates a multi-phase pen test across a curated offensive toolchain where every tool has a different risk profile, every action is Cedar policy-gated, and every finding is evidence-chained.

The Problem

Penetration testing firms face four persistent problems:

Scope creep — testers accidentally hit out-of-scope assets
Evidence chain integrity — tampering risk in findings
Junior tester supervision — unsupervised high-risk tool usage
Reporting overhead — 40% of engagement time writing reports

The Solution: ORGA-Governed Multi-Agent Pen Testing

Eight specialized agents execute a PTES-methodology pen test. Every tool invocation passes through Symbiont's ORGA (Observe-Reason-Gate-Act) loop with Cedar policy enforcement:

engagement-controller
├── recon agent         → nmap, whois, dig, whatweb, amass
├── enum agent          → nikto, gobuster, enum4linux, smbclient, snmpwalk
├── vuln-assess agent   → nmap NSE, nuclei, sqlmap (detect), searchsploit
├── exploit agent       → hydra, metasploit, sqlmap (exploit)  [human-gated]
├── post-exploit agent  → impacket, pypykatz, chisel, ligolo   [human-gated]
├── reflector agent     → distils phase findings into knowledge triples
└── reporter agent      → executive, technical, remediation reports

Between phases the controller invokes the bounded reflector agent, which reads the phase's findings and writes subject-predicate-object lessons to a knowledge store. The next phase's agent pulls those lessons via recall_knowledge before planning, so learning flows forward across the engagement without widening any phase agent's tool surface. Cedar's reflector.cedar uses a defensive forbid ... unless whitelist so the reflector can only touch store_knowledge, recall_knowledge, and query_findings — every scan/exploit action is rejected at the gate.

The critical insight: The Gate operates outside LLM influence. An AI plans Metasploit usage; a human approves each exploitation attempt. Cedar policies cannot be bypassed through prompt injection, social engineering, or creative reasoning.

Architecture

┌─────────────────────────────────────────────────────────┐
│                  Engagement Controller                  │
│    Maintains state · Enforces methodology · Orchestrates│
└───────┬───────┬───────┬───────┬───────┬───────┬─────────┘
        │       │       │       │       │       │
   ┌────▼──┐ ┌─▼───┐ ┌─▼───┐ ┌▼────┐ ┌▼────┐ ┌▼────────┐
   │ Recon │ │Enum │ │Vuln │ │Expl.│ │Post │ │Reporter │
   │       │ │     │ │     │ │     │ │Expl.│ │         │
   └───┬───┘ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ └────┬────┘
       │        │       │       │       │          │
   ┌───▼────────▼───────▼───────▼───────▼──────────▼─────┐
   │          ToolClad Manifests (19 .clad.toml)         │
   │  Typed args · MCP schema · Evidence · Cedar metadata │
   ├─────────────────────────────────────────────────────┤
   │              MCP Tool Layer (35 tools)              │
   │  Rust implementations · Cedar-gated · Audit-logged  │
   ├─────────────────────────────────────────────────────┤
   │              Shell Wrappers (19 scripts)            │
   │  Arg validation · Timeout · JSON output · Defense   │
   ├─────────────────────────────────────────────────────┤
   │            Offensive Toolchain (Kali)               │
   │  nmap · nikto · nuclei · sqlmap · hydra · metasploit│
   │  impacket · pypykatz · chisel · ligolo · gobuster   │
   └─────────────────────────────────────────────────────┘

Risk-Tiered Tool Authorization

Risk Level	Tools	Authorization
Low	nmap, whois, dig, whatweb, amass	Auto-allowed within scope
Medium	nikto, gobuster, enum4linux, smbclient, snmpwalk	Rate-limited
Medium-High	nmap NSE, nuclei, sqlmap (detect), searchsploit	Non-production only
High	hydra, metasploit, sqlmap (exploit)	Human approval required
Highest	impacket, pypykatz, chisel, ligolo	Human approval + scope revalidation

Cedar Policy Model

Eight policy files enforce governance at every level:

Policy	Purpose
`scope.cedar`	Target CIDR enforcement, excluded assets
`tool-authorization.cedar`	Per-tool risk-tiered authorization
`phase-gates.cedar`	PTES methodology enforcement
`rate-limits.cedar`	Per-target and global frequency limits
`escalation.cedar`	Human approval with time-limited expiry
`evidence.cedar`	Evidence chain integrity requirements
`time-bounds.cedar`	Engagement window enforcement
`reflector.cedar`	Bounds the reflector to `store_knowledge` / `recall_knowledge` / `query_findings` via defensive `forbid ... unless`

Data Layer

SQLite stores structured engagement data: findings, tool runs, retests, and reflector-authored knowledge triples.

LanceDB provides semantic search across findings for cross-tool correlation and retest comparison. A service that moved from port 8080 to 8443 still gets matched. A finding described differently by a different scanner still gets correlated.

Knowledge store — a knowledge table of subject-predicate-object triples written exclusively by the reflector (e.g. (smb_null_session, enabled_on, 10.0.2.15:445, confidence=0.9)). Phase agents read it via recall_knowledge at phase entry to bias their plan. The triple shape keeps lessons concrete and small enough to inject into the next phase's prompt without token bloat. Pattern borrowed from symbiont-karpathy-loop.

Evidence store archives all tool outputs with SHA-256 integrity hashing, creating a tamper-evident chain from discovery through reporting.

Quick Start

Prerequisites

Docker
An Anthropic API key

Using the pre-built image

# Pull from GitHub Container Registry
docker pull ghcr.io/thirdkeyai/symbi-redteam:latest

# Set required environment variables
export ANTHROPIC_API_KEY=your-key
export SYMBIONT_MASTER_KEY=$(openssl rand -hex 32)

# Start the runtime
docker run --rm --network host --privileged \
  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
  -e SYMBIONT_API_TOKEN="your-api-token" \
  -e SYMBIONT_MASTER_KEY="$SYMBIONT_MASTER_KEY" \
  ghcr.io/thirdkeyai/symbi-redteam:latest \
  up -p 9080 --http-port 9081 --http.token "your-webhook-token"

Building from source

To build locally (e.g., to customize agents, policies, or tools):

# Clone the repo
git clone https://github.com/ThirdKeyAI/symbi-redteam.git
cd symbi-redteam

# Build the container (first build ~15 min for Rust compilation)
docker compose build

# Start with local mounts for live editing
docker run --rm --network host --privileged \
  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
  -e SYMBIONT_API_TOKEN="your-api-token" \
  -e SYMBIONT_MASTER_KEY="$SYMBIONT_MASTER_KEY" \
  -v ./policies:/app/policies:ro \
  -v ./scope:/app/scope:ro \
  -v ./agents:/app/agents:ro \
  -v ./scripts:/app/scripts \
  -v ./templates:/app/templates:ro \
  symbi-redteam:latest \
  up -p 9080 --http-port 9081 --http.token "your-webhook-token"

Interact via API

# Health check
curl -s http://localhost:9080/api/v1/health

# List loaded agents (8 agents from agents/ directory)
curl -s -H "Authorization: Bearer your-api-token" \
  http://localhost:9080/api/v1/agents

# Execute an agent
curl -s -X POST -H "Authorization: Bearer your-api-token" \
  -H "Content-Type: application/json" \
  http://localhost:9080/api/v1/agents/{agent-id}/execute \
  -d '{"input": "Scan 10.0.1.0/24 for open services"}'

# Swagger API docs
open http://localhost:9080/swagger-ui/

Test individual tools

Tool wrappers can be tested directly inside the container without the full runtime:

docker run --rm --network host --privileged --user root \
  --entrypoint bash symbi-redteam:latest -c \
  '/app/scripts/tool-wrappers/nmap-wrapper.sh 10.0.1.5 service "" test-001'

Configure scope

Edit scope/scope.toml to define your engagement targets and update policies/scope.cedar to match. The scope is baked into Cedar policies for this demo.

Environment variables

Variable	Required	Description
`ANTHROPIC_API_KEY`	Yes	API key for LLM reasoning
`SYMBIONT_API_TOKEN`	Yes	Bearer token for the runtime REST API (port 9080)
`SYMBIONT_MASTER_KEY`	Yes	256-bit hex key for encryption (`openssl rand -hex 32`)
`SYMBI_LOG_LEVEL`	No	Log level: debug, info, warn, error (default: info)
`SLACK_BOT_TOKEN`	If approvals enabled	Slack bot token (`xoxb-…`) for chat.postMessage / chat.update
`SLACK_SIGNING_SECRET`	If approvals enabled	Slack app signing secret for webhook signature verification

Ports

Port	Purpose	Authentication
9080	Runtime REST API (agents, status, execute)	`SYMBIONT_API_TOKEN` via Bearer header
9081	HTTP Input webhook (agent invocation)	`--http.token` via Bearer header
9082	Slack approvals webhook (block_actions callbacks)	Slack signing secret
4317	OTLP gRPC (Jaeger trace collector)	None (local only)
16686	Jaeger UI	None (local only)

Observability

Audit trail

Every tool invocation is logged to .symbiont/audit/ as JSONL with SHA-256 hash chaining (configured in symbi.toml). In Docker, these are persisted to the host via the audit-logs/ volume mount:

# View recent audit entries
cat audit-logs/*.jsonl | jq .

# Filter by tool name
cat audit-logs/*.jsonl | jq 'select(.tool == "nmap_scan")'

# Filter by Cedar decision
cat audit-logs/*.jsonl | jq 'select(.cedar_decision == "deny")'

Distributed tracing with Jaeger

Symbiont 1.10.0+ supports W3C traceparent propagation via OpenTelemetry. Traces show the full ORGA loop per agent (Observe, Reason, Gate, Act) with cross-agent propagation through ask() calls.

1. Start Jaeger:

docker run -d --name jaeger \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest

2. Add telemetry config to symbi.toml:

[telemetry]
enabled = true
otlp_endpoint = "http://localhost:4317"

3. View traces:

Open http://localhost:16686 and select the symbi-redteam service. Each engagement run produces traces spanning all phase agents, with spans for:

Agent ORGA loop iterations
Cedar policy evaluations (permit/deny)
Tool executions (wrapper invocation + duration)
Inter-agent ask() calls (controller → phase agent)
Human approval gates (time-to-approve)

Log verbosity

# Increase log detail for debugging
SYMBI_LOG_LEVEL=debug RUST_LOG=symbi=debug,cedar=info

Known limitations

Gobuster requires --exclude-length for SPA targets (like Juice Shop) that return 200 for all paths. The agent's reasoning phase handles this automatically.
Nuclei downloads templates on first run inside the container. Templates are pre-downloaded during Docker build, but template updates require a rebuild.
Metasploit first-run initialization takes 30-60 seconds while the framework loads.
Non-root execution: The container runs as the symbi user by default. Tools requiring raw sockets (nmap SYN scans, chisel tunneling) need --cap-add NET_RAW --cap-add NET_ADMIN or --privileged for testing.
MCP tool registration: ToolClad manifests in tools/ auto-generate MCP schemas via toolclad schema. The Rust MCP tool definitions in src/ provide the runtime registration layer. The Symbiont runtime's ToolCladExecutor discovers manifests from tools/ and registers them as MCP tools automatically.

Slack approval relay (optional)

When enabled, human-gated tools (exploit, post-exploit) post an Approve/Deny prompt to Slack in addition to the CLI prompt. The first responder wins.

Slack app setup:

Create a Slack app at https://api.slack.com/apps
Bot Token Scopes: chat:write, chat:write.public, im:write
Interactivity & Shortcuts: enable; Request URL = https://<your-host>:9082/slack/events
Install to workspace; copy Bot Token (xoxb-…) and Signing Secret
Invite the bot to the approval channel: /invite @your-bot #symbi-approvals

Configure symbi.toml:

[approvals.slack]
enabled = true
bot_token_env = "SLACK_BOT_TOKEN"
signing_secret_env = "SLACK_SIGNING_SECRET"
channel = "#symbi-approvals"
approvers = ["U01ABC123", "U02DEF456"]   # Slack member IDs
dm_approvers = true
events_bind_addr = "0.0.0.0:9082"

Run with Slack enabled:

docker run --rm --network host --privileged \
  -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
  -e SYMBIONT_API_TOKEN="..." \
  -e SYMBIONT_MASTER_KEY="..." \
  -e SLACK_BOT_TOKEN="xoxb-..." \
  -e SLACK_SIGNING_SECRET="..." \
  ghcr.io/thirdkeyai/symbi-redteam:latest \
  up -p 9080 --http-port 9081 --http.token "..."

v1 limitations:

Pending approvals are in-memory; on container restart they're lost and the agent re-prompts on retry.
Approver allowlist is static (Slack user_ids in symbi.toml). Per-engagement Cedar-mapped approvers are planned for v2.
Slack only. Teams/Mattermost are deferred.

Repository Structure

symbi-redteam/
├── agents/                    # 8 Symbiont DSL agent definitions
│   ├── engagement-controller.dsl  # Orchestrator
│   ├── recon.dsl                  # Reconnaissance
│   ├── enum.dsl                   # Enumeration
│   ├── vuln-assess.dsl            # Vulnerability assessment
│   ├── exploit.dsl                # Exploitation (human-gated)
│   ├── post-exploit.dsl           # Post-exploitation (human-gated)
│   ├── reflector.dsl              # Post-phase lesson extractor (bounded)
│   └── reporter.dsl              # Report generation
├── tools/                     # 19 ToolClad manifests (.clad.toml)
├── toolclad.toml              # Project-level custom type definitions
├── policies/                  # 8 Cedar policy files
├── src/                       # Rust MCP tool definitions
│   ├── recon_tools.rs            # 5 recon tools + parse + CVE lookup
│   ├── enum_tools.rs             # 5 enumeration tools
│   ├── vuln_tools.rs             # 4 vulnerability tools
│   ├── exploit_tools.rs          # 4 exploitation tools
│   ├── postexploit_tools.rs      # 4 post-exploitation tools
│   ├── evidence_tools.rs         # 5 evidence management tools
│   ├── knowledge_tools.rs        # store_knowledge + recall_knowledge
│   ├── reporting.rs              # 4 reporting tools
│   └── db.rs                     # SQLite + LanceDB layer
├── scripts/
│   ├── tool-wrappers/            # 19 sandboxed tool wrappers
│   └── parse-outputs/            # 9 output parsers
├── scope/                     # Engagement scope definition
├── db/                        # Database schema
├── templates/                 # Report templates
├── Dockerfile                 # Multi-stage: Rust builder + Kali runtime
├── docker-compose.yml         # Security-hardened container config
└── symbi.toml                 # Symbiont runtime configuration

ToolClad Integration

All 19 offensive tools have declarative ToolClad manifests in tools/. Each .clad.toml defines:

Typed parameters with validation (scope_target, port, enum, credential_file, msf_options, etc.)
Cedar metadata for policy evaluation (resource, action, risk_tier, human_approval)
MCP schema generation — auto-generate inputSchema/outputSchema from manifests
Evidence envelopes with SHA-256 hashing and structured output

Manifests use the executor escape hatch to delegate to existing shell wrappers, preserving defense-in-depth while adding ToolClad's typed validation layer:

Agent fills typed parameters → ToolClad validates → Shell wrapper executes → Evidence envelope

Custom types in toolclad.toml define project-specific enums and constraints: hydra_service, nmap_scan_type, severity_level, dns_record_type, scan_rate, msf_module_path, impacket_tool

# Validate all tool manifests (symbi tools CLI, v1.10.0+)
symbi tools validate

# Generate MCP schema for a tool
symbi tools schema nmap_scan

# Dry-run a tool
symbi tools test nmap_scan --arg target=10.0.1.5 --arg scan_type=service

# List all discovered tools
symbi tools list

Key Design Decisions

Kali base image — Provides the offensive toolchain via apt. Larger image but vastly simpler tool installation and dependency management than building from source.

Hierarchical multi-agent — The engagement controller delegates to phase agents via ask(). Only 2 agents are active concurrently (controller + current phase). This maps naturally to PTES methodology and keeps Cedar policies scoped per phase.

Bounded reflector — Cross-phase learning is handled by a single-purpose reflector agent that can only write to the knowledge store. Separating "who learns" from "who acts" means accumulating procedural knowledge never widens any phase agent's tool surface. The forbid ... unless Cedar pattern catches future accidental widening.

Cedar over inline checks — Cedar policies are formally verifiable, updatable without code changes, and evaluated outside LLM influence. The Gate cannot be prompt-injected.

SQLite + LanceDB — Structured data in SQLite for queries, embeddings in LanceDB for semantic search. Single LanceDB collection with type discriminator avoids runtime changes.

Human approval via CLI — Symbiont's HumanCritic suspends the ORGA loop and prompts the operator. Approval tokens have configurable expiry (30-60 minutes) enforced by Cedar.

Comparison

Capability	Raw Tools	symbi-redteam
Scope enforcement	Manual discipline	Cedar policy — automatic
Phase methodology	Tester judgment	Policy-gated transitions
Tool authorization	Honor system	Risk-tiered Cedar policies
Rate limiting	Manual	Automatic per-target + global
Human approval	Verbal/email	CLI prompt with timed expiry
Evidence integrity	Trust-based	SHA-256 hash chains
Audit trail	Manual notes	Cryptographic, tamper-evident
Report generation	40% of engagement time	Automated from evidence DB
Retest comparison	Manual analyst work	Semantic matching + delta reports
Cross-phase learning	Tester memory	Reflector-written knowledge triples, recalled by next phase

License

Apache 2.0 — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

symbi-redteam

The Problem

The Solution: ORGA-Governed Multi-Agent Pen Testing

Architecture

Risk-Tiered Tool Authorization

Cedar Policy Model

Data Layer

Quick Start

Prerequisites

Using the pre-built image

Building from source

Interact via API

Test individual tools

Configure scope

Environment variables

Ports

Observability

Audit trail

Distributed tracing with Jaeger

Log verbosity

Known limitations

Slack approval relay (optional)

Repository Structure

ToolClad Integration

Key Design Decisions

Comparison

License

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
agents		agents
db		db
policies		policies
scope		scope
scripts		scripts
src		src
templates		templates
tests		tests
tools		tools
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
COPILOT.md		COPILOT.md
CURSOR.md		CURSOR.md
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
symbi-redteam.png		symbi-redteam.png
symbi.toml		symbi.toml
toolclad.toml		toolclad.toml

Folders and files

Latest commit

History

Repository files navigation

symbi-redteam

The Problem

The Solution: ORGA-Governed Multi-Agent Pen Testing

Architecture

Risk-Tiered Tool Authorization

Cedar Policy Model

Data Layer

Quick Start

Prerequisites

Using the pre-built image

Building from source

Interact via API

Test individual tools

Configure scope

Environment variables

Ports

Observability

Audit trail

Distributed tracing with Jaeger

Log verbosity

Known limitations

Slack approval relay (optional)

Repository Structure

ToolClad Integration

Key Design Decisions

Comparison

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages