The AI Harness for Python Developers

Wrap any LLM with budget control, memory, observability, sandboxed execution,
multi-agent orchestration, and guardrails — in one Python class.

Website · Docs · Discord · Reddit · YouTube

What is an AI Harness?

A harness gives you control over something powerful. It does not limit what the horse can do — it channels that power precisely where you need it, safely, predictably, every time.

LLMs are extraordinarily capable. They are also unpredictable, expensive, opaque, and stateless out of the box. A raw LLM call is a power source with no circuit breaker, no meter, and no safety rail.

Syrin is the harness. It wraps any LLM — OpenAI, Anthropic, Google, Ollama, or your own — with everything a Python developer needs to build production AI systems:

Hard cost limits that actually stop execution before the bill becomes a problem
Persistent memory across sessions with four distinct types and pluggable backends
Isolated code execution so LLM-generated code never runs in your process
72+ lifecycle hooks so nothing in your agent is hidden from you
Multi-agent orchestration with recursive decomposition and shared budget pools
Guardrails for PII, content safety, output validation, and prompt injection defense

You write Python classes. Syrin handles the production concerns. Ship faster, safer, and with full visibility.

Install

pip install syrin

Five Minutes to a Production Agent

from syrin import Agent, Budget, Model
from syrin.enums import ExceedPolicy

class Analyst(Agent):
    model  = Model.OpenAI("gpt-4o-mini", api_key="sk-...")
    budget = Budget(max_cost=0.10, exceed_policy=ExceedPolicy.STOP)
    system_prompt = "You are a precise financial analyst."

result = Analyst().run("Summarise Q3 revenue trends")

print(result.content)
print(f"Cost:      \${result.cost:.6f}")
print(f"Tokens:    {result.tokens.total_tokens}")
print(f"Remaining: \${result.budget_remaining:.4f}")

Q3 revenue grew 14% YoY, driven by enterprise deals (+22%) offsetting
consumer softness (-3%). Gross margin held at 71%...

Cost:      \$0.000312
Tokens:    284
Remaining: \$0.0997

The agent hard-stops at $0.10. No surprise invoices. No extra code.

What the Harness Gives You

Budget Control — Runtime Enforcement, Not Just Monitoring

Every other library treats cost as a logging concern. Syrin treats it as a runtime constraint. The agent checks its budget before every LLM call.

from syrin import Agent, Budget, Model, RateLimit
from syrin.enums import ExceedPolicy
from syrin.budget import BudgetThreshold

class ProductionAgent(Agent):
    model  = Model.OpenAI("gpt-4o", api_key="sk-...")
    budget = Budget(
        max_cost=1.00,                        # Hard cap per run
        reserve=0.10,                         # Hold back for the final reply
        exceed_policy=ExceedPolicy.STOP,      # STOP | WARN | IGNORE | SWITCH
        rate_limits=RateLimit(
            hour=10.00,                       # \$10/hour ceiling
            day=100.00,                       # \$100/day ceiling
            month=2000.00,                    # \$2,000/month ceiling
        ),
        thresholds=[
            BudgetThreshold(at=80, action=lambda ctx: alert_ops(ctx)),
        ],
    )

Pre-call estimation, post-call actuals, threshold callbacks, rate-window enforcement — all declarative, zero boilerplate. The $47K runaway-agent incident? A Budget(max_cost=50) would have been a $50 error.

Sandboxed Execution — LLM Code Never Runs in Your Process

When your agent generates and runs code, it should not run in the same process as your application. Sandbox spawns a fresh subprocess for every execution — isolated memory, hard timeout, no shared state.

import asyncio
from syrin.sandbox import Sandbox

async def main():
    async with Sandbox(python=True, bash=True, timeout=30, memory_mb=256) as sb:

        # Execute Python — in a fresh isolated subprocess
        result = await sb.exec_python("""
import statistics
data = [12, 45, 7, 89, 23, 56, 34, 78]
print(f"mean={statistics.mean(data):.1f}  p95={sorted(data)[int(len(data)*0.95)]}")
""")
        print(result.stdout)        # mean=43.0  p95=78
        print(result.exit_code)     # 0
        print(f"{result.duration_ms:.0f}ms")

        # Execute bash — same isolation, shell tooling available
        result = await sb.exec_bash("""
echo "Disk usage:" && du -sh /tmp
find /tmp -name "*.log" -newer /tmp -mmin -60 | wc -l
""")
        print(result.stdout)

        # Write a file, let Python read it back
        await sb.write("data.csv", "name,score\nalice,95\nbob,87\ncarol,92\n")
        result = await sb.exec_python("""
import csv, os
with open(os.environ["SANDBOX_WORKSPACE"] + "/data.csv") as f:
    rows = list(csv.DictReader(f))
print(f"{len(rows)} rows, avg score={sum(int(r['score']) for r in rows)/len(rows):.1f}")
""")
        print(result.stdout)        # 3 rows, avg score=91.3

        # Read a file the agent wrote back out
        summary = await sb.read("data.csv")
        print(summary.decode()[:40])

asyncio.run(main())

mean=43.0  p95=78
Disk usage:
/tmp  12K
0
3 rows, avg score=91.3
name,score
alice,95
bob,87

Pre-install packages once, used across all exec calls:

sb = Sandbox(packages=["pandas", "matplotlib"], timeout=60)
result = await sb.exec_python("import pandas; print(pandas.__version__)")
# pandas is installed once before the first exec call, then cached

No external dependencies — PROCESS backend uses only the Python standard library.

Persistent Memory — Four Types, Extensible Backends

from syrin import Agent, Model
from syrin.enums import MemoryType

agent = Agent(model=Model.OpenAI("gpt-4o-mini", api_key="sk-..."))

# Persist across sessions
agent.remember("User is a TypeScript engineer at a fintech startup", memory_type=MemoryType.FACTS)
agent.remember("Prefers concise bullet-point answers", memory_type=MemoryType.FACTS)

# Semantic recall — top-k by relevance
memories = agent.recall("user preferences", limit=5)

# Forget outdated facts
agent.forget("previous role title")

Type	What it stores
`FACTS`	Identity, preferences, persistent user facts
`HISTORY`	Past events and conversation summaries
`KNOWLEDGE`	General knowledge — ideal for vector/semantic search
`INSTRUCTIONS`	Skills, workflows, how-to procedures

Swap backends with one line:

from syrin.memory import Memory, QdrantConfig
from syrin.enums import MemoryBackend

# SQLite — zero config, single-process production
Memory(backend=MemoryBackend.SQLITE, path="~/.syrin/memory.db")

# Qdrant — semantic search at scale
Memory(backend=MemoryBackend.QDRANT, qdrant=QdrantConfig(url="...", api_key="..."))

# PostgreSQL — multi-agent shared store, pgvector
Memory(backend=MemoryBackend.POSTGRES, postgres=PostgresConfig(...))

Multi-Agent Orchestration — Swarms and Recursive Decomposition

Five swarm topologies — one class:

from syrin.swarm import Swarm, SwarmConfig, BudgetPool
from syrin.enums import SwarmTopology

pool = BudgetPool(total=5.00)  # $5 shared; no agent exceeds its slice

swarm = Swarm(
    agents=[Researcher, FactChecker, Writer],
    config=SwarmConfig(
        topology=SwarmTopology.ORCHESTRATOR,
        budget_pool=pool,
    ),
)
result = swarm.run("Research and write a report on battery technology trends")
print(result.cost_breakdown)   # per-agent cost
print(result.budget_report)    # pool utilisation

Topology	Behaviour
`ORCHESTRATOR`	First agent routes tasks to the rest dynamically
`PARALLEL`	All agents run concurrently; results merged
`CONSENSUS`	Multiple agents vote; winner selected by strategy
`REFLECTION`	Producer–critic loop until quality threshold met
`WORKFLOW`	Sequential, parallel, branch, and dynamic fan-out steps

Recursive decomposition — agents that spawn agents:

from syrin import Agent, Budget, BudgetSplit, Model, Spawn
from syrin.sandbox import Sandbox

class DataAgent(Agent):
    model = Model.OpenAI("gpt-4o-mini")
    sandbox = Sandbox(python=True, bash=True, timeout=30)
    system_prompt = "Analyse data using code. Return concise summaries only."

class SummaryAgent(Agent):
    model = Model.OpenAI("gpt-4o-mini")
    system_prompt = "Summarise findings into bullet points."

class Orchestrator(Agent):
    model  = Model.OpenAI("gpt-4o")
    budget = Budget(max_cost=0.50)
    agents = [DataAgent, SummaryAgent]   # RLMLoop auto-wired
    spawn  = Spawn(
        max_depth=2,
        budget_split=BudgetSplit.EQUAL,  # divide budget evenly across children
        child_timeout=60.0,
    )
    sandbox = Sandbox(python=True, bash=True)  # propagated to all children

result = await Orchestrator().arun("Analyse the sales CSV and write an executive summary")

The orchestrator spawns specialist sub-agents at runtime. No raw file bytes enter the LLM context — all heavy processing happens inside the sandbox.

Observability — 72+ Typed Lifecycle Hooks

Every LLM call, tool invocation, budget event, memory operation, sandbox execution, and agent spawn fires a typed hook. No patching. No monkey-patching. No log parsing.

from syrin.enums import Hook

agent.events.on(Hook.AGENT_RUN_END,       lambda ctx: metrics.record(ctx.cost, ctx.tokens))
agent.events.on(Hook.BUDGET_THRESHOLD,    lambda ctx: pagerduty.alert(f"Budget at {ctx.percentage}%"))
agent.events.on(Hook.TOOL_CALL_END,       lambda ctx: logger.info(f"Tool {ctx.name} → {ctx.duration_ms}ms"))
agent.events.on(Hook.MEMORY_RECALL,       lambda ctx: trace.span("recall", memories=ctx.count))
agent.events.on(Hook.SANDBOX_EXEC_END,    lambda ctx: print(f"Sandbox {ctx.language} exit={ctx.exit_code} {ctx.duration_ms:.0f}ms"))
agent.events.on(Hook.RLM_SPAWN,           lambda ctx: print(f"Spawned {ctx.agent} at depth {ctx.depth}"))
agent.events.on(Hook.RLM_BUDGET_SPLIT,    lambda ctx: print(f"Child budget: \${ctx.child_budget:.4f}"))

Or enable full tracing with one flag — no code changes:

python my_agent.py --trace

Guardrails — Safety Without Ceremony

from syrin import Agent, Model
from syrin.guardrails import PIIGuardrail, LengthGuardrail, GuardrailChain
from syrin.enums import GuardrailMode

class SafeAgent(Agent):
    model      = Model.OpenAI("gpt-4o-mini", api_key="sk-...")
    guardrails = GuardrailChain([
        PIIGuardrail(redact=True, mode=GuardrailMode.PRE_CALL),   # scrub input
        LengthGuardrail(max_length=4000),                          # cap output
    ])

result = SafeAgent().run("Process: call me at 555-123-4567")
print(result.content)
# "call me at ***-***-****"

Resource Limits — Per-Agent Guardrails on Execution

from syrin import Agent, Model
from syrin.resource import Resource, ResourceThreshold, DegradePolicy
from syrin.enums import OnExceed, RestoreWhen

class ManagedAgent(Agent):
    model    = Model.OpenAI("gpt-4o-mini")
    resource = Resource(
        timeout=120,               # seconds per run
        max_steps=20,              # max LLM iterations
        max_context=50_000,        # token context ceiling
        on_exceed=OnExceed.STOP,
        thresholds=[
            ResourceThreshold(dimension="steps", at=80,
                              action=lambda ctx: logger.warning("Step limit near")),
        ],
        degrade=DegradePolicy(
            tool_to_disable="web_search",   # disable expensive tool at limit
            restore_when=RestoreWhen.NEVER,
        ),
    )

Production Serving — One Line

agent.serve(port=8000, enable_playground=True)
# → POST /chat   POST /stream   GET /playground   GET /health

Crash-proof checkpoints:

from syrin.checkpoint import CheckpointConfig

agent = Agent(
    model=model,
    checkpoint_config=CheckpointConfig(dir="/tmp/checkpoints", auto_save=True),
)
agent.run("Begin long analysis...")
# Crash? Resume exactly where it left off:
agent.load_checkpoint("analysis-run-1")

Event-driven triggers:

from syrin.watch import CronProtocol, WebhookProtocol

agent.watch(CronProtocol(cron="0 9 * * *"), task="Send morning briefing")
agent.watch(WebhookProtocol(path="/events"),  task="Process incoming event")

Real-World Patterns

Data Pipeline — Bash + Python in Isolation

Process large files without loading raw bytes into the LLM context:

import asyncio
from syrin import Agent, Budget, BudgetSplit, Model, Spawn
from syrin.sandbox import Sandbox
from syrin.tool import tool

class IngestionAgent(Agent):
    model = Model.OpenAI("gpt-4o-mini")
    sandbox = Sandbox(bash=True, python=True, timeout=30)
    system_prompt = "Generate and validate data using bash and Python."

    @tool
    async def generate_dataset(self) -> str:
        """Generate 10 000-line synthetic log file."""
        result = await self.sandbox.exec_bash("""
python3 -c "
import random, datetime
for _ in range(10000):
    ip = '.'.join(str(random.randint(1,254)) for _ in range(4))
    code = random.choice([200,200,200,404,500])
    print(f'{ip} GET /api/{random.choice([\"users\",\"orders\"])} {code}')
" > \$SANDBOX_WORKSPACE/access.log
wc -l \$SANDBOX_WORKSPACE/access.log
""")
        return result.stdout.strip()

    @tool
    async def analyze_log(self) -> str:
        """Parse the log and compute error rate."""
        result = await self.sandbox.exec_python("""
import collections, os, pathlib
log = pathlib.Path(os.environ["SANDBOX_WORKSPACE"]) / "access.log"
codes = collections.Counter(line.split()[2] for line in log.open())
total = sum(codes.values())
print(f"Total: {total:,}  Errors: {codes['500']:,}  Rate: {codes['500']/total:.1%}")
""")
        return result.stdout.strip()

class Orchestrator(Agent):
    model   = Model.OpenAI("gpt-4o")
    budget  = Budget(max_cost=0.10)
    agents  = [IngestionAgent]
    spawn   = Spawn(max_depth=2, budget_split=BudgetSplit.EQUAL)
    sandbox = Sandbox(bash=True, python=True)

Total: 10,000  Errors: 1,423  Rate: 14.2%

Customer Support — Memory + Handoff + Guardrails

from syrin import Agent, Budget, Model
from syrin.guardrails import PIIGuardrail
from syrin.enums import MemoryType

class SupportAgent(Agent):
    model      = Model.OpenAI("gpt-4o-mini", api_key="sk-...")
    budget     = Budget(max_cost=0.05)
    guardrails = [PIIGuardrail(redact=True)]
    system_prompt = "You are a helpful customer support agent."

class EscalationAgent(Agent):
    model = Model.OpenAI("gpt-4o", api_key="sk-...")
    system_prompt = "Handle escalated cases requiring senior judgment."

agent = SupportAgent()
agent.remember(f"Customer {user_id}: premium plan, joined 2023", memory_type=MemoryType.FACTS)

result = agent.run(user_message)
if result.confidence < 0.6:
    result = agent.handoff(EscalationAgent, context=result.content)

Document Intelligence — RAG + Structured Output

from syrin import Agent, Model
from syrin.knowledge import Knowledge
from syrin.enums import KnowledgeBackend
from pydantic import BaseModel

class ContractRisk(BaseModel):
    risk_level: str            # low | medium | high | critical
    key_clauses: list[str]
    recommended_action: str

kb = Knowledge(
    sources=["contracts/"],
    backend=KnowledgeBackend.QDRANT,
    embedding_provider="openai",
)

class ContractReviewer(Agent):
    model       = Model.OpenAI("gpt-4o", api_key="sk-...")
    budget      = Budget(max_cost=0.25)
    knowledge   = kb
    output_type = ContractRisk

result = ContractReviewer().run("Review the indemnification clause in contract-2024-07.pdf")
risk: ContractRisk = result.output       # guaranteed typed
print(f"{risk.risk_level} — {risk.recommended_action}")

The Harness at a Glance

Capability	Syrin	DIY / Other Frameworks
Hard budget enforcement	Declarative, pre-call + post-call	Not available or manual
Rate windows	hour / day / month built-in	Build and persist yourself
Threshold callbacks	`BudgetThreshold(at=80, ...)`	Write from scratch
Shared budget pools	Thread-safe `BudgetPool`	Implement locking
Sandboxed code execution	`Sandbox(python=True, bash=True)` — zero deps	Manual subprocess plumbing
Memory (4 types)	Built-in, auto-managed, backend-agnostic	Manual setup
Multi-agent (5 topologies + RLM)	Single `Swarm` or `agents=`	Complex orchestration code
Lifecycle hooks	72+ typed events	Logging + parsing
Live debugger	Rich TUI (Pry)	Parse log files
Guardrails	PII, length, content, output validation	Per-project code
Checkpoints	Auto-save, crash recovery	DIY
RAG / Knowledge	GitHub, docs, PDFs, websites	Manual indexing pipeline
Structured output	Guaranteed Pydantic / JSON	Parse + validate manually
Resource limits	`Resource(timeout, max_steps, degrade=...)`	Manual counter logic
Type safety	StrEnum everywhere, `mypy --strict` passes	String literals

Installation

# Minimal — no LLM providers
pip install syrin

# With OpenAI
pip install syrin[openai]

# With Anthropic
pip install syrin[anthropic]

# Multi-modal — voice, documents, vector stores
pip install syrin[voice,pdf,vector]

# Full install
pip install syrin[openai,anthropic,serve,vector,postgres,pdf,voice]

Documentation

Guide	Description
Introduction	Why Syrin exists — the AI Harness concept
Quick Start	First agent in 10 minutes
Budget Control	Caps, rate limits, thresholds, shared pools
Sandbox	Isolated Python + bash execution
Memory	4 types, backends, decay curves
Sub-agent Spawning	Recursive decomposition, budget split
Multi-Agent Swarms	5 topologies, A2A messaging, budget delegation
Resource Limits	Per-agent timeouts, step caps, degrade policies
Observability & Hooks	72+ events, tracing, Pry debugger
Guardrails	PII, length, content filtering, output validation
Serving	HTTP API, streaming, playground
Checkpoints	State persistence, crash recovery
Migration v0.11 → v0.12	Breaking changes and upgrade guide
Examples	Runnable code for every use case

Join the Community

Channel	What's there
Discord	Real-time help, showcase your agents, roadmap discussion
Reddit — r/syrin_ai	Longer posts, tutorials, use-case deep dives
YouTube — @syrin_dev	Walkthroughs, feature demos, production patterns
GitHub Discussions	RFCs, architecture questions, feature requests
Website	Product overview, roadmap, changelog

Contributing

Contributions are welcome. See CONTRIBUTING.md for guidelines on setting up the dev environment, running tests, and submitting pull requests.

License

MIT — see LICENSE for details.

Give developers control over AI. That's the harness.

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
.github/workflows		.github/workflows
.syrin		.syrin
assets		assets
docs		docs
examples		examples
playground		playground
src/syrin		src/syrin
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pre-commit.sh		pre-commit.sh
publish.sh		publish.sh
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The AI Harness for Python Developers

What is an AI Harness?

Install

Five Minutes to a Production Agent

What the Harness Gives You

Budget Control — Runtime Enforcement, Not Just Monitoring

Sandboxed Execution — LLM Code Never Runs in Your Process

Persistent Memory — Four Types, Extensible Backends

Multi-Agent Orchestration — Swarms and Recursive Decomposition

Observability — 72+ Typed Lifecycle Hooks

Guardrails — Safety Without Ceremony

Resource Limits — Per-Agent Guardrails on Execution

Production Serving — One Line

Real-World Patterns

Data Pipeline — Bash + Python in Isolation

Customer Support — Memory + Handoff + Guardrails

Document Intelligence — RAG + Structured Output

The Harness at a Glance

Installation

Documentation

Join the Community

Contributing

License

About

Uh oh!

Releases 14

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The AI Harness for Python Developers

What is an AI Harness?

Install

Five Minutes to a Production Agent

What the Harness Gives You

Budget Control — Runtime Enforcement, Not Just Monitoring

Sandboxed Execution — LLM Code Never Runs in Your Process

Persistent Memory — Four Types, Extensible Backends

Multi-Agent Orchestration — Swarms and Recursive Decomposition

Observability — 72+ Typed Lifecycle Hooks

Guardrails — Safety Without Ceremony

Resource Limits — Per-Agent Guardrails on Execution

Production Serving — One Line

Real-World Patterns

Data Pipeline — Bash + Python in Isolation

Customer Support — Memory + Handoff + Guardrails

Document Intelligence — RAG + Structured Output

The Harness at a Glance

Installation

Documentation

Join the Community

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages