Wrap any LLM with budget control, memory, observability, sandboxed execution,
multi-agent orchestration, and guardrails — in one Python class.
Website · Docs · Discord · Reddit · YouTube
A harness gives you control over something powerful. It does not limit what the horse can do — it channels that power precisely where you need it, safely, predictably, every time.
LLMs are extraordinarily capable. They are also unpredictable, expensive, opaque, and stateless out of the box. A raw LLM call is a power source with no circuit breaker, no meter, and no safety rail.
Syrin is the harness. It wraps any LLM — OpenAI, Anthropic, Google, Ollama, or your own — with everything a Python developer needs to build production AI systems:
- Hard cost limits that actually stop execution before the bill becomes a problem
- Persistent memory across sessions with four distinct types and pluggable backends
- Isolated code execution so LLM-generated code never runs in your process
- 72+ lifecycle hooks so nothing in your agent is hidden from you
- Multi-agent orchestration with recursive decomposition and shared budget pools
- Guardrails for PII, content safety, output validation, and prompt injection defense
You write Python classes. Syrin handles the production concerns. Ship faster, safer, and with full visibility.
pip install syrinfrom syrin import Agent, Budget, Model
from syrin.enums import ExceedPolicy
class Analyst(Agent):
model = Model.OpenAI("gpt-4o-mini", api_key="sk-...")
budget = Budget(max_cost=0.10, exceed_policy=ExceedPolicy.STOP)
system_prompt = "You are a precise financial analyst."
result = Analyst().run("Summarise Q3 revenue trends")
print(result.content)
print(f"Cost: \${result.cost:.6f}")
print(f"Tokens: {result.tokens.total_tokens}")
print(f"Remaining: \${result.budget_remaining:.4f}")Q3 revenue grew 14% YoY, driven by enterprise deals (+22%) offsetting
consumer softness (-3%). Gross margin held at 71%...
Cost: \$0.000312
Tokens: 284
Remaining: \$0.0997
The agent hard-stops at $0.10. No surprise invoices. No extra code.
Every other library treats cost as a logging concern. Syrin treats it as a runtime constraint. The agent checks its budget before every LLM call.
from syrin import Agent, Budget, Model, RateLimit
from syrin.enums import ExceedPolicy
from syrin.budget import BudgetThreshold
class ProductionAgent(Agent):
model = Model.OpenAI("gpt-4o", api_key="sk-...")
budget = Budget(
max_cost=1.00, # Hard cap per run
reserve=0.10, # Hold back for the final reply
exceed_policy=ExceedPolicy.STOP, # STOP | WARN | IGNORE | SWITCH
rate_limits=RateLimit(
hour=10.00, # \$10/hour ceiling
day=100.00, # \$100/day ceiling
month=2000.00, # \$2,000/month ceiling
),
thresholds=[
BudgetThreshold(at=80, action=lambda ctx: alert_ops(ctx)),
],
)Pre-call estimation, post-call actuals, threshold callbacks, rate-window enforcement — all declarative, zero boilerplate. The $47K runaway-agent incident? A Budget(max_cost=50) would have been a $50 error.
When your agent generates and runs code, it should not run in the same process as your application. Sandbox spawns a fresh subprocess for every execution — isolated memory, hard timeout, no shared state.
import asyncio
from syrin.sandbox import Sandbox
async def main():
async with Sandbox(python=True, bash=True, timeout=30, memory_mb=256) as sb:
# Execute Python — in a fresh isolated subprocess
result = await sb.exec_python("""
import statistics
data = [12, 45, 7, 89, 23, 56, 34, 78]
print(f"mean={statistics.mean(data):.1f} p95={sorted(data)[int(len(data)*0.95)]}")
""")
print(result.stdout) # mean=43.0 p95=78
print(result.exit_code) # 0
print(f"{result.duration_ms:.0f}ms")
# Execute bash — same isolation, shell tooling available
result = await sb.exec_bash("""
echo "Disk usage:" && du -sh /tmp
find /tmp -name "*.log" -newer /tmp -mmin -60 | wc -l
""")
print(result.stdout)
# Write a file, let Python read it back
await sb.write("data.csv", "name,score\nalice,95\nbob,87\ncarol,92\n")
result = await sb.exec_python("""
import csv, os
with open(os.environ["SANDBOX_WORKSPACE"] + "/data.csv") as f:
rows = list(csv.DictReader(f))
print(f"{len(rows)} rows, avg score={sum(int(r['score']) for r in rows)/len(rows):.1f}")
""")
print(result.stdout) # 3 rows, avg score=91.3
# Read a file the agent wrote back out
summary = await sb.read("data.csv")
print(summary.decode()[:40])
asyncio.run(main())mean=43.0 p95=78
Disk usage:
/tmp 12K
0
3 rows, avg score=91.3
name,score
alice,95
bob,87
Pre-install packages once, used across all exec calls:
sb = Sandbox(packages=["pandas", "matplotlib"], timeout=60)
result = await sb.exec_python("import pandas; print(pandas.__version__)")
# pandas is installed once before the first exec call, then cachedNo external dependencies — PROCESS backend uses only the Python standard library.
from syrin import Agent, Model
from syrin.enums import MemoryType
agent = Agent(model=Model.OpenAI("gpt-4o-mini", api_key="sk-..."))
# Persist across sessions
agent.remember("User is a TypeScript engineer at a fintech startup", memory_type=MemoryType.FACTS)
agent.remember("Prefers concise bullet-point answers", memory_type=MemoryType.FACTS)
# Semantic recall — top-k by relevance
memories = agent.recall("user preferences", limit=5)
# Forget outdated facts
agent.forget("previous role title")| Type | What it stores |
|---|---|
FACTS |
Identity, preferences, persistent user facts |
HISTORY |
Past events and conversation summaries |
KNOWLEDGE |
General knowledge — ideal for vector/semantic search |
INSTRUCTIONS |
Skills, workflows, how-to procedures |
Swap backends with one line:
from syrin.memory import Memory, QdrantConfig
from syrin.enums import MemoryBackend
# SQLite — zero config, single-process production
Memory(backend=MemoryBackend.SQLITE, path="~/.syrin/memory.db")
# Qdrant — semantic search at scale
Memory(backend=MemoryBackend.QDRANT, qdrant=QdrantConfig(url="...", api_key="..."))
# PostgreSQL — multi-agent shared store, pgvector
Memory(backend=MemoryBackend.POSTGRES, postgres=PostgresConfig(...))Five swarm topologies — one class:
from syrin.swarm import Swarm, SwarmConfig, BudgetPool
from syrin.enums import SwarmTopology
pool = BudgetPool(total=5.00) # $5 shared; no agent exceeds its slice
swarm = Swarm(
agents=[Researcher, FactChecker, Writer],
config=SwarmConfig(
topology=SwarmTopology.ORCHESTRATOR,
budget_pool=pool,
),
)
result = swarm.run("Research and write a report on battery technology trends")
print(result.cost_breakdown) # per-agent cost
print(result.budget_report) # pool utilisation| Topology | Behaviour |
|---|---|
ORCHESTRATOR |
First agent routes tasks to the rest dynamically |
PARALLEL |
All agents run concurrently; results merged |
CONSENSUS |
Multiple agents vote; winner selected by strategy |
REFLECTION |
Producer–critic loop until quality threshold met |
WORKFLOW |
Sequential, parallel, branch, and dynamic fan-out steps |
Recursive decomposition — agents that spawn agents:
from syrin import Agent, Budget, BudgetSplit, Model, Spawn
from syrin.sandbox import Sandbox
class DataAgent(Agent):
model = Model.OpenAI("gpt-4o-mini")
sandbox = Sandbox(python=True, bash=True, timeout=30)
system_prompt = "Analyse data using code. Return concise summaries only."
class SummaryAgent(Agent):
model = Model.OpenAI("gpt-4o-mini")
system_prompt = "Summarise findings into bullet points."
class Orchestrator(Agent):
model = Model.OpenAI("gpt-4o")
budget = Budget(max_cost=0.50)
agents = [DataAgent, SummaryAgent] # RLMLoop auto-wired
spawn = Spawn(
max_depth=2,
budget_split=BudgetSplit.EQUAL, # divide budget evenly across children
child_timeout=60.0,
)
sandbox = Sandbox(python=True, bash=True) # propagated to all children
result = await Orchestrator().arun("Analyse the sales CSV and write an executive summary")The orchestrator spawns specialist sub-agents at runtime. No raw file bytes enter the LLM context — all heavy processing happens inside the sandbox.
Every LLM call, tool invocation, budget event, memory operation, sandbox execution, and agent spawn fires a typed hook. No patching. No monkey-patching. No log parsing.
from syrin.enums import Hook
agent.events.on(Hook.AGENT_RUN_END, lambda ctx: metrics.record(ctx.cost, ctx.tokens))
agent.events.on(Hook.BUDGET_THRESHOLD, lambda ctx: pagerduty.alert(f"Budget at {ctx.percentage}%"))
agent.events.on(Hook.TOOL_CALL_END, lambda ctx: logger.info(f"Tool {ctx.name} → {ctx.duration_ms}ms"))
agent.events.on(Hook.MEMORY_RECALL, lambda ctx: trace.span("recall", memories=ctx.count))
agent.events.on(Hook.SANDBOX_EXEC_END, lambda ctx: print(f"Sandbox {ctx.language} exit={ctx.exit_code} {ctx.duration_ms:.0f}ms"))
agent.events.on(Hook.RLM_SPAWN, lambda ctx: print(f"Spawned {ctx.agent} at depth {ctx.depth}"))
agent.events.on(Hook.RLM_BUDGET_SPLIT, lambda ctx: print(f"Child budget: \${ctx.child_budget:.4f}"))Or enable full tracing with one flag — no code changes:
python my_agent.py --tracefrom syrin import Agent, Model
from syrin.guardrails import PIIGuardrail, LengthGuardrail, GuardrailChain
from syrin.enums import GuardrailMode
class SafeAgent(Agent):
model = Model.OpenAI("gpt-4o-mini", api_key="sk-...")
guardrails = GuardrailChain([
PIIGuardrail(redact=True, mode=GuardrailMode.PRE_CALL), # scrub input
LengthGuardrail(max_length=4000), # cap output
])
result = SafeAgent().run("Process: call me at 555-123-4567")
print(result.content)
# "call me at ***-***-****"from syrin import Agent, Model
from syrin.resource import Resource, ResourceThreshold, DegradePolicy
from syrin.enums import OnExceed, RestoreWhen
class ManagedAgent(Agent):
model = Model.OpenAI("gpt-4o-mini")
resource = Resource(
timeout=120, # seconds per run
max_steps=20, # max LLM iterations
max_context=50_000, # token context ceiling
on_exceed=OnExceed.STOP,
thresholds=[
ResourceThreshold(dimension="steps", at=80,
action=lambda ctx: logger.warning("Step limit near")),
],
degrade=DegradePolicy(
tool_to_disable="web_search", # disable expensive tool at limit
restore_when=RestoreWhen.NEVER,
),
)agent.serve(port=8000, enable_playground=True)
# → POST /chat POST /stream GET /playground GET /healthCrash-proof checkpoints:
from syrin.checkpoint import CheckpointConfig
agent = Agent(
model=model,
checkpoint_config=CheckpointConfig(dir="/tmp/checkpoints", auto_save=True),
)
agent.run("Begin long analysis...")
# Crash? Resume exactly where it left off:
agent.load_checkpoint("analysis-run-1")Event-driven triggers:
from syrin.watch import CronProtocol, WebhookProtocol
agent.watch(CronProtocol(cron="0 9 * * *"), task="Send morning briefing")
agent.watch(WebhookProtocol(path="/events"), task="Process incoming event")Process large files without loading raw bytes into the LLM context:
import asyncio
from syrin import Agent, Budget, BudgetSplit, Model, Spawn
from syrin.sandbox import Sandbox
from syrin.tool import tool
class IngestionAgent(Agent):
model = Model.OpenAI("gpt-4o-mini")
sandbox = Sandbox(bash=True, python=True, timeout=30)
system_prompt = "Generate and validate data using bash and Python."
@tool
async def generate_dataset(self) -> str:
"""Generate 10 000-line synthetic log file."""
result = await self.sandbox.exec_bash("""
python3 -c "
import random, datetime
for _ in range(10000):
ip = '.'.join(str(random.randint(1,254)) for _ in range(4))
code = random.choice([200,200,200,404,500])
print(f'{ip} GET /api/{random.choice([\"users\",\"orders\"])} {code}')
" > \$SANDBOX_WORKSPACE/access.log
wc -l \$SANDBOX_WORKSPACE/access.log
""")
return result.stdout.strip()
@tool
async def analyze_log(self) -> str:
"""Parse the log and compute error rate."""
result = await self.sandbox.exec_python("""
import collections, os, pathlib
log = pathlib.Path(os.environ["SANDBOX_WORKSPACE"]) / "access.log"
codes = collections.Counter(line.split()[2] for line in log.open())
total = sum(codes.values())
print(f"Total: {total:,} Errors: {codes['500']:,} Rate: {codes['500']/total:.1%}")
""")
return result.stdout.strip()
class Orchestrator(Agent):
model = Model.OpenAI("gpt-4o")
budget = Budget(max_cost=0.10)
agents = [IngestionAgent]
spawn = Spawn(max_depth=2, budget_split=BudgetSplit.EQUAL)
sandbox = Sandbox(bash=True, python=True)Total: 10,000 Errors: 1,423 Rate: 14.2%
from syrin import Agent, Budget, Model
from syrin.guardrails import PIIGuardrail
from syrin.enums import MemoryType
class SupportAgent(Agent):
model = Model.OpenAI("gpt-4o-mini", api_key="sk-...")
budget = Budget(max_cost=0.05)
guardrails = [PIIGuardrail(redact=True)]
system_prompt = "You are a helpful customer support agent."
class EscalationAgent(Agent):
model = Model.OpenAI("gpt-4o", api_key="sk-...")
system_prompt = "Handle escalated cases requiring senior judgment."
agent = SupportAgent()
agent.remember(f"Customer {user_id}: premium plan, joined 2023", memory_type=MemoryType.FACTS)
result = agent.run(user_message)
if result.confidence < 0.6:
result = agent.handoff(EscalationAgent, context=result.content)from syrin import Agent, Model
from syrin.knowledge import Knowledge
from syrin.enums import KnowledgeBackend
from pydantic import BaseModel
class ContractRisk(BaseModel):
risk_level: str # low | medium | high | critical
key_clauses: list[str]
recommended_action: str
kb = Knowledge(
sources=["contracts/"],
backend=KnowledgeBackend.QDRANT,
embedding_provider="openai",
)
class ContractReviewer(Agent):
model = Model.OpenAI("gpt-4o", api_key="sk-...")
budget = Budget(max_cost=0.25)
knowledge = kb
output_type = ContractRisk
result = ContractReviewer().run("Review the indemnification clause in contract-2024-07.pdf")
risk: ContractRisk = result.output # guaranteed typed
print(f"{risk.risk_level} — {risk.recommended_action}")| Capability | Syrin | DIY / Other Frameworks |
|---|---|---|
| Hard budget enforcement | Declarative, pre-call + post-call | Not available or manual |
| Rate windows | hour / day / month built-in | Build and persist yourself |
| Threshold callbacks | BudgetThreshold(at=80, ...) |
Write from scratch |
| Shared budget pools | Thread-safe BudgetPool |
Implement locking |
| Sandboxed code execution | Sandbox(python=True, bash=True) — zero deps |
Manual subprocess plumbing |
| Memory (4 types) | Built-in, auto-managed, backend-agnostic | Manual setup |
| Multi-agent (5 topologies + RLM) | Single Swarm or agents= |
Complex orchestration code |
| Lifecycle hooks | 72+ typed events | Logging + parsing |
| Live debugger | Rich TUI (Pry) | Parse log files |
| Guardrails | PII, length, content, output validation | Per-project code |
| Checkpoints | Auto-save, crash recovery | DIY |
| RAG / Knowledge | GitHub, docs, PDFs, websites | Manual indexing pipeline |
| Structured output | Guaranteed Pydantic / JSON | Parse + validate manually |
| Resource limits | Resource(timeout, max_steps, degrade=...) |
Manual counter logic |
| Type safety | StrEnum everywhere, mypy --strict passes |
String literals |
# Minimal — no LLM providers
pip install syrin
# With OpenAI
pip install syrin[openai]
# With Anthropic
pip install syrin[anthropic]
# Multi-modal — voice, documents, vector stores
pip install syrin[voice,pdf,vector]
# Full install
pip install syrin[openai,anthropic,serve,vector,postgres,pdf,voice]| Guide | Description |
|---|---|
| Introduction | Why Syrin exists — the AI Harness concept |
| Quick Start | First agent in 10 minutes |
| Budget Control | Caps, rate limits, thresholds, shared pools |
| Sandbox | Isolated Python + bash execution |
| Memory | 4 types, backends, decay curves |
| Sub-agent Spawning | Recursive decomposition, budget split |
| Multi-Agent Swarms | 5 topologies, A2A messaging, budget delegation |
| Resource Limits | Per-agent timeouts, step caps, degrade policies |
| Observability & Hooks | 72+ events, tracing, Pry debugger |
| Guardrails | PII, length, content filtering, output validation |
| Serving | HTTP API, streaming, playground |
| Checkpoints | State persistence, crash recovery |
| Migration v0.11 → v0.12 | Breaking changes and upgrade guide |
| Examples | Runnable code for every use case |
| Channel | What's there |
|---|---|
| Discord | Real-time help, showcase your agents, roadmap discussion |
| Reddit — r/syrin_ai | Longer posts, tutorials, use-case deep dives |
| YouTube — @syrin_dev | Walkthroughs, feature demos, production patterns |
| GitHub Discussions | RFCs, architecture questions, feature requests |
| Website | Product overview, roadmap, changelog |
Contributions are welcome. See CONTRIBUTING.md for guidelines on setting up the dev environment, running tests, and submitting pull requests.
MIT — see LICENSE for details.
Give developers control over AI. That's the harness.
