Bitemporal, graph-backed intelligent agent memory system. Hybrid TypeScript/Python monorepo.
# TypeScript (bun workspaces)
bun install # Install all workspaces
bun run infra:up # Start all services (Observatory, Search, Tuner, DBs)
bun run infra:down # Stop infrastructure
bun run dev # Start all apps in dev mode
bun run build # Build all apps/packages
bun run test # Run Bun's native test runner
bun run typecheck # TypeScript validation
bun run lint # Biome linting
bun run format # Biome formatting
# Python apps (uv)
cd apps/search && uv sync # Install dependencies
cd apps/search && uv run pytest # Run tests
cd apps/search && uv run ruff check src tests # Lint
cd apps/search && uv run ruff format src tests # Format
cd apps/search && uv run search # Start service
cd apps/tuner && uv sync # Install tuner dependencies
cd apps/tuner && uv run tuner # Start tuner service
# OAuth verification
./scripts/verify-oauth-setup.sh # Verify local OAuth configurationCRITICAL: All services require OAuth authentication in local development for production parity.
# First-time setup
cp .env.local.example .env # Copy environment template
bun run infra:up # Start Observatory + all services
./scripts/verify-oauth-setup.sh # Verify OAuth is workingKey Points:
- Observatory (port 6178) acts as OAuth authorization server
- All services (search, tuner, api, memory, ingestion) authenticate via token introspection (RFC 7662)
- No
AUTH_ENABLED=falsebypass - matches production behavior - Default dev secrets in
.env(override for production)
See docs/local-oauth-setup.md for full guide.
- Formatter/Linter: Biome (tabs, double quotes, 100 char line width)
- Package Manager: bun only (never npm/yarn/pnpm)
- TypeScript: Version 7 (tsgo), strict mode, ESNext target, bundler module resolution
- Testing: Bun's native test runner with globals enabled
- Uses
tsgo- native Go implementation with ~10x faster builds - Target
ESNextfor latest ES2025 features (Set methods, Iterator helpers, Promise.try, etc.) - Downlevel emit only supports ES2021+, so modern Bun runtime required
- Multi-threaded builds and parallel project compilation enabled by default
IMPORTANT: Run bun run lint and bun run typecheck before committing.
Prefer Bun's native APIs over Node.js equivalents for better performance:
| Task | Bun Native | Node.js (avoid) |
|---|---|---|
| HTTP Server | Bun.serve({ fetch }) |
http.createServer() |
| WebSocket | Bun.serve({ websocket }) |
ws package |
| File Read | await Bun.file(path).text() |
fs.readFile() |
| File Write | await Bun.write(path, data) |
fs.writeFile() |
| Hashing | new Bun.CryptoHasher("sha256") |
crypto.createHash() |
| UUID | crypto.randomUUID() |
crypto.randomUUID() (same) |
| Random Bytes | crypto.getRandomValues() |
crypto.randomBytes() |
| Glob | new Bun.Glob(pattern) |
glob/fast-glob packages |
| Module Path | import.meta.dir, import.meta.file |
fileURLToPath() |
Bun.password (for future auth):
// Hash password with Argon2id
const hash = await Bun.password.hash("password", { algorithm: "argon2id" });
// Verify password
const valid = await Bun.password.verify("password", hash);Bun.Glob (for file pattern matching):
// Create glob instance
const glob = new Bun.Glob("**/*.ts");
// Iterate matches
for await (const file of glob.scan({ cwd: "src" })) {
console.log(file); // "index.ts", "utils/helper.ts", etc.
}
// Match against string
glob.match("src/index.ts"); // true- Formatter/Linter: Ruff (88 char line width, Python 3.12+)
- Package Manager: uv only (never pip/poetry/pdm)
- Type Hints: Required for all function signatures
- Testing: pytest with pytest-asyncio
IMPORTANT: Run uv run ruff check and uv run pytest before committing.
apps/
├── api/ # Cloud REST API (Hono) - memory operations, OAuth 2.1 auth, rate limiting (port 6174)
├── console/ # Infrastructure Console - Next.js 16 management dashboard (port 6185)
├── ingestion/ # Event parsing pipeline, 8+ provider parsers, PII redaction (port 6175)
├── mcp/ # Engram MCP server - remember/recall/query/context tools (stdio + HTTP ingest)
├── memory/ # Graph persistence, turn aggregation, real-time pub/sub (NATS consumer)
├── observatory/ # Neural Observatory - Next.js 16 real-time session visualization (port 6178)
├── search/ # Python/FastAPI vector search - hybrid retrieval, multi-tier reranking (port 6176)
└── tuner/ # Python/FastAPI hyperparameter optimization with Optuna (port 6177)
packages/
├── benchmark/ # LongMemEval evaluation suite (Python) - MTEB/BEIR benchmarks
├── common/ # Utilities, errors, constants, testing fixtures
├── events/ # Zod event schemas (RawStreamEvent, ParsedStreamEvent)
├── graph/ # Bitemporal graph models, repositories, QueryBuilder, GraphPruner
├── infra/ # Pulumi IaC for GCP/GKE (VPC, GKE Autopilot, databases)
├── logger/ # Pino structured logging with PII redaction and lifecycle management
├── parser/ # Provider stream parsers, ThinkingExtractor, DiffExtractor, Redactor
├── storage/ # FalkorDB, NATS, PostgreSQL, Redis, GCS/blob clients
├── temporal/ # Rehydrator, TimeTravelService, ReplayEngine for time-travel
├── tsconfig/ # Shared TypeScript configuration (base.json)
├── tuner/ # TypeScript client, CLI, and trial executor for tuner service
└── vfs/ # VirtualFileSystem, NodeFileSystem, InMemoryFileSystem, PatchManager
Data Flow: External Agent → Ingestion → NATS → Memory → FalkorDB → Search → Qdrant
Storage: FalkorDB (graph), Qdrant (vectors), NATS+JetStream (events), PostgreSQL (OAuth tokens, Optuna)
Bitemporal: All nodes have vt_start/vt_end (valid time) + tt_start/tt_end (transaction time)
Key Patterns:
- See
packages/storage/src/falkor.ts:1for graph client - See
packages/graph/src/writer.ts:1for bitemporal node creation - See
apps/search/src/search/retrieval/retriever.py:1for hybrid search pipeline - See
apps/memory/src/aggregator.ts:1for turn aggregation - See
packages/temporal/src/rehydrator.ts:1for VFS time-travel
Parsers in packages/parser/src/providers/: Anthropic, OpenAI, Gemini, Claude Code, Cline, Codex, XAI, OpenCode
Registry Aliases: claude → anthropic, gpt/gpt-4 → openai, grok → xai, claude-code → claude_code
| Tool | Purpose | When to Use Proactively |
|---|---|---|
remember |
Persist valuable information to long-term memory | When learning user preferences, architectural decisions, project conventions, debugging insights |
recall |
Search memories using semantic similarity | At session start to prime with prior knowledge; before making decisions to check for existing rationale |
context |
Assemble comprehensive context (memories + decisions + file history) | At the START of complex tasks; more thorough than recall alone |
query |
Execute read-only Cypher queries (local mode) | When semantic search can't handle complex lookups (date ranges, relationships, counts) |
summarize |
Condense text using client LLM | Before storing memories; to compress verbose logs or context |
extract_facts |
Parse unstructured text into atomic facts | Before remember when processing docs, logs, or chat history |
enrich_memory |
Auto-generate summary, keywords, category | Before remember - use output to set type and tags |
Resources (local mode): memory://{id}, session://{id}/transcript, file-history://{path}
Prompts (local mode): /e prime (initialize session), /e recap (review past session), /e why (find past decisions)
Sampling-Required Tools: summarize, extract_facts, and enrich_memory require the MCP client to support sampling capability (server requesting LLM completions from the client). If unsupported, these tools return available: false gracefully.
| Mode | Use Case | Auth |
|---|---|---|
| stdio (default) | CLI usage, local development | None needed |
| http | Remote access, cloud deployment | OAuth 2.1 bearer tokens |
HTTP Transport Configuration:
# Required for HTTP transport
MCP_TRANSPORT=http
MCP_HTTP_PORT=3010
# OAuth configuration
ENGRAM_AUTH_SERVER_URL=https://observatory.engram.rawcontext.com
ENGRAM_MCP_SERVER_URL=https://mcp.engram.rawcontext.com
ENGRAM_MCP_CLIENT_ID=engram-mcp-server
ENGRAM_MCP_CLIENT_SECRET=<secret>
# Session settings
SESSION_TTL_SECONDS=3600
MAX_SESSIONS_PER_USER=10OAuth Endpoints (served by MCP server):
GET /.well-known/oauth-protected-resource- RFC 9728 protected resource metadataGET /.well-known/oauth-authorization-server- RFC 8414 (proxied from Observatory)
OAuth Endpoints (served by Observatory):
POST /api/auth/introspect- RFC 7662 token introspectionPOST /api/auth/device/token- Device flow token exchangeGET /.well-known/oauth-authorization-server- RFC 8414 auth server metadata
Engram uses OAuth 2.1 for all authentication. Legacy API keys have been deprecated.
| Flow | Use Case | Grant Type | RFC |
|---|---|---|---|
| Device Flow | User authentication (MCP clients) | urn:ietf:params:oauth:grant-type:device_code |
RFC 8628 |
| Client Credentials | Machine-to-machine (M2M) | client_credentials |
RFC 6749 §4.4 |
Device Flow: User authenticates via Observatory web UI, MCP client polls for tokens.
Client Credentials: Services authenticate with client ID/secret, receive access token (no refresh token per spec).
Engram uses prefixed tokens with CRC32 checksums for secret scanning compatibility:
| Token Type | Format | Example | Flow |
|---|---|---|---|
| User Access | egm_oauth_{random32}_{crc6} |
egm_oauth_a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4_X7kM2p |
Device Flow |
| Refresh Token | egm_refresh_{random32}_{crc6} |
egm_refresh_a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4_Y8nL3q |
Device Flow |
| Client Token | egm_client_{random32}_{crc6} |
egm_client_a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4_Z9mN4r |
Client Credentials |
Format breakdown:
egm: Engram company identifier (3 chars)oauth/refresh/client: Token type identifierrandom32: 32 hex characters (128 bits of entropy)crc6: 6 Base62 characters (CRC32 checksum for offline validation)
The CRC32 checksum enables offline token validation, reducing false positives in secret scanning to near zero. Design inspired by GitHub's token format.
Validation: Use validateTokenChecksum() from apps/observatory/lib/device-auth.ts for offline validation.
Both user and client tokens support Demonstrating Proof-of-Possession (DPoP) for enhanced security:
- Client generates ephemeral key pair, includes JWK thumbprint (
jkt) in token request - All API requests include
DPoPheader with signed proof JWT - Server validates proof matches token's bound key, preventing token theft/replay
Token Type: Returns DPoP instead of Bearer when DPoP is used.
| Token Type | Lifetime | Refreshable |
|---|---|---|
| User Access | 7 days | Yes (via refresh token) |
| User Refresh | 30 days | Yes (rotates on use) |
| Client Access | 1 hour | No (request new token) |
- Token Generation:
apps/observatory/lib/device-auth.ts - Type Definitions:
packages/common/src/types/auth.ts - Auth Middleware:
apps/api/src/middleware/auth.ts,apps/ingestion/src/auth.ts,apps/search/src/middleware/auth.py
- Starting any non-trivial task →
context(task) - Making architectural or design decisions →
recall("decisions about X", type='decision') - Working on files modified in previous sessions → include in
context(task, files=[...]) - When user says "remember", "before", "last time", "we decided" →
recall(query)
- User expresses a preference ("I prefer...", "always use...", "never...") →
type: 'preference' - You make an architectural decision with rationale →
type: 'decision' - You discover something non-obvious while debugging →
type: 'insight' - You learn a project convention or pattern →
type: 'fact'
- Transient status ("working on X", "about to...")
- Obvious facts already in code comments
- Temporary workarounds without noting they're temporary
- Duplicate information already stored
| Trigger | Action |
|---|---|
| Starting a task | context(task, files, depth='medium') |
| User says "remember when..." | recall(query, filters={type: 'turn'}) |
| Before making a decision | recall("decisions about X", filters={type: 'decision'}) |
| Learn user preference | remember(content, type='preference', tags=[...]) |
| Make architectural choice | remember(content, type='decision', tags=[...]) |
| Debug discovery | remember(content, type='insight', tags=[...]) |
| Processing verbose content | extract_facts(text) → remember each fact |
Memory:content,type,tags,project,vt_start,vt_endSession:id,agent_type,working_dir,summaryTurn:user_content,assistant_preview,files_touched,tool_calls_countFileTouch:file_path,action(read/edit/create/delete)
Example: MATCH (m:Memory {type: 'decision'}) WHERE m.vt_end > $now RETURN m.content ORDER BY m.vt_start DESC LIMIT 5
| Endpoint | Method | Purpose | Scope |
|---|---|---|---|
/v1/health |
GET | Health check | Public |
/v1/memory/remember |
POST | Store memory with deduplication | memory:write |
/v1/memory/recall |
POST | Hybrid search with reranking | memory:read |
/v1/memory/query |
POST | Read-only Cypher queries | query:read |
/v1/memory/context |
POST | Comprehensive context assembly | memory:read |
Authentication: All endpoints (except /v1/health) require OAuth 2.1 bearer token with appropriate scopes.
All endpoints are prefixed with /v1/search:
| Endpoint | Method | Purpose |
|---|---|---|
/v1/search/health |
GET | Health check with Qdrant status |
/v1/search/ready |
GET | Kubernetes readiness probe |
/v1/search/metrics |
GET | Prometheus metrics |
/v1/search/query |
POST | Hybrid search with strategy (dense/sparse/hybrid) and reranking |
/v1/search/multi-query |
POST | LLM-driven query expansion (DMQR-RAG) |
/v1/search/session-aware |
POST | Two-stage hierarchical retrieval across sessions |
/v1/search/embed |
POST | Generate embeddings for external use |
Reranker Tiers: fast (FlashRank ~10ms), accurate (BGE cross-encoder ~50ms), code (Jina ~50ms), colbert (late interaction ~30ms), llm (gemini-3-flash-preview ~500ms)
bun test # All tests
bun test -- --filter=@engram/graph # Single package
bun test -- --watch # Watch mode
# Python tests
cd apps/search && uv run pytest --cov=src --cov-report=html
cd apps/tuner && uv run pytestSee bunfig.toml for test configuration (timeouts, coverage, preloads).
# Local development
bun run infra:up # docker-compose.dev.yml
# Production (Pulumi)
cd packages/infra
bun run wake # Turn on GKE cluster and workloads
bun run sleep # Turn off expensive resources
bun run preview # Preview changes
bun run up # DeployServices: API (6174), Ingestion (6175), Search (6176), Tuner (6177), Observatory (6178), FalkorDB (6179), Qdrant (6180), NATS (6181), NATS Monitor (6182), PostgreSQL (6183), Optuna Dashboard (6184), Console (6185)
IMPORTANT: Use hf CLI, NOT the deprecated huggingface-cli.
hf auth login && hf auth whoami
hf upload <space-name> . . --repo-type space
hf download <repo-id>YOU MUST leverage Engram for institutional knowledge. Don't start tasks blind.
At session start or when beginning significant work:
- Prime yourself - Call
engram_contextwith a specific task description - Check for precedent - Before architectural decisions, call
engram_recallwithtype: 'decision' - Review file history - If modifying files touched in past sessions, include them in context
When you learn something valuable:
- Store decisions - Architectural choices with rationale →
engram_rememberwithtype: 'decision' - Store preferences - User preferences and conventions →
type: 'preference' - Store insights - Debugging discoveries, non-obvious learnings →
type: 'insight'
This is NOT optional. Failure to use institutional memory leads to:
- Repeating past mistakes
- Contradicting previous decisions
- Missing established patterns
- Wasted user time re-explaining context
YOU MUST verify information before acting on it. Your training data becomes stale—APIs change, libraries update, best practices evolve.
Before implementing anything involving an external library or framework:
- Use Context7 MCP - ALWAYS call
resolve-library-idthenget-library-docsto retrieve current documentation - Web search - Search for recent patterns, changelogs, breaking changes, and community best practices
- Cross-reference - Compare Context7 docs with web search results to catch discrepancies
This is NOT optional. Failure to ground your reasoning leads to:
- Deprecated API usage
- Security vulnerabilities from outdated patterns
- Incompatible dependency combinations
- Wasted user time debugging AI-generated hallucinations
YOU MUST:
- Context7 first - For ANY library work, call Context7 MCP before writing code
- Web search for mutations - APIs, configs, and best practices change. Search when uncertain
- Run linting before suggesting changes:
bun run lint && bun run typecheck - Preserve bitemporal fields - never remove vt_/tt_ fields from graph nodes
- Use the storage package - never create direct DB connections
YOU MUST NOT:
- Use
import typefor NestJS DI tokens (breaks injection) - Create new packages without updating turbo.json
- Modify NATS subjects without updating consumers
- Skip the parser registry when adding providers
- Assume library APIs from training data—verify with Context7 + web search
- Include meta commentary about development process in code or docs (e.g., "Phase 1 of migration", "implements the plan from...", "this is a temporary solution until..."). Code should describe what it does, not its place in a roadmap.
| Purpose | Location |
|---|---|
| Biome config | /biome.json |
| Turbo tasks | /turbo.json |
| Bun test config | /bunfig.toml |
| Event schemas | /packages/events/src/schemas.ts |
| Graph models | /packages/graph/src/models/ |
| Search config (Py) | /apps/search/src/search/config.py |
| Search retriever (Py) | /apps/search/src/search/retrieval/retriever.py |
| Search embedders (Py) | /apps/search/src/search/embedders/ |
| Search rerankers (Py) | /apps/search/src/search/rerankers/ |
| Parser registry | /packages/parser/src/registry.ts |
| Rehydrator | /packages/temporal/src/rehydrator.ts |
| VFS | /packages/vfs/src/vfs.ts |
# View NATS streams
docker exec -it engram-nats-1 nats stream ls
# Query FalkorDB
docker exec -it engram-falkordb-1 redis-cli
> GRAPH.QUERY engram "MATCH (n) RETURN n LIMIT 5"
# Check Qdrant collections
curl http://localhost:6180/collections
# Search service health check
curl http://localhost:6176/v1/search/health
# Search service metrics
curl http://localhost:6176/v1/search/metrics
# Tuner service health
curl http://localhost:6177/v1/tuner/health
# Optuna Dashboard
open http://localhost:6184Creating graph nodes (always include bitemporal fields):
// See packages/graph/src/writer.ts
await writer.writeNode("Session", {
id: generateId(),
vt_start: Date.now(),
tt_start: Date.now(),
// ... node-specific fields
});Publishing events:
// See packages/storage/src/nats.ts
await nats.sendEvent("events.parsed", sessionId, event);Hybrid search (Python):
# See apps/search/src/search/retrieval/retriever.py
results = await retriever.search(
query="user question",
strategy="hybrid",
rerank=True,
rerank_tier="accurate",
limit=20
)Time-travel VFS reconstruction:
// See packages/temporal/src/rehydrator.ts
const rehydrator = createRehydrator();
const vfs = await rehydrator.rehydrate("session-123", 1640000000000);
const content = vfs.readFile("/src/index.ts");Using parser registry:
// See packages/parser/src/registry.ts
import { defaultRegistry } from "@engram/parser";
const parser = defaultRegistry.get("anthropic"); // or "claude", "gpt", "xai"
const delta = parser.parse(rawEvent);