-
Notifications
You must be signed in to change notification settings - Fork 304
Open
Labels
Milestone
Description
Summary
Explore and implement agentic memory capabilities within the Response API, enabling LLM agents to maintain persistent, structured, and retrievable memory across sessions and tasks.
Background
Current conversation history in Response API is:
- Flat: Linear sequence of messages
- Session-bound: Lost after session ends
- Unstructured: No semantic organization
- Passive: Only retrieved when explicitly referenced
Agentic memory transforms this into an active, structured, cross-session knowledge system that agents can read, write, and reason over.
Research Areas
1. Memory Types (MemGPT-inspired)
┌─────────────────────────────────────────────────────────────┐
│ Memory Hierarchy │
├─────────────────────────────────────────────────────────────┤
│ Working Memory │ Current context window (ephemeral) │
├────────────────────┼────────────────────────────────────────┤
│ Episodic Memory │ Past conversations & events (indexed) │
├────────────────────┼────────────────────────────────────────┤
│ Semantic Memory │ Facts, knowledge, preferences │
├────────────────────┼────────────────────────────────────────┤
│ Procedural Memory │ How-to knowledge, learned procedures │
└────────────────────┴────────────────────────────────────────┘
2. Memory Operations
- Store: Agent explicitly stores important information
- Retrieve: Semantic search over memory
- Update: Modify existing memories
- Forget: Remove outdated/irrelevant memories
- Consolidate: Merge and summarize related memories
- Reflect: Generate insights from memory patterns
3. Memory-Augmented Response API
Extend Response API with memory-aware fields:
{
"model": "gpt-4",
"input": "What was the budget we discussed last week?",
"memory_config": {
"enabled": true,
"memory_types": ["episodic", "semantic"],
"retrieval_limit": 10,
"auto_store": true
},
"memory_context": {
"user_id": "user_123",
"project_id": "project_abc",
"scope": "project"
}
}Response includes memory operations:
{
"id": "resp_xxx",
"output": [...],
"memory_operations": [
{
"operation": "retrieve",
"memory_type": "episodic",
"query": "budget discussion",
"results": [{"id": "mem_xxx", "content": "...", "relevance": 0.92}]
},
{
"operation": "store",
"memory_type": "semantic",
"content": "Project budget is $50,000",
"metadata": {"confidence": 0.95, "source": "user_confirmed"}
}
]
}4. Memory Storage Architecture
┌─────────────────────────────────────────────────────────────┐
│ Agentic Memory Store │
├─────────────────────────────────────────────────────────────┤
│ Vector Index (Milvus) │ Semantic search over memories │
├────────────────────────────┼────────────────────────────────┤
│ Document Store (Redis) │ Full memory content + metadata │
├────────────────────────────┼────────────────────────────────┤
│ Graph Store (Neo4j?) │ Memory relationships & links │
├────────────────────────────┼────────────────────────────────┤
│ Time-Series Index │ Temporal queries & decay │
└────────────────────────────┴────────────────────────────────┘
5. Memory Management Policies
- Importance scoring: Prioritize memories by relevance/usage
- Temporal decay: Gradually reduce importance of old memories
- Contradiction resolution: Handle conflicting memories
- Privacy controls: User-controlled memory retention
- Memory quotas: Limit memory per user/project/scope
Potential Implementation
type AgenticMemory interface {
// Store a new memory
Store(ctx context.Context, memory *Memory) error
// Retrieve relevant memories
Retrieve(ctx context.Context, query string, opts RetrievalOptions) ([]*Memory, error)
// Update existing memory
Update(ctx context.Context, memoryID string, updates *MemoryUpdate) error
// Delete memory
Forget(ctx context.Context, memoryID string) error
// Consolidate related memories
Consolidate(ctx context.Context, memoryIDs []string) (*Memory, error)
// Reflect on memory patterns
Reflect(ctx context.Context, scope MemoryScope) ([]*Insight, error)
}
type Memory struct {
ID string
Type MemoryType // episodic, semantic, procedural
Content string
Embedding []float32
Metadata map[string]any
Importance float64
CreatedAt time.Time
AccessedAt time.Time
AccessCount int
Scope MemoryScope
Links []MemoryLink
}
type MemoryScope struct {
UserID string
ProjectID string
SessionID string
Global bool
}Use Cases
- Personal Assistant: Remember user preferences, past decisions, ongoing projects
- Code Agent: Remember codebase structure, past debugging sessions, architectural decisions
- Research Agent: Accumulate knowledge across research sessions
- Customer Support: Remember customer history, past issues, preferences
Success Metrics
- Memory retrieval precision/recall
- User satisfaction with "remembering" capability
- Reduction in repeated questions
- Agent task completion rate improvement
References
- MemGPT - Towards LLMs as Operating Systems
- Generative Agents - Simulacra of Human Behavior
- Reflexion - Language Agents with Verbal Reinforcement Learning
- Voyager - Open-Ended Embodied Agent with LLMs
Related
- Parent PR: [Feat][Memory] Add OpenAI Response API support #802
- Depends on: [Feat][Router] Support Milvus as Response API storage backend #803 (Milvus for vector search), [Feat][Router] Support Redis as Response API storage backend #804 (Redis for fast access)
- Related: [Research] Explore Context Engineering in Response API #806 (Context Engineering)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Backlog