English | 中文
Do the simplest thing that works.
A minimal, framework-agnostic toolkit for Context Engineering. Not another agent framework — just pure functions and a Context dataclass.
Built for learning — understand Context Engineering concepts through readable, minimal code.
"Most AI Agent failures are not failures of model capability, but failures of Context Engineering."
As LLMs become more powerful, the bottleneck shifts from model intelligence to how we manage context. Research on context rot shows that as context windows fill up, the model's ability to extract accurate information degrades — across all models.
Context Engineering is the discipline of dynamically assembling the optimal context for each reasoning step. It's not just "advanced prompting" — it's treating context as a finite resource with diminishing marginal returns, like memory in an operating system.
The four pillars of Context Engineering are:
| Pillar | Purpose | context-kit Module |
|---|---|---|
| Select | JIT retrieval — pull information on-demand | select |
| Write | Persist information outside the context window | memory |
| Compress | Reduce context size while preserving signal | Context.compress_* |
| Isolate | Distribute context across sub-agents | Framework-level |
context-kit implements the first three pillars as pure functions, letting you compose them with any agent framework.
Tip
📚 Deep Dive: For a comprehensive guide on Context Engineering patterns, see One Poem Suffices: Context Engineering and One Poem Suffices: Just-in-Time Context.
Context is a limited resource with diminishing returns. As context windows fill up, model performance degrades. context-kit provides the building blocks to manage context effectively:
| Module | Purpose | Key Operations |
|---|---|---|
| Context | Core dataclass | compress_by_rule, compress_by_model, format conversion |
| Select | JIT Context Retrieval | list_dir, grep, read_file |
| Memory | Context Persistence | view, create, str_replace, insert, delete |
| Tools | Agent Integration | get_memory_tools, get_select_tools |
git clone https://github.com/keli-wen/context-kit.git
cd context-kit
uv sync # Install dependencies
uv run python examples/basic/00_minimal.py # Run exampleIf you don't have uv installed:
# Install uv (recommended)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or use pip to build from source
git clone https://github.com/keli-wen/context-kit.git
cd context-kit
pip install -e . # Install in editable mode
pip install -e ".[example]" # With example dependencies
pip install -e ".[dev]" # With dev dependenciesfrom context_kit import Context, memory, select
# 1. Create context from messages
messages = [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi there!"},
]
ctx = Context.from_openai(messages)
print(f"Messages: {len(ctx)}, Tokens: ~{ctx.estimate_tokens()}")
# 2. Compress by clearing old tool results (Claude Context Editing API style)
ctx = ctx.compress_by_rule(keep_tool_uses=3)
# 3. Export to different formats
openai_msgs = ctx.to_openai()
anthropic_msgs, system = ctx.to_anthropic()
google_contents, system = ctx.to_google()The Context class manages conversation history with compression and format conversion:
from context_kit import Context
# Create from various formats
ctx = Context.from_openai(messages)
ctx = Context.from_anthropic(messages, system="You are helpful.")
ctx = Context.from_google(contents, system="You are helpful.")
# Compression operations
ctx = ctx.compress_by_rule(keep_tool_uses=3) # Clear old tool results
ctx = await ctx.compress_by_model(llm, keep_recent=3) # LLM summarization
# Export to any format
openai_msgs = ctx.to_openai()
anthropic_msgs, system = ctx.to_anthropic()Note
Thinking/Reasoning Format Support: The Message class supports thinking blocks across providers (Anthropic type: "thinking", OpenAI/DeepSeek reasoning_content, Google thought: true). This feature is experimental — provider APIs may change. Please file an issue if you encounter compatibility problems.
Aligned with Claude Context Editing API. Clears old tool results and thinking blocks to reduce context size:
# Clear old tool results, keep only last 3
ctx = ctx.compress_by_rule(keep_tool_uses=3)
# Exclude specific tools from clearing
ctx = ctx.compress_by_rule(keep_tool_uses=3, exclude_tools=["read_file"])
# Clear thinking blocks too
ctx = ctx.compress_by_rule(
keep_tool_uses=3,
clear_thinking=True,
keep_thinking_turns=1,
)
# Archive cleared content to memory (with retrieval guidance)
ctx = ctx.compress_by_rule(
keep_tool_uses=3,
memory_path="./agent_data", # Same path as get_memory_tools()
)
# Placeholder: "[Tool result cleared. Use memory_read('/memories/tool_001_grep.md') to retrieve.]"Uses an LLM to summarize older conversation turns:
from context_kit.llm import from_litellm
llm = from_litellm(model="gpt-4o-mini")
ctx = await ctx.compress_by_model(
llm,
instruction="Preserve: key decisions, unresolved issues. Discard: exploratory attempts.",
keep_recent=3,
)Progressive Disclosure pattern — start with an overview, narrow down, then load on demand:
from context_kit import select
# Step 1: Understand the map (low token cost)
entries = select.list_dir("./src", max_depth=2)
# Step 2: Narrow down
matches = select.grep(r"def \w+", "./src", file_pattern="*.py")
# Step 3: Load on demand
content = select.read_file("./src/auth.py", start_line=40, end_line=60)Persist information outside the context window (Claude Memory Tool interface):
from context_kit import memory
memory.init(path="./agent_data")
# CRUD operations
memory.create("/memories/notes.md", "# Analysis\n\nKey findings...")
content = memory.view("/memories/notes.md")
memory.str_replace("/memories/notes.md", "old", "new")
memory.insert("/memories/notes.md", 3, "- New item")
memory.delete("/memories/notes.md")Export tool definitions for agent frameworks:
from context_kit import tools
# Get all tools with shared memory_path
all_tools = tools.get_all_tools(
memory_path="./agent_data",
select_path="./src",
)
# Or get specific tool sets
memory_tools = tools.get_memory_tools(memory_path="./agent_data")
select_tools = tools.get_select_tools(select_path="./src")Unified interface for different LLM providers:
from context_kit.llm import from_openai, from_anthropic, from_litellm
# OpenAI
llm = from_openai(model="gpt-4o-mini")
# Anthropic
llm = from_anthropic(model="claude-3-haiku-20240307")
# LiteLLM (any provider)
llm = from_litellm(model="gpt-4o-mini")from camel.agents import ChatAgent
from camel.toolkits import FunctionTool
from context_kit import Context, tools as context_kit_tools
# Setup agent with context-kit tools
ctx_tools = [FunctionTool(f) for f in context_kit_tools.get_all_tools(
memory_path="./agent_data",
select_path="./src",
)]
agent = ChatAgent(
system_message="You are a helpful assistant.",
tools=ctx_tools,
)
# After conversation, compress context
history = agent.memory.get_context()
ctx = Context.from_openai(history)
compressed = ctx.compress_by_rule(keep_tool_uses=3, memory_path="./agent_data")See examples/ directory:
| Example | Description |
|---|---|
basic/00_minimal.py |
Quick start |
basic/01_select_tools.py |
JIT Context Retrieval |
basic/02_memory.py |
Memory persistence |
basic/03_compress_rules.py |
Tool & thinking clearing |
basic/04_compress_model.py |
LLM summarization |
integrations/camel/ |
CAMEL framework integration |
integrations/adk/ |
Google ADK integration |
| Principle | Description |
|---|---|
| Minimal | Pure functions, lightweight core |
| Composable | Each module independent, combine as needed |
| Framework-agnostic | Works with any agent framework |
| Educational | Code as documentation for Context Engineering |
Image source: Anthropic - Building Effective Agents
Context engineering treats context as structured blocks (docs, tools, memory, history) rather than a single prompt string. Inspired by Google ADK's "context as a compiled view" thesis:
"Context is a compiled view over a richer stateful system... context engineering stops being prompt gymnastics and starts looking like systems engineering."
context-kit provides the primitives for this compilation: Select retrieves relevant blocks, Memory persists state, and Compress optimizes the final view.
- Not an agent framework (use CAMEL, LangGraph, etc.)
- Not an LLM API wrapper (use LiteLLM, OpenAI SDK, etc.)
- Not a proxy server
For sub-agent isolation (preventing context pollution), compose existing primitives at the framework level:
# Framework layer (not context-kit)
sub_ctx = Context.from_dict([{"role": "system", "content": system}])
sub_ctx = await run_sub_agent(sub_ctx, task)
summary = await sub_ctx.compress_by_model(llm, keep_recent=0)
parent_ctx = parent_ctx.add_message("assistant", summary_text)- Context dataclass with format conversion (OpenAI, Anthropic, Google)
- compress_by_rule: Clear tool results & thinking blocks (Claude API aligned)
- compress_by_model: LLM-based summarization
- Memory module (Claude Memory Tool interface)
- Select module (list_dir, grep, read_file)
- LLM adapters (OpenAI, Anthropic, LiteLLM)
- Tools export (get_memory_tools, get_select_tools)
- Add
Context.from_dict()convenience for thinking blocks - Support thinking/reasoning formats: Anthropic (
type: "thinking"), OpenAI/DeepSeek (reasoning_content), Google (thought: true)
- Add more integration examples (LangGraph, AutoGen)
- MCP server format export
- Streaming support for compress_by_model
- Semantic search with vector index for memory module (optional dependency)
- Building Effective Agents - Anthropic
- Architecting Context-Aware Multi-Agent Framework - Google ADK
- Context Rot Research - Chroma
- Claude Context Management API - Anthropic
- Claude Memory Tool - Anthropic
MIT
Contributions welcome! Please see CONTRIBUTING.md for details.

