A Cloudflare Workers proxy that translates Anthropic Messages API requests into NVIDIA's OpenAI-compatible API format and vice versa.
This allows any client built for the Anthropic API to seamlessly use NVIDIA NIM models hosted at https://integrate.api.nvidia.com/v1.
- Full Messages API translation — Converts Anthropic
POST /v1/messagesto NVIDIAPOST /v1/chat/completions - Streaming support — Translates OpenAI SSE streaming format to Anthropic SSE streaming format
- Thinking / Extended thinking — Maps NVIDIA
reasoning_content(e.g. DeepSeek-R1) to Anthropicthinkingcontent blocks - Tool use / Function calling — Maps Anthropic tools to OpenAI function tools and back
- PROXY_API_KEY protection — Optional secret key to protect the worker endpoint; NVIDIA key stays server-side
- Authentication passthrough — Accepts
x-api-keyorAuthorization: Bearerheaders - Error mapping — Translates NVIDIA error responses to Anthropic error format
- Models endpoint — Proxies
GET /v1/modelsto list available NVIDIA models - CORS support — Full CORS headers for browser-based clients
Client (Anthropic SDK) Cloudflare Worker NVIDIA API
───────────────────── ───────────────── ──────────
POST /v1/messages → Translate request → POST /v1/chat/completions
(Anthropic → OpenAI)
Anthropic response ← Translate response ← OpenAI response
(OpenAI → Anthropic)
| Anthropic | OpenAI / NVIDIA |
|---|---|
system (top-level param) |
messages[0].role = "system" |
messages[].content (blocks) |
messages[].content (string) |
max_tokens (required) |
max_tokens |
stop_sequences |
stop |
tools[].input_schema |
tools[].function.parameters |
tool_choice.type = "any" |
tool_choice = "required" |
tool_choice.type = "tool" |
tool_choice = {type:"function",...} |
| OpenAI / NVIDIA | Anthropic |
|---|---|
choices[0].message.content |
content[].type = "text" |
choices[0].message.reasoning_content |
content[].type = "thinking" |
choices[0].message.tool_calls |
content[].type = "tool_use" |
finish_reason = "stop" |
stop_reason = "end_turn" |
finish_reason = "length" |
stop_reason = "max_tokens" |
finish_reason = "tool_calls" |
stop_reason = "tool_use" |
usage.prompt_tokens |
usage.input_tokens |
usage.completion_tokens |
usage.output_tokens |
- Node.js 18+
- Wrangler CLI
- An NVIDIA API key
npm install
npm run devnpx wrangler deploySet your NVIDIA API key and (optionally) a proxy key to protect the endpoint:
npx wrangler secret put NVIDIA_API_KEY
npx wrangler secret put PROXY_API_KEY# Without PROXY_API_KEY (client sends NVIDIA key directly)
curl http://localhost:8787/api/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: nvapi-YOUR_NVIDIA_API_KEY" \
-d '{
"model": "meta/llama-3.1-8b-instruct",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello! What models do you offer?"}
]
}'# With PROXY_API_KEY (NVIDIA key is server-side, client sends proxy key)
curl http://localhost:8787/api/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_PROXY_API_KEY" \
-d '{
"model": "meta/llama-3.1-8b-instruct",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello! What models do you offer?"}
]
}'import anthropic
client = anthropic.Anthropic(
api_key="YOUR_PROXY_API_KEY", # or NVIDIA key if no PROXY_API_KEY is set
base_url="https://your-worker.your-subdomain.workers.dev",
)
message = client.messages.create(
model="meta/llama-3.1-8b-instruct",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain quantum computing simply."}],
)
print(message.content[0].text)import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
apiKey: "YOUR_PROXY_API_KEY", // or NVIDIA key if no PROXY_API_KEY is set
baseURL: "https://your-worker.your-subdomain.workers.dev",
});
const message = await client.messages.create({
model: "meta/llama-3.1-8b-instruct",
max_tokens: 1024,
messages: [{ role: "user", content: "Explain quantum computing simply." }],
});
console.log(message.content[0].text);You can also use the OpenAI-compatible endpoint directly without translation. This is useful for clients already using the OpenAI format or when you need exact compatibility with NVIDIA's API:
curl http://localhost:8787/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"model": "meta/llama-3.1-8b-instruct",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": false
}'For models with reasoning support (e.g. deepseek-ai/deepseek-r1), enable thinking to get chain-of-thought reasoning as Anthropic-style thinking blocks:
curl http://localhost:8787/api/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_PROXY_API_KEY" \
-d '{
"model": "deepseek-ai/deepseek-r1",
"max_tokens": 4096,
"thinking": {"type": "enabled", "budget_tokens": 2048},
"messages": [
{"role": "user", "content": "Prove that the square root of 2 is irrational."}
]
}'Response will include thinking blocks before the text block:
{
"content": [
{"type": "thinking", "thinking": "Let me work through this step by step...", "signature": "proxy_sig_..."},
{"type": "text", "text": "Here is the proof..."}
]
}curl http://localhost:8787/api/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_PROXY_API_KEY" \
-d '{
"model": "meta/llama-3.1-8b-instruct",
"max_tokens": 1024,
"stream": true,
"messages": [
{"role": "user", "content": "Write a haiku about AI."}
]
}'| Variable | Description | Default |
|---|---|---|
NVIDIA_BASE_URL |
NVIDIA API base URL | https://integrate.api.nvidia.com/v1 |
NVIDIA_API_KEY |
NVIDIA API key (required when PROXY_API_KEY is set) |
— |
PROXY_API_KEY |
Secret key to protect the proxy endpoint (optional) | — |
Mode 1: No PROXY_API_KEY (passthrough)
Clients send their NVIDIA API key directly via x-api-key or Authorization: Bearer. The worker forwards it to NVIDIA.
Mode 2: With PROXY_API_KEY (protected)
Clients authenticate with the proxy key. The NVIDIA key is stored server-side as NVIDIA_API_KEY and never exposed to clients. This is the recommended setup for production.
| Method | Path | Description |
|---|---|---|
| POST | /api/anthropic/v1/messages |
Anthropic Messages API → NVIDIA completions |
| POST | /v1/chat/completions |
OpenAI format passthrough to NVIDIA |
| GET | /v1/models |
Proxy to NVIDIA models listing |
| GET | / |
Health check |
MIT