Skip to content

emrcaca/nvidia-anthropic-adapter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | Türkçe

NVIDIA Anthropic Adapter

A Cloudflare Workers proxy that translates Anthropic Messages API requests into NVIDIA's OpenAI-compatible API format and vice versa.

This allows any client built for the Anthropic API to seamlessly use NVIDIA NIM models hosted at https://integrate.api.nvidia.com/v1.

Features

  • Full Messages API translation — Converts Anthropic POST /v1/messages to NVIDIA POST /v1/chat/completions
  • Streaming support — Translates OpenAI SSE streaming format to Anthropic SSE streaming format
  • Thinking / Extended thinking — Maps NVIDIA reasoning_content (e.g. DeepSeek-R1) to Anthropic thinking content blocks
  • Tool use / Function calling — Maps Anthropic tools to OpenAI function tools and back
  • PROXY_API_KEY protection — Optional secret key to protect the worker endpoint; NVIDIA key stays server-side
  • Authentication passthrough — Accepts x-api-key or Authorization: Bearer headers
  • Error mapping — Translates NVIDIA error responses to Anthropic error format
  • Models endpoint — Proxies GET /v1/models to list available NVIDIA models
  • CORS support — Full CORS headers for browser-based clients

How It Works

Client (Anthropic SDK)        Cloudflare Worker           NVIDIA API
─────────────────────        ─────────────────           ──────────
POST /v1/messages      →     Translate request     →     POST /v1/chat/completions
                              (Anthropic → OpenAI)

Anthropic response     ←     Translate response    ←     OpenAI response
                              (OpenAI → Anthropic)

Request Translation

Anthropic OpenAI / NVIDIA
system (top-level param) messages[0].role = "system"
messages[].content (blocks) messages[].content (string)
max_tokens (required) max_tokens
stop_sequences stop
tools[].input_schema tools[].function.parameters
tool_choice.type = "any" tool_choice = "required"
tool_choice.type = "tool" tool_choice = {type:"function",...}

Response Translation

OpenAI / NVIDIA Anthropic
choices[0].message.content content[].type = "text"
choices[0].message.reasoning_content content[].type = "thinking"
choices[0].message.tool_calls content[].type = "tool_use"
finish_reason = "stop" stop_reason = "end_turn"
finish_reason = "length" stop_reason = "max_tokens"
finish_reason = "tool_calls" stop_reason = "tool_use"
usage.prompt_tokens usage.input_tokens
usage.completion_tokens usage.output_tokens

Quick Start

Prerequisites

Local Development

npm install
npm run dev

Deploy to Cloudflare Workers

npx wrangler deploy

Set your NVIDIA API key and (optionally) a proxy key to protect the endpoint:

npx wrangler secret put NVIDIA_API_KEY
npx wrangler secret put PROXY_API_KEY

Usage

With curl

# Without PROXY_API_KEY (client sends NVIDIA key directly)
curl http://localhost:8787/api/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: nvapi-YOUR_NVIDIA_API_KEY" \
  -d '{
    "model": "meta/llama-3.1-8b-instruct",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello! What models do you offer?"}
    ]
  }'
# With PROXY_API_KEY (NVIDIA key is server-side, client sends proxy key)
curl http://localhost:8787/api/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_PROXY_API_KEY" \
  -d '{
    "model": "meta/llama-3.1-8b-instruct",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello! What models do you offer?"}
    ]
  }'

With the Anthropic Python SDK

import anthropic

client = anthropic.Anthropic(
    api_key="YOUR_PROXY_API_KEY",  # or NVIDIA key if no PROXY_API_KEY is set
    base_url="https://your-worker.your-subdomain.workers.dev",
)

message = client.messages.create(
    model="meta/llama-3.1-8b-instruct",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain quantum computing simply."}],
)
print(message.content[0].text)

With the Anthropic TypeScript SDK

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: "YOUR_PROXY_API_KEY",  // or NVIDIA key if no PROXY_API_KEY is set
  baseURL: "https://your-worker.your-subdomain.workers.dev",
});

const message = await client.messages.create({
  model: "meta/llama-3.1-8b-instruct",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Explain quantum computing simply." }],
});
console.log(message.content[0].text);

OpenAI Format Passthrough

You can also use the OpenAI-compatible endpoint directly without translation. This is useful for clients already using the OpenAI format or when you need exact compatibility with NVIDIA's API:

curl http://localhost:8787/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "model": "meta/llama-3.1-8b-instruct",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "stream": false
  }'

Thinking (Extended Reasoning)

For models with reasoning support (e.g. deepseek-ai/deepseek-r1), enable thinking to get chain-of-thought reasoning as Anthropic-style thinking blocks:

curl http://localhost:8787/api/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_PROXY_API_KEY" \
  -d '{
    "model": "deepseek-ai/deepseek-r1",
    "max_tokens": 4096,
    "thinking": {"type": "enabled", "budget_tokens": 2048},
    "messages": [
      {"role": "user", "content": "Prove that the square root of 2 is irrational."}
    ]
  }'

Response will include thinking blocks before the text block:

{
  "content": [
    {"type": "thinking", "thinking": "Let me work through this step by step...", "signature": "proxy_sig_..."},
    {"type": "text", "text": "Here is the proof..."}
  ]
}

Streaming

curl http://localhost:8787/api/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_PROXY_API_KEY" \
  -d '{
    "model": "meta/llama-3.1-8b-instruct",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Write a haiku about AI."}
    ]
  }'

Configuration

Environment Variables

Variable Description Default
NVIDIA_BASE_URL NVIDIA API base URL https://integrate.api.nvidia.com/v1
NVIDIA_API_KEY NVIDIA API key (required when PROXY_API_KEY is set)
PROXY_API_KEY Secret key to protect the proxy endpoint (optional)

Authentication Modes

Mode 1: No PROXY_API_KEY (passthrough)

Clients send their NVIDIA API key directly via x-api-key or Authorization: Bearer. The worker forwards it to NVIDIA.

Mode 2: With PROXY_API_KEY (protected)

Clients authenticate with the proxy key. The NVIDIA key is stored server-side as NVIDIA_API_KEY and never exposed to clients. This is the recommended setup for production.

Endpoints

Method Path Description
POST /api/anthropic/v1/messages Anthropic Messages API → NVIDIA completions
POST /v1/chat/completions OpenAI format passthrough to NVIDIA
GET /v1/models Proxy to NVIDIA models listing
GET / Health check

License

MIT

About

A lightweight API proxy that adapts Anthropic-compatible requests to NVIDIA endpoints. It enables seamless integration of Anthropic-style APIs with NVIDIA AI services for easier migration and unified tooling.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors