Skip to content

REST API

The REST API lets you call any model — whether hosted on Cloudflare or by a third-party provider like OpenAI, Anthropic, or Google — through the same Cloudflare API, with all AI Gateway features — logging, caching, rate limiting, and more — applied automatically.

No provider SDKs or API keys are needed. Authentication and billing are handled through your Cloudflare account. Third-party models are billed via Unified Billing, while Workers AI models follow Workers AI pricing.

Endpoints

Four endpoints are available, each suited to different use cases:

EndpointFormatUse case
POST /ai/runEnvelope with model, inputAll models and modalities (LLM, image, TTS, ASR)
POST /ai/v1/chat/completionsOpenAI chat completionsLLMs — OpenAI SDK compatible
POST /ai/v1/responsesOpenAI Responses APIAgentic workflows — OpenAI SDK compatible
POST /ai/v1/messagesAnthropic Messages APILLMs — Anthropic SDK compatible

Authentication

Authenticate with a Cloudflare API token that has AI Gateway permission. Pass it in the Authorization header.

Model naming

Third-party models use the author/model format:

  • openai/gpt-4.1 — OpenAI
  • anthropic/claude-sonnet-4 — Anthropic
  • google/gemini-3-flash — Google
  • xai/grok-3 — xAI

Workers AI models use the @cf/author/model format (for example, @cf/moonshotai/kimi-k2.6). Workers AI requests also require the cf-aig-gateway-id header — refer to Call a Workers AI model for details.

Browse available models in the model catalog.

/ai/run — universal endpoint

Accepts any model with its per-model schema. Model-specific parameters go inside input.

Terminal window
curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run" \
--header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"model": "openai/gpt-4.1",
"input": {
"messages": [
{
"role": "user",
"content": "What is Cloudflare?"
}
],
"max_tokens": 512
}
}'

Call a Workers AI model

To call a Workers AI model, use the @cf/ prefix in the model name and include the cf-aig-gateway-id header to specify which gateway to route through.

Terminal window
curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run" \
--header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "cf-aig-gateway-id: my-gateway" \
--header "Content-Type: application/json" \
--data '{
"model": "@cf/moonshotai/kimi-k2.6",
"input": {
"messages": [
{
"role": "user",
"content": "What is Cloudflare?"
}
]
}
}'

The existing Workers AI endpoint with the model ID in the URL path also continues to work:

Terminal window
curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/moonshotai/kimi-k2.6" \
--header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"messages": [
{
"role": "user",
"content": "What is Cloudflare?"
}
]
}'

/ai/v1/chat/completions — OpenAI compatible

Uses the standard OpenAI chat completions format. The model field uses the same author/model naming. This endpoint is compatible with the OpenAI SDK and other OpenAI-compatible clients.

Terminal window
curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \
--header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"model": "openai/gpt-4.1",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "What is Cloudflare?"
}
],
"max_tokens": 512,
"temperature": 0.7,
"stream": true
}'

OpenAI SDK

Point the OpenAI SDK baseURL at the Cloudflare API:

JavaScript
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: CLOUDFLARE_API_TOKEN,
baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`,
});
const response = await openai.chat.completions.create({
model: "openai/gpt-4.1",
messages: [{ role: "user", content: "What is Cloudflare?" }],
});

/ai/v1/responses — OpenAI Responses API

Uses the OpenAI Responses API format for agentic workflows. Compatible with the OpenAI SDK.

JavaScript
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: CLOUDFLARE_API_TOKEN,
baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`,
});
const response = await openai.responses.create({
model: "openai/gpt-4.1",
input: "What is Cloudflare?",
});

/ai/v1/messages — Anthropic compatible

Uses the Anthropic Messages API format. Compatible with the Anthropic SDK.

Terminal window
curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/messages" \
--header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"model": "anthropic/claude-sonnet-4-5",
"max_tokens": 512,
"messages": [
{
"role": "user",
"content": "What is Cloudflare?"
}
]
}'

Point the Anthropic SDK baseURL at the Cloudflare API:

JavaScript
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
apiKey: CLOUDFLARE_API_TOKEN,
baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`,
});
const message = await anthropic.messages.create({
model: "anthropic/claude-sonnet-4-5",
max_tokens: 512,
messages: [{ role: "user", content: "What is Cloudflare?" }],
});

Specify a gateway

By default, third-party model requests route through your account's default AI Gateway. To use a specific gateway, include the cf-aig-gateway-id header. Workers AI requests always require this header.

Terminal window
curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \
--header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "cf-aig-gateway-id: my-gateway" \
--header "Content-Type: application/json" \
--data '{
"model": "anthropic/claude-sonnet-4",
"messages": [
{
"role": "user",
"content": "Hello"
}
]
}'

With the OpenAI SDK, set the header via defaultHeaders:

JavaScript
const openai = new OpenAI({
apiKey: CLOUDFLARE_API_TOKEN,
baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`,
defaultHeaders: {
"cf-aig-gateway-id": "my-gateway",
},
});

All AI Gateway features configured on that gateway — caching, rate limiting, guardrails, and logging — apply to the request.

Per-request configuration

Use cf-aig-* headers to control AI Gateway behavior on a per-request basis:

HeaderTypeDescription
cf-aig-skip-cachebooleanSkip the cache for this request.
cf-aig-cache-ttlnumberCache TTL in seconds.
cf-aig-cache-keystringCustom cache key.
cf-aig-collect-logbooleanTurn logging on or off for this request.
cf-aig-request-timeoutnumberRequest timeout in milliseconds.
cf-aig-max-attemptsnumberRetry attempts (max 5).
cf-aig-retry-delaynumberRetry delay in milliseconds (max 5000).
cf-aig-backoffstringBackoff method: constant, linear, or exponential.
cf-aig-metadataJSON stringCustom metadata to attach to the log entry.

For more details on these options, refer to Request handling and Caching.