Skip to content

AI applications

Build and deploy AI applications on Cloudflare's global network with inference at the edge, vector databases, and model gateways. Workers AI runs Large Language Models (LLMs), text embeddings, image generation, and other models with pay-per-use pricing. AI Gateway proxies requests to OpenAI, Anthropic, and other providers with caching and unified analytics. Vectorize stores embeddings for Retrieval Augmented Generation (RAG) workflows.

AI applications can present unique infrastructure challenges, such as unpredictable inference costs, latency-sensitive user experiences, and the need to work with multiple model providers. Cloudflare provides a complete platform for building AI applications that are fast, cost-effective, and globally distributed.

Architecture patterns

Retrieval Augmented Generation (RAG)

Combine vector search with Large Language Model (LLM) inference to ground responses in your own data:

  • Vectorize stores embeddings of your knowledge base
  • Workers receives user queries and searches for relevant context
  • Workers AI or AI Gateway generates responses using retrieved context

Multi-provider AI gateway

Use AI Gateway to route requests across providers while maintaining a single interface:

  • AI Gateway proxies requests to OpenAI, Anthropic, or Workers AI
  • Built-in caching reduces costs for repeated queries
  • Unified logging and analytics across all providers

Real-time AI features

Deploy low-latency AI features directly at the edge:

  • Workers handles requests at the nearest Cloudflare location and runs inference via the Workers AI binding — no round-trips to origin servers
  • KV caches frequent responses to reduce inference calls and latency
  • D1 stores session state and conversation history alongside the inference logic

Prerequisites

Create a new application

Use an existing application

  • A Cloudflare account.
  • AI Gateway does not require a domain added to Cloudflare. You can place it in front of any existing AI provider (OpenAI, Anthropic, and others) by updating your API endpoint to route through AI Gateway.
  • If you plan to add Workers AI inference or Vectorize to an existing application, you also need Node.js (version 16.17.0 or later) and Wrangler installed.

Workers AI models

Browse available models for text generation, embeddings, image generation, and more.

AI Gateway providers

Connect to OpenAI, Anthropic, Google AI, and other providers through AI Gateway.