---
title: Overview
description: Build scalable, fully-managed RAG applications with Cloudflare AI Search. Create retrieval-augmented generation pipelines to deliver accurate, context-aware AI without managing infrastructure.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ AI ](https://developers.cloudflare.com/search/?tags=AI) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Overview

Create AI-powered search for your data

 Available on all plans 

AI Search is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents. It natively integrates with Cloudflare’s developer platform tools like Vectorize, AI Gateway, R2, Browser Rendering and Workers AI, while also supporting third-party providers and open standards.

It supports retrieval-augmented generation (RAG) patterns, enabling you to build enterprise search, natural language search, and AI-powered chat without managing infrastructure.

[ Get started ](https://developers.cloudflare.com/ai-search/get-started)[ Watch AI Search demo ](https://www.youtube.com/watch?v=JUFdbkiDN2U)

---

## Features

### Automated indexing

Automatically and continuously index your data source, keeping your content fresh without manual reprocessing.

[ View indexing ](https://developers.cloudflare.com/ai-search/configuration/indexing/) 

### Multitenancy support

Create multitenancy by scoping search to each tenant’s data using folder-based metadata filters.

[ Add filters ](https://developers.cloudflare.com/ai-search/how-to/multitenancy/) 

### Workers Binding

Call your AI Search instance for search or AI Search directly from a Cloudflare Worker using the native binding integration.

[ Add to Worker ](https://developers.cloudflare.com/ai-search/usage/workers-binding/) 

### Similarity caching

Cache repeated queries and results to improve latency and reduce compute on repeated requests.

[ Use caching ](https://developers.cloudflare.com/ai-search/configuration/cache/) 

---

## Related products

**[Workers AI](https://developers.cloudflare.com/workers-ai/)** 

Run machine learning models, powered by serverless GPUs, on Cloudflare’s global network.

**[AI Gateway](https://developers.cloudflare.com/ai-gateway/)** 

Observe and control your AI applications with caching, rate limiting, request retries, model fallback, and more.

**[Vectorize](https://developers.cloudflare.com/vectorize/)** 

Build full-stack AI applications with Vectorize, Cloudflare’s vector database.

**[Workers](https://developers.cloudflare.com/workers/)** 

Build serverless applications and deploy instantly across the globe for exceptional performance, reliability, and scale.

**[R2](https://developers.cloudflare.com/r2/)** 

Store large amounts of unstructured data without the costly egress bandwidth fees associated with typical cloud storage services.

---

## More resources

[Get started](https://developers.cloudflare.com/workers-ai/get-started/workers-wrangler/) 

Build and deploy your first Workers AI application.

[Developer Discord](https://discord.cloudflare.com) 

Connect with the Workers community on Discord to ask questions, share what you are building, and discuss the platform with other developers.

[@CloudflareDev](https://x.com/cloudflaredev) 

Follow @CloudflareDev on Twitter to learn about product announcements, and what is new in Cloudflare Workers.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}}]}
```

---

---
title: Get started
description: Create fully-managed, retrieval-augmented generation pipelines with Cloudflare AI Search.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/get-started/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Get started

AI Search is Cloudflare's managed search service. Connect your data such as websites or an R2 bucket, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents.

## Prerequisites

AI Search integrates with R2 for storing your data. You must have an active R2 subscription before creating your first AI Search instance.

[ Go to **R2 Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview) 

## Choose your setup method

[ Dashboard ](https://developers.cloudflare.com/ai-search/get-started/dashboard/) Create and configure AI Search using the Cloudflare dashboard. 

[ API ](https://developers.cloudflare.com/ai-search/get-started/api/) Create AI Search instances programmatically using the REST API. 

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/get-started/","name":"Get started"}}]}
```

---

---
title: API
description: Create AI Search instances programmatically using the REST API.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/get-started/api.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# API

This guide walks you through creating an AI Search instance programmatically using the REST API. This requires setting up a [service API token](https://developers.cloudflare.com/ai-search/configuration/service-api-token/) for system-to-system authentication.

Already have a service token?

If you have created an AI Search instance via the dashboard at least once, your account already has a [service API token](https://developers.cloudflare.com/ai-search/configuration/service-api-token/) registered. The `token_id` parameter is optional and you can skip to [Step 5: Create an AI Search instance](#5-create-an-ai-search-instance).

## Prerequisites

AI Search integrates with R2 for storing your data. You must have an active R2 subscription before creating your first AI Search instance.

[ Go to **R2 Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview) 

## 1\. Create an API token with token creation permissions

AI Search requires a service API token to access R2 and other resources on your behalf. To create this service token programmatically, you first need an [API token](https://developers.cloudflare.com/fundamentals/api/get-started/create-token/) with permission to create other tokens.

1. In the Cloudflare dashboard, go to **My Profile** \> **API Tokens**.
2. Select **Create Token**.
3. Select **Create Custom Token**.
4. Enter a **Token name**, for example `Token Creator`.
5. Under **Permissions**, select **User** \> **API Tokens** \> **Edit**.
6. Select **Continue to summary**, then select **Create Token**.
7. Copy and save the token value. This is your `API_TOKEN` for the next step.

Note

The steps above create a user-owned token. You can also create an account-owned token. Refer to [Create tokens via API](https://developers.cloudflare.com/fundamentals/api/how-to/create-via-api/) for more information.

## 2\. Create a service API token

Use the [Create token API](https://developers.cloudflare.com/api/resources/user/subresources/tokens/methods/create/) to create a [service API token](https://developers.cloudflare.com/ai-search/configuration/service-api-token/). This token allows AI Search to access resources in your account on your behalf, such as R2, Vectorize, and Workers AI.

1. Run the following request to create a service API token. Replace `<API_TOKEN>` with the token from step 1 and `<ACCOUNT_ID>` with your [account ID](https://developers.cloudflare.com/fundamentals/account/find-account-and-zone-ids/).  
Terminal window  
```  
curl -X POST "https://api.cloudflare.com/client/v4/user/tokens" \  
  -H "Authorization: Bearer <API_TOKEN>" \  
  -H "Content-Type: application/json" \  
  --data '{  
    "name": "AI Search Service API Token",  
    "policies": [  
      {  
        "effect": "allow",  
        "resources": {  
          "com.cloudflare.api.account.<ACCOUNT_ID>": "*"  
        },  
        "permission_groups": [  
          { "id": "9e9b428a0bcd46fd80e580b46a69963c" }  
        ]  
      }  
    ]  
  }'  
```  
This creates a token with the AI Search Index Engine permission (`9e9b428a0bcd46fd80e580b46a69963c`) which grants access to run AI Search Index Engine.
2. Save the `id` (`<CF_API_ID>`) and `value` (`<CF_API_KEY>`) from the response. You will need these values in the next step.  
Example response:  
```  
{  
  "result": {  
    "id": "<CF_API_ID>",  
    "name": "AI Search Service API Token",  
    "status": "active",  
    "issued_on": "2025-12-24T22:14:16Z",  
    "modified_on": "2025-12-24T22:14:16Z",  
    "last_used_on": null,  
    "value": "<CF_API_KEY>",  
    "policies": [  
      {  
        "id": "f56e6d5054e147e09ebe5c514f8a0f93",  
        "effect": "allow",  
        "resources": { "com.cloudflare.api.account.<ACCOUNT_ID>": "*" },  
        "permission_groups": [  
          {  
            "id": "9e9b428a0bcd46fd80e580b46a69963c",  
            "name": "AI Search Index Engine"  
          }  
        ]  
      }  
    ]  
  },  
  "success": true,  
  "errors": [],  
  "messages": []  
}  
```

## 3\. Create an AI Search API token

To register the service token and create AI Search instances, you need an API token with AI Search edit permissions.

1. In the Cloudflare dashboard, go to **My Profile** \> **API Tokens**.
2. Select **Create Token**.
3. Select **Create Custom Token**.
4. Enter a **Token name**, for example `AI Search Manager`.
5. Under **Permissions**, select **Account** \> **AI Search** \> **Edit**.
6. Select **Continue to summary**, then select **Create Token**.
7. Copy and save the token value. This is your `AI_SEARCH_API_TOKEN`.

## 4\. Register the service token with AI Search

Use the [Create token API for AI Search](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/tokens/methods/create/) to register the service token you created in step 2.

1. Run the following request to register the service token. Replace `<CF_API_ID>` and `<CF_API_KEY>` with the values from step 2.  
Terminal window  
```  
curl -X POST "https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai-search/tokens" \  
  -H "Authorization: Bearer <AI_SEARCH_API_TOKEN>" \  
  -H "Content-Type: application/json" \  
  --data '{  
    "cf_api_id": "<CF_API_ID>",  
    "cf_api_key": "<CF_API_KEY>",  
    "name": "AI Search Service Token"  
  }'  
```
2. Save the `id` (`<TOKEN_ID>`) from the response. You will need this value to create instances.  
Example response:  
```  
{  
  "success": true,  
  "result": {  
    "id": "<TOKEN_ID>",  
    "name": "AI Search Service Token",  
    "cf_api_id": "<CF_API_ID>",  
    "created_at": "2025-12-25 01:52:28",  
    "modified_at": "2025-12-25 01:52:28",  
    "enabled": true  
  }  
}  
```

## 5\. Create an AI Search instance

Use the [Create instance API](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/create/) to create an AI Search instance. Replace `<ACCOUNT_ID>` with your [account ID](https://developers.cloudflare.com/fundamentals/account/find-account-and-zone-ids/) and `<AI_SEARCH_API_TOKEN>` with the token from [step 3](#3-create-an-ai-search-api-token).

1. Choose your data source type and run the corresponding request.  
**[R2 bucket](https://developers.cloudflare.com/ai-search/configuration/data-source/r2/):**  
Terminal window  
```  
curl -X POST "https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai-search/instances" \  
  -H "Authorization: Bearer <AI_SEARCH_API_TOKEN>" \  
  -H "Content-Type: application/json" \  
  --data '{  
    "id": "my-r2-rag",  
    "token_id": "<TOKEN_ID>",  
    "type": "r2",  
    "source": "<R2_BUCKET_NAME>"  
  }'  
```  
**[Website](https://developers.cloudflare.com/ai-search/configuration/data-source/website/):**  
Terminal window  
```  
curl -X POST "https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai-search/instances" \  
  -H "Authorization: Bearer <AI_SEARCH_API_TOKEN>" \  
  -H "Content-Type: application/json" \  
  --data '{  
    "id": "my-web-rag",  
    "token_id": "<TOKEN_ID>",  
    "type": "web-crawler",  
    "source": "<DOMAIN_IN_YOUR_ACCOUNT>"  
  }'  
```
2. Wait for indexing to complete. You can monitor progress in the [Cloudflare dashboard ↗](https://dash.cloudflare.com/?to=/:account/ai/ai-search).

Note

The `token_id` field is optional if you have previously created an AI Search instance, either via the [dashboard](https://developers.cloudflare.com/ai-search/get-started/dashboard/) or via API with `token_id` included.

## Try it out

Once indexing is complete, you can run your first query. You can check indexing status on the **Overview** tab of your instance.

1. Go to **Compute & AI** \> **AI Search**.
2. Select your instance.
3. Select the **Playground** tab.
4. Select **Search with AI** or **Search**.
5. Enter a query to test the response.

## Add to your application

There are multiple ways you can connect AI Search to your application:

[ Workers Binding ](https://developers.cloudflare.com/ai-search/usage/workers-binding/) Query AI Search directly from your Workers code. 

[ REST API ](https://developers.cloudflare.com/ai-search/usage/rest-api/) Query AI Search using HTTP requests. 

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/get-started/","name":"Get started"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/get-started/api/","name":"API"}}]}
```

---

---
title: Dashboard
description: Create and configure AI Search using the Cloudflare dashboard.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/get-started/dashboard.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Dashboard

This guide walks you through creating an AI Search instance using the Cloudflare dashboard.

## Prerequisites

AI Search integrates with R2 for storing your data. You must have an active R2 subscription before creating your first AI Search instance.

[ Go to **R2 Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview) 

## Create an AI Search instance

[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search) 
1. In the Cloudflare Dashboard, go to **Compute & AI** \> **AI Search**.
2. Select **Create**.
3. Choose how you want to connect your [data source](https://developers.cloudflare.com/ai-search/configuration/data-source/).
4. Configure [chunking](https://developers.cloudflare.com/ai-search/configuration/chunking/) and [embedding](https://developers.cloudflare.com/ai-search/configuration/models/) settings for how your content is processed.
5. Configure [retrieval settings](https://developers.cloudflare.com/ai-search/configuration/retrieval-configuration/) for how search results are returned.
6. Name your AI Search instance.
7. Create a [service API token](https://developers.cloudflare.com/ai-search/configuration/service-api-token/).
8. Select **Create**.

## Try it out

Once indexing is complete, you can run your first query. You can check indexing status on the **Overview** tab of your instance.

1. Go to **Compute & AI** \> **AI Search**.
2. Select your instance.
3. Select the **Playground** tab.
4. Select **Search with AI** or **Search**.
5. Enter a query to test the response.

## Add to your application

There are multiple ways you can connect AI Search to your application:

[ Workers Binding ](https://developers.cloudflare.com/ai-search/usage/workers-binding/) Query AI Search directly from your Workers code. 

[ REST API ](https://developers.cloudflare.com/ai-search/usage/rest-api/) Query AI Search using HTTP requests. 

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/get-started/","name":"Get started"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/get-started/dashboard/","name":"Dashboard"}}]}
```

---

---
title: Wrangler commands
description: Manage AI Search instances from the command line using Wrangler.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/wrangler-commands.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Wrangler commands

## `ai-search list`

List all AI Search instances

* [  npm ](#tab-panel-3101)
* [  pnpm ](#tab-panel-3102)
* [  yarn ](#tab-panel-3103)

Terminal window

```

npx wrangler ai-search list


```

Terminal window

```

pnpm wrangler ai-search list


```

Terminal window

```

yarn wrangler ai-search list


```

* `--json` ` boolean ` default: false  
Return output as clean JSON
* `--page` ` number ` default: 1  
Page number of the results, can configure page size using "per-page"
* `--per-page` ` number `  
Number of instances to show per page

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources

## `ai-search create`

Create a new AI Search instance

* [  npm ](#tab-panel-3104)
* [  pnpm ](#tab-panel-3105)
* [  yarn ](#tab-panel-3106)

Terminal window

```

npx wrangler ai-search create [NAME]


```

Terminal window

```

pnpm wrangler ai-search create [NAME]


```

Terminal window

```

yarn wrangler ai-search create [NAME]


```

* `[NAME]` ` string ` required  
The name of the AI Search instance to create (must be unique).
* `--source` ` string `  
Data source identifier (R2 bucket name or web URL).
* `--type` ` string `  
The source type for the instance.
* `--embedding-model` ` string `  
Embedding model to use.
* `--generation-model` ` string `  
LLM model for chat completions.
* `--chunk-size` ` number `  
Chunk size for document splitting (min: 64).
* `--chunk-overlap` ` number `  
Overlap between document chunks.
* `--max-num-results` ` number `  
Maximum search results per query.
* `--reranking` ` boolean `  
Enable reranking of search results.
* `--reranking-model` ` string `  
Model to use for reranking.
* `--hybrid-search` ` boolean `  
Enable hybrid (keyword + vector) search.
* `--cache` ` boolean `  
Enable response caching.
* `--score-threshold` ` number `  
Minimum relevance score threshold (0-1).
* `--prefix` ` string `  
R2 key prefix to scope indexing.
* `--include-items` ` array `  
Glob patterns for items to include.
* `--exclude-items` ` array `  
Glob patterns for items to exclude.
* `--json` ` boolean ` default: false  
Return output as clean JSON

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources

## `ai-search get`

Get details of an AI Search instance

* [  npm ](#tab-panel-3107)
* [  pnpm ](#tab-panel-3108)
* [  yarn ](#tab-panel-3109)

Terminal window

```

npx wrangler ai-search get [NAME]


```

Terminal window

```

pnpm wrangler ai-search get [NAME]


```

Terminal window

```

yarn wrangler ai-search get [NAME]


```

* `[NAME]` ` string ` required  
The name of the AI Search instance.
* `--json` ` boolean ` default: false  
Return output as clean JSON

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources

## `ai-search update`

Update an AI Search instance configuration

* [  npm ](#tab-panel-3110)
* [  pnpm ](#tab-panel-3111)
* [  yarn ](#tab-panel-3112)

Terminal window

```

npx wrangler ai-search update [NAME]


```

Terminal window

```

pnpm wrangler ai-search update [NAME]


```

Terminal window

```

yarn wrangler ai-search update [NAME]


```

* `[NAME]` ` string ` required  
The name of the AI Search instance to update.
* `--embedding-model` ` string `  
Update the embedding model.
* `--generation-model` ` string `  
Update the LLM model for chat completions.
* `--chunk-size` ` number `  
Update the chunk size.
* `--chunk-overlap` ` number `  
Update the chunk overlap.
* `--max-num-results` ` number `  
Update max search results per query.
* `--reranking` ` boolean `  
Enable or disable reranking.
* `--reranking-model` ` string `  
Update the reranking model.
* `--hybrid-search` ` boolean `  
Enable or disable hybrid search.
* `--cache` ` boolean `  
Enable or disable caching.
* `--score-threshold` ` number `  
Update the minimum relevance score threshold (0-1).
* `--paused` ` boolean `  
Pause or resume the instance.
* `--json` ` boolean ` default: false  
Return output as clean JSON

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources

## `ai-search delete`

Delete an AI Search instance

* [  npm ](#tab-panel-3113)
* [  pnpm ](#tab-panel-3114)
* [  yarn ](#tab-panel-3115)

Terminal window

```

npx wrangler ai-search delete [NAME]


```

Terminal window

```

pnpm wrangler ai-search delete [NAME]


```

Terminal window

```

yarn wrangler ai-search delete [NAME]


```

* `[NAME]` ` string ` required  
The name of the AI Search instance to delete.
* `--force` ` boolean ` alias: --y default: false  
Skip confirmation

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources

## `ai-search stats`

Get usage statistics for an AI Search instance

* [  npm ](#tab-panel-3116)
* [  pnpm ](#tab-panel-3117)
* [  yarn ](#tab-panel-3118)

Terminal window

```

npx wrangler ai-search stats [NAME]


```

Terminal window

```

pnpm wrangler ai-search stats [NAME]


```

Terminal window

```

yarn wrangler ai-search stats [NAME]


```

* `[NAME]` ` string ` required  
The name of the AI Search instance.
* `--json` ` boolean ` default: false  
Return output as clean JSON

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources

## `ai-search search`

Execute a semantic search query against an AI Search instance

* [  npm ](#tab-panel-3119)
* [  pnpm ](#tab-panel-3120)
* [  yarn ](#tab-panel-3121)

Terminal window

```

npx wrangler ai-search search [NAME]


```

Terminal window

```

pnpm wrangler ai-search search [NAME]


```

Terminal window

```

yarn wrangler ai-search search [NAME]


```

* `[NAME]` ` string ` required  
The name of the AI Search instance.
* `--query` ` string ` required  
The search query text.
* `--max-num-results` ` number `  
Override maximum number of results.
* `--score-threshold` ` number `  
Override minimum relevance score (0-1).
* `--reranking` ` boolean `  
Override reranking setting.
* `--filter` ` array `  
Metadata filter as key=value (repeatable, e.g. --filter type=docs --filter lang=en).
* `--json` ` boolean ` default: false  
Return output as clean JSON

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/wrangler-commands/","name":"Wrangler commands"}}]}
```

---

---
title: Configuration
description: You can customize how your AI Search instance indexes your data, and retrieves and generates responses for queries. Some settings can be updated after the instance is created, while others are fixed at creation time.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Configuration

You can customize how your AI Search instance indexes your data, and retrieves and generates responses for queries. Some settings can be updated after the instance is created, while others are fixed at creation time.

The table below lists all available configuration options:

| Configuration                                                                                                   | Editable after creation | Description                                                                                               |
| --------------------------------------------------------------------------------------------------------------- | ----------------------- | --------------------------------------------------------------------------------------------------------- |
| [Data source](https://developers.cloudflare.com/ai-search/configuration/data-source/)                           | no                      | The source where your knowledge base is stored                                                            |
| [Custom metadata schema](https://developers.cloudflare.com/ai-search/configuration/metadata/#define-a-schema)   | no                      | Define custom metadata fields for filtering (max 5 fields)                                                |
| [Path filtering](https://developers.cloudflare.com/ai-search/configuration/path-filtering/)                     | yes                     | Include or exclude specific paths from indexing                                                           |
| [Chunk size](https://developers.cloudflare.com/ai-search/configuration/chunking/)                               | yes                     | Number of tokens per chunk                                                                                |
| [Chunk overlap](https://developers.cloudflare.com/ai-search/configuration/chunking/)                            | yes                     | Number of overlapping tokens between chunks                                                               |
| [Embedding model](https://developers.cloudflare.com/ai-search/configuration/models/)                            | no                      | Model used to generate vector embeddings                                                                  |
| [Query rewrite](https://developers.cloudflare.com/ai-search/configuration/query-rewriting/)                     | yes                     | Enable or disable query rewriting before retrieval                                                        |
| [Query rewrite model](https://developers.cloudflare.com/ai-search/configuration/models/)                        | yes                     | Model used for query rewriting                                                                            |
| [Query rewrite system prompt](https://developers.cloudflare.com/ai-search/configuration/system-prompt/)         | yes                     | Custom system prompt to guide query rewriting behavior                                                    |
| [Match threshold](https://developers.cloudflare.com/ai-search/configuration/retrieval-configuration/)           | yes                     | Minimum similarity score required for a vector match                                                      |
| [Maximum number of results](https://developers.cloudflare.com/ai-search/configuration/retrieval-configuration/) | yes                     | Maximum number of vector matches returned (top\_k)                                                        |
| [Reranking](https://developers.cloudflare.com/ai-search/configuration/reranking/)                               | yes                     | Rerank to reorder retrieved results by semantic relevance using a reranking model after initial retrieval |
| [Generation model](https://developers.cloudflare.com/ai-search/configuration/models/)                           | yes                     | Model used to generate the final response                                                                 |
| [Generation system prompt](https://developers.cloudflare.com/ai-search/configuration/system-prompt/)            | yes                     | Custom system prompt to guide response generation                                                         |
| [Similarity caching](https://developers.cloudflare.com/ai-search/configuration/cache/)                          | yes                     | Enable or disable caching of responses for similar (not just exact) prompts                               |
| [Similarity caching threshold](https://developers.cloudflare.com/ai-search/configuration/cache/)                | yes                     | Controls how similar a new prompt must be to a previous one to reuse its cached response                  |
| [AI Gateway](https://developers.cloudflare.com/ai-gateway)                                                      | yes                     | AI Gateway for monitoring and controlling model usage                                                     |
| AI Search name                                                                                                  | no                      | Name of your AI Search instance                                                                           |
| [Service API token](https://developers.cloudflare.com/ai-search/configuration/service-api-token/)               | yes                     | API token that grants AI Search permission to configure resources on your account                         |
| [Public endpoint](https://developers.cloudflare.com/ai-search/configuration/public-endpoint/)                   | yes                     | Enable public access to search, chat, and MCP endpoints                                                   |
| [UI snippets](https://developers.cloudflare.com/ai-search/configuration/embed-search-snippets/)                 | yes                     | Embed pre-built search and chat components in your website                                                |

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}}]}
```

---

---
title: Similarity cache
description: Similarity-based caching in AI Search lets you serve responses from Cloudflare’s cache for queries that are similar to previous requests, rather than creating new, unique responses for every request. This speeds up response times and cuts costs by reusing answers for questions that are close in meaning.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/cache.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Similarity cache

Similarity-based caching in AI Search lets you serve responses from Cloudflare’s cache for queries that are similar to previous requests, rather than creating new, unique responses for every request. This speeds up response times and cuts costs by reusing answers for questions that are close in meaning.

## How It Works

Unlike with basic caching, which creates a new response with every request, this is what happens when a request is received using similarity-based caching:

1. AI Search checks if a _similar_ prompt (based on your chosen threshold) has been answered before.
2. If a match is found, it returns the cached response instantly.
3. If no match is found, it generates a new response and caches it.

To see if a response came from the cache, check the `cf-aig-cache-status` header: `HIT` for cached and `MISS` for new.

## What to consider when using similarity cache

Consider these behaviors when using similarity caching:

* **Volatile Cache**: If two similar requests hit at the same time, the first might not cache in time for the second to use it, resulting in a `MISS`.
* **30-Day Cache**: Cached responses last 30 days, then expire automatically. No custom durations for now.
* **Data Dependency**: Cached responses are tied to specific document chunks. If those chunks change or get deleted, the cache clears to keep answers fresh.

## How similarity matching works

AI Search’s similarity cache uses **MinHash and Locality-Sensitive Hashing (LSH)** to find and reuse responses for prompts that are worded similarly.

Here’s how it works when a new prompt comes in:

1. The prompt is split into small overlapping chunks of words (called shingles), like “what’s the” or “the weather.”
2. These shingles are turned into a “fingerprint” using MinHash. The more overlap two prompts have, the more similar their fingerprints will be.
3. Fingerprints are placed into LSH buckets, which help AI Search quickly find similar prompts without comparing every single one.
4. If a past prompt in the same bucket is similar enough (based on your configured threshold), AI Search reuses its cached response.

## Choosing a threshold

The similarity threshold decides how close two prompts need to be to reuse a cached response. Here are the available thresholds:

| Threshold        | Description                 | Example Match                                                                   |
| ---------------- | --------------------------- | ------------------------------------------------------------------------------- |
| Exact            | Near-identical matches only | "What’s the weather like today?" matches with "What is the weather like today?" |
| Strong (default) | High semantic similarity    | "What’s the weather like today?" matches with "How’s the weather today?"        |
| Broad            | Moderate match, more hits   | "What’s the weather like today?" matches with "Tell me today’s weather"         |
| Loose            | Low similarity, max reuse   | "What’s the weather like today?" matches with "Give me the forecast"            |

Test these values to see which works best with your [RAG application](https://developers.cloudflare.com/ai-search/).

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/cache/","name":"Similarity cache"}}]}
```

---

---
title: Chunking
description: Chunking is the process of splitting large data into smaller segments before embedding them for search. AI Search uses recursive chunking, which breaks your content at natural boundaries (like paragraphs or sentences), and then further splits it if the chunks are too large.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/chunking.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Chunking

Chunking is the process of splitting large data into smaller segments before embedding them for search. AI Search uses **recursive chunking**, which breaks your content at natural boundaries (like paragraphs or sentences), and then further splits it if the chunks are too large.

## What is recursive chunking

Recursive chunking tries to keep chunks meaningful by:

* **Splitting at natural boundaries:** like paragraphs, then sentences.
* **Checking the size:** if a chunk is too long (based on token count), it’s split again into smaller parts.

This way, chunks are easy to embed and retrieve, without cutting off thoughts mid-sentence.

## Chunking controls

AI Search exposes two parameters to help you control chunking behavior:

* **Chunk size**: The number of tokens per chunk. The option range may vary depending on the model.
* **Chunk overlap**: The percentage of overlapping tokens between adjacent chunks.  
   * Minimum: `0%`  
   * Maximum: `30%`

These settings apply during the indexing step, before your data is embedded and stored in Vectorize.

## Choosing chunk size and overlap

Chunking affects both how your content is retrieved and how much context is passed into the generation model. Try out this external [chunk visualizer tool ↗](https://huggingface.co/spaces/m-ric/chunk%5Fvisualizer) to help understand how different chunk settings could look.

### Additional considerations:

* **Vector index size:** Smaller chunk sizes produce more chunks and more total vectors. Refer to the [Vectorize limits](https://developers.cloudflare.com/vectorize/platform/limits/) to ensure your configuration stays within the maximum allowed vectors per index.
* **Generation model context window:** Generation models have a limited context window that must fit all retrieved chunks (`topK` × `chunk size`), the user query, and the model’s output. Be careful with large chunks or high topK values to avoid context overflows.
* **Cost and performance:** Larger chunks and higher topK settings result in more tokens passed to the model, which can increase latency and cost. You can monitor this usage in [AI Gateway](https://developers.cloudflare.com/ai-gateway/).

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/chunking/","name":"Chunking"}}]}
```

---

---
title: Data source
description: AI Search can directly ingest data from the following sources:
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/data-source/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Data source

AI Search can directly ingest data from the following sources:

| Data Source                                                                               | Description                                               |
| ----------------------------------------------------------------------------------------- | --------------------------------------------------------- |
| [Website](https://developers.cloudflare.com/ai-search/configuration/data-source/website/) | Connect a domain you own to index website pages.          |
| [R2 Bucket](https://developers.cloudflare.com/ai-search/configuration/data-source/r2/)    | Connect a Cloudflare R2 bucket to index stored documents. |

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/data-source/","name":"Data source"}}]}
```

---

---
title: R2
description: You can use Cloudflare R2 to store data for indexing. To get started, configure an R2 bucket containing your data.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/data-source/r2.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# R2

You can use Cloudflare R2 to store data for indexing. To get started, [configure an R2 bucket](https://developers.cloudflare.com/r2/get-started/) containing your data.

AI Search will automatically scan and process supported files stored in that bucket. Files that are unsupported or exceed the size limit will be skipped during indexing and logged as errors.

## Path filtering

You can control which files get indexed by defining include and exclude rules for object paths. Use this to limit indexing to specific folders or to exclude files you do not want searchable.

For example, to index only documentation while excluding drafts:

* **Include:** `/docs/**`
* **Exclude:** `/docs/drafts/**`

Refer to [Path filtering](https://developers.cloudflare.com/ai-search/configuration/path-filtering/) for pattern syntax, filtering behavior, and more examples.

## File limits

AI Search has a file size limit of **up to 4 MB**.

Files that exceed these limits will not be indexed and will show up in the error logs.

## File types

AI Search can ingest a variety of different file types to power your RAG. The following plain text files and rich format files are supported.

### Plain text file types

AI Search supports the following plain text file types:

| Format     | File extensions                                                  | Mime Type                                                       |
| ---------- | ---------------------------------------------------------------- | --------------------------------------------------------------- |
| Text       | .txt, .rst                                                       | text/plain                                                      |
| Log        | .log                                                             | text/plain                                                      |
| Config     | .ini, .conf, .env, .properties, .gitignore, .editorconfig, .toml | text/plain, text/toml                                           |
| Markdown   | .markdown, .md, .mdx                                             | text/markdown                                                   |
| LaTeX      | .tex, .latex                                                     | application/x-tex, application/x-latex                          |
| Script     | .sh, .bat , .ps1                                                 | application/x-sh , application/x-msdos-batch, text/x-powershell |
| SGML       | .sgml                                                            | text/sgml                                                       |
| JSON       | .json                                                            | application/json                                                |
| YAML       | .yaml, .yml                                                      | application/x-yaml                                              |
| CSS        | .css                                                             | text/css                                                        |
| JavaScript | .js                                                              | application/javascript                                          |
| PHP        | .php                                                             | application/x-httpd-php                                         |
| Python     | .py                                                              | text/x-python                                                   |
| Ruby       | .rb                                                              | text/x-ruby                                                     |
| Java       | .java                                                            | text/x-java-source                                              |
| C          | .c                                                               | text/x-c                                                        |
| C++        | .cpp, .cxx                                                       | text/x-c++                                                      |
| C Header   | .h, .hpp                                                         | text/x-c-header                                                 |
| Go         | .go                                                              | text/x-go                                                       |
| Rust       | .rs                                                              | text/rust                                                       |
| Swift      | .swift                                                           | text/swift                                                      |
| Dart       | .dart                                                            | text/dart                                                       |
| EMACS Lisp | .el                                                              | application/x-elisp, text/x-elisp, text/x-emacs-lisp            |

### Rich format file types

AI Search uses [Markdown Conversion](https://developers.cloudflare.com/workers-ai/features/markdown-conversion/) to convert rich format files to markdown. The following table lists the supported formats that will be converted to Markdown:

| Format                     | File extensions                       | Mime Types                                                                                                                                                                                                                                                              | |  PDF Documents | .pdf | application/pdf |
| -------------------------- | ------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | ---- | --------------- |
| Images 1                   | .jpeg, .jpg, .png, .webp, .svg        | image/jpeg, image/png, image/webp, image/svg+xml                                                                                                                                                                                                                        |                  |      |                 |
| HTML Documents             | .html, .htm                           | text/html                                                                                                                                                                                                                                                               |                  |      |                 |
| XML Documents              | .xml                                  | application/xml                                                                                                                                                                                                                                                         |                  |      |                 |
| Microsoft Office Documents | .xlsx, .xlsm, .xlsb, .xls, .et, .docx | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,application/vnd.ms-excel.sheet.macroenabled.12,application/vnd.ms-excel.sheet.binary.macroenabled.12,application/vnd.ms-excel,application/vnd.openxmlformats-officedocument.wordprocessingml.document |                  |      |                 |
| Open Document Format       | .ods, .odt                            | application/vnd.oasis.opendocument.spreadsheet,application/vnd.oasis.opendocument.text                                                                                                                                                                                  |                  |      |                 |
| CSV                        | .csv                                  | text/csv                                                                                                                                                                                                                                                                |                  |      |                 |
| Apple Documents            | .numbers                              | application/vnd.apple.numbers                                                                                                                                                                                                                                           |                  |      |                 |

1 Image conversion uses two Workers AI models for object detection and summarization. See [Workers AI pricing](https://developers.cloudflare.com/workers-ai/features/markdown-conversion/#pricing) for more details.

## Custom metadata

You can attach custom metadata to R2 objects for filtering search results. AI Search reads metadata from S3-compatible custom headers (`x-amz-meta-*`).

Before metadata can be extracted, you must [define a schema](https://developers.cloudflare.com/ai-search/configuration/metadata/#define-a-schema) in your AI Search configuration.

### Set metadata when uploading

* [ Workers R2 binding ](#tab-panel-3090)
* [ AWS SDK ](#tab-panel-3091)
* [ Wrangler CLI ](#tab-panel-3092)

Use the `customMetadata` option when uploading objects with the [R2 Workers binding](https://developers.cloudflare.com/r2/api/workers/workers-api-usage/):

JavaScript

```

await env.MY_BUCKET.put("docs/document.pdf", fileContent, {

  customMetadata: {

    category: "documentation",

    version: "2.5",

    is_public: "true",

  },

});


```

Use the `Metadata` option with the [AWS SDK for JavaScript](https://developers.cloudflare.com/r2/api/s3/api/):

JavaScript

```

import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";


const client = new S3Client({

  region: "auto",

  endpoint: `https://${accountId}.r2.cloudflarestorage.com`,

  credentials: {

    accessKeyId: R2_ACCESS_KEY_ID,

    secretAccessKey: R2_SECRET_ACCESS_KEY,

  },

});


await client.send(

  new PutObjectCommand({

    Bucket: "your-bucket",

    Key: "docs/document.pdf",

    Body: fileContent,

    Metadata: {

      category: "documentation",

      version: "2.5",

      is_public: "true",

    },

  }),

);


```

Use the `--header` flag with [Wrangler](https://developers.cloudflare.com/r2/reference/wrangler-commands/) to set `x-amz-meta-*` headers:

Terminal window

```

wrangler r2 object put your-bucket/docs/document.pdf \

  --file=./document.pdf \

  --header="x-amz-meta-category:documentation" \

  --header="x-amz-meta-version:2.5" \

  --header="x-amz-meta-is_public:true"


```

### How metadata extraction works

When a file is fetched from R2 during indexing:

1. All `x-amz-meta-*` headers are read from the object.
2. The `x-amz-meta-` prefix is stripped (for example, `x-amz-meta-category` becomes `category`).
3. Field names are matched against your schema (case-insensitive).
4. Values are cast to the configured data type.
5. Invalid values (for example, a non-numeric string for a `number` type) are silently ignored.

### Unicode support

Metadata values support Unicode characters through MIME-Word encoding (RFC 2047). Most S3-compatible tools handle this encoding automatically.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/data-source/","name":"Data source"}},{"@type":"ListItem","position":5,"item":{"@id":"/ai-search/configuration/data-source/r2/","name":"R2"}}]}
```

---

---
title: Website
description: The Website data source allows you to connect a domain you own so its pages can be crawled, stored, and indexed.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/data-source/website.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Website

The Website data source allows you to connect a domain you own so its pages can be crawled, stored, and indexed.

You can only crawl domains that you have onboarded onto the same Cloudflare account. Refer to [Onboard a domain](https://developers.cloudflare.com/fundamentals/manage-domains/add-site/) for more information on adding a domain to your Cloudflare account.

Bot protection may block crawling

If you use Cloudflare products that control or restrict bot traffic such as [Bot Management](https://developers.cloudflare.com/bots/), [Web Application Firewall (WAF)](https://developers.cloudflare.com/waf/), or [Turnstile](https://developers.cloudflare.com/turnstile/), the same rules will apply to the AI Search crawler. Make sure to configure an exception or an allow-list for the AI Search crawler in your settings.

## How website crawling works

When you connect a domain, the crawler looks for your website's sitemap to determine which pages to visit:

1. If you configure one or more custom sitemap URLs in the dashboard under **Parser options** \> **Specific sitemap**, AI Search crawls only those sitemap URLs.
2. Otherwise, the crawler checks `robots.txt` for listed sitemaps.
3. If no `robots.txt` is found, the crawler checks for a sitemap at `/sitemap.xml`.
4. If no sitemap is available, the domain cannot be crawled.

### Indexing order

If your sitemaps include `<priority>` attributes, AI Search reads all sitemaps and indexes pages based on each page's priority value, regardless of which sitemap the page is in.

If no `<priority>` is specified, pages are indexed in the order the sitemaps are provided, either from the configured custom sitemap URLs or from `robots.txt` from top to bottom.

AI Search supports `.gz` compressed sitemaps. Both `robots.txt` and sitemaps can use partial URLs.

## Path filtering

You can control which pages get indexed by defining include and exclude rules for URL paths. Use this to limit indexing to specific sections of your site or to exclude content you do not want searchable.

Note

Path filtering matches against the full URL, including the scheme, hostname, and subdomains. For example, a page at `https://www.example.com/blog/post` requires a pattern like `**/blog/**` to match. Using `/blog/**` alone will not match because it does not account for the hostname.

For example, to index only blog posts while excluding drafts:

* **Include:** `**/blog/**`
* **Exclude:** `**/blog/drafts/**`

Refer to [Path filtering](https://developers.cloudflare.com/ai-search/configuration/path-filtering/) for pattern syntax, filtering behavior, and more examples.

## Best practices for robots.txt and sitemap

Configure your `robots.txt` and sitemap to help AI Search crawl your site efficiently.

### robots.txt

The AI Search crawler uses the user agent `Cloudflare-AI-Search`. Your `robots.txt` file should reference your sitemap and allow the crawler:

robots.txt

```

User-agent: *

Allow: /


Sitemap: https://example.com/sitemap.xml


```

You can list multiple sitemaps or use a sitemap index file:

robots.txt

```

User-agent: *

Allow: /


Sitemap: https://example.com/sitemap.xml

Sitemap: https://example.com/blog-sitemap.xml

Sitemap: https://example.com/sitemap.xml.gz


```

To block all other crawlers but allow only AI Search:

robots.txt

```

User-agent: *

Disallow: /


User-agent: Cloudflare-AI-Search

Allow: /


Sitemap: https://example.com/sitemap.xml


```

### Sitemap

Structure your sitemap to give AI Search the information it needs to crawl efficiently:

sitemap.xml

```

<?xml version="1.0" encoding="UTF-8"?>

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

  <url>

    <loc>https://example.com/important-page</loc>

    <lastmod>2026-01-15</lastmod>

    <changefreq>weekly</changefreq>

    <priority>1.0</priority>

  </url>

  <url>

    <loc>https://example.com/other-page</loc>

    <lastmod>2026-01-10</lastmod>

    <changefreq>monthly</changefreq>

    <priority>0.5</priority>

  </url>

</urlset>


```

Use these attributes to control crawling behavior:

| Attribute    | Purpose                       | Recommendation                                                                                      |
| ------------ | ----------------------------- | --------------------------------------------------------------------------------------------------- |
| <loc>        | URL of the page               | Required. Use full or partial URLs.                                                                 |
| <lastmod>    | Last modification date        | Include to enable change detection. AI Search re-crawls pages when this date changes.               |
| <changefreq> | Expected change frequency     | Use when <lastmod> is not available. Values: always, hourly, daily, weekly, monthly, yearly, never. |
| <priority>   | Relative importance (0.0-1.0) | Set higher values for important pages. AI Search indexes pages in priority order.                   |

You can also use a Sitemap Index to bundle other, domain specific sitemaps:

sitemap-index.xml

```

<?xml version="1.0" encoding="UTF-8"?>

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

  <sitemap>

    <loc>https://www.example.com/sitemap-blog.xml</loc>

    <lastmod>2024-08-15T10:00:00+00:00</lastmod>

  </sitemap>

  <sitemap>

    <loc>https://www.example.com/sitemap-docs.xml</loc>

    <lastmod>2024-08-10T12:00:00+00:00</lastmod>

  </sitemap>

</sitemapindex>


```

When parsing a Sitemap Index, AI Search collects all child sitemaps and then crawls them recursively, collecting all relevant URLs present in your sitemaps.

### Recommendations

* **Include `<lastmod>`** on all URLs to enable efficient change detection during syncs.
* **Set `<priority>`** to control indexing order. Pages with higher priority are indexed first.
* **Use `<changefreq>`** as a fallback when `<lastmod>` is not available.
* **Use sitemap index files** for large sites with multiple sitemaps.
* **Compress large sitemaps** using `.gz` format to reduce bandwidth.
* **Keep sitemaps under 50MB** and 50,000 URLs per file (standard sitemap limits).

## How to set WAF rules to allowlist the crawler

If you have Security rules configured to block bot activity, you can add a rule to allowlist the crawler bot.

1. In the Cloudflare dashboard, go to the **Security rules** page.  
[ Go to **Security rules** ](https://dash.cloudflare.com/?to=/:account/:zone/security/security-rules)
2. To create a new empty rule, select **Create rule** \> **Custom rules**.
3. Enter a descriptive name for the rule in **Rule name**, such as `Allow AI Search`.
4. Under **When incoming requests match**, use the **Field** drop-down list to choose _Bot Detection ID_. For **Operator**, select _equals_. For **Value**, enter `122933950`.
5. Under **Then take action**, in the **Choose action** dropdown, choose _Skip_.
6. Under **Place at**, select the order of the rule in the **Select order** dropdown to be _First_. Setting the order as _First_ allows this rule to be applied before subsequent rules.
7. To save and deploy your rule, select **Deploy**.

## Parsing options

You can configure parsing options during onboarding or in your instance settings under **Parser options**.

### Specific sitemap

By default, AI Search crawls all sitemaps listed in your `robots.txt` in the order they appear (top to bottom). If you do not want the crawler to index everything, or if your sitemap is hosted at a non-standard path, you can configure custom sitemap URLs in the dashboard under **Parser options** \> **Specific sitemap**.

When custom sitemap URLs are configured, AI Search uses those sitemap URLs instead of auto-discovering sitemaps from `robots.txt` or `/sitemap.xml`. You can add up to five sitemap URLs.

### Rendering mode

You can choose how pages are parsed during crawling:

* **Static sites**: Downloads the raw HTML for each page.
* **Rendered sites**: Loads pages with a headless browser and downloads the fully rendered version, including dynamic JavaScript content. Note that the [Browser Rendering](https://developers.cloudflare.com/browser-rendering/pricing/) limits and billing apply.

## Extra headers for access protected content

If your website has pages behind authentication or are only visible to logged-in users, you can configure custom HTTP headers to allow the AI Search crawler to access this protected content. You can add up to five custom HTTP headers to the requests AI Search sends when crawling your site.

### Providing access to sites protected by Cloudflare Access

To allow AI Search to crawl a site protected by [Cloudflare Access](https://developers.cloudflare.com/cloudflare-one/access-controls/), you need to create service token credentials and configure them as custom headers.

Service tokens bypass user authentication, so ensure your Access policies are configured appropriately for the content you want to index. The service token will allow the AI Search crawler to access all content covered by the Service Auth policy.

1. In [Cloudflare One ↗](https://one.dash.cloudflare.com/), [create a service token](https://developers.cloudflare.com/cloudflare-one/access-controls/service-credentials/service-tokens/#create-a-service-token). Once the Client ID and Client Secret are generated, save them for the next steps. For example they can look like:  
```  
CF-Access-Client-Id: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.access  
CF-Access-Client-Secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  
```
2. [Create a policy](https://developers.cloudflare.com/cloudflare-one/access-controls/policies/policy-management/#create-a-policy) with the following configuration:  
   * Add an **Include** rule with **Selector** set to **Service token**.  
   * In **Value**, select the Service Token you created in step 1.
3. [Add your self-hosted application to Access](https://developers.cloudflare.com/cloudflare-one/access-controls/applications/http-apps/self-hosted-public-app/) and with the following configuration:  
   * In Access policies, click **Select existing policies**.  
   * Select the policy that you have just created and select **Confirm**.
4. In the Cloudflare dashboard, go to the **AI Search** page.  
[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
5. Select **Create**.
6. Select **Website** as your data source.
7. Under **Parse options**, locate **Extra headers** and add the following two headers using your saved credentials:  
   * Header 1:  
         * **Key**: `CF-Access-Client-Id`  
         * **Value**: `xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.access`  
   * Header 2:  
         * **Key**: `CF-Access-Client-Secret`  
         * **Value**: `xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`
8. Complete the AI Search setup process to create your search instance.

## Storage

During setup, AI Search creates a dedicated R2 bucket in your account to store the pages that have been crawled and downloaded as HTML files. This bucket is automatically managed and is used only for content discovered by the crawler. Any files or objects that you add directly to this bucket will not be indexed.

Note

We recommend not modifying the bucket as it may disrupt the indexing flow and cause content to not be updated properly.

## Sync and updates

During scheduled or manual [sync jobs](https://developers.cloudflare.com/ai-search/configuration/indexing/), the crawler will check for changes to the `<lastmod>` attribute in your sitemap. If it has been changed to a date occurring after the last sync date, then the page will be crawled, the updated version is stored in the R2 bucket, and automatically reindexed so that your search results always reflect the latest content.

If the `<lastmod>` attribute is not defined, AI Search uses the `<changefreq>` attribute to determine how often to re-crawl the URL. If neither `<lastmod>` nor `<changefreq>` is defined, AI Search automatically crawls each link once a day.

## Custom metadata

You can attach custom metadata to web pages using HTML `<meta>` tags. AI Search extracts metadata from the `<head>` section of each crawled page.

Before custom metadata can be extracted, you must [define a schema](https://developers.cloudflare.com/ai-search/configuration/metadata/#define-a-schema) in your AI Search configuration.

### Add metadata to web pages

Add `<meta>` tags using either the `name` or `property` attribute:

```

<!DOCTYPE html>

<html>

  <head>

    <meta name="title" content="Getting Started Guide" />

    <meta name="description" content="Learn how to set up the application" />

    <meta property="og:title" content="Getting Started Guide" />

    <meta property="og:image" content="https://example.com/og-image.png" />

    <meta name="category" content="documentation" />

    <meta name="version" content="2.5" />

    <meta name="is_public" content="true" />

  </head>

  <body>

    <!-- Page content -->

  </body>

</html>


```

### Recognized fields

For the following fields, AI Search knows which meta tags to extract from. You must still define these in your schema to enable extraction.

| Field       | Source                                                        |
| ----------- | ------------------------------------------------------------- |
| title       | <meta name="title"> or <meta property="og:title">             |
| description | <meta name="description"> or <meta property="og:description"> |
| image       | <meta property="og:image">                                    |

When both a standard meta tag and an Open Graph tag are present, the standard meta tag takes precedence.

### How metadata extraction works

When the crawler fetches a page:

1. All `<meta>` tags with `name` or `property` attributes are parsed from the `<head>` section.
2. Tag names are matched against your schema (case-insensitive).
3. The `content` attribute value is cast to the configured data type.
4. Extracted metadata is stored in R2 alongside the cached HTML.
5. On subsequent processing, metadata flows into Vectorize.

### Boolean value parsing

For `boolean` fields, the following values are accepted (case-insensitive):

| True values  | False values |
| ------------ | ------------ |
| true, 1, yes | false, 0, no |

Any other value is treated as invalid and the field is omitted.

## Limits

The regular AI Search [limits](https://developers.cloudflare.com/ai-search/platform/limits-pricing/) apply when using the Website data source.

The crawler will download and index pages only up to the maximum object limit supported for an AI Search instance, and it processes the first set of pages it visits until that limit is reached. In addition, any files that are downloaded but exceed the file size limit will not be indexed.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/data-source/","name":"Data source"}},{"@type":"ListItem","position":5,"item":{"@id":"/ai-search/configuration/data-source/website/","name":"Website"}}]}
```

---

---
title: UI snippets
description: You can add AI Search easily into your website using the Cloudflare AI Search UI snippet library, which provides production-ready, customizable web components.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/embed-search-snippets.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# UI snippets

You can add AI Search easily into your website using the [Cloudflare AI Search UI snippet library ↗](https://search.ai.cloudflare.com/), which provides production-ready, customizable web components.

The library is open source at [github.com/cloudflare/ai-search-snippet ↗](https://github.com/cloudflare/ai-search-snippet).

## Available components

The snippet library provides four web components. Each component connects to your AI Search instance using the `api-url` attribute, which should point to your public endpoint URL.

| Component              | Description                                                 |
| ---------------------- | ----------------------------------------------------------- |
| <search-bar-snippet>   | An inline search bar that displays results in a dropdown    |
| <search-modal-snippet> | A search modal that opens with Cmd/Ctrl+K keyboard shortcut |
| <chat-bubble-snippet>  | A floating chat bubble in the corner of the page            |
| <chat-page-snippet>    | A full-page chat interface with conversation history        |

For advanced styling and configuration, visit [search.ai.cloudflare.com ↗](https://search.ai.cloudflare.com/).

## Prerequisites

UI snippets connect to your AI Search instance through a public endpoint. You need to enable this endpoint before using the snippets.

1. Go to **AI Search** in the Cloudflare dashboard.  
[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Select your AI Search instance.
3. Go to **Settings** \> **Public Endpoint**.
4. Turn on **Enable Public Endpoint**.
5. Copy the public endpoint URL. You will use this as the `api-url` attribute in your snippets.

## Use with HTML

1. Add the script tag to your HTML file (for example, `index.html`). Replace `<INSTANCE_ID>` with your AI Search instance's public endpoint ID, which you can find in your AI Search instance's **Settings** \> **Public Endpoint**.  
```  
<script  
  type="module"  
  src="https://<INSTANCE_ID>.search.ai.cloudflare.com/assets/v0.0.25/search-snippet.es.js"  
></script>  
```
2. Add a component with your `api-url`.  
```  
<search-bar-snippet  
  api-url="https://<INSTANCE_ID>.search.ai.cloudflare.com/"  
  placeholder="Search..."  
></search-bar-snippet>  
```
3. Before testing, [configure CORS](#configure-cors-for-local-testing) to allow your local origin. Then open the HTML file in your browser to test.

### Full HTML example

The following example shows a complete HTML page with a search bar. When a user types in the search bar, results appear in a dropdown below.

```

<!doctype html>

<html>

  <head>

    <script

      type="module"

      src="https://<INSTANCE_ID>.search.ai.cloudflare.com/assets/v0.0.25/search-snippet.es.js"

    ></script>

  </head>

  <body>

    <search-bar-snippet

      api-url="https://<INSTANCE_ID>.search.ai.cloudflare.com/"

      placeholder="Search..."

      max-results="10"

    ></search-bar-snippet>

  </body>

</html>


```

## Use with a framework

* [ React ](#tab-panel-3093)
* [ Vue ](#tab-panel-3094)

1. Open your React project and install the package:  
Terminal window  
```  
npm install @cloudflare/ai-search-snippet  
```
2. In your component file (for example, `src/App.tsx`), import the package:  
```  
import "@cloudflare/ai-search-snippet";  
```
3. Add a component to your JSX:  
```  
export default function App() {  
  return (  
    <search-bar-snippet  
      api-url="https://<INSTANCE_ID>.search.ai.cloudflare.com/"  
      placeholder="Search..."  
    />  
  );  
}  
```
4. Before testing, [configure CORS](#configure-cors-for-local-testing) to allow your local origin. Then run your development server:  
Terminal window  
```  
npm run dev  
```

The package includes TypeScript types and works with React, Next.js, and other React frameworks.

1. Open your Vue project and install the package:  
Terminal window  
```  
npm install @cloudflare/ai-search-snippet  
```
2. In your component file (for example, `src/App.vue`), import the package and add the component:  
```  
<template>  
  <search-bar-snippet :api-url="apiUrl" placeholder="Search..." />  
</template>  
<script setup>  
import "@cloudflare/ai-search-snippet";  
const apiUrl = "https://<INSTANCE_ID>.search.ai.cloudflare.com/";  
</script>  
```
3. Before testing, [configure CORS](#configure-cors-for-local-testing) to allow your local origin. Then run your development server:  
Terminal window  
```  
npm run dev  
```

## Configure CORS for local testing

When testing locally (for example, `http://localhost:3000`), you need to allow your local origin in the public endpoint settings.

1. Go to **AI Search** in the Cloudflare dashboard.  
[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Select your AI Search instance.
3. Go to **Settings** \> **Public Endpoint**.
4. Under **Authorized hosts**, add your local URL (for example, `http://localhost:3000`) or `*` to allow all origins during testing.
5. Select **Save**.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/embed-search-snippets/","name":"UI snippets"}}]}
```

---

---
title: Indexing
description: AI Search automatically indexes your data into vector embeddings optimized for semantic search. Once a data source is connected, indexing runs continuously in the background to keep your knowledge base fresh and queryable.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/indexing.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Indexing

AI Search automatically indexes your data into vector embeddings optimized for semantic search. Once a data source is connected, indexing runs continuously in the background to keep your knowledge base fresh and queryable.

## Jobs

AI Search automatically monitors your data source for updates and reindexes your content every **6 hours**. During each cycle, new or modified files are reprocessed to keep your Vectorize index up to date.

You can monitor the status and history of all indexing activity in the Jobs tab, including real-time logs for each job to help you troubleshoot and verify successful syncs.

## Controls

You can control indexing behavior through the following actions on the dashboard:

* **Sync Index**: Manually trigger AI Search to scan your data source for new, modified, or deleted files and initiate an indexing job to update the associated Vectorize index. A new indexing job can be initiated every 30 seconds.
* **Sync Individual File**: Trigger a sync for a specific file from the **Overview** page. Go to **Indexed Items** and select the sync icon next to the specific file you want to reindex.
* **Pause Indexing**: Temporarily stop all scheduled indexing checks and reprocessing. Useful for debugging or freezing your knowledge base.

## Performance

The total time to index depends on the number and type of files in your data source. Factors that affect performance include:

* Total number of files and their sizes
* File formats (for example, images take longer than plain text)
* Latency of Workers AI models used for embedding and image processing

## Best practices

To ensure smooth and reliable indexing:

* Make sure your files are within the [**size limit**](https://developers.cloudflare.com/ai-search/platform/limits-pricing/#limits) and in a supported format to avoid being skipped.
* Keep your Service API token valid to prevent indexing failures.
* Regularly clean up outdated or unnecessary content in your knowledge base to avoid hitting [Vectorize index limits](https://developers.cloudflare.com/vectorize/platform/limits/).

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/indexing/","name":"Indexing"}}]}
```

---

---
title: Metadata
description: Use metadata to filter documents before retrieval and provide context to guide AI responses. This page covers built-in metadata attributes, custom metadata schemas, filter syntax, and the optional context field.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/metadata.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Metadata

Use metadata to filter documents before retrieval and provide context to guide AI responses. This page covers built-in metadata attributes, custom metadata schemas, filter syntax, and the optional `context` field.

## Built-in metadata attributes

AI Search automatically extracts the following metadata attributes from your indexed documents:

| Attribute | Description                                                                                         | Example                                                         |
| --------- | --------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- |
| filename  | The name of the file.                                                                               | dog.png or animals/mammals/cat.png                              |
| folder    | The folder or prefix to the object.                                                                 | For animals/mammals/cat.png, the folder is animals/mammals/     |
| timestamp | Unix timestamp (milliseconds) when the object was last modified. Comparisons round down to seconds. | 1735689600000 rounds to 1735689600000 (2025-01-01 00:00:00 UTC) |

## Custom metadata

Custom metadata allows you to define additional fields for filtering search results. You can attach structured metadata to documents and filter queries by attributes such as category, version, or any custom field.

### Supported data types

| Type    | Description                        | Example values               |
| ------- | ---------------------------------- | ---------------------------- |
| text    | String values (max 500 characters) | "documentation", "blog-post" |
| number  | Numeric values (parsed as float)   | 2.5, 100, \-3.14             |
| boolean | Boolean values                     | true, false, 1, 0, yes, no   |

### Define a schema

Before custom metadata can be extracted, define a schema in your AI Search configuration using the `custom_metadata` field. The schema specifies which fields to extract and their data types.

TypeScript

```

custom_metadata: [

  { field_name: "category", data_type: "text" },

  { field_name: "version", data_type: "number" },

  { field_name: "is_public", data_type: "boolean" },

];


```

**Schema constraints:**

* Maximum of 5 custom metadata fields per AI Search instance
* Field names are case-insensitive and stored as lowercase
* Field names cannot use reserved names: `timestamp`, `folder`, `filename`
* Text values are truncated to 500 characters
* Changing the schema triggers a full re-index of all documents

### Add custom metadata to documents

How you attach custom metadata depends on your data source:

* **R2 bucket**: Set metadata using S3-compatible custom headers (`x-amz-meta-*`). Refer to [R2 custom metadata](https://developers.cloudflare.com/ai-search/configuration/data-source/r2/#custom-metadata) for examples.
* **Website**: Add `<meta>` tags to your HTML pages. Refer to [Website custom metadata](https://developers.cloudflare.com/ai-search/configuration/data-source/website/#custom-metadata) for details.

## Metadata filtering

Metadata filtering narrows down search results based on metadata, so only relevant content is retrieved. The filter is applied before retrieval, so you only query the documents that matter.

Note

If you are using the legacy AutoRAG API, refer to [AutoRAG API filter format](https://developers.cloudflare.com/ai-search/autorag-filter-format/) for the filter syntax.

Here is an example of metadata filtering using the [REST API](https://developers.cloudflare.com/ai-search/usage/rest-api/):

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/instances/{NAME}/search \

  -H "Content-Type: application/json" \

  -H "Authorization: Bearer {API_TOKEN}" \

  -d '{

    "messages": [

      {

        "content": "How do I train a llama to deliver coffee?",

        "role": "user"

      }

    ],

    "ai_search_options": {

      "retrieval": {

        "filters": {

          "folder": "llama/logistics/",

          "timestamp": { "$gte": 1735689600 }

        }

      }

    }

  }'


```

### Filter by custom metadata

| Attribute | Description                                               | Example                                                                |
| --------- | --------------------------------------------------------- | ---------------------------------------------------------------------- |
| filename  | The name of the file.                                     | dog.png or animals/mammals/cat.png                                     |
| folder    | The folder or prefix to the object.                       | For the object animals/mammals/cat.png, the folder is animals/mammals/ |
| timestamp | Unix timestamp (seconds) when the file was last modified. | 1735689600 for 2025-01-01 00:00:00 UTC                                 |

### Filter syntax

The REST API uses Vectorize-style metadata filtering. Filters are JSON objects where keys are metadata attribute names and values specify the filter condition.

#### Supported operators

| Operator | Description                       |
| -------- | --------------------------------- |
| $eq      | Equals                            |
| $ne      | Not equals                        |
| $in      | In (matches any value in array)   |
| $nin     | Not in (excludes values in array) |
| $lt      | Less than                         |
| $lte     | Less than or equal to             |
| $gt      | Greater than                      |
| $gte     | Greater than or equal to          |

#### Implicit `$eq`

When you provide a direct value without an operator, it is treated as an equality check:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": "customer-a/" }

    }

  }

}


```

This is equivalent to:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": { "$eq": "customer-a/" } }

    }

  }

}


```

#### Range queries

Combine upper and lower bound operators to filter by ranges:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "timestamp": { "$gte": 1735689600, "$lt": 1735900000 } }

    }

  }

}


```

#### Multiple conditions (implicit AND)

When you specify multiple keys, all conditions must match:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": {

        "folder": "llama/logistics/",

        "timestamp": { "$gte": 1735689600 }

      }

    }

  }

}


```

#### `$in` operator

Match any value in an array:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": { "$in": ["customer-a/", "customer-b/"] } }

    }

  }

}


```

### "Starts with" filter for folders

Use range queries to filter for all files within a folder and its subfolders.

For example, consider this file structure:

* Directorycustomer-a  
   * profile.md  
   * Directorycontracts  
         * Directoryproperty  
                  * contract-1.pdf

Using `{ "folder": "customer-a/" }` only matches files directly in that folder (like `profile.md`), not files in subfolders.

To match all files starting with `customer-a/`, use a range query:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": { "$gte": "customer-a/", "$lt": "customer-a0" } }

    }

  }

}


```

This works because:

* `$gte` includes all paths starting with `customer-a/`
* `$lt` with `customer-a0` excludes paths that do not start with `customer-a/` (since `0` comes after `/` in ASCII)

For more details on Vectorize metadata filtering, refer to [Vectorize metadata filtering](https://developers.cloudflare.com/vectorize/reference/metadata-filtering/).

## Add `context` field to guide AI Search

You can optionally include a custom metadata field named `context` when uploading an object to your R2 bucket.

The `context` field is attached to each chunk and passed to the LLM during an `/ai-search` query. It does not affect retrieval but helps the LLM interpret and frame the answer.

The field can be used for providing document summaries, source links, or custom instructions without modifying the file content.

You can add [custom metadata](https://developers.cloudflare.com/r2/api/workers/workers-api-reference/#r2putoptions) to an object in the `/PUT` operation when uploading the object to your R2 bucket. For example if you are using the [Workers binding with R2](https://developers.cloudflare.com/r2/api/workers/workers-api-usage/):

JavaScript

```

await env.MY_BUCKET.put("cat.png", file, {

  customMetadata: {

    context: "This is a picture of Joe's cat. His name is Max.",

  },

});


```

During `/ai-search`, this context appears in the response under `attributes.file.context`, and is included in the data passed to the LLM for generating a response.

## Response

You can see the metadata attributes of your retrieved data in the response under the property `attributes` for each retrieved chunk. For example:

```

{

  "data": [

    {

      "file_id": "llama001",

      "filename": "llama/logistics/llama-logistics.md",

      "score": 0.45,

      "attributes": {

        "timestamp": 1735689600000,

        "folder": "llama/logistics/",

        "file": {

          "url": "www.llamasarethebest.com/logistics",

          "context": "This file contains information about how llamas can logistically deliver coffee."

        }

      },

      "content": [

        {

          "id": "llama001",

          "type": "text",

          "text": "Llamas can carry 3 drinks max."

        }

      ]

    }

  ]

}


```

Custom metadata fields appear alongside built-in attributes in the `attributes.file` object.

## Re-indexing behavior

When you modify the `custom_metadata` schema:

1. New fields are added to the Vectorize metadata index.
2. Removed fields are deleted from the Vectorize metadata index.
3. A full re-index is triggered for all documents.
4. Existing vectors are updated with the new metadata structure.

## Limitations

| Constraint                | Limit                       |
| ------------------------- | --------------------------- |
| Maximum custom fields     | 5 per AI Search instance    |
| Maximum text value length | 500 characters              |
| Reserved field names      | timestamp, folder, filename |
| Field name matching       | Case-insensitive            |

If R2 file metadata exceeds Vectorize size limits, the metadata is replaced with an error indicator:

```

{

  "file": { "error": "r2 metadata is too large" }

}


```

To avoid this, keep individual metadata values concise.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/metadata/","name":"Metadata"}}]}
```

---

---
title: Models
description: AI Search uses models at multiple stages. You can configure which models are used, or let AI Search automatically select a smart default for you.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/models/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Models

AI Search uses models at multiple stages. You can configure which models are used, or let AI Search automatically select a smart default for you.

## Models usage

AI Search leverages Workers AI models in the following stages:

* Image to markdown conversion (if images are in data source): Converts image content to Markdown using object detection and captioning models.
* Embedding: Transforms your documents and queries into vector representations for semantic search.
* Query rewriting (optional): Reformulates the user’s query to improve retrieval accuracy.
* Generation: Produces the final response from retrieved context.

## Model providers

All AI Search instances support models from [Workers AI](https://developers.cloudflare.com/workers-ai). You can use other providers (such as OpenAI or Anthropic) in AI Search by adding their API keys to an [AI Gateway](https://developers.cloudflare.com/ai-gateway) and connecting that gateway to your AI Search.

To use AI Search with other model providers:

1. Add provider keys to [AI Gateway](https://developers.cloudflare.com/ai-gateway/configuration/bring-your-own-keys/)
2. Connect the gateway to AI Search
* When creating a new AI Search, select the AI Gateway with your provider keys.
* For an existing AI Search, go to **Settings** and switch to a gateway that has your keys under **Resources**.
1. Select models
* Embedding model: Only available to be changed when creating a new AI Search.
* Generation model: Can be selected when creating a new AI Search and can be changed at any time in **Settings**.

AI Search supports a subset of models that have been selected to provide the best experience. See list of [supported models](https://developers.cloudflare.com/ai-search/configuration/models/supported-models/).

### Smart default

If you choose **Smart Default** in your model selection, then AI Search will select a Cloudflare recommended model and will update it automatically for you over time. You can switch to explicit model configuration at any time by visiting **Settings**.

### Per-request generation model override

While the generation model can be set globally at the AI Search instance level, you can also override it on a per-request basis in the [AI Search API](https://developers.cloudflare.com/ai-search/usage/rest-api/#chat-completions). This is useful if your [RAG application](https://developers.cloudflare.com/ai-search/) requires dynamic selection of generation models based on context or user preferences.

## Model deprecation

AI Search may deprecate support for a given model in order to provide support for better-performing models with improved capabilities. When a model is being deprecated, we announce the change and provide an end-of-life date after which the model will no longer be accessible. Applications that depend on AI Search may therefore require occasional updates to continue working reliably.

### Model lifecycle

AI Search models follow a defined lifecycle to ensure stability and predictable deprecation:

1. **Production:** The model is actively supported and recommended for use. It is included in Smart Defaults and receives ongoing updates and maintenance.
2. **Announcement & Transition:** The model remains available but has been marked for deprecation. An end-of-life date is communicated through documentation, release notes, and other official channels. During this phase, users are encouraged to migrate to the recommended replacement model.
3. **Automatic Upgrade (if applicable):** If you have selected the Smart Default option, AI Search will automatically upgrade requests to a recommended replacement.
4. **End of life:** The model is no longer available. Any requests to the retired model return a clear error message, and the model is removed from documentation and Smart Defaults.

See models are their lifecycle status in [supported models](https://developers.cloudflare.com/ai-search/configuration/models/supported-models/).

### Best practices

* Regularly check the [release note](https://developers.cloudflare.com/ai-search/platform/release-note/) for updates.
* Plan migration efforts according to the communicated end-of-life date.
* Migrate and test the recommended replacement models before the end-of-life date.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/models/","name":"Models"}}]}
```

---

---
title: Supported models
description: This page lists all models supported by AI Search and their lifecycle status.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/models/supported-models.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Supported models

This page lists all models supported by AI Search and their lifecycle status.

Request model support

If you would like to use a model that is not currently supported, reach out to us on [Discord ↗](https://discord.gg/cloudflaredev) to request it.

## Production models

Production models are the actively supported and recommended models that are stable, fully available.

### Text generation

| Provider                                    | Alias                                    | Context window (tokens) |
| ------------------------------------------- | ---------------------------------------- | ----------------------- |
| **Anthropic**                               | anthropic/claude-3-7-sonnet              | 200,000                 |
| anthropic/claude-sonnet-4                   | 200,000                                  |                         |
| anthropic/claude-opus-4                     | 200,000                                  |                         |
| anthropic/claude-3-5-haiku                  | 200,000                                  |                         |
| **Cerebras**                                | cerebras/qwen-3-235b-a22b-instruct       | 64,000                  |
| cerebras/qwen-3-235b-a22b-thinking          | 65,000                                   |                         |
| cerebras/llama-3.3-70b                      | 65,000                                   |                         |
| cerebras/llama-4-maverick-17b-128e-instruct | 8,000                                    |                         |
| cerebras/llama-4-scout-17b-16e-instruct     | 8,000                                    |                         |
| cerebras/gpt-oss-120b                       | 64,000                                   |                         |
| **Google AI Studio**                        | google-ai-studio/gemini-2.5-flash        | 1,048,576               |
| google-ai-studio/gemini-2.5-pro             | 1,048,576                                |                         |
| **Grok (x.ai)**                             | grok/grok-4                              | 256,000                 |
| **Groq**                                    | groq/llama-3.3-70b-versatile             | 131,072                 |
| groq/llama-3.1-8b-instant                   | 131,072                                  |                         |
| **OpenAI**                                  | openai/gpt-5                             | 400,000                 |
| openai/gpt-5-mini                           | 400,000                                  |                         |
| openai/gpt-5-nano                           | 400,000                                  |                         |
| **Workers AI**                              | @cf/meta/llama-3.3-70b-instruct-fp8-fast | 24,000                  |
| @cf/meta/llama-3.1-8b-instruct-fast         | 60,000                                   |                         |
| @cf/meta/llama-3.1-8b-instruct-fp8          | 32,000                                   |                         |
| @cf/meta/llama-4-scout-17b-16e-instruct     | 131,000                                  |                         |

### Embedding

| Provider                      | Alias                                 | Vector dims | Input tokens | Metric |
| ----------------------------- | ------------------------------------- | ----------- | ------------ | ------ |
| **Google AI Studio**          | google-ai-studio/gemini-embedding-001 | 1,536       | 2048         | cosine |
| **OpenAI**                    | openai/text-embedding-3-small         | 1,536       | 8192         | cosine |
| openai/text-embedding-3-large | 1,536                                 | 8192        | cosine       |        |
| **Workers AI**                | @cf/baai/bge-m3                       | 1,024       | 512          | cosine |
| @cf/baai/bge-large-en-v1.5    | 1,024                                 | 512         | cosine       |        |

### Reranking

| Provider       | Alias                      | Input tokens |
| -------------- | -------------------------- | ------------ |
| **Workers AI** | @cf/baai/bge-reranker-base | 512          |

## Transition models

There are currently no models marked for end-of-life.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/models/","name":"Models"}},{"@type":"ListItem","position":5,"item":{"@id":"/ai-search/configuration/models/supported-models/","name":"Supported models"}}]}
```

---

---
title: Path filtering
description: Path filtering allows you to control which files or URLs are indexed by defining include and exclude patterns. Use this to limit indexing to specific content or to skip files you do not want searchable.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/path-filtering.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Path filtering

Path filtering allows you to control which files or URLs are indexed by defining include and exclude patterns. Use this to limit indexing to specific content or to skip files you do not want searchable.

Path filtering works with both [website](https://developers.cloudflare.com/ai-search/configuration/data-source/website/) and [R2](https://developers.cloudflare.com/ai-search/configuration/data-source/r2/) data sources.

## Configuration

You can configure path filters when creating or editing an AI Search instance. In the dashboard, open **Path Filters** and add your include or exclude rules. You can also update path filters at any time from the **Settings** page of your instance.

When using the REST API, specify `include_items` and `exclude_items` in the `source_params` of your configuration:

| Parameter      | Type       | Limit               | Description                                              |
| -------------- | ---------- | ------------------- | -------------------------------------------------------- |
| include\_items | string\[\] | Maximum 10 patterns | Only index items matching at least one of these patterns |
| exclude\_items | string\[\] | Maximum 10 patterns | Skip items matching any of these patterns                |

Both parameters are optional. If neither is specified, all items from the data source are indexed.

## Filtering behavior

### Wildcard rules

Exclude rules take precedence over include rules. Filtering is applied in this order:

1. **Exclude check**: If the item matches any exclude pattern, it is skipped.
2. **Include check**: If include patterns are defined and the item does not match any of them, it is skipped.
3. **Index**: The item proceeds to indexing.

| Scenario                    | Behavior                                                                               |
| --------------------------- | -------------------------------------------------------------------------------------- |
| No rules defined            | All items are indexed                                                                  |
| Only exclude\_items defined | All items except those matching exclude patterns are indexed                           |
| Only include\_items defined | Only items matching at least one include pattern are indexed                           |
| Both defined                | Exclude patterns are checked first, then remaining items must match an include pattern |

### Pattern syntax

Patterns use a case-sensitive wildcard syntax based on [micromatch ↗](https://github.com/micromatch/micromatch):

| Wildcard | Meaning                                              |
| -------- | ---------------------------------------------------- |
| \*       | Matches any characters except path separators (/)    |
| \*\*     | Matches any characters including path separators (/) |

Patterns can contain:

* Letters, numbers, and underscores (`a-z`, `A-Z`, `0-9`, `_`)
* Hyphens (`-`) and dots (`.`)
* Path separators (`/`)
* URL characters (`?`, `:`, `=`, `&`, `%`)
* Wildcards (`*`, `**`)

### Indexing job status

Items skipped by filtering rules are recorded in job logs with the reason:

* Exclude match: `Skipped by rule: {pattern}`
* No include match: `Skipped by Include Rules`

You can view these in the Jobs tab of your AI Search instance to verify your filters are working as expected.

### Important notes

* **Case sensitivity:** Pattern matching is case-sensitive. `/Blog/*` does not match `/blog/post.html`.
* **Full path matching:** Patterns match the entire path or URL. Use `**` at the beginning for partial matching. For example, `docs/*` matches `docs/file.pdf` but not `site/docs/file.pdf`, while `**/docs/*` matches both.
* **Single `*` does not cross directories:** Use `**` to match across path separators. For example, `docs/*` matches `docs/file.pdf` but not `docs/sub/file.pdf`, while `docs/**` matches both.
* **Trailing slashes matter:** URLs are matched as-is without normalization. `/blog/` does not match `/blog`.

## Examples

### R2 data source

| Use case                        | Pattern                                         | Indexed                            | Skipped                           |
| ------------------------------- | ----------------------------------------------- | ---------------------------------- | --------------------------------- |
| Index only PDFs in docs         | Include: /docs/\*\*/\*.pdf                      | /docs/guide.pdf, /docs/api/ref.pdf | /docs/guide.md, /images/logo.png  |
| Exclude temp and backup files   | Exclude: \*\*/\*.tmp, \*\*/\*.bak               | /docs/guide.md                     | /data/cache.tmp, /old.bak         |
| Exclude temp and backup folders | Exclude: /temp/\*\*, /backup/\*\*               | /docs/guide.md                     | /temp/file.txt, /backup/data.json |
| Index docs but exclude drafts   | Include: /docs/\*\*, Exclude: /docs/drafts/\*\* | /docs/guide.md                     | /docs/drafts/wip.md               |

### Website data source

| Use case                      | Pattern                                                 | Indexed                                            | Skipped                                        |
| ----------------------------- | ------------------------------------------------------- | -------------------------------------------------- | ---------------------------------------------- |
| Index only blog pages         | Include: \*\*/blog/\*\*                                 | example.com/blog/post, example.com/en/blog/article | example.com/about                              |
| Exclude admin pages           | Exclude: \*\*/admin/\*\*                                | example.com/blog/post                              | example.com/admin/settings                     |
| Exclude login pages           | Exclude: \*\*/login\*                                   | example.com/blog/post                              | example.com/login, example.com/auth/login-form |
| Index docs but exclude drafts | Include: \*\*/docs/\*\*, Exclude: \*\*/docs/drafts/\*\* | example.com/docs/guide                             | example.com/docs/drafts/wip                    |

### API format

When using the API, specify patterns in `source_params`:

```

{

  "source_params": {

    "include_items": ["<PATTERN_1>", "<PATTERN_2>"],

    "exclude_items": ["<PATTERN_1>", "<PATTERN_2>"]

  }

}


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/path-filtering/","name":"Path filtering"}}]}
```

---

---
title: Public endpoint settings
description: Configure public endpoints to expose your AI Search instance directly to users without requiring authentication. This enables you to share your AI Search functionality with external users, or to integrate it into public-facing applications.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/public-endpoint.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Public endpoint settings

Configure public endpoints to expose your AI Search instance directly to users without requiring authentication. This enables you to share your AI Search functionality with external users, or to integrate it into public-facing applications.

## Available endpoints

Each AI Search instance can expose three public endpoints:

| Endpoint          | Description                                   |
| ----------------- | --------------------------------------------- |
| /mcp              | Model Context Protocol endpoint for AI agents |
| /chat/completions | OpenAI-compatible chat completion endpoint    |
| /search           | Search endpoint that returns relevant chunks  |

For details on how to use these endpoints, refer to [Public endpoint usage](https://developers.cloudflare.com/ai-search/usage/public-endpoint/).

## Public URL format

When enabled, public endpoints are accessible at:

```

https://<hash>.search.ai.cloudflare.com/<endpoint>


```

The `<hash>` is your instance's unique public endpoint identifier.

For example:

* `https://abc123.search.ai.cloudflare.com/mcp`
* `https://abc123.search.ai.cloudflare.com/chat/completions`
* `https://abc123.search.ai.cloudflare.com/search`

## Enabling and disabling public endpoints

You can enable or disable each public endpoint independently:

1. Log in to your Cloudflare account, and go to **AI Search**.[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Select your AI Search instance.
3. Go to **Settings** \> **Public Endpoints**.
4. Toggle on **Public Endpoints** to enable the feature, then toggle each individual endpoint on or off as needed.

Each endpoint has its own configuration panel for granular control.

## Rate limiting

Configure rate limits to control usage across all public endpoints:

| Setting             | Description                               | Default  |
| ------------------- | ----------------------------------------- | -------- |
| Requests per period | Maximum number of requests allowed        | 120      |
| Time period         | Time window for the rate limit            | 1 minute |
| Period type         | Rate limiting technique: fixed or sliding | fixed    |

Rate limits apply across all enabled public endpoints for the AI Search instance.

## CORS configuration

Cross-Origin Resource Sharing (CORS) is enabled by default to support browser-based applications.

The default allowed origins depend on your data source type:

* **Website data sources**: The source domain is automatically added as an allowed origin.
* **Other data sources**: All origins (`*`) are allowed by default.

You can customize allowed origins in the **Public Endpoints** settings by adding specific hostnames to the CORS rules.

## Tool description

The **Tool Description** field allows you to customize how your AI Search instance is described to MCP clients. The default description is `Finds exactly what you're looking for`. This description helps AI agents understand what content is available, and when to use your search tool. A good tool description should explain what type of content is indexed, and what kinds of questions it can answer.

For example:

```

Search the Acme product documentation for information about

installation, configuration, API references, and troubleshooting

guides. Use this tool when users ask questions about how to set up

or use Acme products.


```

## Security considerations

* Public endpoints do not require authentication.
* Consider enabling rate limiting to prevent abuse.
* Use CORS rules to restrict access to specific domains.
* Monitor usage through your dashboard analytics.

## Related

* [UI snippets](https://developers.cloudflare.com/ai-search/configuration/embed-search-snippets/) \- Add pre-built search and chat components to your website using your public endpoints.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/public-endpoint/","name":"Public endpoint settings"}}]}
```

---

---
title: Query rewriting
description: Query rewriting is an optional step in the AI Search pipeline that improves retrieval quality by transforming the original user query into a more effective search query.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/query-rewriting.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Query rewriting

Query rewriting is an optional step in the AI Search pipeline that improves retrieval quality by transforming the original user query into a more effective search query.

Instead of embedding the raw user input directly, AI Search can use a large language model (LLM) to rewrite the query based on a system prompt. The rewritten query is then used to perform the vector search.

## Why use query rewriting?

The wording of a user’s question may not match how your documents are written. Query rewriting helps bridge this gap by:

* Rephrasing informal or vague queries into precise, information-dense terms
* Adding synonyms or related keywords
* Removing filler words or irrelevant details
* Incorporating domain-specific terminology

This leads to more relevant vector matches which improves the accuracy of the final generated response.

## Example

**Original query:** `how do i make this work when my api call keeps failing?`

**Rewritten query:** `API call failure troubleshooting authentication headers rate limiting network timeout 500 error`

In this example, the original query is conversational and vague. The rewritten version extracts the core problem (API call failure) and expands it with relevant technical terms and likely causes. These terms are much more likely to appear in documentation or logs, improving semantic matching during vector search.

## How it works

If query rewriting is enabled, AI Search performs the following:

1. Sends the **original user query** and the **query rewrite system prompt** to the configured LLM
2. Receives the **rewritten query** from the model
3. Embeds the rewritten query using the selected embedding model
4. Performs vector search in your AI Search's Vectorize index

For details on how to guide model behavior during this step, see the [system prompt](https://developers.cloudflare.com/ai-search/configuration/system-prompt/) documentation.

Note

All AI Search requests are routed through [AI Gateway](https://developers.cloudflare.com/ai-gateway/) and logged there. If you do not select an AI Gateway during setup, AI Search creates a default gateway for your instance. You can view query rewrites, embeddings, text generation, and other model calls in the AI Gateway logs for monitoring and debugging.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/query-rewriting/","name":"Query rewriting"}}]}
```

---

---
title: Reranking
description: Reranking can help improve the quality of AI Search results by reordering retrieved documents based on semantic relevance to the user’s query. It applies a secondary model after retrieval to &#34;rerank&#34; the top results before they are outputted.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/reranking.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Reranking

Reranking can help improve the quality of AI Search results by reordering retrieved documents based on semantic relevance to the user’s query. It applies a secondary model after retrieval to "rerank" the top results before they are outputted.

## How it works

By default, reranking is **disabled** for all AI Search instances. You can enable it during creation or later from the settings page.

When enabled, AI Search will:

1. Retrieve a set of relevant results from your index, constrained by your `max_num_of_results` and `score_threshold` parameters.
2. Pass those results through a [reranking model](https://developers.cloudflare.com/ai-search/configuration/models/supported-models/).
3. Return the reranked results, which the text generation model can use for answer generation.

Reranking helps improve accuracy, especially for large or noisy datasets where vector similarity alone may not produce the optimal ordering.

## Configuration

You can configure reranking in several ways:

### Configure via API

When you make a `/search` or `/ai-search` request using the [Workers Binding](https://developers.cloudflare.com/ai-search/usage/workers-binding/) or [REST API](https://developers.cloudflare.com/ai-search/usage/rest-api/), you can:

* Enable or disable reranking per request
* Specify the reranking model

For example:

JavaScript

```

const answer = await env.AI.autorag("my-autorag").aiSearch({

  query: "How do I train a llama to deliver coffee?",

  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",

  reranking: {

    enabled: true,

    model: "@cf/baai/bge-reranker-base"

  }

});


```

### Configure in dashboard for new AI Search

When creating a new RAG in the dashboard:

1. Go to **AI Search** in the Cloudflare dashboard.  
[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Select **Create** \> **Get started**.
3. In the **Retrieval configuration** step, open the **Reranking** dropdown.
4. Toggle **Reranking** on.
5. Select the reranking model.
6. Complete your setup.

### Configure in dashboard for existing AI Search

To update reranking for an existing instance:

1. Go to **AI Search** in the Cloudflare dashboard.  
[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Select an existing AI Search instance.
3. Go to the **Settings** tab.
4. Under **Reranking**, toggle reranking on.
5. Select the reranking model.

### Considerations

Adding reranking will include an additional step to the query request, as a result, there may be an increase in the latency of the request.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/reranking/","name":"Reranking"}}]}
```

---

---
title: Retrieval configuration
description: AI Search allows you to configure how content is retrieved from your vector index and used to generate a final response. Two options control this behavior:
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/retrieval-configuration.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Retrieval configuration

AI Search allows you to configure how content is retrieved from your vector index and used to generate a final response. Two options control this behavior:

* **Match threshold**: Minimum similarity score required for a vector match to be considered relevant.
* **Maximum number of results**: Maximum number of top-matching results to return (`top_k`).

AI Search uses the [query()](https://developers.cloudflare.com/vectorize/best-practices/query-vectors/) method from [Vectorize](https://developers.cloudflare.com/vectorize/) to perform semantic search. This function compares the embedded query vector against the stored vectors in your index and returns the most similar results.

## Match threshold

The `match_threshold` sets the minimum similarity score (for example, cosine similarity) that a document chunk must meet to be included in the results. Threshold values range from `0` to `1`.

* A higher threshold means stricter filtering, returning only highly similar matches.
* A lower threshold allows broader matches, increasing recall but possibly reducing precision.

## Maximum number of results

This setting controls the number of top-matching chunks returned by Vectorize after filtering by similarity score. It corresponds to the `topK` parameter in `query()`. The maximum allowed value is 50.

* Use a higher value if you want to synthesize across multiple documents. However, providing more input to the model can increase latency and cost.
* Use a lower value if you prefer concise answers with minimal context.

## How they work together

AI Search's retrieval step follows this sequence:

1. Your query is embedded using the configured Workers AI model.
2. `query()` is called to search the Vectorize index, with `topK` set to the `maximum_number_of_results`.
3. Results are filtered using the `match_threshold`.
4. The filtered results are passed into the generation step as context.

If no results meet the threshold, AI Search will not generate a response.

## Configuration

These values can be configured at the AI Search instance level or overridden on a per-request basis using the [REST API](https://developers.cloudflare.com/ai-search/usage/rest-api/) or the [Workers Binding](https://developers.cloudflare.com/ai-search/usage/workers-binding/).

Use the parameters `match_threshold` and `max_num_results` to customize retrieval behavior per request.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/retrieval-configuration/","name":"Retrieval configuration"}}]}
```

---

---
title: Service API token
description: A service API token grants AI Search permission to access and configure resources in your Cloudflare account. This token is different from API tokens you use to interact with your AI Search instance.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/service-api-token.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Service API token

A service API token grants AI Search permission to access and configure resources in your Cloudflare account. This token is different from API tokens you use to interact with your AI Search instance.

Beta

Service API tokens are required during the AI Search beta. This requirement may change in future releases.

## What is a service API token

When you create an AI Search instance, it needs to interact with other Cloudflare services on your behalf, such as [R2](https://developers.cloudflare.com/r2/), [Vectorize](https://developers.cloudflare.com/vectorize/), and [Workers AI](https://developers.cloudflare.com/workers-ai/). The service API token authorizes AI Search to perform these operations. Without it, AI Search cannot index your data or respond to queries.

This token requires the AI Search Index Engine permission (`9e9b428a0bcd46fd80e580b46a69963c`) which grants access to run AI Search Index Engine.

## Service API token vs. AI Search API token

AI Search uses two types of API tokens for different purposes:

| Token type          | Purpose                                                                               | Who uses it          | When to create                                   |
| ------------------- | ------------------------------------------------------------------------------------- | -------------------- | ------------------------------------------------ |
| Service API token   | Grants AI Search permission to access R2, Vectorize, Browser Rendering and Workers AI | AI Search (internal) | Once per account, during first instance creation |
| AI Search API token | Authenticates your requests to query or manage AI Search instances                    | You (external)       | When calling the AI Search REST API              |

The **service API token** is used internally by AI Search to perform background operations like indexing your content and generating responses. You create it once and AI Search uses it automatically.

The **AI Search API token** is a standard [Cloudflare API token](https://developers.cloudflare.com/fundamentals/api/get-started/create-token/) that you create with AI Search permissions. You use this token to authenticate REST API requests, such as creating instances, updating configuration, or querying your AI Search.

## How it works

When you create an AI Search instance via the [dashboard](https://developers.cloudflare.com/ai-search/get-started/dashboard/), the service API token is created automatically as part of the setup flow.

When you create an instance via the [API](https://developers.cloudflare.com/ai-search/get-started/api/), you must create and register the service API token manually before creating your instance.

Once registered, the service API token is stored securely and reused across all AI Search instances in your account. You do not need to create a new token for each instance.

## Token lifecycle

The service API token remains active for as long as you have AI Search instances that depend on it.

Warning

Do not delete your service API token. If you revoke or delete the token, your AI Search instances will lose access to the underlying resources and stop functioning.

If you need a new service API token, you can create one via the dashboard or the API.

### Dashboard

1. Go to an existing AI Search instance in the [Cloudflare dashboard ↗](https://dash.cloudflare.com/?to=/:account/ai/ai-search).
2. Select **Settings**.
3. Under **General**, find **Service API Token** and select the edit icon.
4. Select **Create a new token**.
5. Select **Save**.

### API

Follow steps 1-4 in the [API guide](https://developers.cloudflare.com/ai-search/get-started/api/) to create and register a new token programmatically.

## View registered tokens

You can view the service API tokens registered with AI Search in your account using the [List tokens API](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/tokens/methods/list/). Replace `<API_TOKEN>` with an API token that has AI Search read permissions.

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai-search/tokens \

  -H "Authorization: Bearer <API_TOKEN>"


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/service-api-token/","name":"Service API token"}}]}
```

---

---
title: System prompt
description: System prompts allow you to guide the behavior of the text-generation models used by AI Search at query time. AI Search supports system prompt configuration in two steps:
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/system-prompt.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# System prompt

System prompts allow you to guide the behavior of the text-generation models used by AI Search at query time. AI Search supports system prompt configuration in two steps:

* **Query rewriting**: Reformulates the original user query to improve semantic retrieval. A system prompt can guide how the model interprets and rewrites the query.
* **Generation**: Generates the final response from retrieved context. A system prompt can help define how the model should format, filter, or prioritize information when constructing the answer.

## What is a system prompt?

A system prompt is a special instruction sent to a large language model (LLM) that guides how it behaves during inference. The system prompt defines the model's role, context, or rules it should follow.

System prompts are particularly useful for:

* Enforcing specific response formats
* Constraining behavior (for example, it only responds based on the provided content)
* Applying domain-specific tone or terminology
* Encouraging consistent, high-quality output

## System prompt configuration

### Default system prompt

When configuring your AI Search instance, you can provide your own system prompts. If you do not provide a system prompt, AI Search will use the **default system prompt** provided by Cloudflare.

You can view the effective system prompt used for any AI Search's model call through AI Gateway logs, where model inputs and outputs are recorded.

Note

The default system prompt can change and evolve over time to improve performance and quality.

### Configure via API

When you make a `/ai-search` request using the [Workers Binding](https://developers.cloudflare.com/ai-search/usage/workers-binding/) or [REST API](https://developers.cloudflare.com/ai-search/usage/rest-api/), you can set the system prompt programmatically.

For example:

JavaScript

```

const answer = await env.AI.autorag("my-autorag").aiSearch({

  query: "How do I train a llama to deliver coffee?",

  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",

  system_prompt: "You are a helpful assistant."

});


```

### Configure via Dashboard

The system prompt for your AI Search can be set after it has been created:

1. Go to **AI Search** in the Cloudflare dashboard.[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Select an existing AI Search instance.
3. Go to the **Settings** tab.
4. Go to **Query rewrite** or **Generation**, and edit the **System prompt**.

## Generation system prompt

If you are using the AI Search API endpoint, you can use the system prompt to influence how the LLM responds to the final user query using the retrieved results. At this step, the model receives:

* The user's original query
* Retrieved document chunks (with metadata)
* The generation system prompt

The model uses these inputs to generate a context-aware response.

### Example

```

You are a helpful AI assistant specialized in answering questions using retrieved documents.

Your task is to provide accurate, relevant answers based on the matched content provided.

For each query, you will receive:

User's question/query

A set of matched documents, each containing:

  - File name

  - File content


You should:

1. Analyze the relevance of matched documents

2. Synthesize information from multiple sources when applicable

3. Acknowledge if the available documents don't fully answer the query

4. Format the response in a way that maximizes readability, in Markdown format


Answer only with direct reply to the user question, be concise, omit everything which is not directly relevant, focus on answering the question directly and do not redirect the user to read the content.


If the available documents don't contain enough information to fully answer the query, explicitly state this and provide an answer based on what is available.


Important:

- Cite which document(s) you're drawing information from

- Present information in order of relevance

- If documents contradict each other, note this and explain your reasoning for the chosen answer

- Do not repeat the instructions


```

## Query rewriting system prompt

If query rewriting is enabled, you can provide a custom system prompt to control how the model rewrites user queries. In this step, the model receives:

* The query rewrite system prompt
* The original user query

The model outputs a rewritten query optimized for semantic retrieval.

### Example

```

You are a search query optimizer for vector database searches. Your task is to reformulate user queries into more effective search terms.


Given a user's search query, you must:

1. Identify the core concepts and intent

2. Add relevant synonyms and related terms

3. Remove irrelevant filler words

4. Structure the query to emphasize key terms

5. Include technical or domain-specific terminology if applicable


Provide only the optimized search query without any explanations, greetings, or additional commentary.


Example input: "how to fix a bike tire that's gone flat"

Example output: "bicycle tire repair puncture fix patch inflate maintenance flat tire inner tube replacement"


Constraints:

- Output only the enhanced search terms

- Keep focus on searchable concepts

- Include both specific and general related terms

- Maintain all important meaning from original query


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/system-prompt/","name":"System prompt"}}]}
```

---

---
title: REST API
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/ai-search-api.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# REST API

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/ai-search-api/","name":"REST API"}}]}
```

---

---
title: MCP server
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/mcp-server.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# MCP server

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/mcp-server/","name":"MCP server"}}]}
```

---

---
title: AutoRAG API filter format
description: This page documents the filter format used by the legacy AutoRAG REST API. For the new AI Search REST API filter syntax, refer to Metadata filtering.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/autorag-filter-format.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# AutoRAG API filter format

This page documents the filter format used by the legacy AutoRAG REST API. For the new AI Search REST API filter syntax, refer to [Metadata filtering](https://developers.cloudflare.com/ai-search/configuration/metadata/).

## Comparison filter

Compare a metadata attribute (for example, `folder` or `timestamp`) with a target value:

JavaScript

```

filters: {

  type: "eq",

  key: "folder",

  value: "customer-a/"

}


```

### Operators

| Operator | Description              |
| -------- | ------------------------ |
| eq       | Equals                   |
| ne       | Not equals               |
| gt       | Greater than             |
| gte      | Greater than or equal to |
| lt       | Less than                |
| lte      | Less than or equal to    |

## Compound filter

Combine multiple comparison filters with a logical operator:

JavaScript

```

filters: {

  type: "and",

  filters: [

    { type: "eq", key: "folder", value: "customer-a/" },

    { type: "gte", key: "timestamp", value: "1735689600000" }

  ]

}


```

The available compound operators are `and` and `or`.

### Limitations

* No nested combinations of `and` and `or`. You can only use one compound operator at a time.
* When using `or`, only the `eq` operator is allowed and all conditions must filter on the same key.

## "Starts with" filter for folders

To filter for all files within a folder and its subfolders, use a compound filter with range operators.

For example, consider this file structure:

* Directorycustomer-a  
   * profile.md  
   * Directorycontracts  
         * Directoryproperty  
                  * contract-1.pdf

Using `{ type: "eq", key: "folder", value: "customer-a/" }` only matches files directly in that folder (like `profile.md`), not files in subfolders.

To match all files starting with `customer-a/`, use a compound filter:

JavaScript

```

filters: {

  type: "and",

  filters: [

    { type: "gt", key: "folder", value: "customer-a//" },

    { type: "lte", key: "folder", value: "customer-a/z" }

  ]

}


```

This filter matches all paths starting with `customer-a/` by using:

* `gt` with `customer-a//` to include paths greater than the `/` ASCII character
* `lte` with `customer-a/z` to include paths up to and including the lowercase `z` ASCII character

## Related

* [Metadata filtering](https://developers.cloudflare.com/ai-search/configuration/metadata/) \- New AI Search REST API filter format
* [Migrate from AutoRAG Search API](https://developers.cloudflare.com/ai-search/how-to/migrate-from-autorag-api/) \- Migration guide with before/after examples

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/autorag-filter-format/","name":"AutoRAG API filter format"}}]}
```

---

---
title: How AI Search works
description: AI Search is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/concepts/how-ai-search-works.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# How AI Search works

AI Search is Cloudflare’s managed search service. You can connect your data such as websites or unstructured content, and it automatically creates a continuously updating index that you can query with natural language in your applications or AI agents.

AI Search consists of two core processes:

* **Indexing:** An asynchronous background process that monitors your data source for changes and converts your data into vectors for search.
* **Querying:** A synchronous process triggered by user queries. It retrieves the most relevant content and generates context-aware responses.

## How indexing works

Indexing begins automatically when you create an AI Search instance and connect a data source.

Here is what happens during indexing:

1. **Data ingestion:** AI Search reads from your connected data source.
2. **Markdown conversion:** AI Search uses [Workers AI’s Markdown Conversion](https://developers.cloudflare.com/workers-ai/features/markdown-conversion/) to convert [supported data types](https://developers.cloudflare.com/ai-search/configuration/data-source/) into structured Markdown. This ensures consistency across diverse file types. For images, Workers AI is used to perform object detection followed by vision-to-language transformation to convert images into Markdown text.
3. **Chunking:** The extracted text is [chunked](https://developers.cloudflare.com/ai-search/configuration/chunking/) into smaller pieces to improve retrieval granularity.
4. **Embedding:** Each chunk is embedded using Workers AI’s embedding model to transform the content into vectors.
5. **Vector storage:** The resulting vectors, along with metadata like file name, are stored in a the [Vectorize](https://developers.cloudflare.com/vectorize/) database created on your Cloudflare account.

After the initial data set is indexed, AI Search will regularly check for updates in your data source (e.g. additions, updates, or deletes) and index changes to ensure your vector database is up to date.

![Indexing](https://developers.cloudflare.com/_astro/indexing.CQ13F9Js_2Tvxs.webp) 

## How querying works

Once indexing is complete, AI Search is ready to respond to end-user queries in real time.

Here is how the querying pipeline works:

1. **Receive query from AI Search API:** The query workflow begins when you send a request to either the AI Search’s [Chat Completions](https://developers.cloudflare.com/ai-search/usage/rest-api/#chat-completions) or [Search](https://developers.cloudflare.com/ai-search/usage/rest-api/#search) endpoints.
2. **Query rewriting (optional):** AI Search provides the option to [rewrite the input query](https://developers.cloudflare.com/ai-search/configuration/query-rewriting/) using one of Workers AI’s LLMs to improve retrieval quality by transforming the original query into a more effective search query.
3. **Embedding the query:** The rewritten (or original) query is transformed into a vector via the same embedding model used to embed your data so that it can be compared against your vectorized data to find the most relevant matches.
4. **Querying Vectorize index:** The query vector is [queried](https://developers.cloudflare.com/vectorize/best-practices/query-vectors/) against stored vectors in the associated Vectorize database for your AI Search.
5. **Content retrieval:** Vectorize returns the metadata of the most relevant chunks, and the original content is retrieved from the R2 bucket. If you are using the Search endpoint, the content is returned at this point.
6. **Response generation:** If you are using the AI Search endpoint, then a text-generation model from Workers AI is used to generate a response using the retrieved content and the original user’s query, combined via a [system prompt](https://developers.cloudflare.com/ai-search/configuration/system-prompt/). The context-aware response from the model is returned.
![Querying](https://developers.cloudflare.com/_astro/querying.c_RrR1YL_1ECh9s.webp) 

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/concepts/","name":"Concepts"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/concepts/how-ai-search-works/","name":"How AI Search works"}}]}
```

---

---
title: What is RAG
description: Retrieval-Augmented Generation (RAG) is a way to use your own data with a large language model (LLM). Instead of relying only on what the model was trained on, RAG searches for relevant information from your data source and uses it to help answer questions.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ LLM ](https://developers.cloudflare.com/search/?tags=LLM) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/concepts/what-is-rag.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# What is RAG

Retrieval-Augmented Generation (RAG) is a way to use your own data with a large language model (LLM). Instead of relying only on what the model was trained on, RAG searches for relevant information from your data source and uses it to help answer questions.

## How RAG works

Here’s a simplified overview of the RAG pipeline:

1. **Indexing:** Your content (e.g. docs, wikis, product information) is split into smaller chunks and converted into vectors using an embedding model. These vectors are stored in a vector database.
2. **Retrieval:** When a user asks a question, it’s also embedded into a vector and used to find the most relevant chunks from the vector database.
3. **Generation:** The retrieved content and the user’s original question are combined into a single prompt. An LLM uses that prompt to generate a response.

The resulting response should be accurate, relevant, and based on your own data.

![What is RAG](https://developers.cloudflare.com/_astro/RAG.Br2ehjiz_Z1L7k2s.webp) 

How does AI Search work

To learn more details about how AI Search uses RAG under the hood, reference [How AI Search works](https://developers.cloudflare.com/ai-search/concepts/how-ai-search-works/).

## Why use RAG?

RAG lets you bring your own data into LLM generation without retraining or fine-tuning a model. It improves both accuracy and trust by retrieving relevant content at query time and using that as the basis for a response.

Benefits of using RAG:

* **Accurate and current answers:** Responses are based on your latest content, not outdated training data.
* **Control over information sources:** You define the knowledge base so answers come from content you trust.
* **Fewer hallucinations:** Responses are grounded in real, retrieved data, reducing made-up or misleading answers.
* **No model training required:** You can get high-quality results without building or fine-tuning your own LLM which can be time consuming and costly.

RAG is ideal for building AI-powered apps like:

* AI assistants for internal knowledge
* Support chatbots connected to your latest content
* Enterprise search across documentation and files

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/concepts/","name":"Concepts"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/concepts/what-is-rag/","name":"What is RAG"}}]}
```

---

---
title: Bring your own generation model
description: When using AI Search, AI Search leverages a Workers AI model to generate the response. If you want to use a model outside of Workers AI, you can use AI Search for search while leveraging a model outside of Workers AI to generate responses.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ AI ](https://developers.cloudflare.com/search/?tags=AI) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/how-to/bring-your-own-generation-model.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Bring your own generation model

When using `AI Search`, AI Search leverages a Workers AI model to generate the response. If you want to use a model outside of Workers AI, you can use AI Search for `search` while leveraging a model outside of Workers AI to generate responses.

Here is an example of how you can use an OpenAI model to generate your responses. This example uses [Workers Binding](https://developers.cloudflare.com/ai-search/usage/workers-binding/).

Note

AI Search now supports [bringing your own models natively](https://developers.cloudflare.com/ai-search/configuration/models/). You can attach provider keys through AI Gateway and select third-party models directly in your AI Search settings. The example below still works, but the recommended way is to configure your external model through AI Gateway.

* [  JavaScript ](#tab-panel-3095)
* [  TypeScript ](#tab-panel-3096)

JavaScript

```

import { openai } from "@ai-sdk/openai";

import { generateText } from "ai";


export default {

  async fetch(request, env) {

    // Parse incoming url

    const url = new URL(request.url);


    // Get the user query or default to a predefined one

    const userQuery =

      url.searchParams.get("query") ??

      "How do I train a llama to deliver coffee?";


    // Search for documents in AI Search

    const searchResult = await env.AI.autorag("my-rag").search({

      query: userQuery,

    });


    if (searchResult.data.length === 0) {

      // No matching documents

      return Response.json({ text: `No data found for query "${userQuery}"` });

    }


    // Join all document chunks into a single string

    const chunks = searchResult.data

      .map((item) => {

        const data = item.content

          .map((content) => {

            return content.text;

          })

          .join("\n\n");


        return `<file name="${item.filename}">${data}</file>`;

      })

      .join("\n\n");


    // Send the user query + matched documents to openai for answer

    const generateResult = await generateText({

      model: openai("gpt-4o-mini"),

      messages: [

        {

          role: "system",

          content:

            "You are a helpful assistant and your task is to answer the user question using the provided files.",

        },

        { role: "user", content: chunks },

        { role: "user", content: userQuery },

      ],

    });


    // Return the generated answer

    return Response.json({ text: generateResult.text });

  },

};


```

TypeScript

```

import { openai } from "@ai-sdk/openai";

import { generateText } from "ai";


export interface Env {

  AI: Ai;

  OPENAI_API_KEY: string;

}


export default {

  async fetch(request, env): Promise<Response> {

    // Parse incoming url

    const url = new URL(request.url);


    // Get the user query or default to a predefined one

    const userQuery =

      url.searchParams.get("query") ??

      "How do I train a llama to deliver coffee?";


    // Search for documents in AI Search

    const searchResult = await env.AI.autorag("my-rag").search({

      query: userQuery,

    });


    if (searchResult.data.length === 0) {

      // No matching documents

      return Response.json({ text: `No data found for query "${userQuery}"` });

    }


    // Join all document chunks into a single string

    const chunks = searchResult.data

      .map((item) => {

        const data = item.content

          .map((content) => {

            return content.text;

          })

          .join("\n\n");


        return `<file name="${item.filename}">${data}</file>`;

      })

      .join("\n\n");


    // Send the user query + matched documents to openai for answer

    const generateResult = await generateText({

      model: openai("gpt-4o-mini"),

      messages: [

        {

          role: "system",

          content:

            "You are a helpful assistant and your task is to answer the user question using the provided files.",

        },

        { role: "user", content: chunks },

        { role: "user", content: userQuery },

      ],

    });


    // Return the generated answer

    return Response.json({ text: generateResult.text });

  },

} satisfies ExportedHandler<Env>;


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/how-to/","name":"How to"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/how-to/bring-your-own-generation-model/","name":"Bring your own generation model"}}]}
```

---

---
title: Migrate from AutoRAG Search API
description: This guide explains how to migrate from the previous AutoRAG API endpoints to the new AI Search API endpoints. The old /autorag/rags/ endpoints were named after the original product name (AutoRAG). The new /ai-search/instances/ endpoints reflect the product rename and include improvements like OpenAI-compatible formatting.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/how-to/migrate-from-autorag-api.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Migrate from AutoRAG Search API

This guide explains how to migrate from the previous AutoRAG API endpoints to the new AI Search API endpoints. The old `/autorag/rags/` endpoints were named after the original product name (AutoRAG). The new `/ai-search/instances/` endpoints reflect the product rename and include improvements like OpenAI-compatible formatting.

## Endpoint changes

| Old endpoint (AutoRAG)                                     | New endpoint (AI Search)                                                 |
| ---------------------------------------------------------- | ------------------------------------------------------------------------ |
| POST /accounts/{account\_id}/autorag/rags/{name}/ai-search | POST /accounts/{account\_id}/ai-search/instances/{name}/chat/completions |
| POST /accounts/{account\_id}/autorag/rags/{name}/search    | POST /accounts/{account\_id}/ai-search/instances/{name}/search           |

The `{name}` parameter refers to your AI Search instance name.

## Why migrate

The new AI Search API offers several advantages:

* **OpenAI-compatible format**: Use the familiar `messages` array structure that works with existing OpenAI SDKs and tools
* **New features**: Future enhancements will only be available on the new API

## Chat completions

### Before (AutoRAG API)

For all parameters and options, refer to the [AutoRAG /ai-search API reference](https://developers.cloudflare.com/api/resources/autorag/methods/ai%5Fsearch/).

**Request:**

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/autorag/rags/{NAME}/ai-search \

  -H "Content-Type: application/json" \

  -H "Authorization: Bearer {API_TOKEN}" \

  -d '{

    "query": "How do I get started?"

  }'


```

**Response:**

```

{

  "success": true,

  "result": {

    "object": "vector_store.search_results.page",

    "search_query": "How do I get started?",

    "response": "To get started with AI Search...",

    "data": [

      {

        "file_id": "doc001",

        "filename": "getting-started.md",

        "score": 0.45,

        "content": [

          {

            "id": "doc001",

            "type": "text",

            "text": "Welcome to AI Search..."

          }

        ]

      }

    ]

  }

}


```

### After (AI Search API)

For all parameters and options, refer to the [AI Search /chat/completions API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/chat%5Fcompletions/).

**Request:**

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/instances/{NAME}/chat/completions \

  -H "Content-Type: application/json" \

  -H "Authorization: Bearer {API_TOKEN}" \

  -d '{

    "messages": [

      {

        "content": "How do I get started?",

        "role": "user"

      }

    ]

  }'


```

**Response:**

```

{

  "id": "chatcmpl-abc123",

  "object": "chat.completion",

  "created": 1771886959,

  "model": "@cf/meta/llama-3.3-70b-instruct-fp8-fast",

  "choices": [

    {

      "index": 0,

      "message": {

        "role": "assistant",

        "content": "To get started with AI Search..."

      },

      "finish_reason": "stop"

    }

  ],

  "usage": {

    "prompt_tokens": 6507,

    "completion_tokens": 137,

    "total_tokens": 6644

  },

  "chunks": [

    {

      "id": "chunk001",

      "type": "text",

      "score": 0.85,

      "text": "Welcome to AI Search...",

      "item": {

        "key": "getting-started.md",

        "timestamp": 1735689600

      },

      "scoring_details": {

        "vector_score": 0.85

      }

    }

  ]

}


```

### Request format changes

| Old format               | New format                                                     |
| ------------------------ | -------------------------------------------------------------- |
| "query": "your question" | "messages": \[{ "content": "your question", "role": "user" }\] |

### Response format changes

| Old format                | New format                   |
| ------------------------- | ---------------------------- |
| result.response           | choices\[0\].message.content |
| result.data               | chunks                       |
| data\[\].filename         | chunks\[\].item.key          |
| data\[\].content\[\].text | chunks\[\].text              |
| No scoring breakdown      | chunks\[\].scoring\_details  |

### Streaming behavior changes

In the old AutoRAG API, when `stream` was set to `true`, you would only receive the streamed response without the retrieved chunks.

Now, in the new AI Search API, streaming responses include the chunks. The retrieved chunks are sent first as a `chunks` event, followed by the streamed response data. This allows you to display the source chunks immediately while streaming the generated response to the user.

## Search

### Before (AutoRAG API)

For all parameters and options, refer to the [AutoRAG /search API reference](https://developers.cloudflare.com/api/resources/autorag/methods/search/).

**Request:**

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/autorag/rags/{NAME}/search \

  -H "Content-Type: application/json" \

  -H "Authorization: Bearer {API_TOKEN}" \

  -d '{

    "query": "How do I get started?"

  }'


```

**Response:**

```

{

  "success": true,

  "result": {

    "object": "vector_store.search_results.page",

    "search_query": "How do I get started?",

    "data": [

      {

        "file_id": "doc001",

        "filename": "getting-started.md",

        "score": 0.45,

        "content": [

          {

            "id": "doc001",

            "type": "text",

            "text": "Welcome to AI Search..."

          }

        ]

      }

    ]

  }

}


```

### After (AI Search API)

For all parameters and options, refer to the [AI Search /search API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/search/).

**Request:**

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/instances/{NAME}/search \

  -H "Content-Type: application/json" \

  -H "Authorization: Bearer {API_TOKEN}" \

  -d '{

    "messages": [

      {

        "content": "How do I get started?",

        "role": "user"

      }

    ]

  }'


```

**Response:**

```

{

  "success": true,

  "result": {

    "search_query": "How do I get started?",

    "chunks": [

      {

        "id": "chunk001",

        "type": "text",

        "score": 0.85,

        "text": "Welcome to AI Search...",

        "item": {

          "key": "getting-started.md",

          "timestamp": 1735689600

        },

        "scoring_details": {

          "vector_score": 0.85

        }

      }

    ]

  }

}


```

### Request format changes

| Old format               | New format                                                     |
| ------------------------ | -------------------------------------------------------------- |
| "query": "your question" | "messages": \[{ "content": "your question", "role": "user" }\] |

### Response format changes

| Old format                | New format                  |
| ------------------------- | --------------------------- |
| result.data               | result.chunks               |
| data\[\].filename         | chunks\[\].item.key         |
| data\[\].content\[\].text | chunks\[\].text             |
| No scoring breakdown      | chunks\[\].scoring\_details |

## Filter format

The new AI Search REST API uses Vectorize-style metadata filtering, which differs from the AutoRAG API format. Filters are now nested under `ai_search_options.retrieval.filters` in the request body. For full documentation of the old format, refer to [AutoRAG API filter format](https://developers.cloudflare.com/ai-search/autorag-filter-format/).

### Operator mapping

| AutoRAG API | AI Search API     |
| ----------- | ----------------- |
| eq          | $eq (or implicit) |
| ne          | $ne               |
| gt          | $gt               |
| gte         | $gte              |
| lt          | $lt               |
| lte         | $lte              |
| —           | $in (new)         |
| —           | $nin (new)        |

### Examples

#### Simple filter

**Before (AutoRAG API):**

JavaScript

```

filters: {

  type: "eq",

  key: "folder",

  value: "customer-a/"

}


```

**After (AI Search API):**

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": "customer-a/" }

    }

  }

}


```

#### Compound filter (AND)

**Before (AutoRAG API):**

JavaScript

```

filters: {

  type: "and",

  filters: [

    { type: "eq", key: "folder", value: "customer-a/" },

    { type: "gte", key: "timestamp", value: "1735689600000" }

  ]

}


```

**After (AI Search API):**

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": {

        "folder": "customer-a/",

        "timestamp": { "$gte": 1735689600 }

      }

    }

  }

}


```

#### "Starts with" filter

**Before (AutoRAG API):**

JavaScript

```

filters: {

  type: "and",

  filters: [

    { type: "gt", key: "folder", value: "customer-a//" },

    { type: "lte", key: "folder", value: "customer-a/z" }

  ]

}


```

**After (AI Search API):**

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": { "$gte": "customer-a/", "$lt": "customer-a0" } }

    }

  }

}


```

## API references

* [REST API documentation](https://developers.cloudflare.com/ai-search/usage/rest-api/)
* [Chat Completions API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/chat%5Fcompletions/)
* [Search API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/search/)
* [Legacy AutoRAG API reference](https://developers.cloudflare.com/api/resources/autorag/)

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/how-to/","name":"How to"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/how-to/migrate-from-autorag-api/","name":"Migrate from AutoRAG Search API"}}]}
```

---

---
title: Create multitenancy
description: AI Search supports multitenancy by letting you segment content by tenant, so each user, customer, or workspace can only access their own data. This is typically done by organizing documents into per-tenant folders and applying metadata filters at query time.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/how-to/multitenancy.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Create multitenancy

AI Search supports multitenancy by letting you segment content by tenant, so each user, customer, or workspace can only access their own data. This is typically done by organizing documents into per-tenant folders and applying [metadata filters](https://developers.cloudflare.com/ai-search/configuration/metadata/) at query time.

## 1\. Organize content by tenant

When uploading files to R2, structure your content by tenant using unique folder paths.

Example folder structure:

* Directorycustomer-a  
   * Directorylogs/  
         * …  
   * Directorycontracts/  
         * …
* Directorycustomer-b  
   * Directorycontracts/  
         * …

When indexing, AI Search will automatically store the folder path as metadata under the `folder` attribute. It is recommended to enforce folder separation during upload or indexing to prevent accidental data access across tenants.

## 2\. Search using folder filters

To ensure a tenant only retrieves their own documents, apply a `folder` filter when performing a search.

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/instances/{NAME}/search \

  -H "Content-Type: application/json" \

  -H "Authorization: Bearer {API_TOKEN}" \

  -d '{

    "messages": [

      {

        "content": "When did I sign my agreement contract?",

        "role": "user"

      }

    ],

    "ai_search_options": {

      "retrieval": {

        "filters": {

          "folder": { "$gte": "customer-a/", "$lt": "customer-a0" }

        }

      }

    }

  }'


```

## Tip: Use "starts with" filter

While an equality filter targets files at a specific folder, you often want to retrieve all documents belonging to a tenant, including files in subfolders. For example, all files in `customer-a/` with a structure like:

* Directorycustomer-a  
   * profile.md  
   * Directorycontracts  
         * Directoryproperty  
                  * contract-1.pdf

To achieve this ["starts with"](https://developers.cloudflare.com/ai-search/configuration/metadata/#starts-with-filter-for-folders) behavior, use a range filter:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": {

        "folder": { "$gte": "customer-a/", "$lt": "customer-a0" }

      }

    }

  }

}


```

This range filter matches all paths starting with `customer-a/` because `0` comes after `/` in ASCII order, capturing both `profile.md` and `contract-1.pdf`.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/how-to/","name":"How to"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/how-to/multitenancy/","name":"Create multitenancy"}}]}
```

---

---
title: NLWeb
description: Enable conversational search on your website with NLWeb and Cloudflare AI Search. This template crawls your site, indexes the content, and deploys NLWeb-standard endpoints to serve both people and AI agents.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/how-to/nlweb.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# NLWeb

Enable conversational search on your website with NLWeb and Cloudflare AI Search. This template crawls your site, indexes the content, and deploys NLWeb-standard endpoints to serve both people and AI agents.

Note

This is a public preview ideal for experimentation. If you're interested in running this in production workflows, please contact us at [nlweb@cloudflare.com](mailto:nlweb@cloudflare.com).

## What is NLWeb

[NLWeb ↗](https://github.com/nlweb-ai/NLWeb) is an open project developed by Microsoft that defines a standard protocol for natural language queries on websites. Its goal is to make every website as accessible and interactive as a conversational AI app, so both people and AI agents can reliably query site content. It does this by exposing two key endpoints:

* `/ask`: Conversational endpoint for user queries
* `/mcp`: Structured Model Context Protocol (MCP) endpoint for AI agents

## How to use it

You can deploy NLWeb on your website directly through the AI Search dashboard:

1. Log in to your [Cloudflare dashboard ↗](https://dash.cloudflare.com/).
2. Go to **Compute & AI** \> **AI Search**.
3. Select **Create**.
4. Select **Website** as a data source.
5. Follow the instructions to create an AI Search instance.
6. Go to the **Settings** for the instance
7. Find **NLWeb Worker** and select "Enable AI Search for your website".

Once complete, AI Search will deploy an NLWeb Worker for you that enables you to use the NLWeb API Endpoints.

## What this template includes

Choosing the NLWeb Website option extends a normal AI Search by tailoring it for content‑heavy websites and giving you everything that is required to adopt NLWeb as the standard for conversational search on your site. Specifically, the template provides:

* **Website as a data source:** Uses [Website](https://developers.cloudflare.com/ai-search/configuration/data-source/website/) as data source option to crawl and ingest pages with the Rendered Sites option.
* **Defaults for content-heavy websites:** Applies tuned embedding and retrieval configurations ideal for publishing and content‑rich websites.
* **NLWeb Worker deployment:** Automatically spins up a Cloudflare Worker from the [NLWeb Worker template ↗](https://github.com/cloudflare/templates).

## What the Worker includes

Your deployed Worker provides two endpoints:

* `/ask` — NLWeb’s standard conversational endpoint  
   * Powers the conversational UI at the root (`/`)  
   * Powers the embeddable preview widget (`/snippet.html`)
* `/mcp` — NLWeb’s MCP server endpoint for trusted AI agents

These endpoints give both people and agents structured access to your content.

## Using It on Your Website

To integrate NLWeb search directly into your site you can:

1. Find your deployed Worker in the [Cloudflare dashboard ↗](https://dash.cloudflare.com/):
* Go to **Compute & AI** \> **AI Search**.
* Select **Connect**, then go to the **NLWeb** tab.
* Select **Go to Worker**.
1. Add a [custom domain](https://developers.cloudflare.com/workers/configuration/routing/custom-domains/) to your Worker (for example, ask.example.com)
2. Use the `/ask` endpoint on your custom domain to power the search (for example, ask.example.com/ask)

You can also use the embeddable snippet to add a search UI directly into your website. For example:

```

<!-- Add css on head -->

    <link rel="stylesheet" href="https://ask.example.com/nlweb-dropdown-chat.css">

    <link rel="stylesheet" href="https://ask.example.com/common-chat-styles.css">


    <!-- Add container on body -->

    <div id="docs-search-container"></div>


    <!-- Include JavaScript -->

    <script type="module">

      import { NLWebDropdownChat } from 'https://ask.example.com/nlweb-dropdown-chat.js';


      const chat = new NLWebDropdownChat({

        containerId: 'docs-search-container',

        site: 'https://ask.example.com',

        placeholder: 'Search for docs...',

        endpoint: 'https://ask.example.com'

      });

    </script>


```

This lets you serve conversational AI search directly from your own domain, with control over how people and agents access your content.

## Modifying or updating the Worker

You may want to customize your Worker, for example, to adjust the UI for the embeddable snippet. In those cases, we recommend calling the `/ask` endpoint for queries and building your own UI on top of it, however, you may also choose to modify the Worker's code for the embeddable UI.

If the NLWeb standard is updated, you can update your Worker to stay compatible and receive the latest updates.

The simplest way to apply changes or updates is to redeploy the Worker template:

[![Deploy to Cloudflare](https://deploy.workers.cloudflare.com/button)](https://deploy.workers.cloudflare.com/?url=https://github.com/cloudflare/templates/tree/main/nlweb-template)

To do so:

1. Select the **Deploy to Cloudflare** button from above to deploy the Worker template to your Cloudflare account.
2. Enter the name of your AI Search in the `RAG_ID` environment variable field.
3. Click **Deploy**.
4. Select the **GitHub/GitLab** icon on the Workers Dashboard.
5. Clone the repository that is created for your Worker.
6. Make your modifications, then commit and push changes to the repository to update your Worker.

Now you can use this Worker as the new NLWeb endpoint for your website.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/how-to/","name":"How to"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/how-to/nlweb/","name":"NLWeb"}}]}
```

---

---
title: Create a simple search engine
description: By using the search method, you can implement a simple but fast search engine. This example uses Workers Binding, but can be easily adapted to use the REST API instead.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/how-to/simple-search-engine.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Create a simple search engine

By using the `search` method, you can implement a simple but fast search engine. This example uses [Workers Binding](https://developers.cloudflare.com/ai-search/usage/workers-binding/), but can be easily adapted to use the [REST API](https://developers.cloudflare.com/ai-search/usage/rest-api/) instead.

To replicate this example remember to:

* Disable `rewrite_query`, as you want to match the original user query
* Configure your AI Search to have small chunk sizes, usually 256 tokens is enough

* [  JavaScript ](#tab-panel-3097)
* [  TypeScript ](#tab-panel-3098)

JavaScript

```

export default {

  async fetch(request, env) {

    const url = new URL(request.url);

    const userQuery =

      url.searchParams.get("query") ??

      "How do I train a llama to deliver coffee?";

    const searchResult = await env.AI.autorag("my-rag").search({

      query: userQuery,

      rewrite_query: false,

    });


    return Response.json({

      files: searchResult.data.map((obj) => obj.filename),

    });

  },

};


```

TypeScript

```

export interface Env {

  AI: Ai;

}


export default {

  async fetch(request, env): Promise<Response> {

    const url = new URL(request.url);

    const userQuery =

      url.searchParams.get("query") ??

      "How do I train a llama to deliver coffee?";

    const searchResult = await env.AI.autorag("my-rag").search({

      query: userQuery,

      rewrite_query: false,

    });


    return Response.json({

      files: searchResult.data.map((obj) => obj.filename),

    });

  },

} satisfies ExportedHandler<Env>;


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/how-to/","name":"How to"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/how-to/simple-search-engine/","name":"Create a simple search engine"}}]}
```

---

---
title: Limits &#38; pricing
description: During the open beta, AI Search is free to enable. When you create an AI Search instance, it provisions and runs on top of Cloudflare services in your account. These resources are billed as part of your Cloudflare usage, and includes:
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/platform/limits-pricing.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Limits & pricing

## Pricing

During the open beta, AI Search is **free to enable**. When you create an AI Search instance, it provisions and runs on top of Cloudflare services in your account. These resources are **billed as part of your Cloudflare usage**, and includes:

| Service & Pricing                                                                     | Description                                                                                                                                                       |
| ------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [**R2**](https://developers.cloudflare.com/r2/pricing/)                               | Stores your source data                                                                                                                                           |
| [**Vectorize**](https://developers.cloudflare.com/vectorize/platform/pricing/)        | Stores vector embeddings and powers semantic search                                                                                                               |
| [**Workers AI**](https://developers.cloudflare.com/workers-ai/platform/pricing/)      | Handles image-to-Markdown conversion, embedding, query rewriting, and response generation                                                                         |
| [**AI Gateway**](https://developers.cloudflare.com/ai-gateway/reference/pricing/)     | Monitors and controls model usage                                                                                                                                 |
| [**Browser Rendering**](https://developers.cloudflare.com/browser-rendering/pricing/) | Loads dynamic JavaScript content during [website](https://developers.cloudflare.com/ai-search/configuration/data-source/website/) crawling with the Render option |

For more information about how each resource is used within AI Search, reference [How AI Search works](https://developers.cloudflare.com/ai-search/concepts/how-ai-search-works/).

## Limits

The following limits currently apply to AI Search during the open beta:

Need a higher limit?

To request an adjustment to a limit, complete the [Limit Increase Request Form ↗](https://forms.gle/wnizxrEUW33Y15CT8). If the limit can be increased, Cloudflare will contact you with next steps.

| Limit                               | Value                    |
| ----------------------------------- | ------------------------ |
| Max AI Search instances per account | 50                       |
| Max files per AI Search             | 1,000,000                |
| Max file size                       | 4 MB                     |
| Max custom metadata fields          | 5 per AI Search instance |
| Max text metadata value length      | 500 characters           |

These limits are subject to change as AI Search evolves beyond open beta.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/platform/","name":"Platform"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/platform/limits-pricing/","name":"Limits & pricing"}}]}
```

---

---
title: Release note
description: Review recent changes to Cloudflare AI Search.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/platform/release-note.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Release note

This release notes section covers regular updates and minor fixes. For major feature releases or significant updates, see the [changelog](https://developers.cloudflare.com/changelog).

## 2026-04-01

**Wrangler CLI support for AI Search**

Manage AI Search instances from the command line with the `wrangler ai-search` command namespace. Create, list, update, delete, search, and get usage statistics for instances without leaving your terminal. All commands support `--json` for structured output that scripts and AI agents can parse directly. Refer to [Wrangler commands](https://developers.cloudflare.com/ai-search/wrangler-commands/) for full usage details.

## 2026-03-23

**Custom metadata filtering**

Define up to 5 custom metadata fields per AI Search instance and filter search results by category, version, or any custom attribute. Attach metadata via R2 custom headers or HTML meta tags.

## 2026-03-23

**Public endpoint, UI snippets, and MCP support**

AI Search now supports [public endpoints](https://developers.cloudflare.com/ai-search/configuration/public-endpoint/), [UI snippets](https://developers.cloudflare.com/ai-search/configuration/embed-search-snippets/), and [MCP](https://developers.cloudflare.com/ai-search/usage/mcp/), making it easy to add search to your website or connect AI agents.

## 2026-03-23

**New REST API endpoints**

AI Search introduces new [REST API](https://developers.cloudflare.com/ai-search/usage/rest-api/) endpoints for search that use an OpenAI-compatible format. You can use the familiar `messages` array structure that works with existing OpenAI SDKs and tools. The previous AutoRAG API endpoints will continue working as expected for the time being. New features will only be added to the new API. We will provide at least 90 days notice before any end of life. See the [migration guide](https://developers.cloudflare.com/ai-search/how-to/migrate-from-autorag-api/) for instructions.

## 2026-02-09

**Crawler user agent renamed**

The AI Search crawler user agent has been renamed from `Cloudflare-AutoRAG` to `Cloudflare-AI-Search`. You can continue using the previous user agent name, `Cloudflare-AutoRAG`, in your `robots.txt`. The Bot Detection ID, `122933950` for WAF rules remains unchanged.

## 2026-02-09

**Specify a single sitemap for website crawling**

You can now specify a single sitemap URL in **Parser options** to limit which pages are crawled. By default, AI Search crawls all sitemaps listed in your `robots.txt` from top to bottom.

## 2026-02-09

**Sync individual files**

You can now trigger a sync for a specific file from the dashboard. Go to **Overview** \> **Indexed Items** and select the sync icon next to the file you want to reindex.

## 2026-01-22

**New file type support**

AI Search now supports EMACS Lisp (`.el`) files and the `.htm` extension for HTML documents.

## 2026-01-19

**Path filtering for website and R2 data sources**

You can now filter which paths to include or exclude from indexing for both website and R2 data sources.

## 2026-01-19

**Simplified API instance creation**

API instance creation is now simpler with optional token\_id and model fields.

## 2026-01-16

**Website crawler improvements**

Website instances now respect sitemap `<priority>` for indexing order and `<changefreq>` for re-crawl frequency. Added support for `.gz` compressed sitemaps and partial URLs in robots.txt and sitemaps.

## 2026-01-16

**Improved indexing performance**

We have improved indexing performance for all AI Search instances. Support for more and larger files is coming.

## 2025-12-10

**Query rewrite visibility in AI Gateway logs**

Fixed a bug where query rewrites were not visible in the AI Gateway logs.

## 2025-11-19

**Custom HTTP headers for website crawling**

AI Search now supports custom HTTP headers for website crawling, allowing you to index content behind authentication or access controls.

## 2025-10-28

**Reranking and API-based system prompts**

You can now enable reranking to reorder retrieved documents by semantic relevance and set system prompts directly in API requests for per-query control.

## 2025-09-25

**AI Search (formerly AutoRAG) now supports more models**

Connect your provider keys through AI Gateway to use models from OpenAI, Anthropic, and other providers for both embeddings and inference.

## 2025-09-23

**Support document file types in AutoRAG**

Our [conversion utility](https://developers.cloudflare.com/workers-ai/features/markdown-conversion/) can now convert `.docx` and `.odt` files to Markdown, making these files available to index inside your AutoRAG instance.

## 2025-09-19

**Metrics view for AI Search**

AI Search now includes a Metrics tab to track file indexing, search activity, and top retrievals.

## 2025-08-28

**Website data source and NLWeb integration**

AI Search now supports websites as a data source. Connect your domain to automatically crawl and index your site content with continuous re-crawling. Also includes NLWeb integration for conversational search with `/ask` and `/mcp` endpoints.

## 2025-08-20

**Increased maximum query results to 50**

The maximum number of results returned from a query has been increased from **20** to **50**. This allows you to surface more relevant matches in a single request.

## 2025-07-16

**Deleted files now removed from index on next sync**

When a file is deleted from your R2 bucket, its corresponding chunks are now automatically removed from the Vectorize index linked to your AI Search instance during the next sync.

## 2025-07-08

**Faster indexing and new Jobs view**

Indexing is now 3-5x faster. A new Jobs view lets you monitor indexing progress, view job status, and inspect real-time logs.

## 2025-07-08

**Reduced cooldown between syncs**

The cooldown period between sync jobs has been reduced to 3 minutes, allowing you to trigger syncs more frequently.

## 2025-06-19

**Filter search by file name**

You can now filter AI Search queries by file name using the `filename` attribute for more control over which files are searched.

## 2025-06-19

**Custom metadata in search responses**

AI Search now returns custom metadata in search responses. You can also add a `context` field to guide AI-generated answers.

## 2025-06-16

**Rich format file size limit increased to 4 MB**

You can now index rich format files (e.g., PDF) up to 4 MB in size, up from the previous 1 MB limit.

## 2025-06-12

**Index processing status displayed on dashboard**

The dashboard now includes a new “Processing” step for the indexing pipeline that displays the files currently being processed.

## 2025-06-12

**Sync AI Search REST API published**

You can now trigger a sync job for an AI Search using the [Sync REST API](https://developers.cloudflare.com/api/resources/ai-search/subresources/rags/methods/sync/). This scans your data source for changes and queues updated or previously errored files for indexing.

## 2025-06-10

**Files modified in the data source will now be updated**

Files modified in your source R2 bucket will now be updated in the AI Search index during the next sync. For example, if you upload a new version of an existing file, the changes will be reflected in the index after the subsequent sync job. Please note that deleted files are not yet removed from the index. We are actively working on this functionality.

## 2025-05-31

**Errored files will now be retried in next sync**

Files that failed to index will now be automatically retried in the next indexing job. For instance, if a file initially failed because it was oversized but was then corrected (e.g. replaced with a file of the same name/key within the size limit), it will be re-attempted during the next scheduled sync.

## 2025-05-31

**Fixed character cutoff in recursive chunking**

Resolved an issue where certain characters (e.g. '#') were being cut off during the recursive chunking and embedding process. This fix ensures complete character processing in the indexing process.

## 2025-05-25

**EU jurisdiction R2 buckets now supported**

AI Search now supports R2 buckets configured with European Union (EU) jurisdiction restrictions. Previously, files in EU-restricted R2 buckets would not index when linked. This issue has been resolved, and all EU-restricted R2 buckets should now function as expected.

## 2025-04-23

**Metadata filtering and multitenancy support**

Filter search results by `folder` and `timestamp` to enable multitenancy and control the scope of retrieved results.

## 2025-04-23

**Response streaming in AI Search binding added**

AI Search now supports response streaming in the `AI Search` method of the [Workers binding](https://developers.cloudflare.com/ai-search/usage/workers-binding/), allowing you to stream results as they're retrieved by setting `stream: true`.

## 2025-04-07

**AI Search is now in open beta!**

AI Search allows developers to create fully-managed retrieval-augmented generation (RAG) pipelines powered by Cloudflare allowing developers to integrate context-aware AI into their applications without managing infrastructure. Get started today on the [Cloudflare Dashboard](https://dash.cloudflare.com/?to=/:account/ai/autorag).

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/platform/","name":"Platform"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/platform/release-note/","name":"Release note"}}]}
```

---

---
title: MCP
description: The Model Context Protocol (MCP) endpoint allows AI agents to discover and interact with your AI Search content. This endpoint follows the MCP specification and provides tools for querying your indexed content.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/usage/mcp.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# MCP

The Model Context Protocol (MCP) endpoint allows AI agents to discover and interact with your AI Search content. This endpoint follows the [MCP specification ↗](https://modelcontextprotocol.io/) and provides tools for querying your indexed content.

## Prerequisites

Enable public endpoints for your AI Search instance:

1. Go to **AI Search** in the Cloudflare dashboard.[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Select your AI Search instance.
3. Go to **Settings** \> **Public Endpoint**.
4. Turn on **Enable Public Endpoint**.
5. Copy the public endpoint URL.

## Available tools

The AI Search MCP endpoint exposes a `search` tool that queries your indexed content.

| Tool   | Description                           |
| ------ | ------------------------------------- |
| search | Finds exactly what you're looking for |

You can customize this in your AI Search instance settings. For more details, refer to [Public endpoint configuration](https://developers.cloudflare.com/ai-search/configuration/public-endpoint/).

## Test the MCP endpoint

Send a request to the `/mcp` endpoint with the `Accept: application/json, text/event-stream` header:

Terminal window

```

curl https://<INSTANCE_ID>.search.ai.cloudflare.com/mcp \

  -H "Content-Type: application/json" \

  -H "Accept: application/json, text/event-stream" \

  -d '{

    "jsonrpc": "2.0",

    "id": 1,

    "method": "tools/call",

    "params": {

      "name": "search",

      "arguments": {

        "query": "How do I configure AI Search?"

      }

    }

  }'


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/usage/","name":"Search API"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/usage/mcp/","name":"MCP"}}]}
```

---

---
title: Public endpoint
description: AI Search public endpoints allow you to expose AI Search capabilities without requiring authentication. This enables you to integrate AI Search into public-facing applications or share it with external users.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/usage/public-endpoint.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Public endpoint

AI Search public endpoints allow you to expose AI Search capabilities without requiring authentication. This enables you to integrate AI Search into public-facing applications or share it with external users.

For pre-built search and chat components you can embed on your website using the public endpoints, refer to [UI snippets](https://developers.cloudflare.com/ai-search/configuration/embed-search-snippets/).

## Prerequisites

Enable public endpoints for your AI Search instance:

1. Go to **AI Search** in the Cloudflare dashboard.[ Go to **AI Search** ](https://dash.cloudflare.com/?to=/:account/ai/ai-search)
2. Select your AI Search instance.
3. Go to **Settings** \> **Public Endpoint**.
4. Turn on **Enable Public Endpoint**.
5. Copy the public endpoint URL.

For configuration options like rate limiting and CORS, refer to [Public endpoint configuration](https://developers.cloudflare.com/ai-search/configuration/public-endpoint/).

## Chat completions

The `/chat/completions` endpoint searches your data source and generates a response using the model and retrieved context. It uses the same OpenAI-compatible format as the [REST API](https://developers.cloudflare.com/ai-search/usage/rest-api/#chat-completions).

Terminal window

```

curl https://<INSTANCE_ID>.search.ai.cloudflare.com/chat/completions \

  -H "Content-Type: application/json" \

  -d '{

    "messages": [

      {

        "content": "How do I configure AI Search?",

        "role": "user"

      }

    ]

  }'


```

For the full list of options, refer to the [Chat Completions API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/chat%5Fcompletions/).

## Search

The `/search` endpoint returns relevant chunks from your data source without generating a response. It uses the same format as the [REST API](https://developers.cloudflare.com/ai-search/usage/rest-api/#search).

Terminal window

```

curl https://<INSTANCE_ID>.search.ai.cloudflare.com/search \

  -H "Content-Type: application/json" \

  -d '{

    "messages": [

      {

        "content": "How do I configure AI Search?",

        "role": "user"

      }

    ]

  }'


```

For the full list of options, refer to the [Search API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/search/).

## Next steps

* [UI snippets](https://developers.cloudflare.com/ai-search/configuration/embed-search-snippets/) \- Add pre-built search and chat components to your website.
* [MCP](https://developers.cloudflare.com/ai-search/usage/mcp/) \- Connect AI agents using the Model Context Protocol.
* [Public endpoint configuration](https://developers.cloudflare.com/ai-search/configuration/public-endpoint/) \- Configure rate limiting, CORS, and security settings.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/usage/","name":"Search API"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/usage/public-endpoint/","name":"Public endpoint"}}]}
```

---

---
title: REST API
description: This guide explains how to use the AI Search REST API to query your AI Search instance.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/usage/rest-api.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# REST API

This guide explains how to use the AI Search REST API to query your AI Search instance.

Note

The previous [AutoRAG API endpoints](https://developers.cloudflare.com/api/resources/autorag/) are no longer recommended for use. Refer to [Migrate from AutoRAG Search API](https://developers.cloudflare.com/ai-search/how-to/migrate-from-autorag-api/) for details.

## Prerequisite: Get AI Search API token

You need an API token with `AI Search` `Run` permissions to use the REST API. To create a new token:

1. Log in to the Cloudflare dashboard, and go to API tokens for your profile.[ Go to **Account API tokens** ](https://dash.cloudflare.com/?to=/:account/api-tokens)
2. Select **Create Token**.
3. Select **Create Custom Token**.
4. Enter a name for your token.
5. Under **Permissions**, select **AI Search** and **Run**.
6. Under **Account Resources**, select the account you want to use.
7. Select **Continue to summary**, then select **Create Token**.
8. Copy and save your API token for future use.

## Chat Completions

This endpoint searches for relevant results from your data source and generates a response using the model and the retrieved context:

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/instances/{AI_SEARCH_NAME}/chat/completions \

  -H 'Content-Type: application/json' \

  -H "Authorization: Bearer {API_TOKEN}" \

  -d '{

    "messages": [

      {

        "content": "How do I train a llama to deliver coffee?",

        "role": "user"

      }

    ]

  }'


```

Note

* `ACCOUNT_ID`: Find this by going to [Workers & Pages](https://developers.cloudflare.com/fundamentals/account/find-account-and-zone-ids/#find-account-id-workers-and-pages) in the Cloudflare dashboard.
* `AI_SEARCH_NAME`: The name of your AI Search instance.
* `API_TOKEN`: The API token you created in the [prerequisite step](#prerequisite-get-ai-search-api-token).

### Parameters

`messages` ` array ` required

An array of message objects. Each message has:

* `content` ` string ` \- The message content.
* `role` ` string ` \- The role: `user`, `system`, or `assistant`.

`stream` ` boolean ` optional

Set to `true` to return a stream of results as they are generated. Defaults to `false`.

`ai_search_options` ` object ` optional

Per-request overrides for retrieval and model behavior. Supports the following nested options:

* `retrieval.filters` ` object ` \- Narrow down search results based on metadata. Refer to [Metadata filtering](https://developers.cloudflare.com/ai-search/configuration/metadata/) for syntax and examples.
* `retrieval.max_num_results` ` number ` \- Maximum number of chunks to return. Defaults to `10`, maximum `50`.
* `retrieval.retrieval_type` ` string ` \- One of `vector`, `keyword`, or `hybrid`.
* `retrieval.match_threshold` ` number ` \- Minimum similarity score (0-1). Defaults to `0.4`.
* `cache.enabled` ` boolean ` \- Override the instance-level cache setting for this request.
* `reranking.enabled` ` boolean ` \- Override the instance-level reranking setting for this request.

---

For the full list of optional parameters, refer to the [Chat Completions API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/chat%5Fcompletions/).

### Response

When `stream` is set to `false` (default), the response is returned as a single JSON object:

```

{

  "id": "chatcmpl-abc123",

  "object": "chat.completion",

  "created": 1771886959,

  "model": "@cf/meta/llama-3.3-70b-instruct-fp8-fast",

  "choices": [

    {

      "index": 0,

      "message": {

        "role": "assistant",

        "content": "To train a llama to deliver coffee, start by building trust...",

        "refusal": null

      },

      "logprobs": null,

      "finish_reason": "stop"

    }

  ],

  "usage": {

    "prompt_tokens": 6507,

    "completion_tokens": 137,

    "total_tokens": 6644

  },

  "chunks": [

    {

      "id": "chunk001",

      "type": "text",

      "score": 0.85,

      "text": "Llamas can carry up to 3 drinks.",

      "item": {

        "key": "llama-logistics.md",

        "timestamp": 1735689600

      },

      "scoring_details": {

        "vector_score": 0.85

      }

    }

  ]

}


```

When `stream` is set to `true`, the response is returned as server-sent events (SSE). The retrieved chunks are sent first as a single `chunks` event, followed by multiple `data` events containing the generated response in incremental pieces:

```

event: chunks

data: [{"id":"chunk001","type":"text","score":0.85,"text":"...","item":{...},"scoring_details":{...}}]


data: {"id":"id-123","created":1771887723,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"To"}}]}


data: {"id":"id-123","created":1771887723,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" train a llama"}}]}


data: [DONE]


```

This allows you to display the source chunks immediately while streaming the generated response to the user.

## Search

This endpoint searches for results from your data source and returns the relevant chunks without generating a response:

Terminal window

```

curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai-search/instances/{AI_SEARCH_NAME}/search \

  -H 'Content-Type: application/json' \

  -H "Authorization: Bearer {API_TOKEN}" \

  -d '{

    "messages": [

      {

        "content": "How do I train a llama to deliver coffee?",

        "role": "user"

      }

    ]

  }'


```

Note

* `ACCOUNT_ID`: Find this by going to [Workers & Pages](https://developers.cloudflare.com/fundamentals/account/find-account-and-zone-ids/#find-account-id-workers-and-pages) in the Cloudflare dashboard.
* `AI_SEARCH_NAME`: The name of your AI Search instance.
* `API_TOKEN`: The API token you created in the [prerequisite step](#prerequisite-get-ai-search-api-token).

### Parameters

`messages` ` array ` required

An array of message objects. Each message has:

* `content` ` string ` \- The search query content.
* `role` ` string ` \- The role: `user`, `system`, or `assistant`.

`ai_search_options` ` object ` optional

Per-request overrides for retrieval and model behavior. Supports the following nested options:

* `retrieval.filters` ` object ` \- Narrow down search results based on metadata. Refer to [Metadata filtering](https://developers.cloudflare.com/ai-search/configuration/metadata/) for syntax and examples.
* `retrieval.max_num_results` ` number ` \- Maximum number of chunks to return. Defaults to `10`, maximum `50`.
* `retrieval.retrieval_type` ` string ` \- One of `vector`, `keyword`, or `hybrid`.
* `retrieval.match_threshold` ` number ` \- Minimum similarity score (0-1). Defaults to `0.4`.
* `cache.enabled` ` boolean ` \- Override the instance-level cache setting for this request.
* `reranking.enabled` ` boolean ` \- Override the instance-level reranking setting for this request.

---

For the full list of optional parameters, refer to the [Search API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/search/).

### Response

```

{

  "success": true,

  "result": {

    "search_query": "How do I train a llama to deliver coffee?",

    "chunks": [

      {

        "id": "chunk001",

        "type": "text",

        "score": 0.85,

        "text": "Llamas can carry up to 3 drinks.",

        "item": {

          "key": "llama-logistics.md",

          "timestamp": 1735689600

        },

        "scoring_details": {

          "vector_score": 0.85

        }

      }

    ]

  }

}


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/usage/","name":"Search API"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/usage/rest-api/","name":"REST API"}}]}
```

---

---
title: Workers Binding
description: Cloudflare’s serverless platform allows you to run code at the edge to build full-stack applications with Workers. A binding enables your Worker or Pages Function to interact with resources on the Cloudflare Developer Platform.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ Bindings ](https://developers.cloudflare.com/search/?tags=Bindings) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/usage/workers-binding.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Workers Binding

Cloudflare’s serverless platform allows you to run code at the edge to build full-stack applications with [Workers](https://developers.cloudflare.com/workers/). A [binding](https://developers.cloudflare.com/workers/runtime-apis/bindings/) enables your Worker or Pages Function to interact with resources on the Cloudflare Developer Platform.

To use your AI Search with Workers or Pages, create an AI binding either in the Cloudflare dashboard (refer to [AI bindings](https://developers.cloudflare.com/pages/functions/bindings/#workers-ai) for instructions), or you can update your [Wrangler file](https://developers.cloudflare.com/workers/wrangler/configuration/). To bind AI Search to your Worker, add the following to your Wrangler file:

* [  wrangler.jsonc ](#tab-panel-3099)
* [  wrangler.toml ](#tab-panel-3100)

```

{

  "ai": {

    "binding": "AI", // i.e. available in your Worker on env.AI

  },

}


```

```

[ai]

binding = "AI"


```

AI Search is the new name for AutoRAG

API endpoints may still reference `autorag` for the time being. Functionality remains the same, and support for the new naming will be introduced gradually.

## `aiSearch()`

This method searches for relevant results from your data source and generates a response using your default model and the retrieved context, for an AI Search named `my-autorag`:

JavaScript

```

const answer = await env.AI.autorag("my-autorag").aiSearch({

  query: "How do I train a llama to deliver coffee?",

  model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",

  rewrite_query: true,

  max_num_results: 2,

  ranking_options: {

    score_threshold: 0.3,

  },

  reranking: {

    enabled: true,

    model: "@cf/baai/bge-reranker-base",

  },

  stream: true,

});


```

### Parameters

`query` ` string ` required

The input query.

`model` ` string ` optional

The text-generation model that is used to generate the response for the query. For a list of valid options, check the AI Search Generation model Settings. Defaults to the generation model selected in the AI Search Settings.

`system_prompt` ` string ` optional

The system prompt for generating the answer.

`rewrite_query` ` boolean ` optional

Rewrites the original query into a search optimized query to improve retrieval accuracy. Defaults to `false`.

`max_num_results` ` number ` optional

The maximum number of results that can be returned from the Vectorize database. Defaults to `10`. Must be between `1` and `50`.

`ranking_options` ` object ` optional

Configurations for customizing result ranking. Defaults to `{}`.

* `score_threshold` ` number ` optional  
   * The minimum match score required for a result to be considered a match. Defaults to `0`. Must be between `0` and `1`.

`reranking` ` object ` optional

Configurations for customizing reranking. Defaults to `{}`.

* `enabled` ` boolean ` optional  
   * Enables or disables reranking, which reorders retrieved results based on semantic relevance using a reranking model. Defaults to `false`.
* `model` ` string ` optional  
   * The reranking model to use when reranking is enabled.

`stream` ` boolean ` optional

Returns a stream of results as they are available. Defaults to `false`.

`filters` ` object ` optional

Narrow down search results based on metadata, like folder and date, so only relevant content is retrieved. For more details, refer to [Metadata filtering](https://developers.cloudflare.com/ai-search/configuration/metadata/).

### Response

This is the response structure without `stream` enabled.

```

{

    "object": "vector_store.search_results.page",

    "search_query": "How do I train a llama to deliver coffee?",

    "response": "To train a llama to deliver coffee:\n\n1. **Build trust** — Llamas appreciate patience (and decaf).\n2. **Know limits** — Max 3 cups per llama, per `llama-logistics.md`.\n3. **Use voice commands** — Start with \"Espresso Express!\"\n4.",

    "data": [

      {

        "file_id": "llama001",

        "filename": "llama/logistics/llama-logistics.md",

        "score": 0.45,

        "attributes": {

          "modified_date": 1735689600000,   // unix timestamp for 2025-01-01

          "folder": "llama/logistics/",

        },

        "content": [

          {

            "id": "llama001",

            "type": "text",

            "text": "Llamas can carry 3 drinks max."

          }

        ]

      },

      {

        "file_id": "llama042",

        "filename": "llama/llama-commands.md",

        "score": 0.4,

        "attributes": {

          "modified_date": 1735689600000,   // unix timestamp for 2025-01-01

          "folder": "llama/",

        },

        "content": [

          {

            "id": "llama042",

            "type": "text",

            "text": "Start with basic commands like 'Espresso Express!' Llamas love alliteration."

          }

        ]

      },

    ],

    "has_more": false,

    "next_page": null

}


```

## `search()`

This method searches for results from your corpus and returns the relevant results, for the AI Search instance named `my-autorag`:

JavaScript

```

const answer = await env.AI.autorag("my-autorag").search({

  query: "How do I train a llama to deliver coffee?",

  rewrite_query: true,

  max_num_results: 2,

  ranking_options: {

    score_threshold: 0.3,

  },

  reranking: {

    enabled: true,

    model: "@cf/baai/bge-reranker-base",

  },

});


```

### Parameters

`messages` ` array ` required

An array of message objects. Each message has:

* `content` ` string ` \- The search query content.
* `role` ` string ` \- The role: `user`, `system`, or `assistant`.

`ai_search_options` ` object ` optional

Per-request overrides for retrieval and model behavior. Supports the following nested options:

* `retrieval.filters` ` object ` \- Narrow down search results based on metadata. Refer to [Metadata filtering](https://developers.cloudflare.com/ai-search/configuration/metadata/) for syntax and examples.
* `retrieval.max_num_results` ` number ` \- Maximum number of chunks to return. Defaults to `10`, maximum `50`.
* `retrieval.retrieval_type` ` string ` \- One of `vector`, `keyword`, or `hybrid`.
* `retrieval.match_threshold` ` number ` \- Minimum similarity score (0-1). Defaults to `0.4`.
* `cache.enabled` ` boolean ` \- Override the instance-level cache setting for this request.
* `reranking.enabled` ` boolean ` \- Override the instance-level reranking setting for this request.

---

For the full list of optional parameters, refer to the [Search API reference](https://developers.cloudflare.com/api/resources/ai%5Fsearch/subresources/instances/methods/search/).

### Response

```

{

    "object": "vector_store.search_results.page",

    "search_query": "How do I train a llama to deliver coffee?",

    "data": [

      {

        "file_id": "llama001",

        "filename": "llama/logistics/llama-logistics.md",

        "score": 0.45,

        "attributes": {

          "modified_date": 1735689600000,   // unix timestamp for 2025-01-01

          "folder": "llama/logistics/",

        },

        "content": [

          {

            "id": "llama001",

            "type": "text",

            "text": "Llamas can carry 3 drinks max."

          }

        ]

      },

      {

        "file_id": "llama042",

        "filename": "llama/llama-commands.md",

        "score": 0.4,

        "attributes": {

          "modified_date": 1735689600000,   // unix timestamp for 2025-01-01

          "folder": "llama/",

        },

        "content": [

          {

            "id": "llama042",

            "type": "text",

            "text": "Start with basic commands like 'Espresso Express!' Llamas love alliteration."

          }

        ]

      },

    ],

    "has_more": false,

    "next_page": null

}


```

## Local development

Local development is supported by proxying requests to your deployed AI Search instance. When running in local mode, your application forwards queries to the configured remote AI Search instance and returns the generated responses as if they were served locally.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/usage/","name":"Search API"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/usage/workers-binding/","name":"Workers Binding"}}]}
```