---
title: R2 SQL
description: A distributed SQL engine for R2 Data Catalog
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# R2 SQL

Note

R2 SQL is in **open beta**, and any developer with an [R2 subscription](https://developers.cloudflare.com/r2/pricing/) can start using it. Currently, outside of standard R2 storage and operations, you will not be billed for your use of R2 SQL. We will update [the pricing page](https://developers.cloudflare.com/r2-sql/platform/pricing) and provide at least 30 days notice before enabling billing.

Query Apache Iceberg tables managed by R2 Data Catalog using SQL.

R2 SQL is Cloudflare's serverless, distributed, analytics query engine for querying [Apache Iceberg ↗](https://iceberg.apache.org/) tables stored in [R2 Data Catalog](https://developers.cloudflare.com/r2/data-catalog/). R2 SQL is designed to efficiently query large amounts of data by automatically utilizing file pruning, Cloudflare's distributed compute, and R2 object storage.

Terminal window

```

❯ npx wrangler r2 sql query "3373912de3f5202317188ae01300bd6_data-catalog" \

"SELECT * FROM default.transactions LIMIT 10"


 ⛅️ wrangler 4.38.0

────────────────────────────────────────────────────────────────────────────

▲ [WARNING] 🚧 `wrangler r2 sql query` is an open-beta command. Please report any issues to https://github.com/cloudflare/workers-sdk/issues/new/choose


┌─────────────────────────────┬──────────────────────────────────────┬─────────┬──────────┬──────────────────────────────────┬───────────────┬───────────────────┬──────────┐

│ __ingest_ts                 │ transaction_id                       │ user_id │ amount   │ transaction_timestamp            │ location      │ merchant_category │ is_fraud │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.872554Z │ fdc1beed-157c-4d2d-90cf-630fdea58051 │ 1679    │ 13241.59 │ 2025-09-20T02:23:04.269988+00:00 │ NEW_YORK      │ RESTAURANT        │ false    │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.724378Z │ ea7ef106-8284-4d08-9348-ad33989b6381 │ 1279    │ 17615.79 │ 2025-09-20T02:23:04.271090+00:00 │ MIAMI         │ GAS_STATION       │ true     │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.724330Z │ afcdee4d-5c71-42be-97ec-e282b6937a8c │ 1843    │ 7311.65  │ 2025-09-20T06:23:04.267890+00:00 │ SEATTLE       │ GROCERY           │ true     │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.657007Z │ b99d14e0-dbe0-49bc-a417-0ee57f8bed99 │ 1976    │ 15228.21 │ 2025-09-16T23:23:04.269426+00:00 │ NEW_YORK      │ RETAIL            │ false    │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.656992Z │ 712cd094-ad4c-4d24-819a-0d3daaaceea1 │ 1184    │ 7570.89  │ 2025-09-20T00:23:04.269163+00:00 │ LOS_ANGELES   │ RESTAURANT        │ true     │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.656912Z │ b5a1aab3-676d-4492-92b8-aabcde6db261 │ 1196    │ 46611.25 │ 2025-09-20T16:23:04.268693+00:00 │ NEW_YORK      │ RETAIL            │ true     │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.613740Z │ 432d3976-8d89-4813-9099-ea2afa2c0e70 │ 1720    │ 21547.9  │ 2025-09-20T05:23:04.273681+00:00 │ SAN FRANCISCO │ GROCERY           │ true     │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.532068Z │ 25e0b851-3092-4ade-842f-e3189e07d4ee │ 1562    │ 29311.54 │ 2025-09-20T05:23:04.277405+00:00 │ NEW_YORK      │ RETAIL            │ false    │

├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤

│ 2025-09-20T22:30:11.526037Z │ 8001746d-05fe-42fe-a189-40caf81d7aa2 │ 1817    │ 15976.5  │ 2025-09-15T16:23:04.266632+00:00 │ SEATTLE       │ RESTAURANT        │ true     │

└─────────────────────────────┴──────────────────────────────────────┴─────────┴──────────┴──────────────────────────────────┴───────────────┴───────────────────┴──────────┘

Read 11.3 kB across 4 files from R2

On average, 3.36 kB / s


```

Create an end-to-end data pipeline by following [this step by step guide](https://developers.cloudflare.com/r2-sql/get-started/), which shows you how to stream events into an Apache Iceberg table and query it with R2 SQL.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}}]}
```

---

---
title: Getting started
description: Create your first pipeline to ingest streaming data and write to R2 Data Catalog as an Apache Iceberg table.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/get-started.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Getting started

This guide will instruct you through:

* Creating your first [R2 bucket](https://developers.cloudflare.com/r2/buckets/) and enabling its [data catalog](https://developers.cloudflare.com/r2/data-catalog/).
* Creating an [API token](https://developers.cloudflare.com/r2/api/tokens/) needed for pipelines to authenticate with your data catalog.
* Creating your first pipeline with a simple ecommerce schema that writes to an [Apache Iceberg ↗](https://iceberg.apache.org/) table managed by R2 Data Catalog.
* Sending sample ecommerce data via HTTP endpoint.
* Validating data in your bucket and querying it with R2 SQL.

## Prerequisites

1. Sign up for a [Cloudflare account ↗](https://dash.cloudflare.com/sign-up/workers-and-pages).
2. Install [Node.js ↗](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm).

Node.js version manager

Use a Node version manager like [Volta ↗](https://volta.sh/) or [nvm ↗](https://github.com/nvm-sh/nvm) to avoid permission issues and change Node.js versions. [Wrangler](https://developers.cloudflare.com/workers/wrangler/install-and-update/), discussed later in this guide, requires a Node version of `16.17.0` or later.

## 1\. Create an R2 bucket

* [ Wrangler CLI ](#tab-panel-5731)
* [ Dashboard ](#tab-panel-5732)

1. If not already logged in, run:  
```  
npx wrangler login  
```
2. Create an R2 bucket:  
```  
npx wrangler r2 bucket create pipelines-tutorial  
```

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select **Create bucket**.
3. Enter the bucket name: pipelines-tutorial
4. Select **Create bucket**.

## 2\. Enable R2 Data Catalog

* [ Wrangler CLI ](#tab-panel-5733)
* [ Dashboard ](#tab-panel-5734)

Enable the catalog on your R2 bucket:

```

npx wrangler r2 bucket catalog enable pipelines-tutorial


```

When you run this command, take note of the "Warehouse" and "Catalog URI". You will need these later.

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select the bucket: pipelines-tutorial.
3. Switch to the **Settings** tab, scroll down to **R2 Data Catalog**, and select **Enable**.
4. Once enabled, note the **Catalog URI** and **Warehouse name**.

## 3\. Create an API token

Pipelines must authenticate to R2 Data Catalog with an [R2 API token](https://developers.cloudflare.com/r2/api/tokens/) that has catalog and R2 permissions.

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select **Manage API tokens**.
3. Select **Create Account API token**.
4. Give your API token a name.
5. Under **Permissions**, choose the **Admin Read & Write** permission.
6. Select **Create Account API Token**.
7. Note the **Token value**.

Note

This token also includes the R2 SQL Read permission, which allows you to query your data with R2 SQL.

## 4\. Create a pipeline

* [ Wrangler CLI ](#tab-panel-5735)
* [ Dashboard ](#tab-panel-5736)

First, create a schema file that defines your ecommerce data structure:

**Create `schema.json`:**

```

{

  "fields": [

    {

      "name": "user_id",

      "type": "string",

      "required": true

    },

    {

      "name": "event_type",

      "type": "string",

      "required": true

    },

    {

      "name": "product_id",

      "type": "string",

      "required": false

    },

    {

      "name": "amount",

      "type": "float64",

      "required": false

    }

  ]

}


```

Use the interactive setup to create a pipeline that writes to R2 Data Catalog:

Terminal window

```

npx wrangler pipelines setup


```

Follow the prompts:

1. **Pipeline name**: Enter `ecommerce`
2. **Stream configuration**:  
   * Enable HTTP endpoint: `yes`  
   * Require authentication: `no` (for simplicity)  
   * Configure custom CORS origins: `no`  
   * Schema definition: `Load from file`  
   * Schema file path: `schema.json` (or your file path)
3. **Sink configuration**:  
   * Destination type: `Data Catalog Table`  
   * R2 bucket name: `pipelines-tutorial`  
   * Namespace: `default`  
   * Table name: `ecommerce`  
   * Catalog API token: Enter your token from step 3  
   * Compression: `zstd`  
   * Roll file when size reaches (MB): `100`  
   * Roll file when time reaches (seconds): `10` (for faster data visibility in this tutorial)
4. **SQL transformation**: Choose `Use simple ingestion query` to use:  
```  
INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream  
```

After setup completes, note the HTTP endpoint URL displayed in the final output.

1. In the Cloudflare dashboard, go to **Pipelines** \> **Pipelines**.  
[ Go to **Pipelines** ](https://dash.cloudflare.com/?to=/:account/pipelines/overview)
2. Select **Create Pipeline**.
3. **Connect to a Stream**:  
   * Pipeline name: `ecommerce`  
   * Enable HTTP endpoint for sending data: Enabled  
   * HTTP authentication: Disabled (default)  
   * Select **Next**
4. **Define Input Schema**:  
   * Select **JSON editor**  
   * Copy in the schema:  
   ```  
   {  
     "fields": [  
       {  
         "name": "user_id",  
         "type": "string",  
         "required": true  
       },  
       {  
         "name": "event_type",  
         "type": "string",  
         "required": true  
       },  
       {  
         "name": "product_id",  
         "type": "string",  
         "required": false  
       },  
       {  
         "name": "amount",  
         "type": "f64",  
         "required": false  
       }  
     ]  
   }  
   ```  
   * Select **Next**
5. **Define Sink**:  
   * Select your R2 bucket: `pipelines-tutorial`  
   * Storage type: **R2 Data Catalog**  
   * Namespace: `default`  
   * Table name: `ecommerce`  
   * **Advanced Settings**: Change **Maximum Time Interval** to `10 seconds`  
   * Select **Next**
6. **Credentials**:  
   * Disable **Automatically create an Account API token for your sink**  
   * Enter **Catalog Token** from step 3  
   * Select **Next**
7. **Pipeline Definition**:  
   * Leave the default SQL query:  
   ```  
   INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream;  
   ```  
   * Select **Create Pipeline**
8. After pipeline creation, note the **Stream ID** for the next step.

## 5\. Send sample data

Send ecommerce events to your pipeline's HTTP endpoint:

Terminal window

```

curl -X POST https://{stream-id}.ingest.cloudflare.com \

  -H "Content-Type: application/json" \

  -d '[

    {

      "user_id": "user_12345",

      "event_type": "purchase",

      "product_id": "widget-001",

      "amount": 29.99

    },

    {

      "user_id": "user_67890",

      "event_type": "view_product",

      "product_id": "widget-002"

    },

    {

      "user_id": "user_12345",

      "event_type": "add_to_cart",

      "product_id": "widget-003",

      "amount": 15.50

    }

  ]'


```

Replace `{stream-id}` with your actual stream endpoint from the pipeline setup.

## 6\. Validate data in your bucket

1. In the Cloudflare dashboard, go to the **R2 object storage** page.
2. Select your bucket: `pipelines-tutorial`.
3. You should see Iceberg metadata files and data files created by your pipeline. Note: If you aren't seeing any files in your bucket, try waiting a couple of minutes and trying again.
4. The data is organized in the Apache Iceberg format with metadata tracking table versions.

## 7\. Query your data using R2 SQL

Set up your environment to use R2 SQL:

Terminal window

```

export WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN


```

Or create a `.env` file with:

```

WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN


```

Where `YOUR_API_TOKEN` is the token you created in step 3\. For more information on setting environment variables, refer to [Wrangler system environment variables](https://developers.cloudflare.com/workers/wrangler/system-environment-variables/).

Query your data:

Terminal window

```

npx wrangler r2 sql query "YOUR_WAREHOUSE_NAME" "

SELECT

    user_id,

    event_type,

    product_id,

    amount

FROM default.ecommerce

WHERE event_type = 'purchase'

LIMIT 10"


```

Replace `YOUR_WAREHOUSE_NAME` with the warehouse name from step 2.

You can also query this table with any engine that supports Apache Iceberg. To learn more about connecting other engines to R2 Data Catalog, refer to [Connect to Iceberg engines](https://developers.cloudflare.com/r2/data-catalog/config-examples/).

## Learn more

[ Managing R2 Data Catalogs ](https://developers.cloudflare.com/r2/data-catalog/manage-catalogs/) Enable or disable R2 Data Catalog on your bucket, retrieve configuration details, and authenticate your Iceberg engine. 

[ Try another example ](https://developers.cloudflare.com/r2-sql/tutorials/end-to-end-pipeline) Detailed tutorial for setting up a simple fraud detection data pipeline, and generate events for it in Python. 

[ Pipelines ](https://developers.cloudflare.com/pipelines/) Understand SQL transformations and pipeline configuration. 

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/get-started/","name":"Getting started"}}]}
```

---

---
title: Query data
description: Understand how to query data with R2 SQL
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/query-data.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Query data

Query [Apache Iceberg ↗](https://iceberg.apache.org/) tables managed by [R2 Data Catalog](https://developers.cloudflare.com/r2/data-catalog/). R2 SQL queries can be made via [Wrangler](https://developers.cloudflare.com/workers/wrangler/) or HTTP API.

## Get your warehouse name

To query data with R2 SQL, you'll need your warehouse name associated with your [catalog](https://developers.cloudflare.com/r2/data-catalog/manage-catalogs/). To retrieve it, you can run the [r2 bucket catalog get command](https://developers.cloudflare.com/workers/wrangler/commands/r2/#r2-bucket-catalog-get):

Terminal window

```

npx wrangler r2 bucket catalog get <BUCKET_NAME>


```

Alternatively, you can find it in the dashboard by going to the **R2 object storage** page, selecting the bucket, switching to the **Settings** tab, scrolling to **R2 Data Catalog**, and finding **Warehouse name**.

## Query via Wrangler

To begin, install [npm ↗](https://docs.npmjs.com/getting-started). Then [install Wrangler, the Developer Platform CLI](https://developers.cloudflare.com/workers/wrangler/install-and-update/).

Wrangler needs an API token with permissions to access R2 Data Catalog, R2 storage, and R2 SQL to execute queries. The `r2 sql query` command looks for the token in the `WRANGLER_R2_SQL_AUTH_TOKEN` environment variable.

Set up your environment:

Terminal window

```

export WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN


```

Or create a `.env` file with:

```

WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN


```

Where `YOUR_API_TOKEN` is the token you created with the [required permissions](#authentication). For more information on setting environment variables, refer to [Wrangler system environment variables](https://developers.cloudflare.com/workers/wrangler/system-environment-variables/).

To run a SQL query, run the [r2 sql query command](https://developers.cloudflare.com/workers/wrangler/commands/r2/#r2-sql-query):

Terminal window

```

npx wrangler r2 sql query <WAREHOUSE> "SELECT * FROM namespace.table_name limit 10;"


```

For a full list of supported SQL commands, refer to the [R2 SQL reference](https://developers.cloudflare.com/r2-sql/sql-reference/).

## Query via API

Below is an example of using R2 SQL via the REST endpoint:

Terminal window

```

curl -X POST \

  "https://api.sql.cloudflarestorage.com/api/v1/accounts/{ACCOUNT_ID}/r2-sql/query/{BUCKET_NAME}" \

  -H "Authorization: Bearer ${WRANGLER_R2_SQL_AUTH_TOKEN}" \

  -H "Content-Type: application/json" \

  -d '{

    "query": "SELECT * FROM namespace.table_name limit 10;"

  }'


```

The API requires an API token with the appropriate permissions in the Authorization header. Refer to [Authentication](#authentication) for details on creating a token.

For a full list of supported SQL commands, refer to the [R2 SQL reference](https://developers.cloudflare.com/r2-sql/sql-reference/).

## Authentication

To query data with R2 SQL, you must provide a Cloudflare API token with R2 SQL, R2 Data Catalog, and R2 storage permissions. R2 SQL requires these permissions to access catalog metadata and read the underlying data files stored in R2.

### Create API token in the dashboard

Create an [R2 API token](https://developers.cloudflare.com/r2/api/tokens/#permissions) with the following permissions:

* Access to R2 Data Catalog (read-only)
* Access to R2 storage (Admin read/write)
* Access to R2 SQL (read-only)

Use this token value for the `WRANGLER_R2_SQL_AUTH_TOKEN` environment variable when querying with Wrangler, or in the Authorization header when using the REST API.

### Create API token via API

To create an API token programmatically for use with R2 SQL, you'll need to specify R2 SQL, R2 Data Catalog, and R2 storage permission groups in your [Access Policy](https://developers.cloudflare.com/r2/api/tokens/#access-policy).

#### Example Access Policy

```

[

  {

    "id": "f267e341f3dd4697bd3b9f71dd96247f",

    "effect": "allow",

    "resources": {

      "com.cloudflare.edge.r2.bucket.4793d734c0b8e484dfc37ec392b5fa8a_default_my-bucket": "*",

      "com.cloudflare.edge.r2.bucket.4793d734c0b8e484dfc37ec392b5fa8a_eu_my-eu-bucket": "*"

    },

    "permission_groups": [

      {

        "id": "f45430d92e2b4a6cb9f94f2594c141b8",

        "name": "Workers R2 SQL Read"

      },

      {

        "id": "d229766a2f7f4d299f20eaa8c9b1fde9",

        "name": "Workers R2 Data Catalog Write"

      },

      {

        "id": "bf7481a1826f439697cb59a20b22293e",

        "name": "Workers R2 Storage Write"

      }

    ]

  }

]


```

To learn more about how to create API tokens for R2 SQL using the API, including required permission groups and usage examples, refer to the [Create API tokens via API documentation](https://developers.cloudflare.com/r2/api/tokens/#create-api-tokens-via-api).

## Additional resources

[ Manage R2 Data Catalogs ](https://developers.cloudflare.com/r2/data-catalog/manage-catalogs/) Enable or disable R2 Data Catalog on your bucket, retrieve configuration details, and authenticate your Iceberg engine. 

[ Build an end to end data pipeline ](https://developers.cloudflare.com/r2-sql/tutorials/end-to-end-pipeline) Detailed tutorial for setting up a simple fraud detection data pipeline, and generate events for it in Python. 

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/query-data/","name":"Query data"}}]}
```

---

---
title: SQL reference
description: Comprehensive reference for SQL syntax, functions, and data types supported in R2 SQL.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ SQL ](https://developers.cloudflare.com/search/?tags=SQL) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/sql-reference/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# SQL reference

Note

R2 SQL is in public beta. Supported SQL grammar may change over time.

R2 SQL is Cloudflare's serverless, distributed, analytics query engine for querying [Apache Iceberg ↗](https://iceberg.apache.org/) tables stored in [R2 Data Catalog](https://developers.cloudflare.com/r2/data-catalog/). This page documents the supported SQL syntax.

---

## Query syntax

```

SELECT column_list | expression | aggregation_function

FROM namespace_name.table_name

[WHERE conditions]

[GROUP BY column_list]

[HAVING conditions]

[ORDER BY expression [ASC | DESC]]

[LIMIT number]


```

---

## Schema discovery commands

### SHOW DATABASES

Lists all available namespaces.

```

SHOW DATABASES;


```

### SHOW NAMESPACES

Alias for `SHOW DATABASES`. Lists all available namespaces.

```

SHOW NAMESPACES;


```

### SHOW TABLES

Lists all tables within a specific namespace.

```

SHOW TABLES IN namespace_name;


```

### DESCRIBE

Describes the structure of a table, showing column names and data types.

```

DESCRIBE namespace_name.table_name;


```

---

## SELECT clause

### Syntax

```

SELECT column_specification [, column_specification, ...]


```

### Column specification

* **Column name**: `column_name`
* **All columns**: `*`
* **Qualified wildcard**: `table_name.*`
* **Column alias**: `column_name AS alias`
* **Expressions**: arithmetic, function calls, CASE expressions, and casts

### Examples

```

SELECT * FROM my_namespace.sales_data LIMIT 10

SELECT customer_id, region, total_amount FROM my_namespace.sales_data LIMIT 10

SELECT region, total_amount * 1.1 AS total_with_tax FROM my_namespace.sales_data LIMIT 10


```

---

## Common table expressions (CTEs)

CTEs let you define named temporary result sets using `WITH` that you can reference in the main query. All CTEs must reference the same single table.

### Syntax

```

WITH cte_name AS (

    SELECT ...

    FROM namespace_name.table_name

    [WHERE ...]

)

SELECT ... FROM cte_name


```

### Chained CTEs

A CTE can reference a previously defined CTE. All CTEs in the chain must derive from the same underlying table.

```

WITH filtered AS (

    SELECT customer_id, department, total_amount

    FROM my_namespace.sales_data

    WHERE total_amount > 0

),

summary AS (

    SELECT department,

           COUNT(*) AS order_count,

           round(AVG(total_amount), 2) AS avg_amount

    FROM filtered

    GROUP BY department

)

SELECT *

FROM summary

WHERE order_count > 100

ORDER BY avg_amount DESC


```

Note

CTEs must reference a single table. Multi-table CTEs, JOINs within CTEs, and cross-table references are not supported.

---

## FROM clause

### Syntax

```

SELECT * FROM namespace_name.table_name


```

R2 SQL queries reference exactly one table, specified as `namespace_name.table_name`.

---

## WHERE clause

### Syntax

```

SELECT * FROM namespace_name.table_name WHERE condition [AND | OR condition ...]


```

### Conditions

#### Comparison operators

`=`, `!=`, `<>`, `<`, `>`, `<=`, `>=`

#### Null checks

* `column_name IS NULL`
* `column_name IS NOT NULL`

#### Boolean checks

* `IS TRUE`, `IS FALSE`, `IS NOT TRUE`, `IS NOT FALSE`
* `IS UNKNOWN`, `IS NOT UNKNOWN`

#### Range

* `column_name BETWEEN value1 AND value2`
* `column_name NOT BETWEEN value1 AND value2`

#### List membership

* `column_name IN ('value1', 'value2')`
* `column_name NOT IN ('value1', 'value2')`

#### Pattern matching

* `column_name LIKE 'pattern'`
* `column_name NOT LIKE 'pattern'`
* `column_name ILIKE 'pattern'` (case-insensitive)
* `column_name NOT ILIKE 'pattern'`
* `column_name SIMILAR TO 'regex_pattern'`

#### Logical operators

* `AND`
* `OR`
* `NOT`

### Examples

```

SELECT * FROM my_namespace.sales_data

WHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'


SELECT * FROM my_namespace.sales_data

WHERE status = 200 AND response_time > 1000


SELECT * FROM my_namespace.sales_data

WHERE (region = 'North' OR region = 'South')

  AND total_amount IS NOT NULL


SELECT * FROM my_namespace.sales_data

WHERE department ILIKE '%eng%'


```

---

## GROUP BY clause

### Syntax

```

SELECT column_list, aggregation_function(column)

FROM namespace_name.table_name

[WHERE conditions]

GROUP BY column_list


```

### Examples

```

SELECT department, COUNT(*) AS dept_count

FROM my_namespace.sales_data

GROUP BY department


SELECT department, category, SUM(total_amount) AS total

FROM my_namespace.sales_data

GROUP BY department, category


```

---

## HAVING clause

### Syntax

```

SELECT column_list, aggregation_function(column) AS alias

FROM namespace_name.table_name

GROUP BY column_list

HAVING aggregation_function(column) comparison_operator value


```

### Examples

```

SELECT department, COUNT(*) AS dept_count

FROM my_namespace.sales_data

GROUP BY department

HAVING COUNT(*) > 1000


SELECT region, SUM(total_amount) AS total

FROM my_namespace.sales_data

GROUP BY region

HAVING SUM(total_amount) > 1000000


```

---

## ORDER BY clause

### Syntax

```

ORDER BY expression [ASC | DESC] [, expression [ASC | DESC], ...]


```

* **ASC**: Ascending order (default)
* **DESC**: Descending order
* Multi-column ordering is supported

### Examples

```

SELECT customer_id, total_amount

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

ORDER BY total_amount DESC

LIMIT 50


SELECT department, COUNT(*) AS dept_count

FROM my_namespace.sales_data

GROUP BY department

ORDER BY dept_count DESC, department ASC


```

---

## LIMIT clause

### Syntax

```

LIMIT number


```

* **Type**: Integer only
* **Default**: 500

### Examples

```

SELECT * FROM my_namespace.sales_data LIMIT 100


```

---

## EXPLAIN

Returns the execution plan for a query without running it.

```

EXPLAIN SELECT department, COUNT(*) AS dept_count

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

GROUP BY department;


```

---

## Expressions

Expressions can be used in `SELECT`, `WHERE`, `GROUP BY`, `HAVING`, and `ORDER BY` clauses.

### Literals

```

SELECT 42 AS int_val, 3.14 AS float_val, 'hello' AS str_val, TRUE AS bool_val, NULL AS null_val

FROM my_namespace.sales_data LIMIT 1


```

### Arithmetic operators

`+`, `-`, `*`, `/`, `%`

```

SELECT customer_id, total_amount * 1.1 AS total_with_tax, total_amount % 10 AS remainder

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### String concatenation

```

SELECT customer_id || ' - ' || region AS label

FROM my_namespace.sales_data

LIMIT 5


```

### CASE expressions

Searched form:

```

SELECT customer_id,

    CASE

        WHEN total_amount > 1000 THEN 'high'

        WHEN total_amount > 100 THEN 'medium'

        ELSE 'low'

    END AS tier

FROM my_namespace.sales_data

LIMIT 10


```

Simple form:

```

SELECT customer_id,

    CASE region

        WHEN 'North' THEN 'N'

        WHEN 'South' THEN 'S'

        ELSE 'Other'

    END AS region_code

FROM my_namespace.sales_data

LIMIT 10


```

### Type casting

```

-- CAST

SELECT CAST(total_amount AS INT) AS amount_int FROM my_namespace.sales_data LIMIT 5


-- TRY_CAST (returns NULL on failure instead of error)

SELECT TRY_CAST(customer_id AS INT) AS id_int FROM my_namespace.sales_data LIMIT 5


-- Shorthand (::)

SELECT total_amount::INT AS amount_int FROM my_namespace.sales_data LIMIT 5


```

### EXTRACT

```

SELECT EXTRACT(YEAR FROM timestamp) AS yr,

       EXTRACT(MONTH FROM timestamp) AS mo,

       EXTRACT(DAY FROM timestamp) AS dy

FROM my_namespace.sales_data

LIMIT 1


```

---

## Data type reference

| Type      | Description     | Example Values               |
| --------- | --------------- | ---------------------------- |
| integer   | Whole numbers   | 1, 42, \-10, 0               |
| float     | Decimal numbers | 1.5, 3.14, \-2.7, 0.0        |
| string    | Text values     | 'hello', 'GET', '2024-01-01' |
| boolean   | Boolean values  | true, false                  |
| timestamp | RFC3339         | '2025-09-24T01:00:00Z'       |
| date      | Date values     | '2025-09-24'                 |
| struct    | Named fields    | struct\_col\['field\_name'\] |
| array     | Ordered list    | array\_col\[1\] (1-indexed)  |
| map       | Key-value pairs | map\_keys(map\_col)          |

---

## Operator precedence

1. **Comparison operators**: `=`, `!=`, `<`, `<=`, `>`, `>=`, `LIKE`, `BETWEEN`, `IS NULL`, `IS NOT NULL`
2. **AND** (higher precedence)
3. **OR** (lower precedence)

Use parentheses to override default precedence:

```

SELECT * FROM my_namespace.sales_data WHERE (status = 404 OR status = 500) AND region = 'North'


```

---

## Complete query examples

### Basic query

```

SELECT *

FROM my_namespace.sales_data

WHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'

LIMIT 100


```

### Filtered query with sorting

```

SELECT customer_id, timestamp, status, total_amount

FROM my_namespace.sales_data

WHERE status >= 400 AND total_amount > 5000

ORDER BY total_amount DESC

LIMIT 50


```

### Aggregation with HAVING

```

SELECT region, COUNT(*) AS region_count, AVG(total_amount) AS avg_amount

FROM my_namespace.sales_data

WHERE status = 'completed'

GROUP BY region

HAVING COUNT(*) > 1000

ORDER BY avg_amount DESC

LIMIT 20


```

### Conditional categorization

```

SELECT customer_id,

    CASE

        WHEN total_amount >= 1000 THEN 'Premium'

        WHEN total_amount >= 100 THEN 'Standard'

        ELSE 'Basic'

    END AS tier,

    total_amount

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

ORDER BY total_amount DESC

LIMIT 20


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/sql-reference/","name":"SQL reference"}}]}
```

---

---
title: Aggregate functions
description: Reference for all 33 aggregate functions supported in R2 SQL, organized by category.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ SQL ](https://developers.cloudflare.com/search/?tags=SQL) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/sql-reference/aggregate-functions.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Aggregate functions

Aggregate functions collapse multiple rows into a single result. They are used with `GROUP BY` to compute summaries per group, or without `GROUP BY` to compute a single result across all rows.

Note

The following aggregates are not supported: `PERCENTILE_CONT`, `MEDIAN`, `ARRAY_AGG`, `STRING_AGG`, and any `func(DISTINCT ...)`. Use the approximate alternatives where available.

---

## Basic aggregates

### COUNT

Counts rows. `COUNT(*)` counts all rows. `COUNT(column)` counts non-NULL values.

```

SELECT COUNT(*) AS total_rows

FROM my_namespace.sales_data


SELECT department, COUNT(*) AS dept_count

FROM my_namespace.sales_data

GROUP BY department

ORDER BY dept_count DESC


```

### SUM

Returns the sum of values in a column.

```

SELECT SUM(total_amount) AS grand_total

FROM my_namespace.sales_data


SELECT department, SUM(total_amount) AS dept_total

FROM my_namespace.sales_data

GROUP BY department

ORDER BY dept_total DESC


```

### AVG

Returns the average of values in a column. Alias: `mean`.

```

SELECT AVG(total_amount) AS avg_amount

FROM my_namespace.sales_data


SELECT department, AVG(total_amount) AS avg_amount

FROM my_namespace.sales_data

GROUP BY department

ORDER BY avg_amount DESC


```

### MIN

Returns the minimum value. Works on numeric and string columns.

```

SELECT MIN(total_amount) AS min_amount, MIN(customer_id) AS first_customer

FROM my_namespace.sales_data


SELECT department, MIN(total_amount) AS min_amount

FROM my_namespace.sales_data

GROUP BY department


```

### MAX

Returns the maximum value. Works on numeric and string columns.

```

SELECT MAX(total_amount) AS max_amount, MAX(customer_id) AS last_customer

FROM my_namespace.sales_data


SELECT department, MAX(total_amount) AS max_amount

FROM my_namespace.sales_data

GROUP BY department


```

---

## Approximate aggregates

Approximate aggregation functions produce statistically estimated results while using significantly less memory and compute than their exact counterparts. Use them when analyzing large datasets and an approximate result is acceptable.

### approx\_percentile\_cont

Returns the approximate value at a given percentile using a T-Digest algorithm. The percentile parameter must be between `0.0` and `1.0` inclusive.

```

SELECT approx_percentile_cont(total_amount, 0.5) AS median,

       approx_percentile_cont(total_amount, 0.95) AS p95

FROM my_namespace.sales_data


SELECT department,

       approx_percentile_cont(total_amount, 0.5) AS median

FROM my_namespace.sales_data

GROUP BY department

ORDER BY median DESC


```

### approx\_percentile\_cont\_with\_weight

Returns the approximate weighted percentile. Rows are weighted by the `weight` column.

```

SELECT approx_percentile_cont_with_weight(unit_price, quantity, 0.5) AS weighted_median

FROM my_namespace.sales_data

WHERE unit_price IS NOT NULL AND quantity IS NOT NULL


```

### approx\_median

Returns the approximate median. Equivalent to `approx_percentile_cont(column, 0.5)`.

```

SELECT approx_median(total_amount) AS median_amount

FROM my_namespace.sales_data


SELECT department, approx_median(total_amount) AS median

FROM my_namespace.sales_data

GROUP BY department


```

### approx\_distinct

Returns the approximate count of distinct values using HyperLogLog.

```

SELECT approx_distinct(customer_id) AS unique_customers

FROM my_namespace.sales_data


SELECT department, approx_distinct(customer_id) AS unique_customers

FROM my_namespace.sales_data

GROUP BY department


```

### approx\_top\_k

Returns the _k_ most frequent values with their approximate counts.

```

SELECT approx_top_k(department, 5) AS top_departments

FROM my_namespace.sales_data


```

---

## Statistical aggregates

### var / var\_samp

Returns the sample variance.

```

SELECT var(total_amount) AS variance

FROM my_namespace.sales_data


SELECT department, var(total_amount) AS variance

FROM my_namespace.sales_data

GROUP BY department


```

### var\_pop

Returns the population variance.

```

SELECT var_pop(total_amount) AS pop_variance

FROM my_namespace.sales_data


```

### stddev / stddev\_samp

Returns the sample standard deviation.

```

SELECT stddev(total_amount) AS std_dev

FROM my_namespace.sales_data


SELECT department, stddev(total_amount) AS std_dev

FROM my_namespace.sales_data

GROUP BY department


```

### stddev\_pop

Returns the population standard deviation.

```

SELECT stddev_pop(total_amount) AS pop_std_dev

FROM my_namespace.sales_data


```

### covar\_samp

Returns the sample covariance. Alias: `covar`.

```

SELECT covar_samp(total_amount, CAST(quantity AS DOUBLE)) AS covariance

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### covar\_pop

Returns the population covariance.

```

SELECT covar_pop(total_amount, CAST(quantity AS DOUBLE)) AS pop_covariance

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### corr

Returns the Pearson correlation coefficient between two columns.

```

SELECT corr(total_amount, CAST(quantity AS DOUBLE)) AS correlation

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_slope

Returns the slope of the linear regression line.

```

SELECT regr_slope(total_amount, CAST(quantity AS DOUBLE)) AS slope

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_intercept

Returns the y-intercept of the linear regression line.

```

SELECT regr_intercept(total_amount, CAST(quantity AS DOUBLE)) AS intercept

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_count

Returns the count of non-NULL pairs.

```

SELECT regr_count(total_amount, CAST(quantity AS DOUBLE)) AS pair_count

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_r2

Returns the coefficient of determination (R-squared).

```

SELECT regr_r2(total_amount, CAST(quantity AS DOUBLE)) AS r_squared

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_avgx

Returns the average of the independent variable (x) for non-NULL pairs.

```

SELECT regr_avgx(total_amount, CAST(quantity AS DOUBLE)) AS avg_qty

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_avgy

Returns the average of the dependent variable (y) for non-NULL pairs.

```

SELECT regr_avgy(total_amount, CAST(quantity AS DOUBLE)) AS avg_amount

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_sxx

Returns the sum of squares of the independent variable.

```

SELECT regr_sxx(total_amount, CAST(quantity AS DOUBLE)) AS sxx

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_syy

Returns the sum of squares of the dependent variable.

```

SELECT regr_syy(total_amount, CAST(quantity AS DOUBLE)) AS syy

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

### regr\_sxy

Returns the sum of products of the paired variables.

```

SELECT regr_sxy(total_amount, CAST(quantity AS DOUBLE)) AS sxy

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL AND quantity IS NOT NULL


```

---

## Bitwise aggregates

### bit\_and

Returns the bitwise AND of all values in a group.

```

SELECT department, bit_and(quantity) AS and_result

FROM my_namespace.sales_data

WHERE quantity IS NOT NULL

GROUP BY department


```

### bit\_or

Returns the bitwise OR of all values in a group.

```

SELECT department, bit_or(quantity) AS or_result

FROM my_namespace.sales_data

WHERE quantity IS NOT NULL

GROUP BY department


```

### bit\_xor

Returns the bitwise XOR of all values in a group.

```

SELECT department, bit_xor(quantity) AS xor_result

FROM my_namespace.sales_data

WHERE quantity IS NOT NULL

GROUP BY department


```

---

## Boolean aggregates

### bool\_and

Returns true if all values in a group are true.

```

SELECT department, bool_and(is_completed) AS all_completed

FROM my_namespace.sales_data

WHERE is_completed IS NOT NULL

GROUP BY department


```

### bool\_or

Returns true if any value in a group is true.

```

SELECT department, bool_or(is_completed) AS any_completed

FROM my_namespace.sales_data

WHERE is_completed IS NOT NULL

GROUP BY department


```

---

## Positional aggregates

### first\_value

Returns the first value in a group according to the specified ordering.

```

SELECT department,

       first_value(customer_id ORDER BY total_amount ASC) AS lowest_spender

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

GROUP BY department


```

### last\_value

Returns the last value in a group according to the specified ordering.

```

SELECT department,

       last_value(customer_id ORDER BY total_amount ASC) AS highest_spender

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

GROUP BY department


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/sql-reference/","name":"SQL reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/sql-reference/aggregate-functions/","name":"Aggregate functions"}}]}
```

---

---
title: Complex types
description: Reference for querying struct, array, and map column types in R2 SQL.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ SQL ](https://developers.cloudflare.com/search/?tags=SQL) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/sql-reference/complex-types.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Complex types

R2 SQL supports querying struct, array, and map column types stored in Iceberg tables. This page covers access patterns, supported functions, and examples for each type.

---

## Structs

Struct columns contain named fields. Access fields using bracket notation or the `get_field()` function.

### Bracket notation

```

SELECT pricing['price'] AS price,

       pricing['discount_percent'] AS discount

FROM my_namespace.products

LIMIT 5


```

### get\_field function

```

SELECT get_field(pricing, 'price') AS price,

       get_field(pricing, 'discount_percent') AS discount

FROM my_namespace.products

LIMIT 5


```

### Struct fields in WHERE

```

SELECT customer_id, pricing['price'] AS price

FROM my_namespace.products

WHERE pricing['price'] > 50

LIMIT 10


```

### Struct fields in ORDER BY

```

SELECT customer_id, pricing['price'] AS price

FROM my_namespace.products

WHERE pricing['price'] IS NOT NULL

ORDER BY pricing['price'] DESC

LIMIT 10


```

### Struct fields in GROUP BY

```

SELECT platforms['windows'] AS windows_support,

       COUNT(*) AS product_count,

       AVG(pricing['price']) AS avg_price

FROM my_namespace.products

WHERE pricing['price'] IS NOT NULL

GROUP BY platforms['windows']


```

### Creating structs inline

```

-- named_struct creates a struct with named fields

SELECT named_struct('id', customer_id, 'amount', total_amount) AS info

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 1


-- struct creates a struct with positional fields

SELECT struct(customer_id, total_amount, region) AS info

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 1


```

---

## Arrays

Array columns contain ordered lists of values. Array indexing is **1-based**.

### Index access

```

SELECT customer_id, tags[1] AS first_tag, tags[2] AS second_tag

FROM my_namespace.products

LIMIT 5


```

### Create arrays

#### make\_array

Creates an array from a list of values.

```

SELECT make_array(1, 2, 3) AS nums

FROM my_namespace.sales_data

LIMIT 1


```

#### string\_to\_array

Splits a string into an array by a delimiter.

```

SELECT string_to_array(categories, ',') AS cat_array

FROM my_namespace.products

WHERE categories IS NOT NULL

LIMIT 5


```

#### range

Generates an array of integers from start (inclusive) to stop (exclusive).

```

SELECT range(0, 5) AS nums

FROM my_namespace.sales_data

LIMIT 1


```

#### generate\_series

Generates an array of integers from start to stop (inclusive).

```

SELECT generate_series(1, 5) AS nums

FROM my_namespace.sales_data

LIMIT 1


```

### Inspect arrays

#### array\_length

Returns the number of elements in an array.

```

SELECT customer_id, array_length(tags) AS tag_count

FROM my_namespace.products

LIMIT 5


```

#### cardinality

Returns the total number of elements in an array. Alias for `array_length`.

```

SELECT customer_id, cardinality(tags) AS tag_count

FROM my_namespace.products

LIMIT 5


```

#### empty

Returns true if an array has zero elements.

```

SELECT customer_id, empty(tags) AS has_no_tags

FROM my_namespace.products

LIMIT 5


```

#### array\_ndims

Returns the number of dimensions of an array.

```

SELECT array_ndims(make_array(1, 2, 3)) AS ndims

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_dims

Returns the dimensions of an array.

```

SELECT array_dims(make_array(1, 2, 3)) AS dims

FROM my_namespace.sales_data

LIMIT 1


```

### Search arrays

#### array\_has

Returns true if an array contains a value.

```

SELECT customer_id, array_has(tags, 'premium') AS is_premium

FROM my_namespace.products

LIMIT 5


```

#### array\_has\_all

Returns true if the first array contains all elements of the second.

```

SELECT array_has_all(make_array(1, 2, 3, 4), make_array(2, 3)) AS has_all

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_has\_any

Returns true if the first array contains any element of the second.

```

SELECT array_has_any(make_array(1, 2, 3), make_array(3, 4, 5)) AS has_any

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_position

Returns the position of the first occurrence of a value (1-indexed). Returns 0 if not found.

```

SELECT array_position(make_array('a', 'b', 'c', 'b'), 'b') AS pos

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_positions

Returns all positions of a value as an array.

```

SELECT array_positions(make_array(1, 2, 1, 3, 1), 1) AS positions

FROM my_namespace.sales_data

LIMIT 1


```

### Transform arrays

#### array\_sort

Sorts array elements.

```

SELECT array_sort(make_array(3, 1, 2)) AS sorted

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_reverse

Reverses the order of array elements.

```

SELECT array_reverse(make_array(1, 2, 3)) AS reversed

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_distinct

Removes duplicate elements from an array.

```

SELECT array_distinct(make_array(1, 2, 2, 3, 3, 3)) AS unique_vals

FROM my_namespace.sales_data

LIMIT 1


```

#### flatten

Flattens a nested array by one level.

```

SELECT flatten(make_array(make_array(1, 2), make_array(3, 4))) AS flat

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_slice

Returns a slice of an array from a start index to an end index (both inclusive, 1-indexed).

```

SELECT array_slice(make_array(10, 20, 30, 40, 50), 2, 4) AS sliced

FROM my_namespace.sales_data

LIMIT 1


```

### Modify arrays

#### array\_append

Appends a value to the end of an array.

```

SELECT array_append(make_array(1, 2, 3), 4) AS appended

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_prepend

Prepends a value to the beginning of an array.

```

SELECT array_prepend(0, make_array(1, 2, 3)) AS prepended

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_concat

Concatenates two or more arrays.

```

SELECT array_concat(make_array(1, 2), make_array(3, 4)) AS merged

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_remove

Removes the first occurrence of a value from an array.

```

SELECT array_remove(make_array(1, 2, 3, 2), 2) AS result

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_remove\_all

Removes all occurrences of a value from an array.

```

SELECT array_remove_all(make_array(1, 2, 3, 2, 2), 2) AS result

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_remove\_n

Removes the first _n_ occurrences of a value from an array.

```

SELECT array_remove_n(make_array(1, 2, 2, 2, 3), 2, 2) AS result

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_replace

Replaces the first occurrence of a value in an array.

```

SELECT array_replace(make_array(1, 2, 3), 2, 99) AS result

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_replace\_n

Replaces the first _n_ occurrences of a value in an array.

```

SELECT array_replace_n(make_array(1, 2, 2, 2, 3), 2, 99, 2) AS result

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_replace\_all

Replaces all occurrences of a value in an array.

```

SELECT array_replace_all(make_array(1, 2, 3, 2), 2, 99) AS result

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_pop\_back

Removes the last element from an array.

```

SELECT array_pop_back(make_array(1, 2, 3)) AS result

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_pop\_front

Removes the first element from an array.

```

SELECT array_pop_front(make_array(1, 2, 3)) AS result

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_repeat

Repeats a value a given number of times as an array.

```

SELECT array_repeat(region, 3) AS repeated

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_resize

Resizes an array to a given length, filling with a default value.

```

SELECT array_resize(make_array(1, 2), 5, 0) AS resized

FROM my_namespace.sales_data

LIMIT 1


```

### Set operations on arrays

#### array\_intersect

Returns elements common to both arrays.

```

SELECT array_intersect(make_array(1, 2, 3), make_array(2, 3, 4)) AS common

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_union

Returns all unique elements from both arrays.

```

SELECT array_union(make_array(1, 2, 3), make_array(3, 4, 5)) AS merged

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_except

Returns elements in the first array that are not in the second.

```

SELECT array_except(make_array(1, 2, 3, 4), make_array(2, 4)) AS diff

FROM my_namespace.sales_data

LIMIT 1


```

### Aggregate array values

#### array\_max

Returns the maximum value in an array.

```

SELECT customer_id, array_max(scores) AS max_score

FROM my_namespace.products

LIMIT 5


```

#### array\_min

Returns the minimum value in an array.

```

SELECT customer_id, array_min(scores) AS min_score

FROM my_namespace.products

LIMIT 5


```

#### array\_any\_value

Returns the first non-NULL value in an array.

```

SELECT array_any_value(make_array(NULL, 42, NULL)) AS first_val

FROM my_namespace.sales_data

LIMIT 1


```

#### array\_element

Returns the element at a given index (1-indexed). Equivalent to bracket-notation access (`arr[idx]`).

```

SELECT array_element(make_array(10, 20, 30), 2) AS second_val

FROM my_namespace.sales_data

LIMIT 1


```

### Convert arrays

#### array\_to\_string

Joins array elements into a string with a separator.

```

SELECT customer_id, array_to_string(tags, ', ') AS tag_list

FROM my_namespace.products

LIMIT 5


```

---

## Maps

Map columns store key-value pairs. Use `map_keys`, `map_values`, and `map_extract` to query them.

### map\_keys

Returns all keys from a map as an array.

```

SELECT map_keys(metadata) AS keys

FROM my_namespace.products

LIMIT 5


```

### map\_values

Returns all values from a map as an array.

```

SELECT map_values(metadata) AS vals

FROM my_namespace.products

LIMIT 5


```

### map\_extract

Returns the value for a specific key.

```

SELECT map_extract(metadata, 'source') AS source,

       map_extract(metadata, 'store_name') AS store

FROM my_namespace.products

LIMIT 5


```

### Creating maps inline

```

SELECT map(make_array('a', 'b'), make_array(1, 2)) AS m

FROM my_namespace.sales_data

LIMIT 1


```

---

## Complete function index

### Struct functions

| Function                    | Description                          |
| --------------------------- | ------------------------------------ |
| struct\_col\['field'\]      | Bracket notation field access        |
| get\_field(struct, 'field') | Function-based field access          |
| named\_struct(k1, v1, ...)  | Create struct with named fields      |
| struct(v1, v2, ...)         | Create struct with positional fields |

### Array functions

| Function                            | Description                              |
| ----------------------------------- | ---------------------------------------- |
| make\_array(v1, v2, ...)            | Create array from values                 |
| string\_to\_array(str, delim)       | Split string into array                  |
| range(start, stop)                  | Generate integer range (exclusive stop)  |
| generate\_series(start, stop)       | Generate integer series (inclusive stop) |
| array\_length(arr)                  | Number of elements                       |
| cardinality(arr)                    | Number of elements                       |
| empty(arr)                          | True if empty                            |
| array\_ndims(arr)                   | Number of dimensions                     |
| array\_dims(arr)                    | Dimension information                    |
| array\_has(arr, val)                | Contains check                           |
| array\_has\_all(arr, arr2)          | Contains all check                       |
| array\_has\_any(arr, arr2)          | Contains any check                       |
| array\_position(arr, val)           | First position of value                  |
| array\_positions(arr, val)          | All positions of value                   |
| array\_sort(arr)                    | Sort elements                            |
| array\_reverse(arr)                 | Reverse order                            |
| array\_distinct(arr)                | Remove duplicates                        |
| flatten(arr)                        | Flatten one level                        |
| array\_slice(arr, start, end)       | Extract sub-array                        |
| array\_append(arr, val)             | Append to end                            |
| array\_prepend(val, arr)            | Prepend to start                         |
| array\_concat(arr1, arr2)           | Concatenate arrays                       |
| array\_remove(arr, val)             | Remove first occurrence                  |
| array\_remove\_all(arr, val)        | Remove all occurrences                   |
| array\_remove\_n(arr, val, n)       | Remove first _n_ occurrences             |
| array\_replace(arr, old, new)       | Replace first occurrence                 |
| array\_replace\_n(arr, old, new, n) | Replace first _n_ occurrences            |
| array\_replace\_all(arr, old, new)  | Replace all occurrences                  |
| array\_pop\_back(arr)               | Remove last element                      |
| array\_pop\_front(arr)              | Remove first element                     |
| array\_repeat(val, n)               | Repeat value _n_ times                   |
| array\_resize(arr, size, default)   | Resize with default fill                 |
| array\_intersect(arr1, arr2)        | Common elements                          |
| array\_union(arr1, arr2)            | Union of elements                        |
| array\_except(arr1, arr2)           | Difference of elements                   |
| array\_max(arr)                     | Maximum value                            |
| array\_min(arr)                     | Minimum value                            |
| array\_any\_value(arr)              | First non-NULL value                     |
| array\_to\_string(arr, delim)       | Join elements as string                  |
| array\_element(arr, idx)            | Element at index                         |

### Map functions

| Function                  | Description                          |
| ------------------------- | ------------------------------------ |
| map(keys\_arr, vals\_arr) | Create map from key and value arrays |
| map\_keys(map)            | All keys as array                    |
| map\_values(map)          | All values as array                  |
| map\_extract(map, key)    | Value for a specific key             |

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/sql-reference/","name":"SQL reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/sql-reference/complex-types/","name":"Complex types"}}]}
```

---

---
title: Scalar functions
description: Reference for all 163 scalar functions supported in R2 SQL, organized by category.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ SQL ](https://developers.cloudflare.com/search/?tags=SQL) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/sql-reference/scalar-functions.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Scalar functions

Scalar functions transform individual values and can be used in `SELECT`, `WHERE`, `GROUP BY`, `HAVING`, and `ORDER BY` clauses.

---

## Core functions

### coalesce

Returns the first non-NULL argument.

```

SELECT coalesce(department, region, 'unknown') AS first_val

FROM my_namespace.sales_data

LIMIT 5


```

### nullif

Returns NULL if both arguments are equal, otherwise returns the first argument.

```

SELECT nullif(department, 'Unknown') AS dept

FROM my_namespace.sales_data

LIMIT 5


```

### nvl

Returns the second argument if the first is NULL. Alias: `ifnull`.

```

SELECT nvl(department, 'N/A') AS dept

FROM my_namespace.sales_data

LIMIT 5


```

### nvl2

Returns the second argument if the first is not NULL, otherwise returns the third.

```

SELECT nvl2(department, 'has_dept', 'no_dept') AS dept_status

FROM my_namespace.sales_data

LIMIT 5


```

### greatest

Returns the largest value from a list of arguments.

```

SELECT greatest(total_amount, unit_price, quantity) AS max_val

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### least

Returns the smallest value from a list of arguments.

```

SELECT least(total_amount, unit_price, quantity) AS min_val

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### arrow\_typeof

Returns the Arrow data type name of an expression.

```

SELECT arrow_typeof(total_amount) AS amount_type,

       arrow_typeof(customer_id) AS id_type

FROM my_namespace.sales_data

LIMIT 1


```

### arrow\_cast

Casts an expression to a specific Arrow data type by string name.

```

SELECT arrow_cast(total_amount, 'Float32') AS amount_f32

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 1


```

### named\_struct

Creates a struct with named fields from key-value pairs.

```

SELECT named_struct('customer', customer_id, 'amount', total_amount) AS info

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 1


```

### get\_field

Extracts a field from a struct by name.

```

SELECT get_field(named_struct('customer', customer_id, 'amount', total_amount), 'amount') AS amt

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 1


```

### struct

Creates a struct with positional fields. Alias: `row`.

```

SELECT struct(customer_id, total_amount, region) AS info

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 1


```

### overlay

Replaces a substring at a given position.

```

SELECT customer_id,

       overlay(customer_id PLACING 'XX' FROM 1 FOR 2) AS masked

FROM my_namespace.sales_data

LIMIT 3


```

---

## Datetime functions

### now

Returns the current timestamp. Aliases: `current_timestamp`.

Precision is quantized to 10ms boundaries.

```

SELECT now() AS current_ts

FROM my_namespace.sales_data

LIMIT 1


```

### current\_date

Returns today's date. Alias: `today`.

```

SELECT current_date() AS today_date

FROM my_namespace.sales_data

LIMIT 1


```

### current\_time

Returns the current time. Precision is quantized to 10ms boundaries.

```

SELECT current_time() AS now_time

FROM my_namespace.sales_data

LIMIT 1


```

### date\_part

Extracts a component from a timestamp. Alias: `datepart`.

Supported fields: `year`, `month`, `day`, `hour`, `minute`, `second`, `millisecond`, `microsecond`, `week`, `dow`, `doy`, `quarter`, `epoch`.

```

SELECT date_part('hour', timestamp) AS hr,

       date_part('minute', timestamp) AS mn

FROM my_namespace.sales_data

LIMIT 1


```

### date\_trunc

Truncates a timestamp to a specified unit. Alias: `datetrunc`.

Supported units: `year`, `month`, `week`, `day`, `hour`, `minute`, `second`.

```

SELECT date_trunc('day', timestamp) AS day_trunc, COUNT(*) AS cnt

FROM my_namespace.sales_data

GROUP BY date_trunc('day', timestamp)

ORDER BY day_trunc

LIMIT 5


```

### date\_bin

Bins a timestamp into fixed-size intervals aligned to an origin.

```

SELECT date_bin(INTERVAL '1 hour', timestamp, '2025-01-01T00:00:00Z') AS hour_bin,

       COUNT(*) AS cnt

FROM my_namespace.sales_data

GROUP BY date_bin(INTERVAL '1 hour', timestamp, '2025-01-01T00:00:00Z')

ORDER BY hour_bin

LIMIT 5


```

### from\_unixtime

Converts a Unix epoch (seconds) to a timestamp.

```

SELECT from_unixtime(1770000000) AS ts

FROM my_namespace.sales_data

LIMIT 1


```

### make\_date

Constructs a date from year, month, and day components.

```

SELECT make_date(2026, 3, 1) AS d

FROM my_namespace.sales_data

LIMIT 1


```

### make\_time

Constructs a time from hour, minute, and second components.

```

SELECT make_time(14, 30, 0) AS t

FROM my_namespace.sales_data

LIMIT 1


```

### to\_char

Formats a timestamp as a string using strftime format. Alias: `date_format`.

```

SELECT to_char(timestamp, '%Y-%m-%d %H:%M') AS formatted

FROM my_namespace.sales_data

LIMIT 1


```

### to\_date

Parses a date from a string using a format pattern.

```

SELECT to_date('2026-03-01', '%Y-%m-%d') AS d

FROM my_namespace.sales_data

LIMIT 1


```

### to\_timestamp

Parses a timestamp from a string using a format pattern.

```

SELECT to_timestamp('2026-03-01 12:00:00', '%Y-%m-%d %H:%M:%S') AS ts

FROM my_namespace.sales_data

LIMIT 1


```

### to\_timestamp\_seconds

Converts seconds since Unix epoch to a timestamp.

```

SELECT to_timestamp_seconds(1770000000) AS ts

FROM my_namespace.sales_data

LIMIT 1


```

### to\_timestamp\_millis

Converts milliseconds since Unix epoch to a timestamp.

```

SELECT to_timestamp_millis(1770000000000) AS ts

FROM my_namespace.sales_data

LIMIT 1


```

### to\_timestamp\_micros

Converts microseconds since Unix epoch to a timestamp.

```

SELECT to_timestamp_micros(1770000000000000) AS ts

FROM my_namespace.sales_data

LIMIT 1


```

### to\_timestamp\_nanos

Converts nanoseconds since Unix epoch to a timestamp. Large values may overflow.

```

SELECT to_timestamp_nanos(1770000000000000000) AS ts

FROM my_namespace.sales_data

LIMIT 1


```

### to\_unixtime

Converts a timestamp to a Unix epoch (seconds).

```

SELECT to_unixtime(timestamp) AS epoch

FROM my_namespace.sales_data

LIMIT 1


```

### to\_local\_time

Strips timezone information from a timestamp.

```

SELECT to_local_time(timestamp) AS local_ts

FROM my_namespace.sales_data

LIMIT 1


```

### to\_time

Parses a time from a string using a format pattern.

```

SELECT to_time('14:30:00', '%H:%M:%S') AS t

FROM my_namespace.sales_data

LIMIT 1


```

---

## Math functions

### abs

Returns the absolute value of a number.

```

SELECT abs(total_amount - 500) AS distance_from_500

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### ceil

Returns the smallest integer greater than or equal to a number.

```

SELECT ceil(total_amount) AS rounded_up

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### floor

Returns the largest integer less than or equal to a number.

```

SELECT floor(total_amount) AS rounded_down

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### round

Rounds a number to a specified number of decimal places.

```

SELECT round(total_amount, 2) AS rounded

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### trunc

Truncates a number to a specified number of decimal places.

```

SELECT trunc(total_amount, 0) AS truncated

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### sqrt

Returns the square root of a number.

```

SELECT sqrt(CAST(quantity AS DOUBLE)) AS sqrt_qty

FROM my_namespace.sales_data

WHERE quantity IS NOT NULL

LIMIT 5


```

### cbrt

Returns the cube root of a number.

```

SELECT cbrt(CAST(quantity AS DOUBLE)) AS cbrt_qty

FROM my_namespace.sales_data

WHERE quantity IS NOT NULL

LIMIT 5


```

### power

Raises a number to a power. Alias: `pow`.

```

SELECT power(total_amount, 2.0) AS amount_squared

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### exp

Returns _e_ raised to the given power.

```

SELECT exp(total_amount / 1000.0) AS exp_val

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### ln

Returns the natural logarithm.

```

SELECT ln(total_amount) AS ln_val

FROM my_namespace.sales_data

WHERE total_amount > 0

LIMIT 5


```

### log

Returns the logarithm of a value for a given base.

```

SELECT log(10.0, total_amount) AS log10_val

FROM my_namespace.sales_data

WHERE total_amount > 0

LIMIT 5


```

### log2

Returns the base-2 logarithm.

```

SELECT log2(total_amount) AS log2_val

FROM my_namespace.sales_data

WHERE total_amount > 0

LIMIT 5


```

### log10

Returns the base-10 logarithm.

```

SELECT log10(total_amount) AS log10_val

FROM my_namespace.sales_data

WHERE total_amount > 0

LIMIT 5


```

### Trigonometric functions

`sin`, `cos`, `tan`, `asin`, `acos`, `atan`, `atan2`, `cot`

```

SELECT sin(1.0) AS s, cos(1.0) AS c, tan(1.0) AS t,

       asin(0.5) AS as_val, acos(0.5) AS ac_val, atan(1.0) AS at_val

FROM my_namespace.sales_data

LIMIT 1


```

### Hyperbolic functions

`sinh`, `cosh`, `tanh`, `asinh`, `acosh`, `atanh`

```

SELECT sinh(1.0) AS sh, cosh(1.0) AS ch, tanh(1.0) AS th

FROM my_namespace.sales_data

LIMIT 1


```

### degrees

Converts radians to degrees.

```

SELECT degrees(pi()) AS full_circle

FROM my_namespace.sales_data

LIMIT 1


```

### radians

Converts degrees to radians.

```

SELECT radians(180.0) AS pi_val

FROM my_namespace.sales_data

LIMIT 1


```

### pi

Returns the value of pi.

```

SELECT pi() AS pi_val

FROM my_namespace.sales_data

LIMIT 1


```

### random

Returns a random float between 0 and 1.

```

SELECT random() AS rnd

FROM my_namespace.sales_data

LIMIT 1


```

### factorial

Returns the factorial of a non-negative integer.

```

SELECT factorial(5) AS fact5

FROM my_namespace.sales_data

LIMIT 1


```

### gcd

Returns the greatest common divisor of two integers.

```

SELECT gcd(12, 8) AS gcd_val

FROM my_namespace.sales_data

LIMIT 1


```

### lcm

Returns the least common multiple of two integers.

```

SELECT lcm(4, 6) AS lcm_val

FROM my_namespace.sales_data

LIMIT 1


```

### signum

Returns the sign of a number: -1, 0, or 1.

```

SELECT signum(total_amount - 500) AS sign_val

FROM my_namespace.sales_data

WHERE total_amount IS NOT NULL

LIMIT 5


```

### isnan

Returns true if the value is NaN.

```

SELECT isnan(0.0 / 0.0) AS is_nan

FROM my_namespace.sales_data

LIMIT 1


```

### iszero

Returns true if the value is zero.

```

SELECT iszero(0.0) AS is_zero

FROM my_namespace.sales_data

LIMIT 1


```

### nanvl

Returns the first argument if it is not NaN, otherwise returns the second.

```

SELECT nanvl(0.0 / 0.0, -1.0) AS safe_val

FROM my_namespace.sales_data

LIMIT 1


```

---

## String functions

### ascii

Returns the ASCII code of the first character.

```

SELECT customer_id, ascii(customer_id) AS first_code

FROM my_namespace.sales_data

LIMIT 3


```

### bit\_length

Returns the length of a string in bits.

```

SELECT customer_id, bit_length(customer_id) AS bits

FROM my_namespace.sales_data

LIMIT 3


```

### octet\_length

Returns the length of a string in bytes.

```

SELECT customer_id, octet_length(customer_id) AS bytes

FROM my_namespace.sales_data

LIMIT 3


```

### lower

Converts a string to lowercase.

```

SELECT lower(department) AS dept_lower

FROM my_namespace.sales_data

LIMIT 5


```

### upper

Converts a string to uppercase.

```

SELECT upper(region) AS region_upper

FROM my_namespace.sales_data

LIMIT 5


```

### concat

Concatenates two or more strings.

```

SELECT concat(department, ' - ', region) AS label

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### concat\_ws

Concatenates strings with a separator.

```

SELECT concat_ws('/', region, department) AS path

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### contains

Returns true if a string contains a substring.

```

SELECT customer_id, contains(department, 'Sales') AS is_sales

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### starts\_with

Returns true if a string starts with a prefix.

```

SELECT customer_id, starts_with(department, 'Eng') AS is_eng

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### ends\_with

Returns true if a string ends with a suffix.

```

SELECT customer_id, ends_with(department, 'ing') AS ends_ing

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### btrim

Trims characters from both sides of a string. Alias: `trim`.

```

SELECT btrim('  hello  ') AS trimmed

FROM my_namespace.sales_data

LIMIT 1


```

### ltrim

Trims characters from the left side of a string.

```

SELECT ltrim('  hello') AS trimmed

FROM my_namespace.sales_data

LIMIT 1


```

### rtrim

Trims characters from the right side of a string.

```

SELECT rtrim('hello  ') AS trimmed

FROM my_namespace.sales_data

LIMIT 1


```

### replace

Replaces all occurrences of a substring.

```

SELECT department, replace(department, ' ', '_') AS underscored

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### repeat

Repeats a string a given number of times.

```

SELECT repeat(region, 2) AS doubled

FROM my_namespace.sales_data

LIMIT 3


```

### split\_part

Splits a string by a delimiter and returns the specified part (1-indexed).

```

SELECT customer_id, split_part(customer_id, '-', 1) AS first_part

FROM my_namespace.sales_data

WHERE customer_id IS NOT NULL

LIMIT 5


```

### levenshtein

Returns the Levenshtein edit distance between two strings.

```

SELECT department, levenshtein(department, 'Engineering') AS dist

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### chr

Returns the character for a given ASCII code.

```

SELECT chr(65) AS letter

FROM my_namespace.sales_data

LIMIT 1


```

### to\_hex

Converts an integer to a hexadecimal string.

```

SELECT to_hex(255) AS hex_ff

FROM my_namespace.sales_data

LIMIT 1


```

### uuid

Generates a random UUID.

```

SELECT uuid() AS new_id

FROM my_namespace.sales_data

LIMIT 1


```

---

## Unicode functions

### character\_length

Returns the number of characters in a string. Aliases: `length`, `char_length`.

```

SELECT department, character_length(department) AS len

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### left

Returns the leftmost _n_ characters of a string.

```

SELECT department, left(department, 5) AS prefix

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### right

Returns the rightmost _n_ characters of a string.

```

SELECT department, right(department, 3) AS suffix

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### substr

Returns a substring starting at a position for a given length. Alias: `substring`.

```

SELECT department, substr(department, 1, 8) AS first_eight

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### lpad

Left-pads a string to a specified length.

```

SELECT region, lpad(region, 15, '.') AS padded

FROM my_namespace.sales_data

LIMIT 5


```

### rpad

Right-pads a string to a specified length.

```

SELECT region, rpad(region, 15, '.') AS padded

FROM my_namespace.sales_data

LIMIT 5


```

### reverse

Reverses a string.

```

SELECT department, reverse(department) AS rev

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### strpos

Returns the position of a substring (1-indexed). Aliases: `instr`, `position`.

```

SELECT department, strpos(department, 'a') AS a_pos

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### initcap

Capitalizes the first letter of each word.

```

SELECT initcap('hello world') AS capped

FROM my_namespace.sales_data

LIMIT 1


```

### translate

Replaces characters in a string based on a mapping.

```

SELECT department, translate(department, 'aeiou', '12345') AS coded

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### find\_in\_set

Returns the position of a string within a comma-separated list.

```

SELECT find_in_set('North', 'South,North,East,West') AS pos

FROM my_namespace.sales_data

LIMIT 1


```

### substr\_index

Returns the substring before the _n_\-th occurrence of a delimiter. Alias: `substring_index`.

```

SELECT customer_id, substr_index(customer_id, '-', 1) AS first_segment

FROM my_namespace.sales_data

WHERE customer_id IS NOT NULL

LIMIT 5


```

---

## Regex functions

### regexp\_like

Returns true if a string matches a regular expression pattern.

```

SELECT department, regexp_like(department, '^[A-Z]{2}') AS starts_two_caps

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### regexp\_count

Returns the number of matches of a pattern in a string.

```

SELECT department, regexp_count(department, '[aeiou]') AS vowels

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### regexp\_replace

Replaces matches of a pattern with a replacement string.

```

SELECT department, regexp_replace(department, '[0-9]', '#') AS no_digits

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

### regexp\_match

Returns the first match of a pattern as an array.

```

SELECT department, regexp_match(department, '([A-Z][a-z]+)') AS first_word

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 3


```

### regexp\_instr

Returns the position of the first match of a pattern.

```

SELECT department, regexp_instr(department, '[0-9]') AS digit_pos

FROM my_namespace.sales_data

WHERE department IS NOT NULL

LIMIT 5


```

---

## Crypto functions

### md5

Returns the MD5 hash of a string.

```

SELECT customer_id, md5(customer_id) AS hash

FROM my_namespace.sales_data

LIMIT 1


```

### sha224

Returns the SHA-224 hash of a string.

```

SELECT sha224(customer_id) AS hash

FROM my_namespace.sales_data

LIMIT 1


```

### sha256

Returns the SHA-256 hash of a string.

```

SELECT customer_id, sha256(customer_id) AS hash

FROM my_namespace.sales_data

LIMIT 1


```

### sha384

Returns the SHA-384 hash of a string.

```

SELECT sha384(customer_id) AS hash

FROM my_namespace.sales_data

LIMIT 1


```

### sha512

Returns the SHA-512 hash of a string.

```

SELECT sha512(customer_id) AS hash

FROM my_namespace.sales_data

LIMIT 1


```

### digest

Returns a hash of a string using a specified algorithm. Supported algorithms: `md5`, `sha224`, `sha256`, `sha384`, `sha512`.

```

SELECT customer_id, digest(customer_id, 'sha256') AS hash

FROM my_namespace.sales_data

LIMIT 1


```

---

## Encoding functions

### encode

Encodes binary data to a string. Supported encoding: `base64`.

```

SELECT encode(CAST('hello' AS BYTEA), 'base64') AS b64

FROM my_namespace.sales_data

LIMIT 1


```

### decode

Decodes a string to binary data. Supported encoding: `base64`.

```

SELECT decode('aGVsbG8=', 'base64') AS raw

FROM my_namespace.sales_data

LIMIT 1


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/sql-reference/","name":"SQL reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/sql-reference/scalar-functions/","name":"Scalar functions"}}]}
```

---

---
title: Troubleshooting guide
description: This guide covers potential errors and limitations you may encounter when using R2 SQL. R2 SQL is in open beta, and supported functionality will evolve and change over time.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ SQL ](https://developers.cloudflare.com/search/?tags=SQL) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/troubleshooting.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Troubleshooting guide

This guide covers potential errors and limitations you may encounter when using R2 SQL. R2 SQL is in open beta, and supported functionality will evolve and change over time.

## Query structure errors

### Missing required clauses

**Error**: `expected exactly 1 table in FROM clause`

**Problem**: R2 SQL requires a `FROM` clause in your query.

```

-- Invalid - Missing FROM clause

SELECT user_id WHERE status = 200;


-- Valid

SELECT user_id

FROM my_namespace.http_requests

WHERE status = 200 AND timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z';


```

**Solution**: Always include `FROM` with a fully qualified table name (`namespace_name.table_name`).

---

## FROM clause issues

### Multiple tables

**Error**: `unsupported feature: JOIN operations are not supported`

**Problem**: R2 SQL queries reference exactly one table. JOINs and multiple tables are not supported.

```

-- Invalid - Multiple tables not supported

SELECT a.*, b.* FROM my_namespace.table1 a, my_namespace.table2 b WHERE a.id = b.id

SELECT * FROM my_namespace.events JOIN my_namespace.users ON events.user_id = users.id


-- Valid - Separate queries

SELECT * FROM my_namespace.table1 WHERE id IN ('id1', 'id2', 'id3') LIMIT 100

-- Then query the second table separately in your application

SELECT * FROM my_namespace.table2 WHERE id IN ('id1', 'id2', 'id3') LIMIT 100


```

**Solution**:

* Denormalize your data by including necessary fields in a single table.
* Perform multiple queries and join data in your application.

### Subqueries

**Error**: `unsupported feature: subqueries`

**Problem**: Subqueries in `FROM`, `WHERE`, and scalar positions are not supported.

```

-- Invalid - Subqueries not supported

SELECT * FROM (SELECT user_id FROM my_namespace.events WHERE status = 200)


-- Valid - Use direct query with appropriate filters

SELECT user_id FROM my_namespace.events WHERE status = 200 LIMIT 100


```

**Solution**: Flatten your query logic or use multiple sequential queries.

---

## WHERE clause issues

### JSON object filtering

**Error**: `unsupported binary operator` or `Error during planning: could not parse compound`

**Problem**: JSON functions are not yet implemented. You cannot filter on fields inside JSON objects using JSON path operators.

```

-- Invalid - JSON path operators not supported

SELECT * FROM my_namespace.requests WHERE json_data->>'level' = 'error'


-- Valid - Filter on the entire JSON column

SELECT * FROM my_namespace.logs WHERE json_data IS NOT NULL LIMIT 100


```

**Solution**:

* Denormalize frequently queried JSON fields into separate columns.
* Filter on the entire JSON field, and handle parsing in your application.

Note

Struct columns are supported and can be filtered using bracket notation. Refer to [Complex types](https://developers.cloudflare.com/r2-sql/sql-reference/complex-types/) for details.

```

SELECT * FROM my_namespace.products WHERE pricing['price'] > 50 LIMIT 100


```

---

## LIMIT clause issues

### Invalid limit values

**Error**: `maximum LIMIT is 10000`

**Problem**: LIMIT values must be between 1 and 10,000.

```

-- Invalid - Out of range

SELECT * FROM my_namespace.events LIMIT 50000


-- Valid

SELECT * FROM my_namespace.events LIMIT 10000


```

**Solution**: Use LIMIT values between 1 and 10,000.

### Pagination attempts

**Error**: `unsupported feature: OFFSET clause is not supported`

**Problem**: OFFSET is not supported.

```

-- Invalid - Pagination not supported

SELECT * FROM my_namespace.events LIMIT 100 OFFSET 200


-- Valid - Use cursor-based pagination with ORDER BY and WHERE

-- Page 1

SELECT * FROM my_namespace.events

WHERE timestamp >= '2024-01-01'

ORDER BY timestamp

LIMIT 100


-- Page 2 - Use the last timestamp from the previous page

SELECT * FROM my_namespace.events

WHERE timestamp > '2024-01-01T10:30:00Z'

ORDER BY timestamp

LIMIT 100


```

**Solution**: Implement cursor-based pagination using `ORDER BY` and `WHERE` conditions.

---

## Schema issues

### DDL and DML operations

**Error**: `only read-only queries are allowed`

**Problem**: R2 SQL is a read-only query engine. DDL and DML statements are not supported.

```

-- Invalid - Schema changes not supported

ALTER TABLE my_namespace.events ADD COLUMN new_field STRING

UPDATE my_namespace.events SET status = 200 WHERE user_id = '123'

CREATE TABLE my_namespace.test (id INT)

DROP TABLE my_namespace.events


```

**Solution**: Manage your schema through your data ingestion pipeline and R2 Data Catalog.

---

## Performance optimization

### Query performance issues

If your queries are running slowly:

1. **Always include partition (timestamp) filters**: This is the most important optimization.  
```  
-- Good - Narrows data scan to one day  
SELECT * FROM my_namespace.events  
WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02'  
LIMIT 100  
```
2. **Use selective filtering**: Include specific conditions to reduce result sets.  
```  
-- Good - Multiple filters reduce scanned data  
SELECT * FROM my_namespace.events  
WHERE status = 200 AND region = 'US' AND timestamp > '2024-01-01'  
LIMIT 100  
```
3. **Select specific columns**: Avoid `SELECT *` when you only need a few fields.  
```  
-- Good - Only reads the columns you need  
SELECT user_id, status, timestamp  
FROM my_namespace.events  
WHERE timestamp > '2024-01-01'  
LIMIT 100  
```
4. **Use EXPLAIN to inspect the execution plan**: Verify that predicate pushdown and file pruning are working.  
```  
EXPLAIN SELECT user_id, status  
FROM my_namespace.events  
WHERE timestamp > '2024-01-01' AND status = 200  
```
5. **Enable compaction**: Enable compaction in R2 Data Catalog to reduce the number of small files scanned per query.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/troubleshooting/","name":"Troubleshooting guide"}}]}
```

---

---
title: Tutorials
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/tutorials/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Tutorials

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/tutorials/","name":"Tutorials"}}]}
```

---

---
title: Build an end to end data pipeline
description: This tutorial demonstrates how to build a complete data pipeline using Cloudflare Pipelines, R2 Data Catalog, and R2 SQL.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/tutorials/end-to-end-pipeline.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Build an end to end data pipeline

**Last reviewed:**  6 months ago 

Learn how to create an end-to-end data pipeline using Cloudflare Pipelines, R2 Data Catalog, and R2 SQL for real-time transaction analysis.

In this tutorial, you will learn how to build a complete data pipeline using Cloudflare Pipelines, R2 Data Catalog, and R2 SQL. This also includes a sample Python script that creates and sends financial transaction data to your Pipeline that can be queried by R2 SQL or any Apache Iceberg-compatible query engine.

This tutorial demonstrates how to:

* Set up R2 Data Catalog to store our transaction events in an Apache Iceberg table
* Set up a Cloudflare Pipeline
* Create transaction data with fraud patterns to send to your Pipeline
* Query your data using R2 SQL for fraud analysis

## Prerequisites

1. Sign up for a [Cloudflare account ↗](https://dash.cloudflare.com/sign-up).
2. Install [Node.js ↗](https://nodejs.org/en/).
3. Install [Python 3.8+ ↗](https://python.org) for the data generation script.

Node.js version manager

Use a Node version manager like [Volta ↗](https://volta.sh/) or [nvm ↗](https://github.com/nvm-sh/nvm) to avoid permission issues and change Node.js versions.

Wrangler requires a Node version of 16.17.0 or later.

## 1\. Set up authentication

You will need API tokens to interact with Cloudflare services.

1. In the Cloudflare dashboard, go to the **API tokens** page.  
[ Go to **Account API tokens** ](https://dash.cloudflare.com/?to=/:account/api-tokens)
2. Select **Create Token**.
3. Select **Get started** next to Create Custom Token.
4. Enter a name for your API token.
5. Under **Permissions**, choose:  
   * **Workers Pipelines** with Read, Send, and Edit permissions  
   * **Workers R2 Data Catalog** with Read and Edit permissions  
   * **Workers R2 SQL** with Read permissions  
   * **Workers R2 Storage** with Read and Edit permissions
6. Optionally, add a TTL to this token.
7. Select **Continue to summary**.
8. Click **Create Token**
9. Note the **Token value**.

Export your new token as an environment variable:

Terminal window

```

export WRANGLER_R2_SQL_AUTH_TOKEN= #paste your token here


```

If this is your first time using Wrangler, make sure to log in.

Terminal window

```

npx wrangler login


```

## 2\. Create an R2 bucket and enable R2 Data Catalog

* [ Wrangler CLI ](#tab-panel-5740)
* [ Dashboard ](#tab-panel-5741)

Create an R2 bucket:

Terminal window

```

npx wrangler r2 bucket create fraud-pipeline


```

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select **Create bucket**.
3. Enter the bucket name: `fraud-pipeline`
4. Select **Create bucket**.

Enable the catalog on your R2 bucket:

* [ Wrangler CLI ](#tab-panel-5742)
* [ Dashboard ](#tab-panel-5743)

Terminal window

```

npx wrangler r2 bucket catalog enable fraud-pipeline


```

When you run this command, take note of the "Warehouse" and "Catalog URI". You will need these later.

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select the bucket: `fraud-pipeline`.
3. Switch to the **Settings** tab, scroll down to **R2 Data Catalog**, and select **Enable**.
4. Once enabled, note the **Catalog URI** and **Warehouse name**.

Note

Copy the `warehouse` (ACCOUNTID\_BUCKETNAME) and paste it in the `export` below. We will use it later in the tutorial.

Terminal window

```

export WAREHOUSE= #Paste your warehouse here


```

### (Optional) Enable compaction on your R2 Data Catalog

R2 Data Catalog can automatically compact tables for you. In production event streaming use cases, it is common to end up with many small files, so it is recommended to enable compaction. Since the tutorial only demonstrates a sample use case, this step is optional.

* [ Wrangler CLI ](#tab-panel-5744)
* [ Dashboard ](#tab-panel-5745)

Terminal window

```

npx wrangler r2 bucket catalog compaction enable fraud-pipeline --token $WRANGLER_R2_SQL_AUTH_TOKEN


```

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select the bucket: `fraud-pipeline`.
3. Switch to the **Settings** tab, scroll down to **R2 Data Catalog**, click on edit icon, and select **Enable**.
4. You can choose a target file size or leave the default. Click save.

## 3\. Set up the pipeline infrastructure

### 3.1\. Create the Pipeline stream

* [ Wrangler CLI ](#tab-panel-5746)
* [ Dashboard ](#tab-panel-5747)

First, create a schema file called `raw_transactions_schema.json` with the following `json` schema:

```

{

  "fields": [

    { "name": "transaction_id", "type": "string", "required": true },

    { "name": "user_id", "type": "int64", "required": true },

    { "name": "amount", "type": "float64", "required": false },

    { "name": "transaction_timestamp", "type": "string", "required": false },

    { "name": "location", "type": "string", "required": false },

    { "name": "merchant_category", "type": "string", "required": false },

    { "name": "is_fraud", "type": "bool", "required": false }

  ]

}


```

Create a stream to receive incoming fraud detection events:

Terminal window

```

npx wrangler pipelines streams create raw_events_stream \

  --schema-file raw_transactions_schema.json \

  --http-enabled true \

  --http-auth false


```

Note

Note the **HTTP Ingest Endpoint URL** from the output. This is the endpoint you will use to send data to your pipeline.

Terminal window

```

# The http ingest endpoint from the output (see example below)

export STREAM_ENDPOINT= #the http ingest endpoint from the output (see example below)


```

The output should look like this:

Terminal window

```

🌀 Creating stream 'raw_events_stream'...

✨ Successfully created stream 'raw_events_stream' with id 'stream_id'.


Creation Summary:

General:

  Name:  raw_events_stream


HTTP Ingest:

  Enabled:         Yes

  Authentication:  Yes

  Endpoint:        https://stream_id.ingest.cloudflare.com

  CORS Origins:    None


Input Schema:

┌───────────────────────┬────────┬────────────┬──────────┐

│ Field Name            │ Type   │ Unit/Items │ Required │

├───────────────────────┼────────┼────────────┼──────────┤

│ transaction_id        │ string │            │ Yes      │

├───────────────────────┼────────┼────────────┼──────────┤

│ user_id               │ int64  │            │ Yes      │

├───────────────────────┼────────┼────────────┼──────────┤

│ amount                │float64 │            │ No       │

├───────────────────────┼────────┼────────────┼──────────┤

│ transaction_timestamp │ string │            │ No       │

├───────────────────────┼────────┼────────────┼──────────┤

│ location              │ string │            │ No       │

├───────────────────────┼────────┼────────────┼──────────┤

│ merchant_category     │ string │            │ No       │

├───────────────────────┼────────┼────────────┼──────────┤

│ is_fraud              │ bool   │            │ No       │

└───────────────────────┴────────┴────────────┴──────────┘


```

### 3.2\. Create the data sink

Create a sink that writes data to your R2 bucket as Apache Iceberg tables:

Terminal window

```

npx wrangler pipelines sinks create raw_events_sink \

  --type "r2-data-catalog" \

  --bucket "fraud-pipeline" \

  --roll-interval 30 \

  --namespace "fraud_detection" \

  --table "transactions" \

  --catalog-token $WRANGLER_R2_SQL_AUTH_TOKEN


```

Note

This creates a `sink` configuration that will write to the Iceberg table `fraud_detection.transactions` in your R2 Data Catalog every 30 seconds. Pipelines automatically appends an `__ingest_ts` column that is used to partition the table by `DAY`.

### 3.3\. Create the pipeline

Connect your stream to your sink with SQL:

Terminal window

```

npx wrangler pipelines create raw_events_pipeline \

  --sql "INSERT INTO raw_events_sink SELECT * FROM raw_events_stream"


```

1. In the Cloudflare dashboard, go to **Pipelines** \> **Pipelines**.  
[ Go to **Pipelines** ](https://dash.cloudflare.com/?to=/:account/pipelines/overview)
2. Select **Create Pipeline**.
3. **Connect to a Stream**:  
   * Pipeline name: `raw_events`  
   * Enable HTTP endpoint for sending data: Enabled  
   * HTTP authentication: Disabled (default)  
   * Select **Next**
4. **Define Input Schema**:  
   * Select **JSON editor**  
   * Copy in the schema:  
   ```  
   {  
     "fields": [  
       { "name": "transaction_id", "type": "string", "required": true },  
       { "name": "user_id", "type": "int64", "required": true },  
       { "name": "amount", "type": "float64", "required": false },  
       {  
         "name": "transaction_timestamp",  
         "type": "string",  
         "required": false  
       },  
       { "name": "location", "type": "string", "required": false },  
       { "name": "merchant_category", "type": "string", "required": false },  
       { "name": "is_fraud", "type": "bool", "required": false }  
     ]  
   }  
   ```  
   * Select **Next**
5. **Define Sink**:  
   * Select your R2 bucket: `fraud-pipeline`  
   * Storage type: **R2 Data Catalog**  
   * Namespace: `fraud_detection`  
   * Table name: `transactions`  
   * **Advanced Settings**: Change **Maximum Time Interval** to `30 seconds`  
   * Select **Next**
6. **Credentials**:  
   * Disable **Automatically create an Account API token for your sink**  
   * Enter **Catalog Token** from step 1  
   * Select **Next**
7. **Pipeline Definition**:  
   * Leave the default SQL query:  
   ```  
   INSERT INTO raw_events_sink SELECT * FROM raw_events_stream;  
   ```  
   * Select **Create Pipeline**
8. After pipeline creation, note the **Stream ID** for the next step.

## 4\. Generate sample fraud detection data

Create a Python script to generate realistic transaction data with fraud patterns:

fraud\_data\_generator.py

```

import requests

import json

import uuid

import random

import time

import os

from datetime import datetime, timezone, timedelta


# Configuration - exported from the prior steps

STREAM_ENDPOINT = os.environ["STREAM_ENDPOINT"]# From the stream you created

API_TOKEN = os.environ["WRANGLER_R2_SQL_AUTH_TOKEN"] #the same one created earlier

EVENTS_TO_SEND = 1000 # Feel free to adjust this


def generate_transaction():

    """Generate some random transactions with occasional fraud"""


    # User IDs

    high_risk_users = [1001, 1002, 1003, 1004, 1005]

    normal_users = list(range(1006, 2000))


    user_id = random.choice(high_risk_users + normal_users)

    is_high_risk_user = user_id in high_risk_users


    # Generate amounts

    if random.random() < 0.05:

        amount = round(random.uniform(5000, 50000), 2)

    elif random.random() < 0.03:

        amount = round(random.uniform(0.01, 1.00), 2)

    else:

        amount = round(random.uniform(10, 500), 2)


    # Locations

    normal_locations = ["NEW_YORK", "LOS_ANGELES", "CHICAGO", "MIAMI", "SEATTLE", "SAN FRANCISCO"]

    high_risk_locations = ["UNKNOWN_LOCATION", "VPN_EXIT", "MARS", "BAT_CAVE"]


    if is_high_risk_user and random.random() < 0.3:

        location = random.choice(high_risk_locations)

    else:

        location = random.choice(normal_locations)


    # Merchant categories

    normal_merchants = ["GROCERY", "GAS_STATION", "RESTAURANT", "RETAIL"]

    high_risk_merchants = ["GAMBLING", "CRYPTO", "MONEY_TRANSFER", "GIFT_CARDS"]


    if random.random() < 0.1:  # 10% high-risk merchants

        merchant_category = random.choice(high_risk_merchants)

    else:

        merchant_category = random.choice(normal_merchants)


    # Series of checks to either increase fraud score by a certain margin

    fraud_score = 0

    if amount > 2000: fraud_score += 0.4

    if amount < 1: fraud_score += 0.3

    if location in high_risk_locations: fraud_score += 0.5

    if merchant_category in high_risk_merchants: fraud_score += 0.3

    if is_high_risk_user: fraud_score += 0.2


    # Compare the fraud scores

    is_fraud = random.random() < min(fraud_score * 0.3, 0.8)


    # Generate timestamps (some fraud happens at unusual hours)

    base_time = datetime.now(timezone.utc)

    if is_fraud and random.random() < 0.4:  # 40% of fraud at night

        hour = random.randint(0, 5)  # Late night/early morning

        transaction_time = base_time.replace(hour=hour)

    else:

        transaction_time = base_time - timedelta(

            hours=random.randint(0, 168)  # Last week

        )


    return {

        "transaction_id": str(uuid.uuid4()),

        "user_id": user_id,

        "amount": amount,

        "transaction_timestamp": transaction_time.isoformat(),

        "location": location,

        "merchant_category": merchant_category,

        "is_fraud": True if is_fraud else False

    }


def send_batch_to_stream(events, batch_size=100):

    """Send events to Cloudflare Stream in batches"""


    headers = {

        "Authorization": f"Bearer {API_TOKEN}",

        "Content-Type": "application/json"

    }


    total_sent = 0

    fraud_count = 0


    for i in range(0, len(events), batch_size):

        batch = events[i:i + batch_size]

        fraud_in_batch = sum(1 for event in batch if event["is_fraud"] == True)


        try:

            response = requests.post(STREAM_ENDPOINT, headers=headers, json=batch)


            if response.status_code in [200, 201]:

                total_sent += len(batch)

                fraud_count += fraud_in_batch

                print(f"Sent batch of {len(batch)} events (Total: {total_sent})")

            else:

                print(f"Failed to send batch: {response.status_code} - {response.text}")


        except Exception as e:

            print(f"Error sending batch: {e}")


        time.sleep(0.1)


    return total_sent, fraud_count


def main():

    print("Generating fraud detection data...")


    # Generate events

    events = []

    for i in range(EVENTS_TO_SEND):

        events.append(generate_transaction())

        if (i + 1) % 100 == 0:

            print(f"Generated {i + 1} events...")


    fraud_events = sum(1 for event in events if event["is_fraud"] == True)

    print(f"📊 Generated {len(events)} total events ({fraud_events} fraud, {fraud_events/len(events)*100:.1f}%)")


    # Send to stream

    print("Sending data to Pipeline stream...")

    sent, fraud_sent = send_batch_to_stream(events)


    print(f"\nComplete!")

    print(f"   Events sent: {sent:,}")

    print(f"   Fraud events: {fraud_sent:,} ({fraud_sent/sent*100:.1f}%)")

    print(f"   Data is now flowing through your pipeline!")


if __name__ == "__main__":

    main()


```

Install the required Python dependency and run the script:

Terminal window

```

pip install requests

python fraud_data_generator.py


```

## 5\. Query the data with R2 SQL

Now you can analyze your fraud detection data using R2 SQL. Here are some example queries:

### 5.1\. View recent transactions

Terminal window

```

npx wrangler r2 sql query "$WAREHOUSE" "

SELECT

    transaction_id,

    user_id,

    amount,

    location,

    merchant_category,

    is_fraud,

    transaction_timestamp

FROM fraud_detection.transactions

WHERE __ingest_ts > '2025-09-24T01:00:00Z'

AND is_fraud = true

LIMIT 10"


```

### 5.2\. Filter the raw transactions into a new table to highlight high-value transactions

Create a new sink that will write the filtered data to a new Apache Iceberg table in R2 Data Catalog:

Terminal window

```

npx wrangler pipelines sinks create fraud_filter_sink \

  --type "r2-data-catalog" \

  --bucket "fraud-pipeline" \

  --roll-interval 30 \

  --namespace "fraud_detection" \

  --table "fraud_transactions" \

  --catalog-token $WRANGLER_R2_SQL_AUTH_TOKEN


```

Now you will create a new SQL query to process data from the original `raw_events_stream` stream and only write flagged transactions that are over the `amount` of 1,000.

Terminal window

```

npx wrangler pipelines create fraud_events_pipeline \

  --sql "INSERT INTO fraud_filter_sink SELECT * FROM raw_events_stream WHERE is_fraud=true and amount > 1000"


```

Note

It may take a few minutes for the new Pipeline to fully Initialize and start processing the data. Also keep in mind the 30 second `roll-interval`.

Query the table and check the results:

Terminal window

```

npx wrangler r2 sql query "$WAREHOUSE" "

SELECT

    transaction_id,

    user_id,

    amount,

    location,

    merchant_category,

    is_fraud,

    transaction_timestamp

FROM fraud_detection.fraud_transactions

LIMIT 10"


```

Also verify that the non-fraudulent events are being filtered out:

Terminal window

```

npx wrangler r2 sql query "$WAREHOUSE" "

SELECT

    transaction_id,

    user_id,

    amount,

    location,

    merchant_category,

    is_fraud,

    transaction_timestamp

FROM fraud_detection.fraud_transactions

WHERE is_fraud = false

LIMIT 10"


```

You should see the following output:

```

Query executed successfully with no results


```

## Conclusion

You have successfully built an end to end data pipeline using Cloudflare's data platform. Through this tutorial, you hve learned to:

1. **Use R2 Data Catalog**: Leveraged Apache Iceberg tables for efficient data storage
2. **Set up Cloudflare Pipelines**: Created streams, sinks, and pipelines for data ingestion
3. **Generated sample data**: Created transaction data with some basic fraud patterns
4. **Query your tables with R2 SQL**: Access raw and processed data tables stored in R2 Data Catalog

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/tutorials/","name":"Tutorials"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/tutorials/end-to-end-pipeline/","name":"Build an end to end data pipeline"}}]}
```

---

---
title: Platform
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/platform/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Platform

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/platform/","name":"Platform"}}]}
```

---

---
title: Pricing
description: R2 SQL is in open beta and available to any developer with an R2 subscription.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/platform/pricing.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Pricing

R2 SQL is in open beta and available to any developer with an [R2 subscription](https://developers.cloudflare.com/r2/pricing/).

We are not currently billing for R2 SQL during open beta. However, you will be billed for standard [R2 storage and operations](https://developers.cloudflare.com/r2/pricing/) for data accessed by queries.

We plan to bill based on the volume of data queried by R2 SQL. We'll provide at least 30 days notice before we make any changes or start charging for R2 SQL usage.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/platform/","name":"Platform"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/platform/pricing/","name":"Pricing"}}]}
```

---

---
title: Reference
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/reference/index.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Reference

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/reference/","name":"Reference"}}]}
```

---

---
title: Limitations and best practices
description: This page summarizes supported features, limitations, and best practices.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

### Tags

[ SQL ](https://developers.cloudflare.com/search/?tags=SQL) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/reference/limitations-best-practices.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Limitations and best practices

Note

R2 SQL is in open beta. Limitations and best practices will change over time.

This page summarizes supported features, limitations, and best practices.

## Quick reference

| Feature                                                 | Supported | Notes                                                                      |
| ------------------------------------------------------- | --------- | -------------------------------------------------------------------------- |
| SELECT, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT        | Yes       |                                                                            |
| Column aliases (AS)                                     | Yes       |                                                                            |
| Expressions (CASE, CAST, LIKE, BETWEEN, IN, arithmetic) | Yes       | Full expression support                                                    |
| EXPLAIN                                                 | Yes       | Returns execution plan                                                     |
| 163 scalar functions                                    | Yes       | Math, string, datetime, regex, crypto, array, map, struct                  |
| 33 aggregate functions                                  | Yes       | Basic, approximate, statistical, bitwise, boolean, positional              |
| Approximate aggregates                                  | Yes       | approx\_distinct, approx\_median, approx\_percentile\_cont, approx\_top\_k |
| Struct / Array / Map column types                       | Yes       | Bracket notation, get\_field(), array functions, map functions             |
| CTEs (WITH ... AS)                                      | Yes       | Single-table only. No JOINs or cross-table references within CTEs.         |
| JOINs                                                   | No        | Single-table only                                                          |
| Subqueries                                              | No        |                                                                            |
| Window functions (OVER)                                 | No        |                                                                            |
| SELECT DISTINCT                                         | No        | Use approx\_distinct                                                       |
| OFFSET                                                  | No        |                                                                            |
| UNION / INTERSECT / EXCEPT                              | No        |                                                                            |
| INSERT / UPDATE / DELETE                                | No        | Read-only                                                                  |
| CREATE / DROP / ALTER                                   | No        | Read-only                                                                  |

For the full SQL syntax, refer to the [SQL reference](https://developers.cloudflare.com/r2-sql/sql-reference/).

---

## Unsupported SQL features

| Feature                                                            | Error                                                  |
| ------------------------------------------------------------------ | ------------------------------------------------------ |
| JOINs (any type)                                                   | unsupported feature: JOIN operations are not supported |
| Multi-table CTEs (JOINs or cross-table references within WITH)     | Single-table CTEs are supported                        |
| Subqueries (FROM, WHERE, scalar)                                   | unsupported feature: subqueries                        |
| SELECT DISTINCT                                                    | unsupported feature: SELECT DISTINCT is not supported  |
| OFFSET                                                             | unsupported feature: OFFSET clause is not supported    |
| UNION / INTERSECT / EXCEPT                                         | Set operations not supported                           |
| Window functions (OVER)                                            | unsupported feature: window functions (OVER clause)    |
| INSERT / UPDATE / DELETE                                           | only read-only queries are allowed                     |
| CREATE / DROP / ALTER                                              | only read-only queries are allowed                     |
| UNNEST / PIVOT / UNPIVOT                                           | Not supported                                          |
| Wildcard modifiers (ILIKE, EXCLUDE, EXCEPT, REPLACE, RENAME on \*) | Not supported                                          |
| LATERAL VIEW / QUALIFY                                             | Not supported                                          |

---

## Unsupported expression patterns

| Pattern                                               | Alternative                                                                                                                            |
| ----------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| func(DISTINCT ...) on any aggregate                   | Use approx\_distinct for counting                                                                                                      |
| PERCENTILE\_CONT / PERCENTILE\_DISC                   | Use [approx\_percentile\_cont](https://developers.cloudflare.com/r2-sql/sql-reference/aggregate-functions/#approx%5Fpercentile%5Fcont) |
| MEDIAN                                                | Use [approx\_median](https://developers.cloudflare.com/r2-sql/sql-reference/aggregate-functions/#approx%5Fmedian)                      |
| ARRAY\_AGG                                            | No alternative (unsupported for memory safety)                                                                                         |
| STRING\_AGG                                           | No alternative (unsupported for memory safety)                                                                                         |
| Scalar subqueries (SELECT ... WHERE x = (SELECT ...)) | Not supported                                                                                                                          |
| EXISTS (SELECT ...)                                   | Not supported                                                                                                                          |
| IN (SELECT ...)                                       | Use IN (value1, value2, ...) with a literal list                                                                                       |

---

## Runtime constraints

| Constraint                        | Details                                                                                               |
| --------------------------------- | ----------------------------------------------------------------------------------------------------- |
| Single table per query            | Queries must reference exactly one table. No JOINs, no subqueries. CTEs may reference a single table. |
| Partitioned Iceberg tables only   | Unpartitioned tables are not supported.                                                               |
| Parquet format only               | No CSV, JSON, or other formats.                                                                       |
| Read-only                         | R2 SQL is a query engine, not a database. No writes.                                                  |
| now() / current\_time() precision | Quantized to 10ms boundaries and forced to UTC.                                                       |

---

## Common error codes

| Code  | Meaning                                                            |
| ----- | ------------------------------------------------------------------ |
| 40003 | Invalid SQL syntax                                                 |
| 40004 | Invalid query (unsupported feature, unknown column, type mismatch) |
| 80001 | Edge service connection failure (retry)                            |

---

## Best practices

1. Always include time-range filters in `WHERE` to limit data scanned.
2. Use specific column names instead of `SELECT *` for better performance.
3. Use `LIMIT` to control result set size.
4. Use approximate aggregation functions (`approx_distinct`, `approx_median`, `approx_percentile_cont`) instead of exact alternatives on large datasets.
5. Enable compaction in R2 Data Catalog to reduce the number of files scanned per query.
6. Use `EXPLAIN` to inspect the execution plan and verify predicate pushdown.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/reference/","name":"Reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/reference/limitations-best-practices/","name":"Limitations and best practices"}}]}
```

---

---
title: Wrangler commands
description: Execute SQL query against R2 Data Catalog
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/r2-sql/reference/wrangler-commands.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Wrangler commands

Note

R2 SQL is currently in open beta. Report R2 SQL bugs in [GitHub ↗](https://github.com/cloudflare/workers-sdk/issues/new/choose). R2 SQL expects there to be a [WRANGLER\_R2\_SQL\_AUTH\_TOKEN](https://developers.cloudflare.com/r2-sql/query-data/#authentication) environment variable to be set.

### `r2 sql query`

Execute SQL query against R2 Data Catalog

* [  npm ](#tab-panel-5737)
* [  pnpm ](#tab-panel-5738)
* [  yarn ](#tab-panel-5739)

Terminal window

```

npx wrangler r2 sql query [WAREHOUSE] [QUERY]


```

Terminal window

```

pnpm wrangler r2 sql query [WAREHOUSE] [QUERY]


```

Terminal window

```

yarn wrangler r2 sql query [WAREHOUSE] [QUERY]


```

* `[WAREHOUSE]` ` string ` required  
R2 Data Catalog warehouse name
* `[QUERY]` ` string ` required  
The SQL query to execute

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/reference/","name":"Reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/reference/wrangler-commands/","name":"Wrangler commands"}}]}
```