Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Comprehensive Cloudflare platform skill covering Workers, D1, R2, KV, AI, Durable Objects, and security.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
references/r2-data-catalog/README.md
1# Cloudflare R2 Data Catalog23Managed Apache Iceberg REST catalog built into R2 buckets. No catalog servers to run.45## Documentation67This reference is a fast-start with verified connection details and code. For limits, maintenance settings, engine config examples, and pricing, **retrieve the live docs** — use the `cloudflare-docs` MCP/search tool if available, otherwise `webfetch` the URL. Docs are source of truth over this file.89| Topic | URL |10|-------|-----|11| Overview / get started | `https://developers.cloudflare.com/r2/data-catalog/get-started/` |12| Manage catalogs (enable, tokens) | `https://developers.cloudflare.com/r2/data-catalog/manage-catalogs/` |13| Engine config examples | `https://developers.cloudflare.com/r2/data-catalog/config-examples/` (`pyiceberg/`, `spark-python/`, `spark-scala/`, `duckdb/`, `snowflake/`, `trino/`, `starrocks/`) |14| Table maintenance (compaction, snapshots) | `https://developers.cloudflare.com/r2/data-catalog/table-maintenance/` |15| Deleting data | `https://developers.cloudflare.com/r2/data-catalog/deleting-data/` |16| Metrics (GraphQL) | `https://developers.cloudflare.com/r2/data-catalog/observability/metrics/` |17| Pricing | `https://developers.cloudflare.com/r2/data-catalog/platform/pricing/` |18| Iceberg spec | `https://iceberg.apache.org/spec/` |1920## Connection Values2122Use the exact **Catalog URI** and **Warehouse** printed by `npx wrangler r2 bucket catalog enable <bucket>` (also shown in the dashboard). They follow these formats:2324| Value | Format | Example |25|-------|--------|---------|26| Catalog URI | `https://catalog.cloudflarestorage.com/{ACCOUNT_ID}/{BUCKET}` | `https://catalog.cloudflarestorage.com/4482a1.../live-data` |27| Warehouse | `{ACCOUNT_ID}_{BUCKET}` (hyphens preserved) | `4482a1..._live-data` |28| Token | R2 API token (Admin R&W on Storage + R&W on Data Catalog) | `cfut_...` |2930The Iceberg `/config` route needs `?warehouse={WAREHOUSE}`.3132## Architecture3334```35Engines (PyIceberg, PySpark, Trino, Snowflake, DuckDB, R2 SQL)36│ Iceberg REST API (Bearer token)37▼38R2 Data Catalog ── namespace/table metadata, snapshots, txn coordination39│ vended S3 credentials40▼41R2 Bucket ── Parquet data files + Iceberg metadata42```4344- **Warehouse** — top-level catalog grouping (`{ACCOUNT_ID}_{BUCKET}`)45- **Namespace** — schema/database; nested namespaces supported46- **Table** — Iceberg table (schema, partition spec, snapshots)47- **Vended credentials** — temp S3 creds the catalog hands engines (`X-Iceberg-Access-Delegation: vended-credentials`)4849## When to Use5051**Use for:** log/analytics data lakes, BI pipelines, time-series/event data, multi-cloud or multi-engine analytics needing ACID + schema evolution on object storage.5253**Don't use for:** OLTP (use D1/a database), sub-second point lookups, tiny datasets (<1 GB), or unstructured blobs (store directly in R2).5455## Two APIs — Don't Confuse Them5657| API | Base | Use for |58|-----|------|---------|59| **Iceberg REST catalog** | `https://catalog.cloudflarestorage.com/{ACCOUNT_ID}/{BUCKET}` | Table reads/writes via PyIceberg, PySpark, Trino, etc. |60| **Control-plane REST API** | `https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/r2-catalog/{BUCKET}` | Enable/disable, maintenance config, list namespaces/tables, get-table |6162**Status:** Open beta. Available to all R2 subscribers; verify pricing/billing status in docs.6364## Reading Order65661. [configuration.md](configuration.md) — enable catalog, tokens, maintenance, client connection672. [api.md](api.md) — control-plane REST (incl. get-table), PyIceberg client, maintenance683. [patterns.md](patterns.md) — PyIceberg + PySpark templates, partitioning, external engines694. [gotchas.md](gotchas.md) — auth errors, maintenance behavior, troubleshooting7071## See Also7273- [pipelines](../pipelines/) — stream events into Iceberg tables74- [r2-sql](../r2-sql/) — serverless SQL over these tables75- [r2](../r2/) — underlying object storage76