R2 Data Catalog API Reference

Two APIs: the control-plane REST API (Cloudflare-specific) and the Iceberg REST catalog API (standard, used via PyIceberg/PySpark). For PyIceberg method details pull https://py.iceberg.apache.org/; for engine configs see https://developers.cloudflare.com/r2/data-catalog/config-examples/.

Control-Plane REST API

Base: https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/r2-catalog/{BUCKET} Auth: Authorization: Bearer $API_TOKEN

Operation	Method	Path
Get catalog details	GET	`/r2-catalog/{bucket}`
Enable / disable	POST	`/r2-catalog/{bucket}/enable` · `/disable`
Store compaction credential	POST	`/r2-catalog/{bucket}/credential`
List namespaces	GET	`/namespaces`
List tables	GET	`/namespaces/{ns}/tables`
Get table metadata	GET	`/namespaces/{ns}/tables/{table}`
Get/update maintenance config	GET/POST	`/maintenance-configs` and `/namespaces/{ns}/tables/{table}/maintenance-configs`

List endpoints accept ?return_uuids=true, ?return_details=true, ?parent={ns}, and pagination. Nested namespaces use %1F (Unit Separator), not / or .: /namespaces/parent%1Fchild/tables.

# Catalog details (status, maintenance_config, credential_status)
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/r2-catalog/$BUCKET" \
  -H "Authorization: Bearer $API_TOKEN"

# Store token for compaction (pure-API setups)
curl -s -X POST "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/r2-catalog/$BUCKET/credential" \
  -H "Authorization: Bearer $API_TOKEN" -H "Content-Type: application/json" \
  -d '{"token": "'$API_TOKEN'"}'

# Update maintenance config (all fields optional; table-level overrides catalog-level)
curl -s -X POST "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/r2-catalog/$BUCKET/maintenance-configs" \
  -H "Authorization: Bearer $API_TOKEN" -H "Content-Type: application/json" \
  -d '{"compaction": {"state": "enabled", "target_size_mb": "256"},
       "snapshot_expiration": {"state": "enabled", "min_snapshots_to_keep": 10, "max_snapshot_age": "7d"}}'

Get Table (metadata introspection)

GET /namespaces/{ns}/tables/{table} returns schema, partition spec, sort order, and snapshot info — like Iceberg "load table" but on the control plane, with snapshots pruned to the most recent 10. (Newer than the published API docs.)

curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/r2-catalog/$BUCKET/namespaces/live/tables/earthquakes" \
  -H "Authorization: Bearer $API_TOKEN"

{"result": {
  "identifier": {"namespace": ["live"], "name": "earthquakes"},
  "table_uuid": "019edccf-3ac8-73e3-...",
  "metadata_location": "s3://live-data/__r2_data_catalog/.../metadata/01225-....metadata.json",
  "total_snapshots": 1225,
  "returned_snapshots": 10,
  "metadata": { /* standard Iceberg TableMetadata: schemas, partition-specs, sort-orders,
                   properties, current-snapshot-id, snapshots (≤10), snapshot-log, refs */ }
}, "success": true}

Field	Description
`identifier`	`{namespace: [...], name}`
`table_uuid`	Iceberg table UUID
`metadata_location`	R2 path to current metadata file
`total_snapshots`	Total before pruning
`returned_snapshots`	Count in `metadata.snapshots` (max 10)
`metadata`	Standard Iceberg TableMetadata, arrays pruned to 10

Error Format

{"success": false, "errors": [{"code": 10000, "message": "Authentication error"}]}

Standard HTTP codes (401 auth, 403 perms, 404 not enabled/found, 409 conflict).

Iceberg REST Catalog API (via PyIceberg)

Standard Iceberg REST Catalog. Base: https://catalog.cloudflarestorage.com/{ACCOUNT_ID}/{BUCKET}. The /config route needs ?warehouse={WAREHOUSE}.

from pyiceberg.catalog.rest import RestCatalog
catalog = RestCatalog(name="r2", warehouse=WAREHOUSE, uri=CATALOG_URI, token=TOKEN)

Common operations (see PyIceberg docs for full signatures):

catalog.create_namespace_if_not_exists("logs")
catalog.list_tables("logs")
table = catalog.create_table(("logs", "events"), schema=schema)   # pyiceberg.schema.Schema
table = catalog.load_table(("logs", "events"))
table.append(pyarrow_table)          # also .overwrite(...)
table.scan(row_filter="id > 100").to_pandas()

Schema evolution (add nullable columns; widen types only):

with table.update_schema() as u:
    u.add_column("user_id", LongType(), doc="User ID")
    u.rename_column("msg", "message")

Time-travel:

table.scan(snapshot_id=table.snapshots()[-2].snapshot_id)
table.scan(as_of_timestamp=ms_epoch)

Manual Maintenance (PySpark)

Prefer automatic maintenance (control-plane API/wrangler). For manual control or very large tables, use Spark procedures (rewrite_data_files, rewrite_manifests, expire_snapshots, remove_orphan_files). See https://developers.cloudflare.com/r2/data-catalog/table-maintenance/.

spark.sql("CALL r2dc.system.rewrite_data_files(table => 'ns.tbl')")
# Orphan removal REQUIRES S3 credentials (vended creds fail with NoAuthWithAWSException)
spark.sql("CALL r2dc.system.remove_orphan_files(table => 'ns.tbl', older_than => TIMESTAMP '2026-02-28 00:00:00')")

R2 Data Catalog API Reference

Control-Plane REST API

Base: https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/r2-catalog/{BUCKET} Auth: Authorization: Bearer $API_TOKEN

Operation	Method	Path
Get catalog details	GET	`/r2-catalog/{bucket}`
Enable / disable	POST	`/r2-catalog/{bucket}/enable` · `/disable`
Store compaction credential	POST	`/r2-catalog/{bucket}/credential`
List namespaces	GET	`/namespaces`
List tables	GET	`/namespaces/{ns}/tables`
Get table metadata	GET	`/namespaces/{ns}/tables/{table}`
Get/update maintenance config	GET/POST	`/maintenance-configs` and `/namespaces/{ns}/tables/{table}/maintenance-configs`

# Catalog details (status, maintenance_config, credential_status)
curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/r2-catalog/$BUCKET" \
  -H "Authorization: Bearer $API_TOKEN"

# Store token for compaction (pure-API setups)
curl -s -X POST "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/r2-catalog/$BUCKET/credential" \
  -H "Authorization: Bearer $API_TOKEN" -H "Content-Type: application/json" \
  -d '{"token": "'$API_TOKEN'"}'

# Update maintenance config (all fields optional; table-level overrides catalog-level)
curl -s -X POST "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/r2-catalog/$BUCKET/maintenance-configs" \
  -H "Authorization: Bearer $API_TOKEN" -H "Content-Type: application/json" \
  -d '{"compaction": {"state": "enabled", "target_size_mb": "256"},
       "snapshot_expiration": {"state": "enabled", "min_snapshots_to_keep": 10, "max_snapshot_age": "7d"}}'

Get Table (metadata introspection)

curl -s "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/r2-catalog/$BUCKET/namespaces/live/tables/earthquakes" \
  -H "Authorization: Bearer $API_TOKEN"

{"result": {
  "identifier": {"namespace": ["live"], "name": "earthquakes"},
  "table_uuid": "019edccf-3ac8-73e3-...",
  "metadata_location": "s3://live-data/__r2_data_catalog/.../metadata/01225-....metadata.json",
  "total_snapshots": 1225,
  "returned_snapshots": 10,
  "metadata": { /* standard Iceberg TableMetadata: schemas, partition-specs, sort-orders,
                   properties, current-snapshot-id, snapshots (≤10), snapshot-log, refs */ }
}, "success": true}

Field	Description
`identifier`	`{namespace: [...], name}`
`table_uuid`	Iceberg table UUID
`metadata_location`	R2 path to current metadata file
`total_snapshots`	Total before pruning
`returned_snapshots`	Count in `metadata.snapshots` (max 10)
`metadata`	Standard Iceberg TableMetadata, arrays pruned to 10

Error Format

{"success": false, "errors": [{"code": 10000, "message": "Authentication error"}]}

Standard HTTP codes (401 auth, 403 perms, 404 not enabled/found, 409 conflict).

Iceberg REST Catalog API (via PyIceberg)

Standard Iceberg REST Catalog. Base: https://catalog.cloudflarestorage.com/{ACCOUNT_ID}/{BUCKET}. The /config route needs ?warehouse={WAREHOUSE}.

from pyiceberg.catalog.rest import RestCatalog
catalog = RestCatalog(name="r2", warehouse=WAREHOUSE, uri=CATALOG_URI, token=TOKEN)

Common operations (see PyIceberg docs for full signatures):

catalog.create_namespace_if_not_exists("logs")
catalog.list_tables("logs")
table = catalog.create_table(("logs", "events"), schema=schema)   # pyiceberg.schema.Schema
table = catalog.load_table(("logs", "events"))
table.append(pyarrow_table)          # also .overwrite(...)
table.scan(row_filter="id > 100").to_pandas()

Schema evolution (add nullable columns; widen types only):

with table.update_schema() as u:
    u.add_column("user_id", LongType(), doc="User ID")
    u.rename_column("msg", "message")

Time-travel:

table.scan(snapshot_id=table.snapshots()[-2].snapshot_id)
table.scan(as_of_timestamp=ms_epoch)

Manual Maintenance (PySpark)

spark.sql("CALL r2dc.system.rewrite_data_files(table => 'ns.tbl')")
# Orphan removal REQUIRES S3 credentials (vended creds fail with NoAuthWithAWSException)
spark.sql("CALL r2dc.system.remove_orphan_files(table => 'ns.tbl', older_than => TIMESTAMP '2026-02-28 00:00:00')")

Cloudflare Platform Skill

references/r2-data-catalog/api.md

R2 Data Catalog API Reference

Control-Plane REST API

Get Table (metadata introspection)

Error Format

Iceberg REST Catalog API (via PyIceberg)

Manual Maintenance (PySpark)

See Also

Preparing the source view

Cloudflare Platform Skill

references/r2-data-catalog/api.md

R2 Data Catalog API Reference

Control-Plane REST API

Get Table (metadata introspection)

Error Format

Iceberg REST Catalog API (via PyIceberg)

Manual Maintenance (PySpark)

See Also