ShaofengShi for Apache Gravitino

Posted on Jan 26

Configuring Gravitino Iceberg REST Catalog Server

#gravitino101 #iceberg #dataengineering #spark

Author: xiaojing fang
Last Updated: 2026-01-26

Overview

In this tutorial, you will learn how to configure and use the Gravitino Iceberg REST catalog server. By the end of this guide, you'll have a fully functional Iceberg REST service that enables standard Iceberg clients to interact with Gravitino through HTTP APIs.

Apache Iceberg defines a REST catalog API that clients use to discover and manage Iceberg namespaces and tables. The Gravitino Iceberg REST service implements this API and acts as a proxy so that Iceberg clients can talk to Gravitino through a standard HTTP interface.

Key concepts:

Iceberg REST catalog: A standard HTTP API for Iceberg operations
Gravitino Iceberg REST service: Implements the Iceberg REST API and connects to a catalog backend (Hive or JDBC)
Client flow: Spark or other Iceberg clients point to the REST endpoint and perform namespace/table operations

The REST endpoint base path is http://<host>:<port>/iceberg/, aligned with the Apache Iceberg REST catalog specification.

Why choose Gravitino Iceberg catalog server:

Standards-compliant API for Iceberg clients without vendor-specific wiring
Centralized governance for namespaces and tables through a single REST endpoint
Backend flexibility with Hive or JDBC as the catalog store
Multi-engine access so Spark, Trino, and other Iceberg clients can share the same catalog
Security enhancements with credential vending and access control (ACL) when enabled
Observability and operations via audit logs, metrics, and event listeners
Performance improvements with table metadata cache and table scan planning cache

Prerequisites

Before starting this tutorial, you will need:

System Requirements:

Linux or macOS operating system with outbound internet access for downloads
JDK 17 or higher installed and properly configured

Required Components:

Gravitino server installed and configured (see Setting up Apache Gravitino from Scratch)

Optional Components:

Apache Spark with Iceberg runtime JARs for client verification (recommended for testing)
Hive Metastore service if using Hive catalog backend
MySQL or PostgreSQL server if using JDBC catalog backend

Before proceeding, verify your Java installation:

${JAVA_HOME}/bin/java -version

Architecture overview:

Setup

Step 1: Start a Gravitino server with Iceberg REST service

Use this approach if you want the Iceberg REST service embedded in a full Gravitino server (with Web UI, unified REST APIs, etc.).

Configure Iceberg REST as auxiliary service

1. Install Gravitino server distribution

Follow the previous tutorial 02-setup-guide/README.md to download or build the Gravitino server package.

2. Enable Iceberg REST as an auxiliary service

By default, Gravitino uses the memory catalog backend, which is the simplest option, you can change it to hive or jdbc which are right choices for production. Configure this in conf/gravitino.conf:

# Enable Iceberg REST service
gravitino.auxService.names = iceberg-rest
gravitino.iceberg-rest.classpath = iceberg-rest-server/libs,iceberg-rest-server/conf
gravitino.iceberg-rest.catalog-backend = memory
gravitino.iceberg-rest.warehouse = /tmp/

3. Start the Gravitino server

./bin/gravitino.sh start

4. Check server logs (optional)

tail -f logs/gravitino-server.log

Step 2: Verify the Iceberg REST endpoint

Test the service endpoint

curl http://localhost:9001/iceberg/v1/config

On success, you should see a JSON response with catalog configuration details.

Step 3: Connect from a client and create a table

Configure your Iceberg client to use the REST catalog. The Spark example below uses the rest catalog type and the REST endpoint above.

Configure Spark with Iceberg REST catalog

1. Start Spark SQL with Iceberg runtime

spark-sql \
  --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.10.1 \
  --conf spark.sql.catalog.gravitino=org.apache.iceberg.spark.SparkCatalog \
  --conf spark.sql.catalog.gravitino.type=rest \
  --conf spark.sql.catalog.gravitino.uri=http://localhost:9001/iceberg

Note: Replace 3.5 in the Iceberg runtime artifact with the Spark version in your environment.

Create and test tables

2. Execute sample SQL operations

In Spark SQL:

USE gravitino;
CREATE NAMESPACE IF NOT EXISTS demo;

CREATE TABLE demo.events (
  id BIGINT,
  event_type STRING,
  ts TIMESTAMP
) USING iceberg;

INSERT INTO demo.events VALUES (1, 'click', TIMESTAMP '2024-01-01 10:00:00');

SELECT * FROM demo.events;

Catalog Backend Configuration Examples

These examples show how to configure different catalog backends with various storage options. Update the following configuration in conf/gravitino.conf.

Hive Catalog Backend with HDFS

For environments with existing Hive infrastructure:

# Hive backend configuration
gravitino.iceberg-rest.catalog-backend = hive
gravitino.iceberg-rest.uri = thrift://127.0.0.1:9083
gravitino.iceberg-rest.warehouse = hdfs://127.0.0.1:9000/user/hive/warehouse-hive

JDBC Catalog Backend with HDFS

For environments preferring direct database storage:

# JDBC backend configuration
gravitino.iceberg-rest.catalog-backend = jdbc
gravitino.iceberg-rest.jdbc-driver = org.postgresql.Driver
gravitino.iceberg-rest.uri = jdbc:postgresql://127.0.0.1:5432/postgres
gravitino.iceberg-rest.warehouse = hdfs://127.0.0.1:9000/user/hive/warehouse-jdbc
gravitino.iceberg-rest.jdbc-user = YOUR_DB_USER
gravitino.iceberg-rest.jdbc-password = YOUR_DB_PASSWORD
gravitino.iceberg-rest.jdbc-initialize = true

Configuration notes:

Place the JDBC driver jar in iceberg-rest-server/libs so the Iceberg REST service can load it
For MySQL, use com.mysql.cj.jdbc.Driver and update the JDBC URL, user and password accordingly

Hive Catalog Backend with S3

For cloud-native deployments with S3 storage:

# Hive backend with S3 storage
gravitino.iceberg-rest.catalog-backend = hive
gravitino.iceberg-rest.uri = thrift://127.0.0.1:9083
gravitino.iceberg-rest.warehouse = s3a://my-bucket/iceberg-warehouse

# S3 configuration
gravitino.iceberg-rest.io-impl = org.apache.iceberg.aws.s3.S3FileIO
gravitino.iceberg-rest.s3-access-key-id = YOUR_ACCESS_KEY
gravitino.iceberg-rest.s3-secret-access-key = YOUR_SECRET_KEY
gravitino.iceberg-rest.s3-region = us-west-2
gravitino.iceberg-rest.credential-providers = s3-secret-key

S3 configuration notes:

Update your S3 access key ID, secret access key, and region code properly

JDBC Catalog Backend with S3

Combining JDBC metadata storage with S3 data storage:

# JDBC backend with S3 storage
gravitino.iceberg-rest.catalog-backend = jdbc
gravitino.iceberg-rest.jdbc-driver = org.postgresql.Driver
gravitino.iceberg-rest.uri = jdbc:postgresql://127.0.0.1:5432/postgres
gravitino.iceberg-rest.warehouse = s3://my-bucket/iceberg-warehouse
gravitino.iceberg-rest.jdbc-user = YOUR_DB_USER
gravitino.iceberg-rest.jdbc-password = YOUR_DB_PASSWORD
gravitino.iceberg-rest.jdbc-initialize = true

# S3 configuration
gravitino.iceberg-rest.io-impl = org.apache.iceberg.aws.s3.S3FileIO
gravitino.iceberg-rest.s3-access-key-id = YOUR_ACCESS_KEY
gravitino.iceberg-rest.s3-secret-access-key = YOUR_SECRET_KEY
gravitino.iceberg-rest.s3-region = us-west-2
gravitino.iceberg-rest.credential-providers = s3-secret-key

Additional S3 Setup Requirements

1. Install required dependencies

Besides any JDBC driver jar you need, place the gravitino-iceberg-aws-bundle jar in the Iceberg REST service classpath (iceberg-rest-server/libs).

Download it from: https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-iceberg-aws-bundle

2. Configure Spark for S3 access

spark-sql \
  --packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.10.1,org.apache.gravitino:gravitino-iceberg-aws-bundle:1.1.0 \
  --conf spark.sql.catalog.gravitino=org.apache.iceberg.spark.SparkCatalog \
  --conf spark.sql.catalog.gravitino.type=rest \
  --conf spark.sql.catalog.gravitino.uri=http://localhost:9001/iceberg \
  --conf spark.sql.catalog.gravitino.header.X-Iceberg-Access-Delegation=vended-credentials

Troubleshooting

Common issues and their solutions:

Service connectivity issues:

curl returns 404: Verify the Iceberg REST base path is /iceberg and the port matches gravitino.iceberg-rest.httpPort
Service not running: Check logs/gravitino-server.log and logs/gravitino-server.out for startup errors

Backend connection issues:

Catalog backend connection errors: Confirm JDBC URL, username/password, and JDBC driver jar availability
Warehouse errors: Validate the warehouse path exists and the service user can access it

Client connection issues:

Spark fails to connect: Ensure the REST URL is reachable from the Spark driver and spark.sql.catalog.<name>.type=rest is set
Spark can't find Iceberg classes: Add the matching Iceberg Spark runtime jar via --packages or spark.jars

Congratulations

You have successfully completed the Gravitino Iceberg REST catalog server configuration tutorial!

You now have a fully functional Iceberg REST service with:

A configured Iceberg REST endpoint running on port 9001
A catalog backend configured for your storage environment
Verified client connectivity through Apache Spark
Understanding of various backend and storage configuration options

Your Gravitino Iceberg REST service is ready to serve Iceberg clients across your data ecosystem.

Next Steps

Continue reading "Setup Lance Catalog" (under writting)
Follow and star Apache Gravitino Repository

Apache Gravitino is rapidly evolving, and this article is written based on the latest version 1.1.0. If you encounter issues, please refer to the official documentation or submit issues on GitHub.