DEV Community

Cover image for Setting up Apache Gravitino from Scratch
Datastrato for Apache Gravitino

Posted on

Setting up Apache Gravitino from Scratch

Author: Danhua Wang

Last Updated: [2026-01-12]

Overview

In this tutorial, you will learn how to install and configure Apache Gravitino from scratch. By the end of this guide, you'll have a fully functional Gravitino server running with your chosen storage backend.

What you'll accomplish:

  • Install Apache Gravitino from source or pre-built binaries and configure the basic server setup
  • Configure storage backends including H2 for development and MySQL/PostgreSQL for production environments
  • Configure Gravitino Server including web server, cache and access control configurations
  • Verify the installation by testing the server endpoints and Web UI to ensure everything is working correctly

Prerequisites

Before starting this tutorial, you will need:

System Requirements:

  • Linux or macOS operating system with outbound internet access for downloads
  • Minimum Production Environment: 4 CPU cores, 16GB RAM
  • Minimum Development Environment: 2 CPU cores, 8GB RAM

Java Development Kit:

  • JDK 17 or higher installed and properly configured

Optional Components:

  • MySQL or PostgreSQL server installed and properly configured, if you choose either as your storage backend

Before proceeding, verify your Java installation:

${JAVA_HOME}/bin/java -version
Enter fullscreen mode Exit fullscreen mode

Setup

Step 1: Obtain Gravitino Binary

You have two options for obtaining Apache Gravitino: downloading a pre-built release or building from source.

Option 1: Download Pre-built Release

1. Download the latest release

Navigate to the Apache Gravitino GitHub Releases page and download the latest release tarball.
For example, to download version 1.1.0, run:

wget https://github.com/apache/gravitino/releases/download/v1.1.0/gravitino-1.1.0-bin.tar.gz
Enter fullscreen mode Exit fullscreen mode

2. Extract the package

tar -xzf gravitino-1.1.0-bin.tar.gz
cd gravitino-1.1.0-bin
Enter fullscreen mode Exit fullscreen mode

Option 2: Build from Source

If you prefer to build from source or need the latest development features, see How to Build Gravitino for detailed build instructions.

Understanding the Package Structure

After obtaining the binary, familiarize yourself with the directory layout:

gravitino-<version>-bin/
├── bin/                                    # Launch scripts
│   ├── gravitino.sh                        # Main server launcher
│   ├── gravitino-iceberg-rest-server.sh    # Iceberg REST server launcher
│   └── gravitino-lance-rest-server.sh      # Lance REST server launcher
├── conf/                                   # Configuration files
│   ├── gravitino.conf                      # Main server configuration
│   ├── gravitino-iceberg-rest-server.conf  # Iceberg REST configuration
│   ├── gravitino-lance-rest-server.conf    # Lance REST configuration
│   ├── gravitino-env.sh                    # Environment variables
│   └── log4j2.properties                   # Logging configuration
├── catalogs/                               # Catalog-specific configurations
├── libs/                                   # Server dependencies
├── iceberg-rest-server/                    # Iceberg REST server package
├── lance-rest-server/                      # Lance REST server package
├── logs/                                   # Log files (created at runtime)
├── data/                                   # Default data storage
└── scripts/                                # Database initialization scripts
└── web/                                    # Frontend package
Enter fullscreen mode Exit fullscreen mode

Step 2: Plan Your Storage Backend

Choose the appropriate storage backend for your deployment scenario.

Development/Testing: H2 Database (Default)

For development and testing environments, H2 provides a quick setup:

  • Pros: Embedded database, no external dependencies, works out-of-the-box
  • Cons: Not suitable for production, no data consistency guarantees
  • Configuration: No additional setup required

Production: MySQL

For production environments, MySQL is the recommended choice:

1. Install and configure MySQL server

2. Create database and user

CREATE DATABASE gravitino;
CREATE USER 'gravitino'@'%' IDENTIFIED BY 'your_password';
GRANT ALL PRIVILEGES ON gravitino.* TO 'gravitino'@'%';
FLUSH PRIVILEGES;
Enter fullscreen mode Exit fullscreen mode

3. Initialize database schema

mysql -h <mysql_ip_address> -u gravitino -D gravitino -p < scripts/mysql/schema-1.1.0-mysql.sql
Enter fullscreen mode Exit fullscreen mode

Production: PostgreSQL

As an alternative production option:

1. Install and configure PostgreSQL server

2. Create database and user

CREATE DATABASE gravitino;
CREATE USER gravitino WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE gravitino TO gravitino;
Enter fullscreen mode Exit fullscreen mode

3. Initialize database schema

psql -h <postgres_ip_address> -U gravitino -d gravitino -f scripts/postgresql/schema-1.1.0-postgresql.sql
Enter fullscreen mode Exit fullscreen mode

Step 3: Configure Gravitino Server

Configure the main server settings in the conf/gravitino.conf file.

Basic Server Configuration

1. Configure HTTP server settings

# HTTP Server Configuration
gravitino.server.webserver.host = 0.0.0.0
gravitino.server.webserver.httpPort = 8090
gravitino.server.webserver.minThreads = 24
gravitino.server.webserver.maxThreads = 200
Enter fullscreen mode Exit fullscreen mode

2. Configure storage backend

For H2 (Development):

# Storage Backend Configuration
gravitino.entity.store = relational
gravitino.entity.store.relational = JDBCBackend
gravitino.entity.store.relational.jdbcUrl = jdbc:h2
gravitino.entity.store.relational.jdbcDriver = org.h2.Driver
gravitino.entity.store.relational.jdbcUser = gravitino
gravitino.entity.store.relational.jdbcPassword = gravitino
Enter fullscreen mode Exit fullscreen mode

For MySQL (Production):

# Configure for MySQL
gravitino.entity.store.relational.jdbcUrl = jdbc:mysql://<mysql_ip_address>:3306/gravitino
gravitino.entity.store.relational.jdbcDriver = com.mysql.cj.jdbc.Driver
gravitino.entity.store.relational.jdbcUser = gravitino
gravitino.entity.store.relational.jdbcPassword = <your_password>
Enter fullscreen mode Exit fullscreen mode

Optional Performance Configuration

1. Enable caching for better performance

Caching provides significant performance improvements, particularly for authorization operations and metadata lookups:

  • Authorization Performance: Dramatically reduces latency for permission checks by caching user roles, privileges, and access control decisions
  • Metadata Retrieval: Speeds up frequent catalog, schema, and table metadata queries by avoiding repeated database lookups
# Enable caching for better performance
gravitino.cache.enabled = true
gravitino.cache.implementation = caffeine
gravitino.cache.maxEntries = 10000
gravitino.cache.expireTimeInMs = 3600000
Enter fullscreen mode Exit fullscreen mode

Optional Access Control Configuration

Configure authorization

Gravitino includes built-in metadata authorization that you can enable with the following configuration:

# Enable access control
gravitino.authorization.enable = true
gravitino.authorization.serviceAdmins = admin,gravitino
Enter fullscreen mode Exit fullscreen mode

gravitino.authorization.serviceAdmins defines service administrators who are responsible for creating metalakes.

When a service admin creates a metalake, they automatically become the owner. As the owner, they have full control over the metalake, including the ability to drop it. Ownership can be transferred to another user if needed.

For comprehensive access control documentation, see Access Control.

Configure authentication

Apache Gravitino supports three authentication mechanisms: simple, OAuth, and Kerberos. Upon successful authentication, user identities from any of these methods are directly mapped to authorization principals to govern access control decisions.

  • Default Behavior: If authentication is not explicitly configured, Gravitino defaults to simple authentication mode.
  • Login Method: Use the service administrators specified in the gravitino.authorization.serviceAdmins configuration to log in.

See How to authenticate for detailed authentication setup.

Environment Configuration

Configure environment variables in conf/gravitino-env.sh

# JVM Memory Settings
export GRAVITINO_MEM="-Xms4g -Xmx4g -XX:MaxMetspaceSize=1g"

# Debug Options (uncomment for debugging)
# export GRAVITINO_DEBUG_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8000 -Dlog4j2.debug=true"
Enter fullscreen mode Exit fullscreen mode

See Apache Gravitino server configurations for detailed server configurations.

Step 4: Optional REST services for enhanced functionality

You can enable the Iceberg REST or Lance REST services either as auxiliary services when starting the Gravitino server, or run them as standalone services. We’ve prepared detailed guides for them in sebsequent articles.

# Enable Iceberg REST/Lance REST as auxiliary service
gravitino.auxService.names = iceberg-rest,lance-rest
Enter fullscreen mode Exit fullscreen mode

Step 5: Start and Verify Installation

Launch the Gravitino server and verify the installation.

Start Gravitino Server

1. Start the server in daemon mode

./bin/gravitino.sh start
Enter fullscreen mode Exit fullscreen mode

2. Check server status

./bin/gravitino.sh status
Enter fullscreen mode Exit fullscreen mode

3. View server logs

tail -f logs/gravitino-server.log
Enter fullscreen mode Exit fullscreen mode

Verify Installation

1. Check server health

curl -v -X GET \
  -H "Accept: application/vnd.gravitino.v1+json" \
  -H "Content-Type: application/json" \
  http://localhost:8090/api/version
Enter fullscreen mode Exit fullscreen mode

On success, the response looks like this:

{"code":0,"version":{"version":"1.1.0","compileDate":"12/12/2025 12:38:33","gitCommit":"5a6b5ae772d50aff98878ae3659fba3598a9027f"}}
Enter fullscreen mode Exit fullscreen mode

2. Access Web UI

Open your browser and navigate to http://localhost:8090 to access the Gravitino Web UI.

The default login page when using simple authentication mode (with access control enabled):

Go to the metalake management page directly if access control disabled:

3. Verify auxiliary services (if enabled)

# Check Iceberg REST service
curl http://localhost:9001/iceberg/v1/config

# Check Lance REST service
curl http://localhost:9101/lance/v1/namespace/%24/list
Enter fullscreen mode Exit fullscreen mode

Create Sample Metadata

Test your installation by creating sample metadata objects.

Create your first metalake
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
  -H "Content-Type: application/json" \
  -d '{"name": "my_metalake", "comment": "My first metalake"}' \
  http://localhost:8090/api/metalakes
Enter fullscreen mode Exit fullscreen mode

Note: If you have enabled access control, you need to add the Authorization header to the command (using username 'admin' and password '123'):

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
  -H "Content-Type: application/json" \
  -H "Authorization: Basic $(echo -n 'admin:123' | base64)" \
  -d '{"name": "my_metalake", "comment": "My first metalake"}' \
  http://localhost:8090/api/metalakes
Create a sample catalog

Note: This example creates a Hive catalog. Before proceeding, ensure you have a running Hive cluster with Hive Metastore service accessible. If you don't have a Hive cluster, you can use a different catalog type (such as MySQL catalog).

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "catalog_hive",
    "type": "relational",
    "provider": "hive",
    "comment": "My Hive catalog",
    "properties": {
      "metastore.uris": "thrift://<hive_metastore_host>:<port>"
    }
  }' \
  http://localhost:8090/api/metalakes/my_metalake/catalogs
Enter fullscreen mode Exit fullscreen mode
Manage catalogs on web GUI

Create a catalog:

View/Manage all catalogs:
Gravitino catalogs Page

Congratulations

You have successfully completed the Apache Gravitino setup tutorial!

You now have a fully functional Apache Gravitino installation with:

  • A configured metadata server running on port 8090
  • A storage backend configured for your environment
  • Optional auxiliary REST services for Iceberg and Lance integration
  • Sample metadata objects to verify functionality

Your Apache Gravitino server is ready to manage metadata across your data ecosystem.

Next Steps


Apache Gravitino is rapidly evolving, and this article is written based on the latest version 1.1.0. If you encounter issues, please refer to the official documentation or submit issues on GitHub.

Top comments (0)