1. Overview
1.1 Product Introduction
Apache Gravitino IRC (Iceberg REST Catalog) is an Iceberg REST catalog service based on Gravitino, providing unified Iceberg table management capabilities. Starting from v1.1.0, Gravitino IRC supports access control for Iceberg tables.
1.2 Key Features
- ✅ Table operation authorization
- ✅ Multi-tenancy support
- ✅ RESTful API interface
- ✅ Seamless integration with Spark
- ✅ Role-based access control (RBAC)
1.3 Current Status
Currently supports table-level operation authorization, with more access control features to be added in the future.
2. System Architecture
2.1 Architecture Diagram
2.2 Component Description
- Gravitino Server: Core metadata service, primarily managing table permission information in this scenario; port 8090
- Iceberg REST Service: Iceberg REST catalog service that connects to Gravitino Server via API to retrieve permission information; port 9002
- MySQL: Metadata storage for both Gravitino and IRC
- Object Storage: Data file storage
3. Environment Requirements
3.1 System Requirements
| Resource | Minimum | Recommended |
|---|---|---|
| CPU | 4 cores | 8 cores |
| Memory | 8GB | 16GB |
| Disk | 100GB | 500GB |
| Network | Gigabit | 10 Gigabit |
3.2 Software Dependencies
| Software | Version | Notes |
|---|---|---|
| Java | JDK 17+ | Required |
| MySQL | 5.7+ | Metadata storage |
| Spark | 3.4+ | Optional, client |
4. Configuration
4.1 Core Configuration File
Create gravitino.conf configuration file in GRAVITINO_HOME/conf:
# ============================================
# Gravitino Service Basic Configuration
# ============================================
# Service shutdown timeout
gravitino.server.shutdown.timeout = 3000
# ============================================
# Web Server Configuration
# ============================================
# Web server host address
gravitino.server.webserver.host = 0.0.0.0
# HTTP port
gravitino.server.webserver.httpPort = 8090
# Minimum threads
gravitino.server.webserver.minThreads = 24
# Maximum threads
gravitino.server.webserver.maxThreads = 200
# Stop timeout
gravitino.server.webserver.stopTimeout = 30000
# Idle timeout
gravitino.server.webserver.idleTimeout = 30000
# ============================================
# Entity Store Configuration (MySQL)
# ============================================
gravitino.entity.store = relational
gravitino.entity.store.relational = JDBCBackend
gravitino.entity.store.relational.jdbcUrl = jdbc:mysql://192.168.194.152:3306/gravitino
gravitino.entity.store.relational.jdbcDriver = com.mysql.cj.jdbc.Driver
gravitino.entity.store.relational.jdbcUser = gravitino
gravitino.entity.store.relational.jdbcPassword = gravitino
# ============================================
# Cache Configuration
# ============================================
gravitino.cache.enabled = true
gravitino.cache.maxEntries = 10000
gravitino.cache.expireTimeInMs = 3600000
gravitino.cache.enableWeigher = true
gravitino.cache.implementation = caffeine
# ============================================
# Authorization Configuration
# ============================================
gravitino.authorization.enable = true
gravitino.authorization.impl = org.apache.gravitino.server.authorization.jcasbin.JcasbinAuthorizer
gravitino.authorization.serviceAdmins = admin # Admin account for creating metalake
gravitino.authenticators = simple
# ============================================
# Iceberg REST Service Configuration
# ============================================
gravitino.auxService.names = iceberg-rest
gravitino.iceberg-rest.classpath = iceberg-rest-server/libs,iceberg-rest-server/conf
gravitino.iceberg-rest.host = 0.0.0.0
gravitino.iceberg-rest.httpPort = 9002
gravitino.iceberg-rest.catalog-config-provider = dynamic-config-provider
gravitino.iceberg-rest.gravitino-uri = http://localhost:8090/
gravitino.iceberg-rest.gravitino-metalake = my_metalake # Metalake name to use
gravitino.iceberg-rest.gravitino-simple.user-name = rest-catalog # User for IRC service to fetch catalog info
gravitino.iceberg-rest.default-catalog-name = catalog_iceberg
5. Deployment Process
5.1 Database Initialization
# Navigate to scripts directory
cd distribution/package/scripts
# Execute SQL based on database type
# MySQL example
mysql -h <host> -u <user> -p -D <database> < xxx.sql
5.2 Download Dependencies
# Download MySQL driver
cd $GRAVITINO_HOME
wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.27/mysql-connector-java-8.0.27.jar
cp mysql-connector-java-8.0.27.jar libs/
cp mysql-connector-java-8.0.27.jar catalogs/lakehouse-iceberg/libs
cp mysql-connector-java-8.0.27.jar iceberg-rest-server/libs
# Copy bundle jar files
cp -r bundles/aws-bundle/build/libs/*.jar distribution/package/catalogs/lakehouse-iceberg/libs
cp -r bundles/aws-bundle/build/libs/*.jar distribution/package/iceberg-rest-server/libs
wget https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-aws-bundle/1.9.2/iceberg-aws-bundle-1.9.2.jar
cp iceberg-aws-bundle-1.9.2.jar distribution/package/iceberg-rest-server/libs
cp iceberg-aws-bundle-1.9.2.jar distribution/package/catalogs/lakehouse-iceberg/libs
5.3 Start Services
# Start Gravitino service
/bin/bash bin/gravitino.sh start
# Check service status
/bin/bash bin/gravitino.sh status
5.4 Create Metalake
If you haven't created a metalake yet, use the API (or web UI) to create one named my_metalake:
# Create Metalake with admin privileges
curl -X POST -H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-d '{
"name": "my_metalake",
"comment": "",
"properties": {}
}' http://localhost:8090/api/metalakes
5.5 Create Iceberg Catalog
Register an Iceberg Catalog in Gravitino; it needs to use the same backend (such as HMS or JDBC) as the running Iceberg REST Service:
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-d '{
"name": "catalog_iceberg",
"type": "RELATIONAL",
"provider": "lakehouse-iceberg",
"comment": "Iceberg directory",
"properties": {
"uri": "jdbc:mysql://mysql-host:3306/iceberg_db",
"catalog-backend": "jdbc",
"warehouse": "s3://bucket/iceberg/warehouse/",
"jdbc-user": "mysql_user",
"jdbc-password": "mysql_password",
"jdbc-driver": "com.mysql.cj.jdbc.Driver",
"io-impl": "org.apache.iceberg.aws.s3.S3FileIO",
"s3-secret-access-key": "your_secret_key",
"s3-access-key-id": "your_access_key",
"s3-region": "ap-southeast-1",
"authentication.type": "simple",
"credential-providers": "s3-token",
"s3-endpoint": "http://s3.ap-southeast-1.amazonaws.com",
"jdbc-initialize": "true",
"s3-role-arn": "arn:aws:iam::730335553010:role/sts_s3_access_role"
}
}' http://localhost:8090/api/metalakes/my_metalake/catalogs
6. Access Control Management
Next, we will use Gravitino's RBAC permission model to configure access control for the Iceberg Catalog.
6.1 Permission Model
Gravitino provides the following privileges related to catalog/schema/table:
| Privilege Type | Description | Applicable Objects |
|---|---|---|
| USE_CATALOG | Permission to use catalog | Catalog |
| USE_SCHEMA | Permission to use schema | Schema, Catalog |
| SELECT_TABLE | Permission to query table | Table, Schema, Catalog |
| MODIFY_TABLE | Permission to modify table | Table, Schema, Catalog |
| CREATE_TABLE | Permission to create table | Schema, Catalog |
| CREATE_SCHEMA | Permission to create schema | Catalog |
6.2 Create Roles and Permissions
Create a role named "data_reader" with various privileges on catalog, schema, and table. Please adjust the catalog, schema, and table names accordingly.
# Create schema
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-H "Content-Type: application/json" -d '{
"name": "schema1",
"comment": "comment",
"properties": {
"key1": "value1"
}
}' http://localhost:8090/api/metalakes/my_metalake/catalogs/catalog_iceberg/schemas
# Create role
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-d '{
"name": "data_reader",
"properties": {"description": "data read"},
"securableObjects": [
{
"fullName": "catalog_iceberg.schema1",
"type": "SCHEMA",
"privileges": [
{"name": "CREATE_TABLE", "condition": "ALLOW"},
{"name": "USE_SCHEMA", "condition": "ALLOW"}
]
},
{
"fullName": "catalog_iceberg",
"type": "CATALOG",
"privileges": [{"name": "USE_CATALOG", "condition": "ALLOW"}]
}
]
}' http://localhost:8090/api/metalakes/my_metalake/roles
Create a role for rest_server to allow it to fetch catalog information:
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-d '{
"name": "catalog_reader",
"properties": {"description": "load catalog infos"},
"securableObjects": [
{
"fullName": "my_metalake",
"type": "METALAKE",
"privileges": [{"name": "USE_CATALOG", "condition": "ALLOW"}]
}
]
}' http://localhost:8090/api/metalakes/my_metalake/roles
6.3 Create Users and Grant Permissions
Create a user such as spark_user in Gravitino and grant them the role created above:
# Create user
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-d '{"name": "spark_user"}' \
http://localhost:8090/api/metalakes/my_metalake/users
# Grant permissions to user
curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-d '{"roleNames": ["data_reader"]}' \
http://localhost:8090/api/metalakes/my_metalake/permissions/users/spark_user/grant
Create user rest-catalog in Gravitino and grant permissions to load catalog:
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-d '{"name": "rest-catalog"}' \
http://localhost:8090/api/metalakes/my_metalake/users
# Grant permissions to user
curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'admin:password' | base64)" \
-d '{"roleNames": ["catalog_reader"]}' \
http://localhost:8090/api/metalakes/my_metalake/permissions/users/rest-catalog/grant
7. Spark Integration
After configuring permissions in Gravitino, you can test and verify on the client side.
7.1 Spark Configuration
Using Spark as an example, you need to configure the username on the client and point the Iceberg REST service to the IRC service address.
spark-sql \
--jars "/path/to/iceberg-aws-bundle-1.9.2.jar,/path/to/iceberg-spark-runtime-3.4_2.12-1.9.2.jar" \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
--conf spark.sql.catalog.rest=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.rest.type=rest \
--conf spark.sql.catalog.rest.uri=http://localhost:9002/iceberg/ \
--conf spark.sql.catalog.rest..X-Iceberg-Access-Delegation=vended-credentials \
--conf spark.sql.catalog.rest.rest.auth.type=basic \
--conf spark.sql.catalog.rest.rest.auth.basic.username=spark_user \
--conf spark.sql.catalog.rest.rest.auth.basic.password=user_password
7.2 Usage Examples
-- Show available tables
SHOW TABLES IN rest.schema1;
-- Query data
SELECT * FROM rest.schema1.table1;
-- Create table
CREATE TABLE rest.schema1.table2 (
id BIGINT,
name STRING
) USING iceberg;
Summary
Through this guide, you will master:
- Complete Deployment Process - End-to-end guidance from environment preparation, database initialization, dependency downloads to service startup
- Access Control System - Understanding Gravitino's RBAC permission model, learning to create roles, assign permissions, and manage users
- Real-world Application Scenarios - Learning how to use IRC access control in production through Spark integration examples
- Core Configuration Points - Mastering key configuration parameters for Gravitino Server and Iceberg REST Service
This solution provides enterprise-grade access control capabilities for your data lake, implementing fine-grained table-level permission management while ensuring data security and maintaining excellent usability.

Top comments (0)