DEV Community

Michael
Michael

Posted on • Originally published at gbase.cn

GBase 8a Application Connectivity Guide: JDBC and Python Best Practices

This guide is for developers integrating GBase 8a into their applications. It covers JDBC driver setup, connection pool tuning, batch write optimization, Python connectivity, and common troubleshooting tips.

1. JDBC Setup

Driver Dependency

The JDBC driver is located in the installation package at $GCLUSTER_BASE/server/connector/jdbc/gbase-connector-java-*.jar.

Maven:

<dependency>
    <groupId>com.gbase</groupId>
    <artifactId>gbase-connector-java</artifactId>
    <version>9.5.0</version>
</dependency>
Enter fullscreen mode Exit fullscreen mode

JDBC URL Parameters

String url = "jdbc:gbase://10.168.10.26:5258/sales_db"
    + "?characterEncoding=utf8"
    + "&useUnicode=true"
    + "&connectTimeout=10000"           // TCP connection timeout 10s
    + "&socketTimeout=3600000"          // SQL execution timeout 1h
    + "&rewriteBatchedStatements=true"  // Merge batch inserts
    + "&serverTimezone=Asia/Shanghai";
Connection conn = DriverManager.getConnection(url, "app_user", "password");
Enter fullscreen mode Exit fullscreen mode
  • socketTimeout must be long enough for analytical queries; otherwise the client disconnects while the query still runs on the gbase database.
  • rewriteBatchedStatements=true merges multiple INSERT rows into a single network packet, boosting throughput 5–20×. Combine with gbase_buffer_insert=1000 on the server side.

2. Connection Pool Configuration (Druid Example)

spring:
  datasource:
    druid:
      initial-size: 5
      min-idle: 5
      max-active: 20
      max-wait: 60000
      test-while-idle: true
      test-on-borrow: false
      validation-query: SELECT 1
      time-between-eviction-runs-millis: 60000
      min-evictable-idle-time-millis: 300000
Enter fullscreen mode Exit fullscreen mode

Key rules:

  • max-active should not exceed max_connections divided by the number of gcluster nodes.
  • Enable test-while-idle and set the eviction interval shorter than gcluster's Wait_timeout (default 3600s) to avoid dead connections.

3. Batch Insert Best Practices

conn.setAutoCommit(false);
PreparedStatement pstmt = conn.prepareStatement(
    "INSERT INTO orders VALUES (?, ?, ?, ?)");
int batchSize = 1000;
int count = 0;
for (Order order : orderList) {
    pstmt.setLong(1, order.getOrderId());
    // ... set other parameters
    pstmt.addBatch();
    if (++count % batchSize == 0) {
        pstmt.executeBatch();
        conn.commit();
        pstmt.clearBatch();
    }
}
if (count % batchSize != 0) {
    pstmt.executeBatch();
    conn.commit();
}
Enter fullscreen mode Exit fullscreen mode

For over 1 million rows, prefer LOAD DATA INFILE — it offers significantly higher throughput than batch INSERT.

4. Python Connectivity

PyMySQL

import pymysql
conn = pymysql.connect(
    host='10.168.10.26', port=5258,
    user='app_user', password='your_password',
    database='sales_db', charset='utf8',
    connect_timeout=10, read_timeout=3600,
    autocommit=True
)
with conn.cursor() as cursor:
    cursor.execute("SELECT dept_id, SUM(sale_amount) FROM sales ...")
    for row in cursor.fetchall():
        print(row)
conn.close()
Enter fullscreen mode Exit fullscreen mode

Batch Insert with executemany

data = [(10001, 1, 1500.00, '2024-06-01'), ...]
with conn.cursor() as cursor:
    cursor.executemany(
        "INSERT INTO orders VALUES (%s, %s, %s, %s)", data
    )
conn.commit()
Enter fullscreen mode Exit fullscreen mode

SQLAlchemy Integration

from sqlalchemy import create_engine
engine = create_engine(
    "mysql+pymysql://user:password@10.168.10.26:5258/sales_db?charset=utf8",
    pool_size=5, max_overflow=10, pool_recycle=1800,
    connect_args={"read_timeout": 3600, "write_timeout": 60}
)
df = pd.read_sql("SELECT * FROM orders WHERE order_date >= '2024-06-01'", engine)
Enter fullscreen mode Exit fullscreen mode

5. Common Connection Issues

Issue Cause Solution
Communications link failure Connection pool holds a dead connection Enable test-while-idle, set pool-recycle
Packet too large Single INSERT batch exceeds limit Increase max_allowed_packet or reduce batch size
Python garbled characters Missing charset declaration charset='utf8', use_unicode=True
SocketTimeoutException socketTimeout too short Use different timeouts for OLTP and analytical queries

6. Quick Reference

Scenario Recommendation
Max pool size ≤ max_connections / gcluster nodes
socketTimeout Analytical: 3600000ms; OLTP: 30000ms
Batch INSERT rewriteBatchedStatements=true, 1000 rows/batch
Prevent connection eviction testWhileIdle=true, poolRecycle=1800
Python batch writes executemany() or LOAD DATA

Whether you are building a Java service with connection pooling or a Python analytics pipeline, following these configuration patterns will help you keep your gbase database connections stable and performant.

Top comments (0)