This guide is for developers integrating GBase 8a into their applications. It covers JDBC driver setup, connection pool tuning, batch write optimization, Python connectivity, and common troubleshooting tips.
1. JDBC Setup
Driver Dependency
The JDBC driver is located in the installation package at $GCLUSTER_BASE/server/connector/jdbc/gbase-connector-java-*.jar.
Maven:
<dependency>
<groupId>com.gbase</groupId>
<artifactId>gbase-connector-java</artifactId>
<version>9.5.0</version>
</dependency>
JDBC URL Parameters
String url = "jdbc:gbase://10.168.10.26:5258/sales_db"
+ "?characterEncoding=utf8"
+ "&useUnicode=true"
+ "&connectTimeout=10000" // TCP connection timeout 10s
+ "&socketTimeout=3600000" // SQL execution timeout 1h
+ "&rewriteBatchedStatements=true" // Merge batch inserts
+ "&serverTimezone=Asia/Shanghai";
Connection conn = DriverManager.getConnection(url, "app_user", "password");
-
socketTimeoutmust be long enough for analytical queries; otherwise the client disconnects while the query still runs on the gbase database. -
rewriteBatchedStatements=truemerges multiple INSERT rows into a single network packet, boosting throughput 5–20×. Combine withgbase_buffer_insert=1000on the server side.
2. Connection Pool Configuration (Druid Example)
spring:
datasource:
druid:
initial-size: 5
min-idle: 5
max-active: 20
max-wait: 60000
test-while-idle: true
test-on-borrow: false
validation-query: SELECT 1
time-between-eviction-runs-millis: 60000
min-evictable-idle-time-millis: 300000
Key rules:
-
max-activeshould not exceedmax_connectionsdivided by the number of gcluster nodes. - Enable
test-while-idleand set the eviction interval shorter than gcluster'sWait_timeout(default 3600s) to avoid dead connections.
3. Batch Insert Best Practices
conn.setAutoCommit(false);
PreparedStatement pstmt = conn.prepareStatement(
"INSERT INTO orders VALUES (?, ?, ?, ?)");
int batchSize = 1000;
int count = 0;
for (Order order : orderList) {
pstmt.setLong(1, order.getOrderId());
// ... set other parameters
pstmt.addBatch();
if (++count % batchSize == 0) {
pstmt.executeBatch();
conn.commit();
pstmt.clearBatch();
}
}
if (count % batchSize != 0) {
pstmt.executeBatch();
conn.commit();
}
For over 1 million rows, prefer LOAD DATA INFILE — it offers significantly higher throughput than batch INSERT.
4. Python Connectivity
PyMySQL
import pymysql
conn = pymysql.connect(
host='10.168.10.26', port=5258,
user='app_user', password='your_password',
database='sales_db', charset='utf8',
connect_timeout=10, read_timeout=3600,
autocommit=True
)
with conn.cursor() as cursor:
cursor.execute("SELECT dept_id, SUM(sale_amount) FROM sales ...")
for row in cursor.fetchall():
print(row)
conn.close()
Batch Insert with executemany
data = [(10001, 1, 1500.00, '2024-06-01'), ...]
with conn.cursor() as cursor:
cursor.executemany(
"INSERT INTO orders VALUES (%s, %s, %s, %s)", data
)
conn.commit()
SQLAlchemy Integration
from sqlalchemy import create_engine
engine = create_engine(
"mysql+pymysql://user:password@10.168.10.26:5258/sales_db?charset=utf8",
pool_size=5, max_overflow=10, pool_recycle=1800,
connect_args={"read_timeout": 3600, "write_timeout": 60}
)
df = pd.read_sql("SELECT * FROM orders WHERE order_date >= '2024-06-01'", engine)
5. Common Connection Issues
| Issue | Cause | Solution |
|---|---|---|
| Communications link failure | Connection pool holds a dead connection | Enable test-while-idle, set pool-recycle |
| Packet too large | Single INSERT batch exceeds limit | Increase max_allowed_packet or reduce batch size |
| Python garbled characters | Missing charset declaration |
charset='utf8', use_unicode=True
|
| SocketTimeoutException |
socketTimeout too short |
Use different timeouts for OLTP and analytical queries |
6. Quick Reference
| Scenario | Recommendation |
|---|---|
| Max pool size | ≤ max_connections / gcluster nodes |
| socketTimeout | Analytical: 3600000ms; OLTP: 30000ms |
| Batch INSERT | rewriteBatchedStatements=true, 1000 rows/batch |
| Prevent connection eviction | testWhileIdle=true, poolRecycle=1800 |
| Python batch writes | executemany() or LOAD DATA |
Whether you are building a Java service with connection pooling or a Python analytics pipeline, following these configuration patterns will help you keep your gbase database connections stable and performant.
Top comments (0)