Cong Li

Posted on Sep 24, 2024

GBase 8a MPP Cluster Performance Tuning: GCluster Parameter Optimization

#database

Efficiency is one of the most important indicators for a database. To improve database performance, the following aspects should be well managed: database design, SQL query optimization, database parameter configuration, appropriate data resources, and the operating system.

Today, I will introduce the GCluster parameter optimization methods for GBase.

1. Query Optimization Parameters

`gcluster_hash_redistribute_groupby_optimize`

This parameter controls whether to enable the group by optimization mode. The default value is 1 (0 - Disable, 1 - Enable).

If this option is enabled, the temporary results will be redistributed to each compute node using a hash algorithm before performing group by operations. Since the data is hashed before being distributed to each node, the final results can be directly aggregated without the need for an additional grouping operation on the summary node.

`gcluster_hash_redistribute_join_optimize`

This parameter controls whether to enable the hash redistribution for the JOIN mode. The default value is 2 (0 – Copy table, 1 – Dynamic hash, 2 – Auto evaluate).

If enabled, when performing an equi-join operation on two distributed tables, the data from one of the tables will be dynamically hashed based on the join condition column values. This dynamic hash table will then be joined with the other table on each compute node. Thus, each node's operation result can be directly aggregated to get the final result. This strategy avoids creating a copy table on all compute nodes. If set to 2, hash redistribution JOIN is used when the data difference between the two tables is within 20%; otherwise, it is not used.

`gcluster_special_correlated_optimize`

This parameter controls whether to enable hash redistribution optimization for correlated subqueries. If the parent and child queries are correlated and there is an equi-join relationship, the parent and child queries will be hashed and redistributed based on the join column.

Parameter = 0: Disable
Parameter = 1: Enable
Default value is 1.

The hash redistribution for correlated subqueries needs to be used together with the parameter gcluster_crossjoin_use_hash_distribution. The rules are as follows:

When gcluster_special_correlated_optimize = 0, the optimization will not be enabled, regardless of the gcluster_crossjoin_use_hash_distribution value.
When gcluster_special_correlated_optimize = 1 and gcluster_crossjoin_use_hash_distribution = 0, the optimization will not be enabled.
When gcluster_special_correlated_optimize = 1 and gcluster_crossjoin_use_hash_distribution = 1, the optimization will be enabled.

Example:

SELECT COUNT(*) 
FROM x1 
WHERE EXISTS (SELECT 1 FROM x2 WHERE x1.id2 = x2.id2);

For hash-related subquery optimization, x1 and x2 are dynamically redistributed.

`gcluster_crossjoin_use_hash_distribution`

This parameter controls whether to enforce hash redistribution JOIN when neither side of the JOIN is a hash column.

Parameter = 0: Disable, hash redistribution JOIN is not used.
Parameter = 1: Enable, hash redistribution JOIN is used.
Default value is 1.

`gcluster_empty_result_set_optimize`

This parameter controls whether to enable empty result set optimization. The default value is 0 (0 - OFF, 1 - ON). If the optimizer determines that the result set is empty, it will return directly without requiring the executor to perform the execution.

`gcluster_single_hash_node_optimize`

Optimizes single-table hash conditions. When the single table contains hash column equality conditions, hash optimization is performed, and the SQL statement is sent to a single node. The default value is 1 (0 - OFF, 1 - ON).

`gcluster_hash_join_complex_optimize`

Optimizes subqueries and parent queries that meet the hash relationship by executing them as a single unit.

Parameter = 0: Disable, hash relationship optimization is not executed as a whole.
Parameter = 1: Enable, hash relationship optimization is executed as a whole.
Default value is 1.

`gcluster_union_optimize`

This parameter controls whether to use union optimization. When enabled, the union statement is executed on the node, avoiding pulling all tables that need to be unioned as copy tables.

Parameter = 0: Do not use union optimization.
Parameter = 1: Use union optimization.
Default value is 1.

`gcluster_starschema_join_estimate_optimize`

Sets the method for estimating the join result of two tables.

0: Estimate by multiplying the row counts of both tables.
1: Estimate by the larger table's row count.
Default value is 1.

`gcluster_delayed_group_by_optimize`

Determines whether group by should be pushed down to gnode for execution. If there is no significant reduction in the result set after pushing down group by to gnode, set this parameter to 1 (0 - OFF, 1 - ON).

`gcluster_count_optimize`

Determines if intermediate result tables are generated during count(*) operations.

0: Execute by generating intermediate tables.
1: Directly calculate the count value in gcluster. The count shortcut optimization sends the SQL to each gnode node for direct calculation, without generating a query plan or temporary table.
2: Based on 1, if the table is a hash distribution table and the SQL's where conditions meet the single hash optimization condition, the executor calculates the target gnode based on the hash value and sends the SQL to the target gnode for calculation, rather than to all gnodes, improving small query concurrency.

Optimization Details: The original strategy for count(*) was to create a temporary summary table on the initiating node's gnode, aggregate the results from each node into the temporary table, and then sum the results. This strategy created and deleted temporary summary tables frequently during high concurrency, reducing performance. The new approach avoids creating temporary summary tables, collects execution results from each node into gcluster, and calculates directly, thus improving performance.

Conditions for gcluster_count_optimize = 1 to optimize count SQL:

No dblink table is used.
No union, minus, intersect.
No non-correlated subqueries.
No scalar subqueries.
No from subqueries.
No correlated subqueries.
Not select...into.
Only one column in the projection.
No group by clause.
No order by.
No having.
No limit or offset.
No distinct.
No join.
No copy table.
Projection column must be count but not count distinct.

Conditions for gcluster_count_optimize = 2 to optimize count SQL:

Meets all conditions of gcluster_count_optimize = 1.
Uses hash distribution table, and where conditions meet single hash optimization conditions.

Example:

SELECT count(1) 
FROM t 
WHERE t.distr_col = constant;

With optimization disabled, throughput at 400 concurrent requests is 3905/s; with optimization enabled, throughput is 6055/s.

2. Concurrency Parameters

`gcluster_serial_exec_query`

Optimization Point: When gnode does not have automatic resource management capabilities, high concurrency causes gnode resources to be contested by concurrent SQL, reducing performance. gcluster can control the number of SQLs sent to gnode to indirectly control resource usage, preventing concurrent SQLs from competing for resources and improving performance. Configure gcluster_serial_exec_query = batch submission count (number of CPU cores per node) in the gcluster configuration file to control the number of SQLs submitted to gnode. Default value is 0 (no limit).

`gcluster_max_conn_in_pool`

Optimization Point: Without a thread pool, the concurrent access to gnode by gcluster is uncontrolled. Each access to gnode starts a new thread, and high concurrency results in many threads consuming system resources and increasing gnode pressure. With a thread pool, concurrent requests compete for threads in the same pool, so the maximum thread count can strictly control the concurrency level, reducing gnode pressure and thread resource consumption. Default value is 300.

`gcluster_use_conn_pool`

Optimization Point: Without a connection pool, the connection count to gnode is uncontrolled. Each access to gnode starts a new connection, increasing connection time and consuming system resources during high concurrency. With a connection pool, concurrent requests compete for connections in the same pool, so the maximum connection count can strictly control concurrency, reducing gnode pressure and connection resource consumption.

0: OFF
1: ON

`gcluster_insertselect_use_values_optimize`

Optimization Point: Multiple insert into t1 select * from t operations are serialized on gnode under high concurrency, affecting execution efficiency. However, insert into t1 values() allows concurrent execution.

Example:

INSERT INTO TB_SVC_SUBS_HIST_TMP1 
SELECT * 
FROM TB_SVC_SUBS_HIST 
WHERE MSISDN = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';

Applicable Scenario:
This parameter is suitable for high-concurrency insert select operations, especially when the result set of the select statement is not too large.

(0-OFF, 1-ON)

`gcluster_single_hash_node_optimize`

This parameter optimizes single-point hash conditions:

0: Disables single-point hash optimization.
1: Optimizes single-table equality hash queries. When a single table has equality conditions on hash columns, the SQL statement is only sent to a single node.
2: Optimizes single-table and from subquery equality hash queries. When a single table or a from subquery has equality conditions on hash columns, the SQL statement is only sent to a single node.
3: Supports single-table optimization and multi-table joins, including inner join, left join, and right join (not supporting full join). The SQL statement is only sent to a single node.

The default value is 1: Enabled.

Example

SELECT * 
FROM bas.clcinfdta 
WHERE 1=1 AND clt_nbr = '7319022720' 
LIMIT 30 OFFSET 0;

If no records are found, remove 1=1 to execute the query correctly. To temporarily disable this optimization, set gcluster_single_hash_node_optimize = 0.

Performance Comparison for single_hash=3:

20 Concurrent JMeter Threads:
- With optimization enabled: Throughput = 20.2/sec
- With optimization disabled: Throughput = 15.3/sec
50 Concurrent JMeter Threads:
- With optimization enabled: Throughput = 27.2/sec
- With optimization disabled: Throughput = 19.2/sec
100 Concurrent JMeter Threads:
- With optimization enabled: Throughput = 28.9/sec
- With optimization disabled: Throughput = 20.3/sec

`_gbase_enable_hashtree`

The default value is 1 (Enabled). This parameter is used when the join columns have high duplication, which can lead to inefficiency.

0: Disables hashtree usage during joins, switching to a linked list structure for joins.

`gcluster_ddl_parallel_execute`

This parameter controls whether DDL statements are executed in parallel or sequentially.

0: Sequential execution.
1: Parallel execution.

The default value is 0.

DEV Community

GBase 8a MPP Cluster Performance Tuning: GCluster Parameter Optimization

1. Query Optimization Parameters

`gcluster_hash_redistribute_groupby_optimize`

`gcluster_hash_redistribute_join_optimize`

`gcluster_special_correlated_optimize`

`gcluster_crossjoin_use_hash_distribution`

`gcluster_empty_result_set_optimize`

`gcluster_single_hash_node_optimize`

`gcluster_hash_join_complex_optimize`

`gcluster_union_optimize`

`gcluster_starschema_join_estimate_optimize`

`gcluster_delayed_group_by_optimize`

`gcluster_count_optimize`

2. Concurrency Parameters

`gcluster_serial_exec_query`

`gcluster_max_conn_in_pool`

`gcluster_use_conn_pool`

`gcluster_insertselect_use_values_optimize`

`gcluster_single_hash_node_optimize`

`_gbase_enable_hashtree`

`gcluster_ddl_parallel_execute`

Top comments (0)

Read next

Oracle Database Fragmentation - How to reorg Tables / Indexes

How to Manage Your SQL Server Database Effectively

Avoiding Connection Pinning in Lambda and RDS Proxy with NestJS and Proxy Splitting

Deploying a MongoDB Collection Generator on Kubernetes