DEV Community

Cong Li
Cong Li

Posted on

GBase 8c Primary/Standby Performance Issue Analysis

Relational databases may encounter performance issues during operation, causing the system to become unstable. Common troubleshooting steps include identifying the issue, gathering information, pinpointing the cause, developing a solution, implementing the fix, testing the results, and documenting the process. This article focuses on analyzing potential performance issues in GBase 8c primary/standby scenarios, with emphasis on the following cases:

1. Analyzing SQL Statement Execution Status

Issue

Some SQL statements take too long to execute, requiring analysis of the query’s execution state.

Troubleshooting

1) Log into the database with an admin account (gbase):

[gbase@gbase01 ~]$ gsql  -d postgres -p 15432
Enter fullscreen mode Exit fullscreen mode

2) Check the currently executing SQL statements:

postgres=# SELECT datname, usename, state, query FROM pg_stat_activity;
Enter fullscreen mode Exit fullscreen mode

If the state is idle, it means the connection is idle, waiting for a new command. To only view non-idle queries, use:

postgres=# SELECT datname, usename, state, query FROM pg_stat_activity WHERE state != 'idle';
Enter fullscreen mode Exit fullscreen mode

To check if a query is blocked, use:

postgres=# SELECT datname, usename, state, query FROM pg_stat_activity WHERE waiting = true;
Enter fullscreen mode Exit fullscreen mode

2. Terminating a Specific Session

Issue

In some cases, you may need to forcibly terminate a session to allow the system to continue providing services.

Troubleshooting

1) Log into the database with an admin account (gbase):

[gbase@gbase01 ~]$ gsql  -d postgres -p 15432
Enter fullscreen mode Exit fullscreen mode

2) Find the session ID of the problematic session:

postgres=# SELECT datid, pid, state, query FROM pg_stat_activity;
Enter fullscreen mode Exit fullscreen mode

The pid value represents the thread ID of the session.

3) Terminate the session using the thread ID:

postgres=# SELECT pg_terminate_backend(139834762094352);
Enter fullscreen mode Exit fullscreen mode

3. Finding Long-Running SQL Statements

Issue

SQL statements are running for an extended period or are unable to finish execution.

Cause

This may be due to the complexity of the SQL statement or because the query is being blocked.

Troubleshooting

1) Log into the database with an admin account (gbase):

[gbase@gbase01 ~]$ gsql  -d postgres -p 15432
Enter fullscreen mode Exit fullscreen mode

2) Find long-running queries:

SELECT EXTRACT(DAY FROM (current_timestamp - query_start)) * 24 * 60 + EXTRACT(HOUR FROM (current_timestamp - query_start)) * 60 + EXTRACT(MINUTE FROM (current_timestamp - query_start)) AS runtime, datname, usename, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY 1 desc;
Enter fullscreen mode Exit fullscreen mode

This query will return a list of queries sorted by execution time, with the longest-running query at the top.

4. SQL Optimization and Analysis

Optimize SQL by creating appropriate indexes or rewriting the queries. Analyze whether SQL statements are blocked.

Issue

In certain business scenarios, queries may be blocked, causing the execution time to be excessively long.

Cause

When SQL statements attempt to lock resources, they may be blocked if the resource is already locked by another session. The SQL statement must wait for the resource to be released.

Troubleshooting

1) Log into the database with an admin account (gbase):

[gbase@gbase01 ~]$ gsql  -d postgres -p 15432
Enter fullscreen mode Exit fullscreen mode

2) Find the thread ID of the blocking session:

SELECT w.query AS waiting_query, w.pid AS w_pid, w.usename AS w_user, l.query AS locking_query, l.pid AS l_pid, l.usename AS l_user, t.schemaname || '.' || t.relname AS tablename FROM pg_stat_activity w JOIN pg_locks l1 ON w.pid = l1.pid AND NOT l1.granted JOIN pg_locks l2 ON l1.relation = l2.relation AND l2.granted JOIN pg_stat_activity l ON l2.pid = l.pid JOIN pg_stat_user_tables t ON l1.relation = t.relid WHERE w.waiting = true;
Enter fullscreen mode Exit fullscreen mode

3) Analyze and terminate the problematic session:

SELECT pg_terminate_backend(139834762094352);
Enter fullscreen mode Exit fullscreen mode

5. Lock Wait Timeout Errors

Issue

During business operations, lock wait timeout errors may occur, such as:

ERROR:  Lock wait timeout: thread 140533638080272 waiting for ShareLock on relation 16409 of database 13218 after 1200000.122 ms 
ERROR:  Lock wait timeout: thread 140533638080272 waiting for AccessExclusiveLock on relation 16409 of database 13218 after 1200000.193 ms
Enter fullscreen mode Exit fullscreen mode

Cause

When SQL attempts to lock a resource and finds it already locked, the statement must wait for the lock to be released. If the wait exceeds 2 minutes, a lock wait timeout error occurs.

Troubleshooting

This issue can be resolved by checking for long-running SQL queries or transactions and terminating problematic sessions, as described in the previous sections.

6. Log Overview

The operating state of the database is influenced by many factors. After an anomaly occurs, the logs are used to determine the cause. Below are the typical locations of log files in GBase 8c primary/standby setups.

(1) GBase Architecture and Composition

A single-node or primary/standby GBase 8c setup stores business data on a single physical node, with high availability achieved through synchronous or asynchronous replication between the primary and standby nodes.

  • HA Center: Responsible for high availability and managing failover between nodes.
  • GBase 8c Primary/Standby: Manages data storage, query execution, and returning results to clients.
  • Storage: Responsible for persisting data.

(2) Viewing Key Service Logs

System logs for the database service, such as:

/var/log/messages
Enter fullscreen mode Exit fullscreen mode

To view active services:

[gbase@gbase01 ~]$ sudo systemctl list-units | grep gbase
Enter fullscreen mode Exit fullscreen mode

To view HA Center logs:

[gbase@gbase01 ~]$ sudo systemctl status server_gbase_gha_server1
[gbase@gbase01 ~]$ sudo journalctl -u  server_gbase_gha_server1
Enter fullscreen mode Exit fullscreen mode

To view data node logs:

[gbase@gbase01 ~]$ sudo systemctl status datanode_gbase_dn1_1
[gbase@gbase01 ~]$ sudo journalctl -u  datanode_gbase_dn1_1
Enter fullscreen mode Exit fullscreen mode

To view data node operation logs:

[gbase@gbase01 dn1_1]$ cd $GAUSSLOG/pg_log
[gbase@gbase01 dn1_1]$ cd dn1_1
Enter fullscreen mode Exit fullscreen mode

Here you will find logs prefixed with postgresql, which contain logs about instance operation, login errors, permission issues, and SQL anomalies.

In addition to performance issues, GBase databases may encounter other challenges such as configuration problems, deployment issues, and runtime errors. Thorough examination of error messages, system configuration, and logs is essential for resolving these issues.

Top comments (0)