GBASE Database

Posted on Sep 24, 2024

Common Issues in GBase 8c Distributed Scenarios (1)

#database

During database usage, improper operations or maintenance can lead to abnormal operation of instances or clusters. The GBase multi-model database is no exception. This article focuses on the common issues and solutions related to GBase 8c V5 S3.0.0 database instances and users.

1. Database Instance Issues

This section mainly introduces common issue handling in the GBase 8c V5 S3.0.0 cluster. There are various reasons for instance anomalies, and identifying the cause requires checking the database operation logs.

1.1 Database Node in Unstable or Stop State

Issue Description:

When using gha_ctl monitor, the backup node of the cluster is in an unstable or stop state, as shown below:

Cause:

This issue can occur due to network failure, full disk, or database residues that cause automatic data restart anomalies, which require troubleshooting.

Solution:

1) Use gha_monitor to check the status of the abnormal node.

   [gbase@gbase8c ~]$ gha_ctl monitor all -l http://XXXX:2379 -c gbase -HI

2) Start the abnormal node, here dn1_2 is started as an example. Wait for the result.

   [gbase@gbase8c ~]$ gha_ctl start datanode dn1 dn1_2 -l http://XXXX:2379 -c gbase

3) Recheck the node status.

   [gbase@gbase8c ~]$ gha_ctl monitor all -l http://XXXX:2379 -c gbase -HI

1.2 Backup Node in Need Repair State

Issue Description:

When using gha_ctl monitor, the backup node of the cluster is in the "Stand by Need repair" state, which only occurs on backup nodes.

state
Stand by Need repair

Cause:

This issue is caused by network failure, full disk, etc., leading to a disconnect between primary and backup instances, and the backup logs not being synchronized, resulting in an exception when the backup starts.

Solution:

1) Use gha_ctl monitor to find the data directory (work_dir) of the node:

   [gbase@gbase8c ~]$ gha_ctl start all -l http://XXXX:2379 -c gbase

2) Log in to the server of the node in the need repair state and perform a database build operation:

   [gbase@gbase8c ~]$ gs_ctl build -D work_dir (directory obtained from the previous step)

2. User-Related Issues

This section mainly introduces troubleshooting and handling of user-related issues.

2.1 Incorrect Username or Password

Issue Description:

An error occurs when the user logs in, with the following message:

FATAL: Invalid username/password, login denied.

Cause:

The incorrect username or password is used during login.

Solution:

Check and use the correct username and password.

2.2 Client Does Not Have Login Permission

Issue Description:

An error occurs when the user logs in, indicating that the client does not have permission to log in to the database. The error message is as follows:

gsql: FATAL: no pg_hba.conf entry for host "XXX.XXX.XXX.XXX".
FATAL: no pg_hba.conf entry for host "XXX.XXX.XXX.XXX".

Cause:

The database has not authorized the client to log in. For security, GBase 8c verifies whether the client is trusted for every login attempt.

Solution:

Method 1: Use the gs_guc management tool to add a trusted client IP address online.

[gbase@gbase01 ~]$ gs_guc reload -N all -I all -h "host all all 0.0.0.0/0 sha256"

The parameters within the quotes consist of five parts:

host: Indicates login type, usually host. Other options include local (local login) and replication (replication slot).
all: Database name.
all: Username.
0.0.0.0/0: Specifies the client IP address. A specific IP range or address can also be specified.
sha256: Indicates the encryption algorithm, usually sha256. Other options include md5 and trust. Note: Do not use trust for remote users, as it allows clients to log in without a password.

NOTE: For primary and standby clusters, this command only needs to be executed once on the primary node.

Method 2: Modify the pg_hba.conf configuration file.

Add the client IP address to the configuration file and restart the cluster. For parameter explanations, refer to Method 1.

Finding the pg_hba.conf configuration file:

[gbase@gbase01 dn1_1]$ gha_ctl monitor all -c cluster_name -l http://XXXX:2379 -HI

The directory of work_dir is where the pg_hba.conf file is located.

NOTE: For primary and standby clusters, the configuration file needs to be modified on each node before restarting the cluster.

2.3 Unlocking a Locked User Account

Issue Description:

An error occurs when the user logs in, with the following message:

FATAL: The account has been locked.

Cause:

To ensure account security, if the user exceeds the maximum allowed login attempts (failed_login_attempts), the system will automatically lock the account. The default value is 10.

Solution:

1) Log in to the database environment with the super administrator user:

[gbase@gbase01 ~]$ gsql -d postgres -p 15432

2) Unlock the user:

postgres=# alter user user_name account unlock;
ALTER ROLE

2.4 User Permission Issues

Issue Description:

An error occurs when using DML, DDL, or other SQL statements after logging into the database:

ERROR: permission denied for relation

Cause:

The user does not have the required permissions.

Solution:

Grant the necessary permissions according to the principle of least privilege. For more information on database authorization, refer to: Documentation.

2.5 Database Read-Only Mode

Issue Description:

The database allows read operations, but errors occur when executing non-read SQL commands (insert, update, delete, create, drop, etc.) on the table:

FATAL: Invalid username/password, login denied.

Cause:

1) In primary-standby mode, all modification operations must be performed on the primary database.
2) If the operation is confirmed to be on the primary database, the database is already in read-only mode.

Solution:

1) Check if the database disk usage has reached 85%. If it has, delete unnecessary files on the server (e.g., expired backup files, log files) to reduce usage below 85% (or the set value).
2) Change the cluster from read-only mode to read-write mode:

[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "default_transaction_read_only=off"

NOTE: Execute this command only once on the primary node.

2.6 Insufficient Connections in High Concurrency Scenarios

Issue Description:

An error occurs when the user logs in, with the following message:

Too many clients already, current/active XXX/X

Cause:

The number of logged-in users has reached the maximum limit (max_connections). Increase the limit if the server memory allows.

Solution:

1) Log in to the database to view the current maximum connections and the number of connections used:

[gbase@gbase01 ~]$ gsql -d postgres -p 15432

Check the current maximum connection limit:

postgres=# show max_connections;
max_connections
-----------------
5000
(1 row)

Check the number of connections currently in use:

postgres=# select count(*) from pg_stat_activity;
count
-------
548
(1 row)

2) If the maximum connection limit is close to the current number of connections, increase the maximum connection limit or check if connections are not being properly closed.

3) Increase the maximum connection limit:

The values of max_prepared_transactions and max_connections need to be the same and adjusted based on connection requirements.

[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "max_connections=xxx"
[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "max_prepared_transactions=xxx"

4) Restart the database:

[gbase@gbase01 dn1_1]$ gha_ctl stop all -c gbase -l http://XXXX:2379
[gbase@gbase01 dn1_1]$ gha_ctl start all -c gbase -l http://XXXX:2379

Note: If you notice that the current number of database connections is significantly lower than the maximum limit, but the error still occurs, you need to check the operating system's file handle limit. For detailed inspection and handling instructions, please refer to: Documentation

DEV Community

Common Issues in GBase 8c Distributed Scenarios (1)

1. Database Instance Issues

1.1 Database Node in Unstable or Stop State

1.2 Backup Node in Need Repair State

2. User-Related Issues

2.1 Incorrect Username or Password

2.2 Client Does Not Have Login Permission

2.3 Unlocking a Locked User Account

Solution:

2.4 User Permission Issues

2.5 Database Read-Only Mode

2.6 Insufficient Connections in High Concurrency Scenarios

Top comments (0)

Read next

New AI Method Cuts Image Learning Costs by 30% While Boosting Accuracy

HermesFlow: AI System Masters Both Understanding and Creating Visual Content

New AI Training Method Cuts Data Needs in Half While Boosting Performance by 20%

Yield in Python - beyond the data generation