DEV Community

Cong Li
Cong Li

Posted on

Common Issues in GBase 8c Distributed Scenarios (1)

During database usage, improper operations or maintenance can lead to abnormal operation of instances or clusters. The GBase multi-model database is no exception. This article focuses on the common issues and solutions related to GBase 8c V5 S3.0.0 database instances and users.

1. Database Instance Issues

This section mainly introduces common issue handling in the GBase 8c V5 S3.0.0 cluster. There are various reasons for instance anomalies, and identifying the cause requires checking the database operation logs.

1.1 Database Node in Unstable or Stop State

Issue Description:

When using gha_ctl monitor, the backup node of the cluster is in an unstable or stop state, as shown below:

Image description

Cause:

This issue can occur due to network failure, full disk, or database residues that cause automatic data restart anomalies, which require troubleshooting.

Solution:

1) Use gha_monitor to check the status of the abnormal node.

   [gbase@gbase8c ~]$ gha_ctl monitor all -l http://XXXX:2379 -c gbase -HI
Enter fullscreen mode Exit fullscreen mode

2) Start the abnormal node, here dn1_2 is started as an example. Wait for the result.

   [gbase@gbase8c ~]$ gha_ctl start datanode dn1 dn1_2 -l http://XXXX:2379 -c gbase
Enter fullscreen mode Exit fullscreen mode

3) Recheck the node status.

   [gbase@gbase8c ~]$ gha_ctl monitor all -l http://XXXX:2379 -c gbase -HI
Enter fullscreen mode Exit fullscreen mode

1.2 Backup Node in Need Repair State

Issue Description:

When using gha_ctl monitor, the backup node of the cluster is in the "Stand by Need repair" state, which only occurs on backup nodes.

state
Stand by Need repair
Enter fullscreen mode Exit fullscreen mode

Cause:

This issue is caused by network failure, full disk, etc., leading to a disconnect between primary and backup instances, and the backup logs not being synchronized, resulting in an exception when the backup starts.

Solution:

1) Use gha_ctl monitor to find the data directory (work_dir) of the node:

   [gbase@gbase8c ~]$ gha_ctl start all -l http://XXXX:2379 -c gbase
Enter fullscreen mode Exit fullscreen mode

2) Log in to the server of the node in the need repair state and perform a database build operation:

   [gbase@gbase8c ~]$ gs_ctl build -D work_dir (directory obtained from the previous step)
Enter fullscreen mode Exit fullscreen mode

2. User-Related Issues

This section mainly introduces troubleshooting and handling of user-related issues.

2.1 Incorrect Username or Password

Issue Description:

An error occurs when the user logs in, with the following message:

FATAL: Invalid username/password, login denied.
Enter fullscreen mode Exit fullscreen mode

Cause:

The incorrect username or password is used during login.

Solution:

Check and use the correct username and password.

2.2 Client Does Not Have Login Permission

Issue Description:

An error occurs when the user logs in, indicating that the client does not have permission to log in to the database. The error message is as follows:

gsql: FATAL: no pg_hba.conf entry for host "XXX.XXX.XXX.XXX".
FATAL: no pg_hba.conf entry for host "XXX.XXX.XXX.XXX".
Enter fullscreen mode Exit fullscreen mode

Cause:

The database has not authorized the client to log in. For security, GBase 8c verifies whether the client is trusted for every login attempt.

Solution:

Method 1: Use the gs_guc management tool to add a trusted client IP address online.

[gbase@gbase01 ~]$ gs_guc reload -N all -I all -h "host all all 0.0.0.0/0 sha256"
Enter fullscreen mode Exit fullscreen mode

The parameters within the quotes consist of five parts:

  • host: Indicates login type, usually host. Other options include local (local login) and replication (replication slot).
  • all: Database name.
  • all: Username.
  • 0.0.0.0/0: Specifies the client IP address. A specific IP range or address can also be specified.
  • sha256: Indicates the encryption algorithm, usually sha256. Other options include md5 and trust. Note: Do not use trust for remote users, as it allows clients to log in without a password.

NOTE: For primary and standby clusters, this command only needs to be executed once on the primary node.

Method 2: Modify the pg_hba.conf configuration file.

Add the client IP address to the configuration file and restart the cluster. For parameter explanations, refer to Method 1.

Finding the pg_hba.conf configuration file:

[gbase@gbase01 dn1_1]$ gha_ctl monitor all -c cluster_name -l http://XXXX:2379 -HI
Enter fullscreen mode Exit fullscreen mode

The directory of work_dir is where the pg_hba.conf file is located.

NOTE: For primary and standby clusters, the configuration file needs to be modified on each node before restarting the cluster.

2.3 Unlocking a Locked User Account

Issue Description:

An error occurs when the user logs in, with the following message:

FATAL: The account has been locked.
Enter fullscreen mode Exit fullscreen mode

Cause:

To ensure account security, if the user exceeds the maximum allowed login attempts (failed_login_attempts), the system will automatically lock the account. The default value is 10.

Solution:

1) Log in to the database environment with the super administrator user:

[gbase@gbase01 ~]$ gsql -d postgres -p 15432
Enter fullscreen mode Exit fullscreen mode

2) Unlock the user:

postgres=# alter user user_name account unlock;
ALTER ROLE
Enter fullscreen mode Exit fullscreen mode

2.4 User Permission Issues

Issue Description:

An error occurs when using DML, DDL, or other SQL statements after logging into the database:

ERROR: permission denied for relation
Enter fullscreen mode Exit fullscreen mode

Cause:

The user does not have the required permissions.

Solution:

Grant the necessary permissions according to the principle of least privilege. For more information on database authorization, refer to: Documentation.

2.5 Database Read-Only Mode

Issue Description:

The database allows read operations, but errors occur when executing non-read SQL commands (insert, update, delete, create, drop, etc.) on the table:

FATAL: Invalid username/password, login denied.
Enter fullscreen mode Exit fullscreen mode

Cause:

1) In primary-standby mode, all modification operations must be performed on the primary database.
2) If the operation is confirmed to be on the primary database, the database is already in read-only mode.

Solution:

1) Check if the database disk usage has reached 85%. If it has, delete unnecessary files on the server (e.g., expired backup files, log files) to reduce usage below 85% (or the set value).
2) Change the cluster from read-only mode to read-write mode:

[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "default_transaction_read_only=off"
Enter fullscreen mode Exit fullscreen mode

NOTE: Execute this command only once on the primary node.

2.6 Insufficient Connections in High Concurrency Scenarios

Issue Description:

An error occurs when the user logs in, with the following message:

Too many clients already, current/active XXX/X
Enter fullscreen mode Exit fullscreen mode

Cause:

The number of logged-in users has reached the maximum limit (max_connections). Increase the limit if the server memory allows.

Solution:

1) Log in to the database to view the current maximum connections and the number of connections used:

[gbase@gbase01 ~]$ gsql -d postgres -p 15432
Enter fullscreen mode Exit fullscreen mode

Check the current maximum connection limit:

postgres=# show max_connections;
max_connections
-----------------
5000
(1 row)
Enter fullscreen mode Exit fullscreen mode

Check the number of connections currently in use:

postgres=# select count(*) from pg_stat_activity;
count
-------
548
(1 row)
Enter fullscreen mode Exit fullscreen mode

2) If the maximum connection limit is close to the current number of connections, increase the maximum connection limit or check if connections are not being properly closed.

3) Increase the maximum connection limit:

The values of max_prepared_transactions and max_connections need to be the same and adjusted based on connection requirements.

[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "max_connections=xxx"
[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "max_prepared_transactions=xxx"
Enter fullscreen mode Exit fullscreen mode

4) Restart the database:

[gbase@gbase01 dn1_1]$ gha_ctl stop all -c gbase -l http://XXXX:2379
[gbase@gbase01 dn1_1]$ gha_ctl start all -c gbase -l http://XXXX:2379
Enter fullscreen mode Exit fullscreen mode

Note: If you notice that the current number of database connections is significantly lower than the maximum limit, but the error still occurs, you need to check the operating system's file handle limit. For detailed inspection and handling instructions, please refer to: Documentation

Top comments (0)