During database usage, improper operations or maintenance can lead to abnormal operation of instances or clusters. The GBase multi-model database is no exception. This article focuses on the common issues and solutions related to GBase 8c V5 S3.0.0 database instances and users.
1. Database Instance Issues
This section mainly introduces common issue handling in the GBase 8c V5 S3.0.0 cluster. There are various reasons for instance anomalies, and identifying the cause requires checking the database operation logs.
1.1 Database Node in Unstable or Stop State
Issue Description:
When using gha_ctl monitor
, the backup node of the cluster is in an unstable or stop state, as shown below:
Cause:
This issue can occur due to network failure, full disk, or database residues that cause automatic data restart anomalies, which require troubleshooting.
Solution:
1) Use gha_monitor
to check the status of the abnormal node.
[gbase@gbase8c ~]$ gha_ctl monitor all -l http://XXXX:2379 -c gbase -HI
2) Start the abnormal node, here dn1_2
is started as an example. Wait for the result.
[gbase@gbase8c ~]$ gha_ctl start datanode dn1 dn1_2 -l http://XXXX:2379 -c gbase
3) Recheck the node status.
[gbase@gbase8c ~]$ gha_ctl monitor all -l http://XXXX:2379 -c gbase -HI
1.2 Backup Node in Need Repair State
Issue Description:
When using gha_ctl monitor
, the backup node of the cluster is in the "Stand by Need repair" state, which only occurs on backup nodes.
state
Stand by Need repair
Cause:
This issue is caused by network failure, full disk, etc., leading to a disconnect between primary and backup instances, and the backup logs not being synchronized, resulting in an exception when the backup starts.
Solution:
1) Use gha_ctl monitor
to find the data directory (work_dir
) of the node:
[gbase@gbase8c ~]$ gha_ctl start all -l http://XXXX:2379 -c gbase
2) Log in to the server of the node in the need repair state and perform a database build operation:
[gbase@gbase8c ~]$ gs_ctl build -D work_dir (directory obtained from the previous step)
2. User-Related Issues
This section mainly introduces troubleshooting and handling of user-related issues.
2.1 Incorrect Username or Password
Issue Description:
An error occurs when the user logs in, with the following message:
FATAL: Invalid username/password, login denied.
Cause:
The incorrect username or password is used during login.
Solution:
Check and use the correct username and password.
2.2 Client Does Not Have Login Permission
Issue Description:
An error occurs when the user logs in, indicating that the client does not have permission to log in to the database. The error message is as follows:
gsql: FATAL: no pg_hba.conf entry for host "XXX.XXX.XXX.XXX".
FATAL: no pg_hba.conf entry for host "XXX.XXX.XXX.XXX".
Cause:
The database has not authorized the client to log in. For security, GBase 8c verifies whether the client is trusted for every login attempt.
Solution:
Method 1: Use the gs_guc
management tool to add a trusted client IP address online.
[gbase@gbase01 ~]$ gs_guc reload -N all -I all -h "host all all 0.0.0.0/0 sha256"
The parameters within the quotes consist of five parts:
-
host
: Indicates login type, usuallyhost
. Other options includelocal
(local login) andreplication
(replication slot). -
all
: Database name. -
all
: Username. -
0.0.0.0/0
: Specifies the client IP address. A specific IP range or address can also be specified. -
sha256
: Indicates the encryption algorithm, usuallysha256
. Other options includemd5
andtrust
. Note: Do not usetrust
for remote users, as it allows clients to log in without a password.
NOTE: For primary and standby clusters, this command only needs to be executed once on the primary node.
Method 2: Modify the pg_hba.conf
configuration file.
Add the client IP address to the configuration file and restart the cluster. For parameter explanations, refer to Method 1.
Finding the pg_hba.conf
configuration file:
[gbase@gbase01 dn1_1]$ gha_ctl monitor all -c cluster_name -l http://XXXX:2379 -HI
The directory of work_dir
is where the pg_hba.conf
file is located.
NOTE: For primary and standby clusters, the configuration file needs to be modified on each node before restarting the cluster.
2.3 Unlocking a Locked User Account
Issue Description:
An error occurs when the user logs in, with the following message:
FATAL: The account has been locked.
Cause:
To ensure account security, if the user exceeds the maximum allowed login attempts (failed_login_attempts
), the system will automatically lock the account. The default value is 10.
Solution:
1) Log in to the database environment with the super administrator user:
[gbase@gbase01 ~]$ gsql -d postgres -p 15432
2) Unlock the user:
postgres=# alter user user_name account unlock;
ALTER ROLE
2.4 User Permission Issues
Issue Description:
An error occurs when using DML, DDL, or other SQL statements after logging into the database:
ERROR: permission denied for relation
Cause:
The user does not have the required permissions.
Solution:
Grant the necessary permissions according to the principle of least privilege. For more information on database authorization, refer to: Documentation.
2.5 Database Read-Only Mode
Issue Description:
The database allows read operations, but errors occur when executing non-read SQL commands (insert, update, delete, create, drop, etc.) on the table:
FATAL: Invalid username/password, login denied.
Cause:
1) In primary-standby mode, all modification operations must be performed on the primary database.
2) If the operation is confirmed to be on the primary database, the database is already in read-only mode.
Solution:
1) Check if the database disk usage has reached 85%. If it has, delete unnecessary files on the server (e.g., expired backup files, log files) to reduce usage below 85% (or the set value).
2) Change the cluster from read-only mode to read-write mode:
[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "default_transaction_read_only=off"
NOTE: Execute this command only once on the primary node.
2.6 Insufficient Connections in High Concurrency Scenarios
Issue Description:
An error occurs when the user logs in, with the following message:
Too many clients already, current/active XXX/X
Cause:
The number of logged-in users has reached the maximum limit (max_connections
). Increase the limit if the server memory allows.
Solution:
1) Log in to the database to view the current maximum connections and the number of connections used:
[gbase@gbase01 ~]$ gsql -d postgres -p 15432
Check the current maximum connection limit:
postgres=# show max_connections;
max_connections
-----------------
5000
(1 row)
Check the number of connections currently in use:
postgres=# select count(*) from pg_stat_activity;
count
-------
548
(1 row)
2) If the maximum connection limit is close to the current number of connections, increase the maximum connection limit or check if connections are not being properly closed.
3) Increase the maximum connection limit:
The values of max_prepared_transactions
and max_connections
need to be the same and adjusted based on connection requirements.
[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "max_connections=xxx"
[gbase@gbase01 dn1_1]$ gs_guc reload -N all -I all -c "max_prepared_transactions=xxx"
4) Restart the database:
[gbase@gbase01 dn1_1]$ gha_ctl stop all -c gbase -l http://XXXX:2379
[gbase@gbase01 dn1_1]$ gha_ctl start all -c gbase -l http://XXXX:2379
Note: If you notice that the current number of database connections is significantly lower than the maximum limit, but the error still occurs, you need to check the operating system's file handle limit. For detailed inspection and handling instructions, please refer to: Documentation
Top comments (0)