DEV Community

Michael
Michael

Posted on • Originally published at gbase.cn

GBase 8c Lock Wait and Deadlock Troubleshooting: From Blocking Chains to Global Transactions

In GBase 8c operations, many apparent performance problems are actually lock waits, blocking chains, or distributed deadlocks. This guide provides a systematic approach to diagnosing and resolving lock issues using session state, lock views, blocking chain analysis, and cross‑node global transaction tracing in a gbase database.

1. Classify Lock Problems Before Investigating

  1. Simple lock wait: A session holds a lock that other DML/DDL statements queue behind. Symptoms include long‑running UPDATE/DELETE, hanging TRUNCATE/ALTER, or frozen stored procedures.
  2. Blocking chain: Session A blocks B, B blocks C, forming a chain. The root is the head of the chain, not the last victim SQL.
  3. Global deadlock: Circular waits across CNs and DNs that cannot be fully seen from a single node. GBase 8c supports automatic global deadlock detection and resolution.

2. Check Session State First

SELECT
    pid, usename, application_name, client_addr,
    state, wait_event_type, wait_event,
    xact_start, query_start,
    now() - query_start AS running_time,
    query
FROM pg_stat_activity
WHERE state <> 'idle'
ORDER BY query_start;
Enter fullscreen mode Exit fullscreen mode

Focus on:

  • state: idle in transaction means the statement completed but the transaction was not committed — locks are still held.
  • wait_event_type: Lock confirms a lock wait.
  • xact_start: An old transaction start signals long‑transaction risk.

3. Start with Table‑Level Strong Locks

TRUNCATE, DDL, and batch scripts often trigger strong lock waits. Join pg_locks, pg_class, pg_namespace, and pg_stat_activity, and look for AccessExclusiveLock.

SELECT
    n.nspname AS schema_name,
    c.relname AS table_name,
    l.pid, l.locktype, l.mode, l.granted,
    a.usename, a.state,
    now() - a.state_change AS lock_duration,
    a.query
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_locks l ON l.relation = c.oid
LEFT JOIN pg_stat_activity a ON a.pid = l.pid
WHERE c.relkind = 'r'
  AND n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY lock_duration DESC NULLS LAST;
Enter fullscreen mode Exit fullscreen mode

A session holding a lock for an unusually long time points the way.

4. Map the Blocking Chain: Find the Head Blocker

Link waiting sessions directly to the sessions that hold the locks they need:

SELECT
    w.pid                 AS waiting_pid,
    wa.usename            AS waiting_user,
    wa.query              AS waiting_sql,
    b.pid                 AS blocking_pid,
    ba.usename            AS blocking_user,
    ba.state              AS blocking_state,
    ba.query              AS blocking_sql
FROM pg_locks w
JOIN pg_stat_activity wa ON wa.pid = w.pid
JOIN pg_locks b
  ON w.locktype = b.locktype
 AND w.database IS NOT DISTINCT FROM b.database
 AND w.relation IS NOT DISTINCT FROM b.relation
 AND w.page IS NOT DISTINCT FROM b.page
 AND w.tuple IS NOT DISTINCT FROM b.tuple
 AND w.pid <> b.pid
JOIN pg_stat_activity ba ON ba.pid = b.pid
WHERE NOT w.granted
  AND b.granted;
Enter fullscreen mode Exit fullscreen mode

Priority when acting: identify a clear blocker → check if it's idle in transaction → table‑level or row‑level → DDL involved → cross‑node needed?

5. Trace Across Nodes with global_sessionid

On the CN, locate the business session, then get the global transaction ID:

SELECT pid, global_sessionid, locktype, mode, granted
FROM pg_locks WHERE pid = 123456;
Enter fullscreen mode Exit fullscreen mode

With an identifier like 0:818#0, continue on the DN:

SELECT pid, global_sessionid, mode, granted
FROM pg_locks WHERE global_sessionid LIKE '%818%';

SELECT pid, state, wait_event_type, wait_event, query
FROM pg_stat_activity WHERE pid = 987654;
Enter fullscreen mode Exit fullscreen mode

This is essential for batch jobs, stored procedures, and cross‑node write conflicts.

6. Deadlock Patterns and Handling

The most common cause is inconsistent update order across transactions:

-- session 1
BEGIN;
UPDATE acct_balance SET amount = amount - 100 WHERE acct_id = 101;

-- session 2
BEGIN;
UPDATE acct_balance SET amount = amount - 50 WHERE acct_id = 202;

-- session 1
UPDATE acct_balance SET amount = amount + 100 WHERE acct_id = 202;

-- session 2
UPDATE acct_balance SET amount = amount + 50 WHERE acct_id = 101;
Enter fullscreen mode Exit fullscreen mode

Adjusting deadlock_timeout can tune detection timing, but root causes are typically: inconsistent update order, oversized transactions, or batch jobs competing with online traffic for the same hot rows.

7. Three‑Step Emergency Response: Cancel, Terminate, Review

-- 1. Try cancelling the query first
SELECT pg_cancel_backend(140508376790784);

-- 2. Terminate the session if cancellation fails
SELECT pg_terminate_backend(140508376790784);
Enter fullscreen mode Exit fullscreen mode

After the immediate fix, always review: the blocking SQL, session source, transaction duration, hot objects, and whether the issue spans nodes.

8. Recommended Troubleshooting Sequence

# 1. Non‑idle sessions
gsql -d appdb -p 15400 -r -c "
SELECT pid, usename, client_addr, state, wait_event_type, wait_event,
       now() - query_start AS running_time, query
FROM pg_stat_activity WHERE state <> 'idle' ORDER BY query_start;
"

# 2. Lock information with object names
gsql -d appdb -p 15400 -r -c "
SELECT n.nspname, c.relname, l.pid, l.mode, l.granted
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_locks l ON l.relation = c.oid
ORDER BY l.granted, l.pid;
"

# 3. For distributed or cross‑node issues, check global_sessionid
gsql -d appdb -p 15400 -r -c "
SELECT pid, global_sessionid, locktype, mode, granted
FROM pg_locks WHERE pid = 123456;
"
Enter fullscreen mode Exit fullscreen mode

Lock issues in a gbase database are fundamentally about concurrency order, not SQL flaws. Stop the bleeding by finding the blocker first, drill down to see if it's a cross‑node global transaction, then build the habits that prevent the same conflicts from recurring. Follow these three steps and the chaos of lock waits and deadlocks becomes manageable.

Top comments (0)