Replication channels are the cornerstone of MySQL HeatWave inbound replication, establishing the connection between source databases and replica DB systems. Understanding channel states, operations, and troubleshooting techniques is essential for maintaining reliable data replication. This guide covers channel management, lifecycle states, common errors, and proven troubleshooting strategies.
Understanding Replication Channels
A replication channel represents a path through which changes produced by transactions on a source MySQL server or DB system are replicated to a target DB system.
What Channels Do
Each channel identifies:
- A replication source (hostname, port, credentials)
- Configuration required to connect to that source
- How to copy transactions from source to replica
- Replication positioning (GTID or binary log)
Important: Channels have their own lifecycle and can be enabled, disabled, reset, and resumed independently.
Channel States
MySQL HeatWave channels exist in either final states (stable) or transient states (temporary during operations).
Final States
1. ACTIVE
Channel is replicating data successfully.
Characteristics:
- Both IO and SQL threads running
- Transactions flowing from source to replica
- May have slight replication lag (normal)
- No errors reported
Console Indicator: Green icon with "Active" text
2. INACTIVE
Channel is not replicating; replication has been stopped.
Common Reasons:
- Channel manually disabled by administrator
- Created with "Enable automatically" unchecked
- DB system is stopped
Console Indicator: Gray icon with "Inactive" text
To Resume: Enable the channel through console or CLI
3. NEEDS_ATTENTION
Channel is not replicating due to an error.
Common Causes:
- Source server unreachable
- Authentication failure
- Binary log files purged on source
- SSL/TLS configuration mismatch
- Incompatible transactions
- DB system is in INACTIVE state
Console Indicator: Red/orange icon with "Needs Attention" text
Action Required: Check channel details page for error messages, resolve issue, then resume channel
4. DELETED
Channel has been permanently removed.
Characteristics:
- Appears briefly before removal from list
- Cannot be recovered
- All channel configuration deleted
Transient States
These are temporary states during channel operations:
1. CREATING
Channel is being created and configured.
Duration: Typically 1-3 minutes
Next State: ACTIVE (if auto-enabled) or INACTIVE
2. UPDATING
Channel is performing an update operation.
Occurs During:
- Enabling or disabling channel
- Modifying channel configuration
- Resuming after error resolution
- DB system operations affecting channel
Duration: Typically 30 seconds to 2 minutes
3. DELETING
Channel deletion is in progress.
Duration: Typically 1-2 minutes
Next State: DELETED
Channel Operations
Enabling a Channel
Starts replication from the source if no errors exist.
Console:
- Navigate to Databases → HeatWave MySQL → Channels
- Select channel → Actions → Enable
- Channel transitions: INACTIVE → UPDATING → ACTIVE
CLI:
oci mysql channel update \
--channel-id <CHANNEL_OCID> \
--is-enabled true
Disabling a Channel
Stops replication and puts channel into INACTIVE state.
When to Disable:
- Maintenance on source server
- Temporary halt of replication
- Troubleshooting issues
- Before configuration changes
Console: Channel → Actions → Disable
CLI:
oci mysql channel update \
--channel-id <CHANNEL_OCID> \
--is-enabled false
Resuming a Channel
Restarts replication after error is resolved. Used when channel is in NEEDS_ATTENTION state.
Console: Channel → Actions → Resume
CLI:
oci mysql channel resume \
--channel-id <CHANNEL_OCID>
Behavior: Channel transitions to UPDATING, then ACTIVE if error resolved
Resetting a Channel
Removes all replication position data except channel configuration. Channel must be INACTIVE first.
Warning: Resets replication position - will re-replicate transactions
Console: Channel → Actions → Reset
CLI:
oci mysql channel reset \
--channel-id <CHANNEL_OCID> \
--from-json '{"resetType": "POSITION"}'
Use Cases:
- Recovering from corrupted replication position
- Starting replication from beginning
- After major source database changes
Deleting a Channel
Permanently removes channel and all associated configuration.
Console: Channel details → Delete
CLI:
oci mysql channel delete \
--channel-id <CHANNEL_OCID> \
--force
Troubleshooting Inbound Replication
Step 1: Identify Source of Failure
When channel enters NEEDS_ATTENTION state, determine the root cause:
Check Channel Details Page
- Navigate to channel in console
- Click channel name to view details
- Review Error section for messages
- Check Last Error timestamp
- Note error code and description
Check Replication Status on Replica
Connect to replica DB system and query:
SHOW REPLICA STATUS\G
Key Fields:
-
Replica_IO_Running: Should be "Yes" (checks source connection) -
Replica_SQL_Running: Should be "Yes" (checks transaction application) -
Last_IO_Errno: IO thread error code -
Last_IO_Error: IO thread error message -
Last_SQL_Errno: SQL thread error code -
Last_SQL_Error: SQL thread error message -
Seconds_Behind_Source: Replication lag
Performance Schema Query:
SELECT * FROM performance_schema.replication_connection_status\G
Step 2: Identify Communication Failures
Verify connectivity between replica and source:
Test Network Connectivity
From replica DB system or compute instance in same VCN:
# Test if source is reachable
telnet <source_hostname> 3306
# Or use netcat
nc -zv <source_hostname> 3306
# Check DNS resolution
nslookup <source_hostname>
Test Replication User Authentication
mysql -h <source_hostname> \
-P 3306 \
-u <repl_user> \
-p
If successful, authentication works; issue is elsewhere.
Common Error Codes and Solutions
Error 2003: Can't Connect to MySQL Server
Error Message: "Can't connect to MySQL server on '' (error_code)"
Meaning: Replica cannot establish network connection to source
Common Causes:
- Source MySQL server not running
- Firewall blocking port 3306
- Network path doesn't exist
- Wrong hostname or IP address
- Source server not listening on expected port
Solutions:
Verify source is running:
-- On source server
SHOW GLOBAL STATUS LIKE 'Uptime';
Check firewall rules:
# On source server (Linux)
sudo iptables -L | grep 3306
# Or for firewalld
sudo firewall-cmd --list-all | grep 3306
Verify MySQL listening on port:
# On source server
sudo netstat -tlnp | grep 3306
# Or
sudo ss -tlnp | grep 3306
Check bind-address:
-- On source server
SHOW VARIABLES LIKE 'bind_address';
Should be 0.0.0.0 or specific IP (not 127.0.0.1 for remote connections)
Test from replica network:
telnet <source_ip> 3306
Error 1045: Access Denied
Error Message: "Access denied for user ''@'' (using password: YES/NO)"
Meaning: Authentication failed - wrong credentials or user doesn't have access from replica IP
Common Causes:
- Incorrect username or password
- User not granted access from replica IP
- User exists but lacks REPLICATION SLAVE privilege
- Blank users on source conflicting
Solutions:
Verify replication user exists:
-- On source server
SELECT User, Host FROM mysql.user WHERE User = 'repl_user';
Check user privileges:
SHOW GRANTS FOR 'repl_user'@'%';
Should include: GRANT REPLICATION SLAVE ON *.*
Verify replica IP is allowed:
-- On source
SELECT User, Host FROM mysql.user WHERE User = 'repl_user';
Host should be %, replica subnet (e.g., 10.0.1.%), or specific IP
Remove blank users (common issue):
-- On source
SELECT User, Host FROM mysql.user WHERE User = '';
DROP USER ''@'localhost';
DROP USER ''@'<hostname>';
FLUSH PRIVILEGES;
Test authentication:
mysql -h <source> -u repl_user -p
Error 3159: Insecure Connection
Error Message: "Connections using insecure transport are prohibited while --require_secure_transport=ON"
Meaning: Source requires SSL/TLS but channel not configured for it
Solutions:
Option 1: Configure SSL in channel (recommended)
- Edit channel configuration
- Set SSL mode to "REQUIRED" or higher
- Provide SSL certificates if using VERIFY_CA or VERIFY_IDENTITY
Option 2: Disable require_secure_transport on source (not recommended)
-- On source server
SET GLOBAL require_secure_transport = OFF;
Verify SSL configuration:
-- On source
SHOW VARIABLES LIKE 'require_secure_transport';
SHOW VARIABLES LIKE 'have_ssl';
Error 13114: Binary Log Files Purged
Error Message: "Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'" or "The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires"
Meaning: Binary log files needed by replica have been deleted from source
Common Causes:
- GTID not supported or configured
- Binary log retention too short
- Long replication outage
- Binary logs manually purged
Solution 1: Fix GTID-Purged (For GTID-based replication)
Step 1: Retrieve gtid_executed from logical dump
When exporting source data with MySQL Shell:
mysqlsh user@source -- util dump-instance /path/to/dump \
--ocimds=true
Capture gtid_executed value from dump metadata.
Or query source directly:
-- On source
SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
Step 2: Set gtid_purged on replica
Connect to replica and execute:
-- On replica
CALL sys.SET_GTID_PURGED('<gtid_executed_value>');
Example:
CALL sys.SET_GTID_PURGED('3E11FA47-71CA-11E1-9E33-C80AA9429562:1-5');
Step 3: Resume channel
After setting gtid_purged, resume replication:
oci mysql channel resume --channel-id <CHANNEL_OCID>
Solution 2: Rebuild Replication (If GTID fix fails)
For severe cases:
Step 1: Delete channel and replica DB system
oci mysql channel delete --channel-id <CHANNEL_OCID> --force
oci mysql db-system delete --db-system-id <DB_SYSTEM_OCID> --force
Step 2: Take new snapshot of source
mysqlsh user@source -- util dump-instance /path/to/new-dump \
--excludeSchemas=mysql,sys,performance_schema,information_schema \
--ocimds=true
Step 3: Create new DB system
Create DB system and import data during creation or after.
Step 4: Create and enable channel
Create new channel with proper GTID positioning from new dump.
Best Practices for Channel Management
1. Monitor Channel Health
Set up alerts:
- Channel state changes to NEEDS_ATTENTION
- Replication lag exceeds threshold (e.g., 60 seconds)
- IO or SQL thread stops
Regular checks:
SHOW REPLICA STATUS\G
2. Maintain Adequate Binary Log Retention
On source server:
-- Set binary log expiration (seconds)
SET GLOBAL binlog_expire_logs_seconds = 172800; -- 48 hours
For AWS RDS:
CALL mysql.rds_set_configuration('binlog retention hours', 48);
3. Document Channel Configuration
Maintain records of:
- Source connection details
- Replication user credentials
- SSL configuration
- GTID settings
- Network paths used
4. Test Before Production
- Create test channel in non-production
- Verify connectivity
- Test failover scenarios
- Document lessons learned
5. Plan for Maintenance Windows
Before source maintenance:
- Verify replication lag is minimal
- Optionally disable channel
- Perform maintenance
- Re-enable channel
- Monitor for catch-up
6. Use GTID When Possible
Benefits:
- Auto-positioning simplifies management
- Easier recovery from failures
- Better failover capabilities
Advanced Troubleshooting
Replication Lag Increasing
Symptoms: Seconds_Behind_Source continuously growing
Diagnose:
-- Check what's being executed
SHOW REPLICA STATUS\G
-- Check running queries on replica
SHOW PROCESSLIST;
-- Check for lock waits
SELECT * FROM sys.innodb_lock_waits;
Solutions:
- Increase replica resources (upgrade shape)
- Optimize slow queries on source
- Break large transactions into smaller batches
- Check for long-running transactions
Channel Constantly Failing
Symptoms: Channel repeatedly enters NEEDS_ATTENTION
Diagnose:
- Check error patterns (same error repeating?)
- Review source server logs
- Check network stability
- Verify source server load
Solutions:
- Fix root cause (network, authentication, binary logs)
- Consider increasing connection timeouts
- Upgrade source server resources if overloaded
MySQL Schema Changes Causing Failures
Remember: Changes to mysql schema are not replicated and cause replication to stop
Solution:
- Manage user accounts independently
- Skip problematic transaction:
SET GLOBAL sql_replica_skip_counter = 1;
START REPLICA;
- Better: Avoid mysql schema changes during active replication
Effective channel management and troubleshooting are crucial for reliable MySQL HeatWave inbound replication. By understanding channel states, recognizing common error codes, and following best practices, you can maintain healthy replication and quickly resolve issues.
Key takeaways:
- Monitor channel states: ACTIVE, INACTIVE, NEEDS_ATTENTION are key indicators
- Check channel details: Error messages provide crucial troubleshooting information
- Common errors have common fixes: Network (2003), authentication (1045), SSL (3159), binary logs (13114)
- Use GTID: Simplifies management and recovery
- Maintain binary log retention: Prevent purged log issues
- Test thoroughly: Validate connectivity and authentication before production
Top comments (0)