I see Data API less as an infrastructure improvement and more as a tool that changes how the team works. Whether your AI tools can directly access the database makes an order-of-magnitude difference in debugging and data verification speed.
We enabled one AWS feature — Aurora Data API — and our AI coding tools could suddenly query the database. No bastion host, no port forwarding, no copy-pasting query results. Here's what that actually looks like in practice, and what you need to enable it.
TL;DR
- Data API is an HTTPS-based SQL execution API built into Aurora. Enabling it costs virtually nothing
- It complements SSM bastion hosts — use both
- Once enabled, Claude Code / Cursor can query your DB directly via shell commands or MCP
- Reduces team learning curve, simplifies bastion operations, and accelerates automation
- If you're running Aurora, there's no reason not to enable it
AI Coding Tool Integration: How Data API Changes Team Development
Let me start with the payoff — this is where Data API had the biggest impact for us.
Shell-Based Tool Use: Claude Code Runs AWS CLI Directly
With the traditional bastion host setup — a dedicated EC2 instance you SSH into (or tunnel through) to reach your private database — AI tools accessing the database was effectively impossible. SSH port forwarding + MySQL client connection is a workflow designed for humans operating manually.
With Data API, a single shell command — aws rds-data execute-statement — executes SQL over HTTPS. That means Claude Code can run this command through its built-in Bash tool (which lets it execute shell commands on your behalf) to query and modify your database directly.
# Example: what Claude Code actually runs
aws rds-data execute-statement \
--resource-arn "arn:aws:rds:ap-northeast-1:xxx:cluster:dev-cluster" \
--secret-arn "arn:aws:secretsmanager:ap-northeast-1:xxx:secret:dbsecret" \
--database "my_database" \
--sql "SHOW TABLES" \
--profile dev
This is the AI agent using a shell tool — it constructs and executes CLI commands, then parses the text output. It works, and it unlocks workflows like:
Debugging — a multi-step agentic loop:
- "Check the staging
userstable for records wherestatusis null" - Claude Code queries via Data API → finds 12 orphaned records
- It reasons about the cause, issues a follow-up query on the
user_sessionstable - Identifies a race condition in the cleanup job → suggests a fix with a migration SQL
Migration authoring:
"Look at the current schema in dev and write a migration SQL for this spec" → auto-fetches current table definitions → generates diff SQL.
Data checks:
"How many records are in the production offices table, and when was the last update?" → instant answer.
Previously, a developer had to manually query through the bastion, copy-paste the results back to the AI, wait for analysis... Data API eliminates the human as the bottleneck in this loop.
MCP-Based Structured Tool Use: Natural Language DB Access
The second integration pattern is more powerful. MCP (Model Context Protocol) lets AI tools like Cursor and Claude Code connect to external data sources through typed, schema-defined interfaces. Instead of parsing free-form CLI output, the agent receives structured data with column names and types — making its actions more reliable and predictable.
The official MySQL MCP Server from AWS Labs uses Data API internally. Here's how to wire it up:
{
"mcpServers": {
"awslabs.mysql-mcp-server": {
"command": "uvx",
"args": [
"awslabs.mysql-mcp-server@latest",
"--resource_arn", "<cluster-arn>",
"--secret_arn", "<secret-arn>",
"--database", "<db-name>",
"--region", "ap-northeast-1",
"--readonly", "True"
],
"env": {
"AWS_PROFILE": "dev",
"AWS_REGION": "ap-northeast-1"
}
}
}
}
Change
ap-northeast-1to your AWS region.
Once configured, you can query from Cursor or Claude Code using natural language:
- "How many support tickets came in this month?"
- "Show me all active users created in the last 30 days"
- "Which records were updated in the last 7 days?"
Guardrails for Production Use
With AI tools querying your database, safety matters. Here's our approach:
-
--readonly Truein MCP config: Restricts the MCP server to SELECT-only queries. This is enforced at the MCP server level, but should not be your only line of defense. - Read-only DB user: Create a dedicated database user with SELECT-only privileges and store those credentials in a separate Secrets Manager secret. Use this secret for AI tool access — not the application's read-write credentials.
-
IAM policy scoping: Restrict the IAM role to
rds-data:ExecuteStatementandsecretsmanager:GetSecretValueon the specific read-only secret ARN. - CloudTrail audit trail: Every Data API call is logged in CloudTrail, including the SQL statement. This gives you post-hoc observability of everything the agent executed.
- Human-in-the-loop for writes: Removing the human from read queries is a productivity win. For write operations, the human-as-bottleneck is actually the safety mechanism. Keep approval workflows for any non-read access.
Note on prompt injection: If your database contains user-generated content, be aware that query results could include text designed to manipulate the agent's next action. Use parameterized queries through the Data API to mitigate injection risks.
Team Learning Curve Drops Dramatically
This one matters if you're leading a team.
What team members previously had to learn for DB access:
- AWS SSO login workflow
- AWS Systems Manager (SSM) Session Manager installation and configuration
- Port forwarding concepts and execution
- MySQL client installation and connection setup
- Retrieving passwords from Secrets Manager
With Data API + AI tools:
- AWS CLI profile setup (most engineers already know this)
- "Show me the data for X" → ask the AI
Five steps become effectively one. For onboarding, it's now "Log in with AWS SSO, then just ask Claude" — and that's it.
The barrier for non-infrastructure engineers and frontend developers who just want to "quickly check some data" drops dramatically.
Reduced Operational Burden
For small teams, the potential to eliminate bastion host maintenance is a significant win.
Complete bastion elimination depends on your team's needs — bulk data exports and complex investigations still benefit from SSM. But the majority of day-to-day "let me quickly check this data" or "update a few records" use cases can be handled by Data API.
As bastion usage drops, you can justify switching from always-on to on-demand — further reducing costs and maintenance.
What Is RDS Data API?
RDS Data API lets you execute SQL against Aurora via HTTPS REST calls.
Traditional database connections rely on the MySQL wire protocol over TCP/IP. Data API replaces that with standard HTTP requests through the AWS SDK or CLI. No VPC required — which also means faster Lambda cold starts, simpler networking, and no ENI provisioning delays.
Supported Engines and Limitations
Data API is Aurora-only. Standard RDS is not supported.
| Engine | Data API Support |
|---|---|
| Aurora MySQL (v3.07+) | Supported |
| Aurora PostgreSQL (v13.12+, 14.9+, 15.4+) | Supported |
| Standard RDS MySQL / PostgreSQL | Not supported |
Key limitations:
| Constraint | Value |
|---|---|
| Response size | 1 MB max per request |
| Row size | 64 KB max per row |
| Timeout | 45 seconds max per request |
| Multi-statement | Not supported on MySQL |
| Target | Writer instance only |
The 1 MB response limit means Data API isn't suited for bulk SELECTs or data exports. That's where the SSM bastion still earns its keep.
Note on multi-statement: If your CI/CD migration tool (Flyway, Liquibase, etc.) uses multi-statement SQL files, you'll need to split statements or use a wrapper for MySQL.
Pricing
Enabling is free. Usage costs $0.35 per million requests.
1,000 SQL executions per month = $0.00035. Not exactly a budget consideration.
Data API vs SSM Session Manager
This was the decision point I wrestled with most. The answer: it's not either/or — it's both.
Use Case Breakdown
| Use Case | Data API | SSM Bastion |
|---|---|---|
| Lambda → DB operations | Ideal (no VPC needed) | Not possible |
| CI/CD migrations | Good fit | Possible but complex |
| Scheduled batch scripts | Good fit (IAM auth) | Requires session management |
| Application integration | Ideal (pure HTTP) | Poor fit |
| Manual data investigation | Limited | Ideal (GUI tools) |
| Bulk data export | Poor fit (1 MB limit) | Ideal |
| Emergency manual UPDATEs | Possible but clunky | Ideal |
Operational Cost Comparison
| Factor | Data API | SSM Bastion |
|---|---|---|
| Initial setup | Near-zero (one console click) | EC2 + SSM + security group config |
| Monthly cost | ~$0 | $3-8/mo (t4g.nano always-on) |
| Maintenance | None (fully managed) | OS patches, SSM Agent updates |
| Audit logging | CloudTrail auto-records all queries | Dual management: CloudTrail + DB audit logs |
| Connection management | None (no persistent connections) | max_connections tuning required |
The annual cost of a bastion host ($36-96) looks small on paper, but factor in the human cost of OS patching, SSM Agent updates, and recovery when the instance goes down — it adds up fast, especially for small teams.
Security Comparison
| Factor | Data API | SSM Bastion |
|---|---|---|
| Attack surface | HTTPS (IAM-protected) | SSM only (no inbound ports) |
| Credential management | Secrets Manager (auto-rotation) | DB password stored locally |
| Access control | IAM policies control API calls | Dual management: SSM permissions + DB user permissions |
Data API delegates credentials entirely to Secrets Manager, meaning developers never need to know the DB password. That's a significant security win.
How to Enable It
Prerequisites
| Requirement | Details |
|---|---|
| DB engine | Aurora MySQL v3.07+ or Aurora PostgreSQL v13.12+ |
| Instance class | Any class except T instances (for provisioned; Serverless v2 is unaffected) |
| Secrets Manager | DB credentials must be stored |
If you're on standard RDS, Data API isn't available. Whether to migrate to Aurora for Data API alone depends on the other benefits (Serverless v2 autoscaling, etc.).
CLI
# For Serverless v2 / Provisioned clusters
aws rds enable-http-endpoint \
--resource-arn <cluster-arn> \
--profile <profile>
Important: For Serverless v2 / Provisioned clusters, use
enable-http-endpoint. Themodify-db-cluster --enable-http-endpointcommand is for Serverless v1 only (now end-of-life). The docs are confusing on this — watch out.
Verification
aws rds-data execute-statement \
--resource-arn "<cluster-arn>" \
--secret-arn "<secret-arn>" \
--database "<db-name>" \
--sql "SELECT 1 AS test" \
--profile <profile>
# → {"records": [[{"longValue": 1}]], "numberOfRecordsUpdated": 0}
Preventing CDK Drift
If you enable via CLI without updating your CDK code, the next deploy might reset it to false.
const cluster = new rds.DatabaseCluster(this, 'Cluster', {
engine: rds.DatabaseClusterEngine.auroraMysql({
version: rds.AuroraMysqlEngineVersion.VER_3_08_0,
}),
// ... existing config ...
enableDataApi: true, // ← Add this
});
Always update your CDK code alongside the CLI enablement. If CDK drift reverts the setting, any MCP servers or automation scripts that depend on Data API will break simultaneously.
Risk Assessment
Enabling Data API carries virtually zero risk.
- No impact on existing applications or database connections
- Even when enabled, access requires IAM permissions
- Can be disabled instantly with a single command
- Cost is effectively zero
The only caveat: prevent CDK drift. If you forget to add enableDataApi: true to your code, the next deploy reverts the setting.
Should You Enable It?
If you're running Aurora, enabling Data API is one of those "no reason not to" improvements.
- Near-zero risk and cost to enable
- Complements SSM bastion for full use-case coverage
- AI coding tool integration accelerates the entire team
- Directly reduces learning curve and operational overhead
"You don't need to know the DB password. You don't need to set up port forwarding. You just ask Claude." — That's the new onboarding experience.
If you have Aurora clusters that haven't enabled it yet, start with your dev environment. One command, five seconds.
Are you already using Data API with AI tools? Or still running bastion hosts for everything? I'd love to hear what your team's setup looks like — drop a comment below.
Top comments (0)