The Surprising Cost of Athena Queries
We started by investigating our Athena queries, which were running using the @aws-sdk/client-athena package. We noticed that our queries were scanning entire tables, even when the queries only required a small subset of the data. This was due to unpartitioned large tables, which can be a bill disaster.
import { StartQueryExecutionCommand } from '@aws-sdk/client-athena';
const athenaClient = new AthenaClient({ region: 'us-west-2' });
const query = 'SELECT * FROM my_table';
const params = {
QueryString: query,
QueryExecutionContext: {
Database: 'my_database',
},
ResultConfiguration: {
OutputLocation: 's3://my-bucket/athena-results/',
},
};
const command = new StartQueryExecutionCommand(params);
athenaClient.send(command).then((data) => {
console.log(data);
}).catch((err) => {
console.error(err);
});
The error message
Your query has exceeded the maximum allowed scan sizeshould be a red flag. It indicates that your query is scanning too much data and can lead to high costs.
Logging and Optimization: Where We Went Wrong
We realized that our logging and optimization strategies were also contributing to the high costs. We were using the @aws-sdk/client-cloudwatch package to log our queries, but we were not properly configuring the log retention period. This led to a massive accumulation of logs and high costs.
import { PutLogEventsCommand } from '@aws-sdk/client-cloudwatch';
const cloudWatchClient = new CloudWatchClient({ region: 'us-west-2' });
const logGroupName = 'my-log-group';
const logStreamName = 'my-log-stream';
const logEvents = [
{
Message: 'This is a log message',
Timestamp: Date.now(),
},
];
const command = new PutLogEventsCommand({
LogGroupName: logGroupName,
LogStreamName: logStreamName,
LogEvents: logEvents,
});
cloudWatchClient.send(command).then((data) => {
console.log(data);
}).catch((err) => {
console.error(err);
});
Be careful with the
Metric resolutionin CloudWatch. 1-second metrics can cost 3x more than 1-minute metrics. Make sure to choose the right resolution for your use case.
Implementing the Fix with Node.js and TypeScript
To fix the issues, we started by optimizing our Athena queries using the async/await syntax and the satisfies operator for type safety. We also implemented a proper logging strategy using CloudWatch, with a configured log retention period.
import { StartQueryExecutionCommand } from '@aws-sdk/client-athena';
import { PutLogEventsCommand } from '@aws-sdk/client-cloudwatch';
async function executeQuery(query: string) {
const athenaClient = new AthenaClient({ region: 'us-west-2' });
const params = {
QueryString: query,
QueryExecutionContext: {
Database: 'my_database',
},
ResultConfiguration: {
OutputLocation: 's3://my-bucket/athena-results/',
},
};
try {
const command = new StartQueryExecutionCommand(params);
const data = await athenaClient.send(command);
console.log(data);
} catch (err) {
console.error(err);
}
}
async function logMessage(message: string) {
const cloudWatchClient = new CloudWatchClient({ region: 'us-west-2' });
const logGroupName = 'my-log-group';
const logStreamName = 'my-log-stream';
const logEvents = [
{
Message: message,
Timestamp: Date.now(),
},
];
try {
const command = new PutLogEventsCommand({
LogGroupName: logGroupName,
LogStreamName: logStreamName,
LogEvents: logEvents,
});
const data = await cloudWatchClient.send(command);
console.log(data);
} catch (err) {
console.error(err);
}
}
executeQuery('SELECT * FROM my_table').then(() => {
logMessage('Query executed successfully');
});
When using
async/await, make sure to handle errors properly. The error messageError: Cannot read properties of undefined (reading 'send')can occur if you're not handling errors correctly.
CloudWatch Logs: The Hidden Cost We Overlooked
We also discovered that our CloudWatch logs were not being properly retained, leading to a significant accumulation of logs and high costs. We implemented a log retention period of 30 days to mitigate this issue.
import { CreateLogGroupCommand } from '@aws-sdk/client-cloudwatch';
const cloudWatchClient = new CloudWatchClient({ region: 'us-west-2' });
const logGroupName = 'my-log-group';
const retentionInDays = 30;
const command = new CreateLogGroupCommand({
LogGroupName: logGroupName,
});
cloudWatchClient.send(command).then((data) => {
console.log(data);
}).catch((err) => {
console.error(err);
});
async function putRetentionPolicy() {
const cloudWatchClient = new CloudWatchClient({ region: 'us-west-2' });
const logGroupName = 'my-log-group';
const retentionInDays = 30;
try {
const command = new PutRetentionPolicyCommand({
LogGroupName: logGroupName,
RetentionInDays: retentionInDays,
});
const data = await cloudWatchClient.send(command);
console.log(data);
} catch (err) {
console.error(err);
}
}
putRetentionPolicy();
The error message
The log group '/aws/lambda/my-function' does not existcan occur if you're trying to put a retention policy on a log group that does not exist.
The Takeaway
Here are the key takeaways from our experience:
- Make sure to optimize your Athena queries to scan only the necessary data.
- Implement a proper logging strategy using CloudWatch, with a configured log retention period.
- Use the
async/awaitsyntax and thesatisfiesoperator for type safety when executing Athena queries. - Be mindful of the
Metric resolutionin CloudWatch and choose the right resolution for your use case. - Handle errors properly when using
async/awaitto avoid unexpected behavior. - Implement a log retention period to mitigate the accumulation of logs and high costs. Remember, it depends on the specifics of your use case, with costs varying by 20-50% based on the region, query complexity, and data size, and it depends on the metrics resolution, with 1-second metrics costing 3x more than 1-minute metrics.
Transparency notice
This article was generated by Me (Dinesh).
The topic was scouted from live AWS and Node.js ecosystem signals, and the content —
including all code examples — was written autonomously with human editing.Published: 2026-05-20 · Primary focus: Athena
All code blocks are intended to be correct and runnable, but please verify them
against the official AWS SDK v3 docs
before using in production.Find an error? Drop a comment — corrections are always welcome.
Top comments (0)