DEV Community

Cover image for How to Optimize Your PostgreSQL Ingest Rate (5 tips)
Team Timescale for Timescale

Posted on • Originally published at timescale.com

4

How to Optimize Your PostgreSQL Ingest Rate (5 tips)

TLDR: Practical tips to improve ingest rates in Postgres and get more row inserts per second. Including those enhanced with TimescaleDB for time-series workloads.

Jump to the full article


The Challenge with High-Volume Data Ingestion

Time-series data workloads are relentless, and optimizing your ingest rate is a necessity. Since we enhance PostgreSQL to handle time-series data and other demanding workloads, almost every Timescale customer is interested in getting as many row inserts per second as possible into their TimescaleDB service.

How can you ensure you’re getting the best write performance? First, take a look at this blog post we published a while ago: 13 Tips to Improve PostgreSQL Insert Performance. Timescale is built on PostgreSQL, so the general best practices for maximizing your inserts into Postgres will also apply to Timescale.

However, we often find that what’s holding our Timescale customers back is “cloud-native best practices.” When dealing with a Timescale service, it’s not only essential to optimize your practices on the database level but also at the cloud infrastructure level.

5 Practical Tips to Improve Your PostgreSQL Ingest Rate

1. Set realistic expectations

First, it's good to have a performance benchmark in mind. Based on experience, aim for ingesting 50-100k rows per second per ingest process. If your requirements exceed this, you'll need to implement multiple ingest processes.

2. Ensure sufficient client-side resources

Your client machine's resources significantly impact write performance. If you're setting up multiple ingest processes, make sure you have enough CPU power.

For example, if you're writing to PostgreSQL from an AWS EC2 instance:

  • For a single ingest process, a 1-CPU instance might be adequate

  • For multiple processes, match your CPU count to at least the number of processes (e.g., four processes = four CPUs)

3.. Make sure you have enough database-side resources

Just like on your client side, your Postgres server needs sufficient resources to handle the incoming writes. We recommend having as many CPUs in the Timescale service as in your client.

If you’re not sure that you have enough resources in your Timescale service, you can use forks to make a copy of your database with more CPU and test if your ingest performance improves. Our customers find it very useful (and also cost-effective).

4. Minimize network latency

Is your data crossing the world? Try to put your client and server closer together.

Network throughput is a major factor in insert performance. To maximize throughput:

Host your client and database in the same region
If using cloud services, keep both in the same availability zone when possible

For example, if your PostgreSQL database is hosted in AWS us-east-1, your client application should also run in us-east-1 to minimize latency. This seemingly simple adjustment can yield substantial performance improvements.

5. Batch your inserts (most important!)

This is perhaps the most crucial optimization: insert your data with many rows per INSERT statement. Don't insert row-by-row. Instead, try to write hundreds or thousands of rows in a single INSERT statement:

INSERT INTO mytable (2022-06-01 12:00:00 1, 1.11, 1.12),
                (2022-06-01 13:00:00 2, 2.21, 2.22)
Enter fullscreen mode Exit fullscreen mode

For more PostgreSQL insert optimization techniques, check out this comprehensive guide: 13 Tips to Improve PostgreSQL Insert Performance.

You might also be interested in a benchmark comparing different PostgreSQL ingest methods: INSERT vs. Batch INSERT vs. COPY, which shows the dramatic performance differences between these approaches.

PostgreSQL for Time-Series Data

If you're working specifically with time-series data, TimescaleDB extends PostgreSQL with specialized capabilities while maintaining 100% compatibility with the PostgreSQL ecosystem. It's designed to handle the particular challenges of time-series workloads while preserving all the features you love about PostgreSQL.

Conclusion

Remember that the perfect database configuration depends on your specific use case—there's no one-size-fits-all solution in the database world. These tips provide a starting point, but you'll want to test and adjust based on your particular workload characteristics.


If you still haven’t tried Timescale, you can use it for free for 30 days. It is the most convenient way to use TimescaleDB in production. You’ll get automatic backups and recovery, high availability via replicas in multiple availability zones, flexible resizing with autoscaling, and one-click database forking for testing and creating developer environments. And, of course, the advice of the Timescale Team. 😎

Top comments (0)

👋 Kindness is contagious

Please consider leaving a ❤️ or a friendly comment if you found this post helpful!

Okay