DEV Community

Chasing down modest database scaling issues when you're not sure what's going on

Ben Halpern on July 19, 2018

This is a brief retro on some downtime we experienced at dev.to in the past day. I posted this in the process of investigating the issue: ...
Collapse
 
samcj profile image
Sam Johnson

If you haven't already it would probably make sense to add connection pooling. I believe Heroku allows you to funnel all of your connections through pgbouncer. Postgres doesn't like having a lot of connections open so it really only scales vertically to a specific limit (connection wise).

In any case it is better to have less connections open since there is a fairly high overheard per postgres connection.

Additionally, if you have workers connecting as well as web servers you can pool them separately in pgbouncer to make sure one doesn't starve the other.

Good luck!

Collapse
 
ben profile image
Ben Halpern

Yeah I think you're right. I feel like I wrongly assumed we weren't as close to this bottleneck as we were. These seem like the obvious next steps.

Collapse
 
dangolant profile image
Daniel Golant

Were the connections that were pushing y'all over the limit strictly necessary? Meaning, is there an underlying issue with connections staying open for longer than they "need" to? I think I was seeing something similar recently and would love some insight.

Collapse
 
ben profile image
Ben Halpern

It’s possible. I’d be happy to chat it out, either via #help or slip into my /connect to talk more.

Either way I might have some relevant thoughts for you.

Collapse
 
dangolant profile image
Daniel Golant

DMing :)

Collapse
 
nebojsac profile image
Nick Cinger

I love these real-world scenario articles - thanks for sharing!

What do you use for monitoring? We use NewRelic, but the price doesn't scale well, so are looking at setting up Nagios.

Collapse
 
ben profile image
Ben Halpern

We use Airbrake and Skylight and Timber as our main tools in this arena. In addition to pricing scale like you've mentioned, I ran into some issues with the script New Relic was injecting early on in the project when I was trying to optimize performance. There may have been other ways to include the service but I wound up just going in a different direction at the time and it's worked out well.

Collapse
 
rhymes profile image
rhymes • Edited

Glad you caught it!

A simple idea to improve the monitoring could be to create an alarm on logged database metrics.

With Standard tiers Heroku periodically logs a line with the statuts of the DB, in this line you have a key named: sample#active-connections=N where N is the number of opened connections.

I think that with Papertrail (probably also other logging addons) you can set an alert on that string that sends the event every 10 minutes or 1 hour to your preferred destination (like a low key "metrics" channel on Slack).

I think I'm going to try it myself tomorrow :D

Collapse
 
ben profile image
Ben Halpern

Yeah, we use Timber and do this sort of stuff but at the moment we've mostly just configured alerts around relatively known issues. We have the tools but we need to step up and really configure these things to be more useful.

Collapse
 
buinauskas profile image
Evaldas Buinauskas

Don't be too sad about it! It literally means that community grows. 😉

Collapse
 
yaser profile image
Yaser Al-Najjar

I'm recently diving into scaling apps... and yeah, most probably what you're looking for is a messaging system like RabbitMQ and SOA.

youtu.be/RhfyP8pEEc4