DEV Community

Discussion on: Apply KISS to Infrastructure

Collapse
dtiziani profile image
Tizi

the idea is fine, but the lack of knowledge about ECS and downtime bothers me. you should have NO downtimes if you're doing it right

Collapse
gregoryledray profile image
Gregory Ledray Author • Edited on

You're right, this assumes you are OK with prod downtime during deployment. I am OK with temporary prod downtime because in my front end code I have a request wrapper which implements both retries and a call to the next best environment if prod is down. For example, if I try to reach example.com/api/a and it is unreachable, the code then tries to reach staging.example.com/api/a, which must be working or else I wouldn't have deployed to prod. Obviously though this requires additional setup I didn't touch on during this post, isn't always practical, etc.

I wish I knew a way to implement this easily in AWS on the networking side (perhaps API Gateway has a way to call endpoint B if endpoint A's response fails??) but as you point out, I don't know how.

Collapse
dtiziani profile image
Tizi

You can allow ECS to run an additional server while deploying, so it creates a new instance, drains the connections to the old one, then kill it: stackoverflow.com/questions/407311...

Thread Thread
gregoryledray profile image
Gregory Ledray Author

This is good to know. I have no doubt that if I understood ECS better I could do deployments with zero downtime. But after spending dozens of hours debugging ECS only to realize the problem wasn't with ECS, it was with my VPC not having DNS set up properly, I've basically lost confidence in AWS documentation and debuggability and I'm trying out simpler solutions like the one in this post.