Hi Dev.to! πππ
Some questions for all you performance aficionados and AWS / Cloud experts out there.
I'm looking for a cheap (as close to free as possible) service for:
1. Hosting AND serving images. These images will be used on a website, in emails, etc. I want to plan for:
- 100GB of added storage / month
- 100M image views (GET requests) / month
- 100K new image uploads (PUT / POST requests) / month
2. CDN / Edge caching - so as to serve requests as fast as possible. Here I am looking to reduce response times and website load times that end-users will experience.
AWS both has an amazing suite of products and at the same time is very difficult to get started with.AWS S3's pricing model is confusing. I did also play a bit with their calculator, but it's hard to say if I'm entering the numbers in correctly.
Q1: In the AWS ecosystem:
- For S3: What is "Storage pricing" vs "Request Pricing"?
- What is S3 Select and how is it different from S3?
- What is S3 Intelligent-Tiering?
- What is S3 Glacier?
- And what about Amazon CloudFront?
Q2: Is AWS the best (and cheapest) available option? What about services like:
- Cloudflare
- Cloudinary
- Photon by Jetpack etc?
- Versus using my Linode server itself for hosting and serving images?
- versus the 1000+ other options out there?
Thoughts on what service I should be using? Looking for advice from folks that are knowledgeable on the matter. πππ
Top comments (10)
Storage pricing is the cost of actually storing the data. It's measured in GB of space usage per month. Pretty much, this is the cost of disk-space.
Request pricing is how much you have to pay for each API request made. If you just upload your data and then let it sit there, this works out to essentially zero. This is paying for usage of the API itself. It's typically billed per thousand or million requests by request type.
Note that you have to pay separately for bandwidth usage from both storage pricing and request pricing. Inbound bandwidth is (usually mostly) free, but outbound data transfer is pro-rated just like storage usage (you pay per GB of usage per month).
I can't comment on this one, as I've never looked into S3 select.
The general concept is pretty simple. S3 offers a couple of different options for storage to optimize based on access patterns. The two big ones are the Standard tier (which is better optimized for frequent access of smaller amounts of data) and the Infrequent Access (or IA) tier (which is better optimized for infrequent access of larger amounts of data (costs less per GB stored, but more per request to access it).
Intelligent Tiering is a special storage option that will look at your access patterns, and automatically use the tier that will save you the most money. It costs a bit extra, but will usually end up saving you money for cases where you don't know ahead of time what your access patterns will look like.
Near-line archival data storage. Storing data in Glacier costs almost nothing compared to other S3 storage options (most of the US regions cost less than a third of the standard tier for actual storage), but you can't immediately retrieve it, and there's a minimum retention period you're always billed for. Pretty much, it's for stuff like off-site backups where you need to store a very large amount of data, but almost never need to access it. In practice, it's similar to Backblaze's 'B2 Cloud Storage' offering (though Backblaze is a bit cheaper per request and doesn't have the minimum retention requirements or retrieval delays).
What makes Glacier so cheap is that they save the data, and then remove the physical media it's stored on from the computer and store it in a warehouse somewhere (so they're not paying for power, and the chance of media failure is far lower). The upshot is that to actually access the data, they have to retrieve the media and load it into a system (though this is probably automated, otherwise their expedited retrieval would not be realistically possible).
A CDN that you can use to serve data from other AWS services. Think of it like CloudFlare, and you'll get the idea.
As far as pricing, I'd look at pricing out:
Those are the most statistically likely to be inexpensive while still meeting your actual usage needs. Note that you will need to provide your own upstream storage with any CDN like Cloudflare.
I personally am fond of AWS for stuff like you seem to be talking about, but part of that is that I can do everything in one place (including DNS hosting (Route 53, excellent pricing compared to most other DNS providers), VPS hosting (Lightsail (works like Linode/DigitalOcean/Vultr) or EC2 (more expensive, but you pay for exactly what you use), container management (multiple options), and more).
I wish I could give each of these replies multiple likes, because they're so, so good. Thank you @ahferroin7 for the phenomenal explanation.
Some follow on questions to your comment:
Given that I'm on a VPS (Linode), does it make sense to be using that, and AWS for:
It depends on how you configure it, but usually it's used for entire sites. The big thing for them (and AWS CloudFront) though is geo-caching. That is, they have dozens of endpoints around the world, and requests through them get routed through the geographically closest endpoint, so you have minimal latency for stuff that's already cached.
If, for example, you wanted to use your own domain name to serve data from. AWS also provides domain registrar services through Route 53, so you can use them to register your own domain as well (and it's essentially at cost for that, they have very little markup). Some VPS offerings like Linode provide really basic free DNS, but you have to go through their specific naming scheme. S3 has a similar thing where you don't need a domain to serve data from it, but you have to live with their domain layout.
The big thing with EC2 is that unlike all-in-one offerings like Linode (or AWS Lightsail), billing is itemized per resource. IOW, you pay separately for the storage for the VM (note that this is different from the pricing for S3), the network usage for the VM (which is the same as for S3), and the compute time plus memory for the VM. EC2 has four major benefits over the all-in-one offerings:
Note, however, that if your needs closely match up with a specific offering from an all-in-one provider, EC2 will usually be more expensive than going with the all-in-one offering.
Probably not. SES and SNS are really designed for very large scale messaging management, and the interfaces for working with them really reflect that. Unless you're talking about dealing with hundreds of thousands of emails to send, they're probably overkill.
Probably, but I'd suggest looking very closely at pricing. There's a nice calculator web app provided by AWS that can help you figure out what your costs would look like. Most likely, it will be worth it for your usage, but keep in mind that your monthly bill will fluctuate based on actual usage.
Possibly. Even if you're using S3 for storage, I'd generally recommend CloudFlare over AWS CloudFront in most cases. Keep in mind that you will probably need your own domain though for either of them. Route 53 is what I'd recommend for that, as it will effectively only cost a few USD a month plus the registration fees for whatever domain you pick (which are billed yearly, and typically only cost 10-20 USD depending on what TLD you're using).
@ahferroin7 Austin, phenomenal explanation, thank you so much for this.
I finally understand a lot about what these services do, thanks to you. Also, thanks for the tip about Cloudflare vs Amazon Coudfront. It shows you've got a ton of experience in this domain, and you're exactly the type of person whose opinion I needed.
I read your explanation twice.
Route 53: I already have a domain (renewal costs $15/yr), and DNS set up with Linode (costs $10/mo). The domain (and all it's subdomains) resolve fine. So, I don't see the need for Route 53....unless I am missing something.
Is there a downside to using SES + SNS for emails, if the number of emails is in the hundreds only (small numbers)? I ask because I already have it configured and set up (which was quite painful if I'm being honest).
Thank you, again!
Ignoring the transfer fees, Route 53 is probably less expensive unless you're seeing many millions of queries. It's 0.50 USD per hosted domain per month, and 0.10 USD per million standard queries per month. The registration costs are probably the same (though worth looking into, because they might not be). I think there's a one-time transfer cost if you want to switch registration to AWS (which would make other Route 53 setup easier), but I'm not sure.
So, unless you're really busy, you'd probably end up paying less than 1 USD a month through Route 53 (provided you don't use any of the fancy features like monitoring and failover). It may also be marginally faster for people who aren't located close to Linode datacenters (AWS has a lot more datacenters, and they automatically route the queries through the closest datacenter).
If you've already got it set up and it works for you, there's probably not much of a downside unless you can find a less expensive alternative. The biggest issue with it is how much effort it takes to set up.
Thanks @ahferroin7 ! It definitely looks like I should give Route 53 a go at some point in the future when I have more time. It's low onthe priority list, though.
All said and done, your tips have been a great help. Following you on DEV, and I hope you don't mind the odd question on AWS every now and then..
Always glad to share my knowledge!
Thank you!
Hi KP,
My thoughts:
Based on your requirements I can recommend AWS, cost-effective solution and security capabilities.
General S3 FAQs
@geektalz thank you so much for the detailed reply. Wow - AWS is capable of a lot! You really explained it like I'm 5, which is very helpful and I appreciate it. It also gives me confidence in using AWS going forward!