Been using UNIX since the late 80s; Linux since the mid-90s; virtualization since the early 2000s and spent the past few years working in the cloud space.
Location
Alexandria, VA, USA
Education
B.S. Psychology from Pennsylvania State University
Migrate rarely accessed S3 data to AWS Glacier. Retrieval is slower, but much less expensive
One of my gripes with AWS is that they have a very nice array of storage-tiers, but they're mostly inflexible when it comes to creating progressive lifecycle policies. Unlike my previous life working in the storage and backups world, I usually end up with single-level lifecycles: policies that go straight from Standard to Glacier. While intervening IA or RR/SZ would be attractive for a more-complete lifecycle process, they're not usable until 30 days have elapsed. Most of the data I want/need to lifecycle has greatly reduced value to my customers — beyond compliance with local policies or legislative prescriptions &mdash after the 7-14 day threshold. However, I can either go straight to Glacier at that point and hope no one has an urgent restore-need, or I can keep the data at the full-cost Standard tier until the intermediate tiers become available. Sub-optimal.
...And after-the-fact tiering of non-current data is somewhere in the neighborhood of "hatefully slow". While the s3api tool makes it doable, it's kludgey on top of that slowness. Always a joy when a customer you've gone hands-off with comes panicking to you with not-designed-for S3 sticker-shock. "You've got tens of millions of unanticipated objects in a bucket that's accruing TiBs worth of unexpected storage costs? Alright, lemme add new bucket policies for you while this script runs to force-migrate your stuff to a less-dear tier". Said customers tend to virtual foot-tap while the job runs.
One of the groups I work with does cloud-enablement for our customers. Part of that is cost-control measures. So, quite familiar with the rest of the points you make. When the group first took on this role, one of the earliest tools we wrote was a service to read instance-tags for scheduling of power off/on (and execute an notify of same). Amazing the difference it makes - especially for dev environments.
We'd probably do a lot more in the way of automating some of the cost-control tools/methods ...except AWS hasn't seen fit to make Lambda (and other tools that can be leveraged for automated cost-control) to all the regions our customers occupy. :(
With respect to
One of my gripes with AWS is that they have a very nice array of storage-tiers, but they're mostly inflexible when it comes to creating progressive lifecycle policies. Unlike my previous life working in the storage and backups world, I usually end up with single-level lifecycles: policies that go straight from Standard to Glacier. While intervening IA or RR/SZ would be attractive for a more-complete lifecycle process, they're not usable until 30 days have elapsed. Most of the data I want/need to lifecycle has greatly reduced value to my customers — beyond compliance with local policies or legislative prescriptions &mdash after the 7-14 day threshold. However, I can either go straight to Glacier at that point and hope no one has an urgent restore-need, or I can keep the data at the full-cost Standard tier until the intermediate tiers become available. Sub-optimal.
...And after-the-fact tiering of non-current data is somewhere in the neighborhood of "hatefully slow". While the
s3api
tool makes it doable, it's kludgey on top of that slowness. Always a joy when a customer you've gone hands-off with comes panicking to you with not-designed-for S3 sticker-shock. "You've got tens of millions of unanticipated objects in a bucket that's accruing TiBs worth of unexpected storage costs? Alright, lemme add new bucket policies for you while this script runs to force-migrate your stuff to a less-dear tier". Said customers tend to virtual foot-tap while the job runs.One of the groups I work with does cloud-enablement for our customers. Part of that is cost-control measures. So, quite familiar with the rest of the points you make. When the group first took on this role, one of the earliest tools we wrote was a service to read instance-tags for scheduling of power off/on (and execute an notify of same). Amazing the difference it makes - especially for dev environments.
We'd probably do a lot more in the way of automating some of the cost-control tools/methods ...except AWS hasn't seen fit to make Lambda (and other tools that can be leveraged for automated cost-control) to all the regions our customers occupy. :(
Really good extra background. Thank you for your insights.