Discussion on: THE ROAD TO AWS RE:INVENT 2018 – WEEKLY PREDICTIONS, PART 2: DATA 2.0

View post

One of my pet peeves on the S3 lifecycle management is that moving from Standard to Infrequent Access storage class has nothing to do with the frequency of accessing the file. While I would imagine that the underlying capabilities of an object store makes it very difficult to actually do this, it would provide a much-needed metric to make storage decisions.

S3's lifecycle management generally leaves much to be desired. I mean, it's great that you could have a multi-stage lifecycle for data. But, the fact that your only choice for sub-30day policies is to just straight to Glacier is kind of dreadful. S3 is potentially great as a repository for nearline/offline storage (i.e., backups) ...but it currently lacks the useful lifecycle capabilities you get used to in legacy products like NetBackup. And, even aside from the whole loss of POSIX attributes if you want to simply sync a filesystem to disk, performance of such is dreadful due to the whole common-key issue. Both the POSIX attributes an common-key problems are solveable, but it's painful to sort the programmatic logic out.

Overall, it has the feel of "you guys have been pestering us, here's something to shut you up for a while", but not really a fully-realized HSM.

Maybe what AWS will introduce is an actual HSM-style interface to S3 or a service-overlay?