S3 Glacier and retrieval

#aws

1. What happens when you move data to Amazon S3 Glacier

When you use an S3 Lifecycle rule to transition .csv files from S3 Standard → S3 Glacier (Flexible Retrieval), the objects are archived for long-term, low-cost storage.

They’re still in S3 — but you can’t directly read them until you restore them.
A restore temporarily copies the object back into S3 Standard (or S3 Standard-IA) so it can be accessed.

2. Glacier retrieval options (for S3 Glacier Flexible Retrieval)

Retrieval Type	Typical Time to Access	Cost	Notes
Expedited	1–5 minutes	Highest	Good for urgent, small retrievals
Standard	3–5 hours	Moderate	Default, good balance for planned jobs
Bulk	5–12 hours	Lowest	Best for large data restores

Because your question says:

“ML trainings and audits are planned weeks in advance”
You can easily schedule a Standard or Bulk retrieval a few hours before training starts — very cost-effective.

3. Can the ML jobs still read the data?

✅ Yes — absolutely.
You just need to initiate a restore from Glacier before training.
Once restored, the .csv objects are temporarily available for normal access (e.g., 1–7 days depending on the restore duration you choose).
Then they automatically go “cold” again in Glacier to keep costs low.

4. Why this works well for the question

Training only happens twice a year, so the .csv files spend most of their time cold in Glacier.
Retrieval delay of a few hours is acceptable because the ML runs are pre-scheduled.
Glacier cost per GB/month is much lower than S3 Standard or One Zone-IA, so total cost is minimal.

✅ In short:

With Glacier, your data is still retrievable, just not immediately.
Typical retrieval delay = 3–5 hours (Standard retrieval) — perfect for planned ML jobs.