DEV Community

Discussion on: Tool for fast deletion and emptying of S3 buckets (versioning supported)

Collapse
 
karlforster profile image
karlforster

This tool is amazing. Thanks so much.
Regarding concurrency numbers, is there any indication of how high we can go before AWS start rejecting.

I am running at the moment with 5 concurrency to remove 28 million items in a bucket.

but have another one with 50 million items and 1.8TB worth of Data.

Collapse
 
k_goto profile image
Kenta Goto AWS Community Builders

Thank you @karlforster !

Actually, the concurrency number in cls3 controls how many buckets are deleted in parallel — it doesn't control parallelism for the objects within a single bucket.

That's because cls3 deletes objects using the following loop, which alternates between a synchronous List and an asynchronous Delete:

  • 1000-object ListObjectVersions (sync) → 1000-object DeleteObjects (async, immediately moves on) → next 1000-object List (sync) → next 1000-object DeleteObjects (async) → ...

You can't delete an object without listing it first, so this design gives the best performance. Both List and DeleteObjects can handle only up to 1000 items per call, and List has to fetch the current page before it can move on to the next one — so within a single bucket, deletes are fired asynchronously in the background of the next List, rather than running in parallel.

NOTE: S3's DeleteObjects API is designed to throttle once you exceed roughly 3,500 deletions per second. However, each DeleteObjects call typically finishes during the time the next List is running, so we don't put a cap on the number of in-flight async deletes.