DEV Community

Tool for fast deletion and emptying of S3 buckets (versioning supported)

Kenta Goto on June 28, 2023

I have released an OSS tool that solves the following problems. The button to empty an S3 bucket is present in the S3 console but not in the CLI ...
Collapse
 
mmuller88 profile image
Martin Muller 🇩🇪🇧🇷🇵🇹 AWS Community Builders

Nice. Thank you so much :)!

Collapse
 
Sloan, the sloth mascot
Comment deleted
Collapse
 
Sloan, the sloth mascot
Comment deleted
 
Sloan, the sloth mascot
Comment deleted
Collapse
 
karlforster profile image
karlforster

This tool is amazing. Thanks so much.
Regarding concurrency numbers, is there any indication of how high we can go before AWS start rejecting.

I am running at the moment with 5 concurrency to remove 28 million items in a bucket.

but have another one with 50 million items and 1.8TB worth of Data.

Collapse
 
k_goto profile image
Kenta Goto AWS Community Builders

Thank you @karlforster !

Actually, the concurrency number in cls3 controls how many buckets are deleted in parallel — it doesn't control parallelism for the objects within a single bucket.

That's because cls3 deletes objects using the following loop, which alternates between a synchronous List and an asynchronous Delete:

  • 1000-object ListObjectVersions (sync) → 1000-object DeleteObjects (async, immediately moves on) → next 1000-object List (sync) → next 1000-object DeleteObjects (async) → ...

You can't delete an object without listing it first, so this design gives the best performance. Both List and DeleteObjects can handle only up to 1000 items per call, and List has to fetch the current page before it can move on to the next one — so within a single bucket, deletes are fired asynchronously in the background of the next List, rather than running in parallel.

NOTE: S3's DeleteObjects API is designed to throttle once you exceed roughly 3,500 deletions per second. However, each DeleteObjects call typically finishes during the time the next List is running, so we don't put a cap on the number of in-flight async deletes.

Collapse
 
rahmantheman profile image
Rahman Badru

love this

Collapse
 
k_goto profile image
Kenta Goto AWS Community Builders

Thank you! So happy!

Collapse
 
andrew_61cd5f1f140a profile image
Andrew

Would it be possible to use this with custom endpoint-url's?

Collapse
 
k_goto profile image
Kenta Goto AWS Community Builders

@andrew_61cd5f1f140a
Hi, The custom endpoint url is now supported in v0.28.0. It can be used with the -e|--endpointUrl option or by specifying the environment variable CLS3_ENDPOINT_URL.

github.com/go-to-k/cls3/releases/t...

Collapse
 
k_goto profile image
Kenta Goto AWS Community Builders

Awesome. I will try to do that in the near future!

github.com/go-to-k/cls3/issues/363

Collapse
 
k_goto profile image
Kenta Goto AWS Community Builders

@mmuller88
Thank you too! Please use it!