DEV Community

Asanka Boteju
Asanka Boteju

Posted on

1 1 1

In-place Serverless Querying AWS S3 Data

We often have a need to directly query unstructured data stored in S3 Buckets in various data formats such as CSV, JSON, AVRO, ORC, PARQUET for ad-hoc querying or may be as a part of building a comprehensive data solution.

Below are some AWS Serverless services that you can use to directly query your S3 data.

1. Amazon Athena
Suitable for ad-hoc data discovery and SQL querying. In this service you are charged based on the amount of data scanned.

Image description

2. Amazon Redshift Spectrum
Suitable if you have to use more complex queries and also if you need to support a large user base.

Image description

Redshift spectrum is recommended due to below reasons.

  • Uses Redshift Data warehouse SQL syntax which can spans Redshift Tables and S3 Data Lakes.

  • Provides sophisticated query optimization.

  • Distributes queries across multiple nodes for parallel processing.

  • Can be used with already existing BI tools.

Thank you for your time...

Do your career a big favor. Join DEV. (The website you're on right now)

It takes one minute, it's free, and is worth it for your career.

Get started

Community matters

Top comments (0)

Sentry image

See why 4M developers consider Sentry, β€œnot bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

πŸ‘‹ Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay