DEV Community

Asanka Boteju
Asanka Boteju

Posted on

1 1 1

In-place Serverless Querying AWS S3 Data

We often have a need to directly query unstructured data stored in S3 Buckets in various data formats such as CSV, JSON, AVRO, ORC, PARQUET for ad-hoc querying or may be as a part of building a comprehensive data solution.

Below are some AWS Serverless services that you can use to directly query your S3 data.

1. Amazon Athena
Suitable for ad-hoc data discovery and SQL querying. In this service you are charged based on the amount of data scanned.

Image description

2. Amazon Redshift Spectrum
Suitable if you have to use more complex queries and also if you need to support a large user base.

Image description

Redshift spectrum is recommended due to below reasons.

  • Uses Redshift Data warehouse SQL syntax which can spans Redshift Tables and S3 Data Lakes.

  • Provides sophisticated query optimization.

  • Distributes queries across multiple nodes for parallel processing.

  • Can be used with already existing BI tools.

Thank you for your time...

Heroku

Build apps, not infrastructure.

Dealing with servers, hardware, and infrastructure can take up your valuable time. Discover the benefits of Heroku, the PaaS of choice for developers since 2007.

Visit Site

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs