DEV Community

Alex Spinov
Alex Spinov

Posted on

DuckDB Has a Free API — Analytics Inside Your Process

DuckDB is an in-process OLAP database. Think SQLite for analytics — no server, just import and query CSV, Parquet, JSON directly.

What Is DuckDB?

DuckDB runs inside your application process and uses columnar storage with vectorized execution.

Features:

  • Zero-dependency, in-process
  • Columnar vectorized engine
  • Query Parquet, CSV, JSON directly
  • PostgreSQL wire protocol
  • Free and open source (MIT)

Python Example

import duckdb

result = duckdb.sql("SELECT * FROM read_csv(data.csv) WHERE amount > 100")
print(result.fetchdf())

duckdb.sql("SELECT year, SUM(sales) FROM read_parquet(s3://bucket/*.parquet) GROUP BY year")
Enter fullscreen mode Exit fullscreen mode

Use Cases

  1. Data analysis — SQL on local files
  2. Jupyter notebooks — fast analytics
  3. Embedded analytics — inside your app
  4. ETL — transform with SQL
  5. Local BI — query without servers

Need web data at scale? Check out my scraping tools on Apify or email spinov001@gmail.com for custom solutions.

Top comments (0)