Hey Devs π,
If you're exploring modern data engineering stacks or want to try out ClickHouse with Go and Python β this post is for you!
I wanted to experiment with something lightweight but real:
Generating a Parquet file using Python and loading it into ClickHouse using Go.
Hereβs what I built, how it works, and what I learned π
π GitHub Repo
π¦ What This Project Does
This is a beginner-friendly, containerized mini-project that:
π§ͺ Generates sample data using a Python script
π Converts it into a Parquet file
π Loads the data into a ClickHouse table using a Go app
π³ Runs locally using Docker Compose
π οΈ Tech Stack
- Python β to generate Parquet files
- Go β to read Parquet and insert into ClickHouse
- ClickHouse β lightning-fast OLAP DB
- Docker Compose β to simplify ClickHouse setup
- Parquet β for efficient columnar storage
βοΈ How To Run It Locally
Step 1. Clone the repo
git clone https://github.com/mohhddhassan/go-clickhouse-parquet.git
cd go-clickhouse-parquet
Step 2. Generate sample Parquet data
cd python
python3 generate_parquet.py
Step 3. Start ClickHouse using Docker Compose
docker-compose up -d
Step 4. Run the Go app to ingest data
cd go
go run main.go
ποΈ Project Structure
go-clickhouse-parquet/
βββ docker-compose.yml # ClickHouse setup
βββ parquet-files/
β βββ sample.parquet # Auto-generated test file
βββ python/
β βββ generate_parquet.py # Script to create data
βββ go/
βββ go.mod
βββ go.sum
βββ main.go # Ingests Parquet into ClickHouse
π€― What I Learned
π‘ How to programmatically create Parquet files
π‘ Connecting Go with ClickHouse and executing inserts
π‘ Using Docker Compose to deploy ClickHouse quickly
π‘ Structuring a mini ETL workflow with multiple languages
π Why You Should Try This
If you're learning data engineering or systems programming:
- Try combining Python + Go for real-world data movement
- Practice building and using Parquet files β they're everywhere in analytics
- Explore ClickHouse and see how blazing fast OLAP can be
- Get used to wiring up different components in a real pipeline
π Whatβs Next?
π Build a ClickHouse dashboard on top of this data
βοΈ Try streaming Parquet data into ClickHouse
π Expand schema complexity for more realistic ingestion
π οΈ Benchmark Go vs Python for loading speed into ClickHouse
πββοΈ About Me
Mohamed Hussain S
Associate Data Engineer
LinkedIn | GitHub
π§ͺ Building one mini project at a time to become a better data engineer.
Top comments (0)