DEV Community

Muhammad Zubair
Muhammad Zubair

Posted on

Integrating ClickHouse with AWS S3

Integrating ClickHouse with AWS S3

To integrate ClickHouse with an S3 bucket for fetching data, performing operations, and putting data back, follow these steps:

1. Setting Up ClickHouse

Install ClickHouse:

  • On a Debian-based system:
sudo apt-get install clickhouse-server clickhouse-client
Enter fullscreen mode Exit fullscreen mode
  • Start ClickHouse server:
sudo service clickhouse-server start
# or
sudo clickhouse start
Enter fullscreen mode Exit fullscreen mode
  • Start clickhouse-client with:
clickhouse-client --password
Enter fullscreen mode Exit fullscreen mode

2. Fetching Data from S3 and Loading into ClickHouse

Create a Table in ClickHouse:

CREATE TABLE s3_data (
    id UInt32,
    name String,
    value Float32
) ENGINE = MergeTree()
ORDER BY id;
Enter fullscreen mode Exit fullscreen mode

Load Data from S3:
Use the s3 table function to load data directly from an S3 bucket:

INSERT INTO s3_data
SELECT *
FROM s3('https://s3.amazonaws.com/your-bucket/path/to/data.csv', 'YOUR_AWS_ACCESS_KEY_ID', 'YOUR_AWS_SECRET_ACCESS_KEY', 'CSVWithNames');
Enter fullscreen mode Exit fullscreen mode

3. Performing Operations on Data in ClickHouse

Perform SQL queries to analyze the data:

SELECT name, AVG(value) AS avg_value
FROM s3_data
GROUP BY name;
Enter fullscreen mode Exit fullscreen mode

Top comments (0)