DEV Community

Cover image for Leveraging Airbyte ๐Ÿชผ and Motherduck ๐Ÿฆ† for Sentiment Analysis
Abhiraj Adhikary
Abhiraj Adhikary

Posted on

Leveraging Airbyte ๐Ÿชผ and Motherduck ๐Ÿฆ† for Sentiment Analysis

This blog is a part of the Airbyte + Motherduck Hackathon where Iโ€™ll demonstrate how to connect Google Sheets with Motherduck using Airbyte. This setup forms the backbone of my Dropbox Sentiment Analysis Dashboard, enabling seamless data integration and storage for analysis. This blog makes it easy to make your fist setup on Airbyte between your source and destination, it is advised to go through the official documentation after this. Letโ€™s dive in! ๐Ÿคฟ๐ŸŒŠ


Overview of the Project ๐Ÿ—บ๏ธ

The goal is to analyze user reviews of the Dropbox app using sentiment analysis techniques. Here's a breakdown of the workflow:

  1. Dataset Source: A CSV dataset of Dropbox app user reviews, downloaded from Kaggle.
  2. Preprocessing: Uploaded the CSV to Google Sheets for basic formatting (e.g., converting ratings from text to integers).
  3. Airbyte Integration: Used Airbyte to connect Google Sheets (source) with Motherduck (destination).
  4. Destination Setup: Motherduck stores the data in DuckDB (similar to SQL databases).
  5. Analysis: Built a sentiment analysis dashboard using Python and Streamlit.

Let me walk you through the setup process for Airbyte and Motherduck. ๐ŸŽฎ


What is Airbyte? ๐Ÿง

Airbyte is an open-source data integration platform that helps synchronize data between different sources and destinations. It provides a wide range of connectors and a user-friendly interface to automate data workflows.

What is Motherduck? ๐Ÿค”

Motherduck is a cloud-based platform built on DuckDB, a fast and lightweight SQL engine. It allows efficient data analysis and management, making it an excellent choice for scalable and real-time data handling.

Airbyte + Motherduck


Setting Up Airbyte ๐Ÿชผ

Step 1: Go to Airbyte and log in.

Youโ€™ll land in the Airbyte workspace. Follow these steps:

Create a New Connection

  1. Click on New Connection and choose Google Sheet as your source.
  2. Share your dataset on Google Sheets and copy the link.
  3. Paste the shared link into the placeholder in Airbyte.
  4. Authenticate your Google account (ensure it's the same account linked to the Google Sheet).

Airbyte Source Setup

Select Destination

  1. Under the Marketplace, search for and select Motherduck.
  2. Authenticate Motherduck as the destination (process is written below).

Configuring Motherduck ๐Ÿฆ†

Step 2: Head over to Motherduck and sign up.

  1. After signing up, delete the sample workspace (not needed for this setup).
  2. Navigate to Settings under your profile.
  3. In the General tab, generate a Motherduck token (API Key).
  4. Copy the token and paste it into Airbyte when prompted.

Airbyte-Motherduck Connection

Schedule the Sync ๐ŸŽ—๏ธ

  • Configure the sync schedule to keep your Motherduck database updated with any changes in the Google Sheet.
  • Click Next to finalize the connection.

Validating the Connection ๐Ÿ”„

After completing the setup, check if the source data has successfully transferred to the destination:

  1. On the left panel of your Motherduck page, find Attached Databases.
  2. Under my_db, navigate to main, where youโ€™ll see your dataset (e.g., dropbox_reviews).
  3. Start a new notebook and run queries to confirm the data transfer.

Example query:

from my_db.main.dropbox_reviews
select
    score,
    content,
    reviewId,
    _airbyte_raw_id,
    _airbyte_extracted_at
limit 100
Enter fullscreen mode Exit fullscreen mode

Google Sheet - Airbyte - Motherduck Connection


Whatโ€™s Next? ๐Ÿšž

This blog covers the setup of Airbyte and Motherduck for seamless data integration. In my next post, Iโ€™ll dive into:

  • Project Structure: A detailed walkthrough of the Dropbox Sentiment Analysis project.
  • Coding Logic: Explanation of Python libraries used for sentiment analysis.
  • Dashboard Deployment: How to deploy the application on Streamlit.

PROJECT ๐Ÿ“Š : https://airbyte-motherduck-hack-dropbox-sentiment-analysis.streamlit.app

Stay tuned for an exciting journey into sentiment analysis of Dropbox User Reviews! ๐Ÿš€๐ŸŒ•๐Ÿชผ๐Ÿฆ†

AirbyteHQ #Motherduck #HappyConnecting

Top comments (0)