DEV Community

Cover image for I built an AI tool to handle my mom's invoices and saved her 20 hours of work! 😲
Sunil Kumar Dash for Composio

Posted on

I built an AI tool to handle my mom's invoices and saved her 20 hours of work! 😲

TL;DR

My mom has been running a small business for a while, and she gets a ton of invoices from clients, suppliers, and dealers. I always wondered why she always seemed mad at me when I was younger. It wasn’t long before I realized she was constantly fumbling with these invoices.

So, I thought, why not create an AI bot that automatically retrieves emails, processes them, and organizes the necessary details into a spreadsheet?

Here’s how I built the bot that saved my mom 20 hours of hair-pulling and screaming at everyone.

  • Use Composio's Gmail Integration to retrieve invoice emails from the inbox.
  • Use LLM to extract relevant data points.
  • Add data points to the Google sheet.

Invoice GIF


Composio - Open-source platform for AI tools & Integrations

Here’s a quick introduction about us.

Composio is an open-source tooling infrastructure for building robust and reliable AI applications. We provide over 100+ tools and integrations across industry verticals from CRM, HRM, and Sales to Productivity, Dev, and Social Media.

Composio handles user authentication and authorization for all these applications, making connecting API endpoints with various AI models and frameworks simple.

Guy Struggling gif

Please help us with a star. 🥹

It would help us to create more articles like this 💖

Star the Composio repository ⭐


How does it work?

This project simplifies retrieving invoice emails from Gmail, downloading invoice attachments, and extracting critical elements to a Google sheet.

Gmail agent workflow

  1. Enter keywords you want to search for in your Gmail.
  2. Enter Google sheet ID and attributes to extract information from invoices.
  3. Gmail Tool finds emails and attachments from Gmail inbox that match your keyword criteria.
  4. Relevant information from the attachments is extracted and stored in your linked Google Sheet.

Technical Description

Under the hood, the AI tool divides the task into multiple steps and executes them:

  1. Retrieves emails from Gmail that match the keyword/phrase criteria.
  2. Download the relevant attachments.
  3. Extract valuable attributes from the attachments using Nanonets.
  4. Stores the extracted data in the linked Google Sheet

Techstack

  • Frontend: React, Vite, and TailwindCSS.
  • Backend: Python, FastAPI.
  • AI Agent: CrewAI, Composio, Gemini.

Quick Description

  • Composio: A toolkit for integrating apps with AI agents.
  • CrewAI: An open-source framework for building collaborative multiple AI bot systems.
  • React + Vite: A combination of React for building UIs and Vite for fast development and build tooling.
  • FastAPI: Python framework for building REST APIs faster.
  • Gemini: LLMs from Google.

Let’s Get Started 💥

To start quickly, fork and clone this repository and cd into the gmailgenius-attachment-extract-store folder.

The project has two parts: the back end and the front end. The back end consists of the AI tool built using CrewAI, Composio, and Gemini, and the front end has an interactive UI.

Setting Up the Backend

To set up the development environment. Make the setup.sh executable and execute it.

cd GmailGenius/backend
chmod +x setup.sh
,/setup.sh
Enter fullscreen mode Exit fullscreen mode

For reference, this is the setup.sh file.


#!/bin/bash

# Create a virtual environment
echo "Creating virtual environment..."
python3 -m venv ~/.venvs/gmail_agent

# Activate the virtual environment
echo "Activating virtual environment..."
source ~/.venvs/gmail_agent/bin/activate

# Install libraries from requirements.txt 
echo "Installing libraries from requirements.txt..."
pip install -r requirements.txt

# Login to your account
echo "Login to your Composio acount"
composio login

# Add calendar tool
echo "Add Gmail tools. Finish the flow"
composio add gmail
composio add googlesheets 

#Enable Gmail trigger
composio triggers enable gmail_new_gmail_message

# Copy env backup to .env file
if [ -f ".env.example" ]; then
    echo "Copying .env.example to .env..."
    cp .env.example .env
else
    echo "No .env.example file found. Creating a new .env file..."
    touch .env
fi

# Prompt user to fill the .env file
echo "Please fill in the .env file with the necessary environment variables."

echo "Setup completed successfully!"
Enter fullscreen mode Exit fullscreen mode

This will create a Python virtual environment and install libraries from requirements.txt. You will also be prompted to log in to Composio. This will redirect you to the Composio login page.

Create an account on Composio and paste the displayed key into the terminal to log in to your Composio account.

Composio login key

You will then be redirected to the Google Authentication page to add the Gmail and Google Sheet integrations.

Composio Authentication successful

Once you are done with integration. You can visit the composio dashboard and monitor your integrations.

Composio Dashboard


Building the backend

Now that we are finished with the integrations let's build the backend.

Prerequisites

You will need APIs for Nanonets and Google’s Gemini to complete the project.

Nanonets

This will help extract relevant data from the invoice PDFs. So, create an account with Nanonet and a free API key.

Nanonet dashboard

Copy the key and add it to the .env file.

Also, set the Nanonet URL https://app.nanonets.com/api/v2/OCR/FullText into the .env file.

Also, go to the Google AI studio and create an API key.

Gemini AI studio

Save the key to the .env file as well.


Building AI Bot 🤖

Let’s start by creating the AI bot responsible for retrieving invoices from the Gmail inbox, processing the PDF, and writing it into the Google sheet.

Here is a quick overview of this section

  • We will set up an event listener with a Gmail trigger to poll emails from inbox.
  • Build an extractor tool for the AI bot to auto-retrieve invoice attributes using Nanonents.
  • Create a CrewAI agent to perform attribute extraction and update it to the sheet.
  • Few helper functions.

Import the required modules and load environment variables inside the agent.py file.

import os
import re
import glob
import json
from composio.client.collections import TriggerEventData
from composio_crewai import Action, ComposioToolSet
from crewai import Agent, Crew, Task, Process
from crewai_tools.tools.base_tool import BaseTool
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from typing import Any, Dict
import requests

load_dotenv()
Enter fullscreen mode Exit fullscreen mode

Create an instance for Gemini.

from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", 
                             verbose=True, temperature=0.5, 
                             google_api_key=os.environ.get("GEMINI_API_KEY"))
Enter fullscreen mode Exit fullscreen mode

Defining Increment Counter Tool

We will also want a tool to keep track of inserted rows in Google Sheets.

#Get the current counter value
def read_counter():
    try:
        with open("counter.json", "r") as file:
            data = json.load(file)
            return data["counter"]
    except FileNotFoundError:
        initial_data = {"counter": 0}
        with open("counter.json", "w") as file:
            json.dump(initial_data, file)
        return 0

#Tool to increment counter value
class incrementCounter(BaseTool):
    name: str = "Increment Counter"
    description: str = "This tool increments the counter value"

    def _run(self):
        current_value = read_counter()
        new_value = current_value + 1
        data = {"counter": new_value}
        with open("counter.json", "w") as file:
            json.dump(data, file)
Enter fullscreen mode Exit fullscreen mode

It tracks the row position in Google Sheets to add data accurately. Since we need to specify the exact cell (e.g., A1) for data entry, the counter helps determine the next available row, especially if some are already filled. The counter updates only when data is successfully added to ensure it reflects the correct position, preventing unnecessary updates when no data is added.

Defining Extractor tool + Google tools

Define a tool to extract the emails using Nanonets.

#Get the attachment that was recently downloaded 
def get_recent_attachment() -> str:
    pdf_files = glob.glob(os.path.join("/Users/abhishekpatil/.composio/output/", "*.pdf")) #modify path as per need
    if not pdf_files:
        return None  

    most_recent_pdf = max(pdf_files, key=os.path.getctime)
    return most_recent_pdf

#Extract useful attributes from the attachment
class extractorTool(BaseTool):
    name: str = "ExtractorTool"
    description: str = "This tool extracts useful attributes from pdf document/attachments"

    def _run(self) -> Dict[str, Any]:
        attachment = get_recent_attachment()
        url = os.environ.get("NANO_URL")
        FilePath = {'file': open(attachment, 'rb')}
        response = requests.post(url, auth=requests.auth.HTTPBasicAuth(os.environ.get("NANO_API_KEY"), ''), files=FilePath)
        return json.loads(response.text)["result"][0]["prediction"]
Enter fullscreen mode Exit fullscreen mode

In the above code block,

  • We define a CrewAI tool extractor tool that will parse PDFs and extract information from them.
  • The get_recent_attachment() function retrieves the recently downloaded PDF.

Next, Initialise the Composio tools for Gmails, Google Sheets, and the extractor tool we just defined.

#Tools
IncrementCounter = incrementCounter()
Extractor_tool=extractorTool()
composio_toolset = ComposioToolSet()
google_tools = composio_toolset.get_actions(
    actions=[
        # Action.GMAIL_FETCH_MESSAGE_BY_THREAD_ID, 
        Action.GMAIL_GET_ATTACHMENT,
        Action.GMAIL_FETCH_EMAILS,
        Action.GOOGLESHEETS_BATCH_UPDATE
    ]
)
tools = google_tools + [Extractor_tool, IncrementCounter]
Enter fullscreen mode Exit fullscreen mode

We defined an Extractor Tool before and now the Google tool with three actions:

  • Action.GMAIL_GET_ATTACHMENT: Retrieves attachments from Gmail emails.
  • Action.GMAIL_FETCH_EMAILS: Fetches emails from Gmail based on specific criteria.
  • Action.GOOGLESHEETS_BATCH_UPDATE: Updates data in Google Sheets in bulk.

Defining CrewAI Agent

Next, define a CrewAI agent.

google_assistant = Agent(
    role="Gmail Assistant",
    goal="""Get five recent emails/messages and check if the thread ID matches, download attachments & extract attributes/information from it & store attributes in Google sheet (Store just the values & not the attribute names)""",
    backstory=f"""You're an AI assistant that makes use of google tools/APIs and does work on behalf of user. You can extract attributes/information from attachments & Store them in Google sheet""",
    verbose=True,
    llm=llm,
    tools=tools,
    allow_delegation=False,
)
Enter fullscreen mode Exit fullscreen mode

Now, we will define a CrewAI Agent. The agent is responsible for carrying out the tasks.

Create an instance of Agent with

  • Role, Goal and Backstory: This provides the LLM additional context to complete a job.
  • Verbose: Logs execution traces.
  • LLM: The LLM instance.
  • Tools: All the tools we defined earlier. Extractor tool and Google tools.
  • allow delegation: Set to false so that the agent will not pass control flow to other agents (if available)

Event Listener

Next, define the event listener.

The event listener will continue to monitor the Gmail Inbox and retrieve emails when they arrive. You can define an event listener with trigger filters.

Here, we enabled a Gmail trigger during the Composio setup that fetches new emails from Gmail inbox.

Each event listener will accompany a callback function when an event is received via the trigger.

#Get keywords, attributes & sheet ID
def readData():
    with open("taskData.json", "r") as file:
        data = json.load(file)
        return [data["keywords"], data["attributes"], data["sheetId"]]

def formatData(payload):
    try:
        threadId = payload.get("threadId", "NA")
        attachmentDetails = payload.get("attachmentList", [])

        if not attachmentDetails or not isinstance(attachmentDetails, list):
            return [threadId, "NA", "NA"]

        attachmentId = attachmentDetails[0].get("attachmentId", "NA")
        filename = attachmentDetails[0].get("filename", "NA")

        return [threadId, attachmentId, filename]
    except (IndexError, AttributeError, TypeError) as e:
        # Log the error if needed
        return ["NA", "NA", "NA"]

@listener.callback(filters={"trigger_name": "GMAIL_NEW_GMAIL_MESSAGE"})
def callback_new_message(event: TriggerEventData) -> None:
    print("Received email")
    payload = event.payload
    formattedPayload = formatData(payload)
    message_id = formattedPayload[0]
    attachment_id = formattedPayload[1]
    file_name = formattedPayload[2]

    res = readData()
    keywords = res[0]
    attributes = res[1]
    sheetId = res[2]

    find_invoice_from_gmail = Task(
        description=f"""
        Check if the email subject or body contains keywords like {keywords}, if so then download the attachment & store the following attributes: {attributes} in google sheet with id {sheetId} & sheet name sheet1 and cell A{read_counter()},
        Email: {payload}
        """,
        agent=google_assistant,
        expected_output="If email matches criteria ({keywords}) then download attachment & store attributes on google sheet & increment counter if and only if email matches keywords, otherwise indicate email isnt related",
    )

    gmail_processing_crew = Crew(
        agents=[google_assistant],
        tasks=[find_invoice_from_gmail],
        verbose=1,
        process=Process.sequential,
    )
    result = gmail_processing_crew.kickoff()
    return result

print("Subscription created!")
listener.listen()
Enter fullscreen mode Exit fullscreen mode

In the callback function callvack_new_message,

  • We first formatted the event payload from Gmail and extracted relevant data, such as message ID, thread ID, etc.
  • We also extracted the keywords to look for in emails to find invoices, attributes to save on the Google sheet, and the Sheet name. from the JSON file saved from the front end.
  • Defined a CrewAI Task for the google_assistant agent with a clear description and expected output.
  • Finally, define the Crew with the agent and the task and set up the event listener.

Running the Event Listener

Finally, run the event listener using the following command.

python agent.py
Enter fullscreen mode Exit fullscreen mode

This will set up the event listener, polling the Gmail inbox regularly (the default is 10 minutes).

When a new email is received, it will look for relevant keywords that you specified on the front end and trigger the Crew if an appropriate match is found.

The Agent will perform the tasks and update the Google Sheets with relevant invoice attributes.


Building the API backend

Next, we will build an API endpoint to receive information from the front end. As I mentioned before, we will use FastAPI and Pydantic.

Import the required modules and set up logging.

from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from agent import run_crew
import logging

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
Enter fullscreen mode Exit fullscreen mode

Create a FastAPI app and set up CORS using CORSMiddleware.

app = FastAPI()

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"], 
    allow_credentials=True,
    allow_methods=["*"], 
    allow_headers=["*"], 
)
Enter fullscreen mode Exit fullscreen mode
  • allow_origins=["*"]: Allows requests from any origin. This is useful for development but should be restricted in production.
  • allow_credentials=True: Allows cookies and HTTP authentication to be included in the requests.
  • allow_methods=["*"]: Allows all HTTP methods (GET, POST, PUT, DELETE, etc.).
  • allow_headers=["*"]: Allows all headers in the requests.

Now, define a Pydantic class for Message.

class Message(BaseModel):
    emailKeywords: str
    attributes: str
    sheetId: str
Enter fullscreen mode Exit fullscreen mode

Also, a write data function to save information to a JSON file.

def writeData(keywords: str, attributes: str, sheetId: str):
    data = {
        "keywords": keywords,
        "attributes": attributes,
        "sheetId": sheetId
    }

    with open("taskData.json", "w") as file:
        json.dump(data, file, indent=4)
Enter fullscreen mode Exit fullscreen mode

Finally, define the POST endpoint.

@app.post("/configparameters")
async def handle_request(message: Message):
    try:
        logger.info(f"Received request with emailKeywords: {message.emailKeywords} and attributes: {message.attributes}")
        writeData(message.emailKeywords, message.attributes, message.sheetId)
        logger.info(f"Data written successfully to taskData.json")
        return {"message": "Data written successfully"}
    except Exception as e:
        logger.error(f"Error occurred: {str(e)}")
        raise HTTPException(status_code=500, detail=str(e))
Enter fullscreen mode Exit fullscreen mode

The endpoint receives user inputs from the front end and saves it to a JSON file.


Building the Frontend

The application's front end is built with React and Vite.

Go to the src directory and install the dependencies.

npm install
Enter fullscreen mode Exit fullscreen mode

Create Pages for the App

For this project, we have three pages.

  1. Home: The Home page lists FAQs, etc.
  2. Dashboard: The main app user interface.

Designing the Home page

Let’s now design the Home page.

The home page will contain a few FAQs and information regarding this project. So, feel free to skip this.


import Hero from "../components/Hero";
import Benefits from "../components/Benefits";
import FAQ from "../components/FAQ";
import Working from "../components/Working";
import ActionButton from "../components/ActionButton";
const Home = () => {
    return <section className="bg-white dark:bg-gray-900 mt-12">
        <div className="py-8 px-4 mx-auto max-w-screen-xl text-center lg:py-16 lg:px-12">
            <Hero />
            <Benefits />
            <Working />
            <FAQ />
            <div className="mt-20">
                <ActionButton displayName={"Get started"} link={"#"} />
            </div>
        </div>
    </section>
}

export default Home;

Enter fullscreen mode Exit fullscreen mode

This will create a simple Home page like the following picture.

home page


Designing the Dashboard

The dashboard contains two input text boxes that accept keywords that will be used to search emails and attributes you want to store in the spreadsheet.

import ConfigParameters from "../components/ConfigParameters";
const Dashboard = () => {
    return <section className="bg-white dark:bg-gray-900 mt-12">
        <div className="py-8 px-4 mx-auto max-w-screen-xl text-center lg:py-16 lg:px-12">
            <span className="font-semibold text-3xl text-gray-900">Enter Keywords (Crisp & Concise)</span>
            <ConfigParameters />
        </div>
    </section>
}

export default Dashboard;
Enter fullscreen mode Exit fullscreen mode

This will create a nice, simple dashboard for accepting user information. The fetch button will trigger the backend to spring into action.

dashboard


Defining the Main App layout

In the App.jsx file, we set up the primary component that manages user authentication and controls access to specific routes.

import { BrowserRouter, Routes, Route } from "react-router-dom";
import Navbar from "./components/Navbar";
import Home from "./pages/Home";
import Footer from "./components/Footer";
import Dashboard from "./pages/Dashboard";
import ScrollToTop from "./components/ScrollToTop";

const App = () => {
  return <>
    <BrowserRouter>
      <Navbar />
      <ScrollToTop />
      <Routes>
        <Route path="/dashboard" element={<Dashboard />} />
        <Route path="/" element={<Home />}>
        </Route>
      </Routes>
      <Footer />
    </BrowserRouter>
  </>
}

export default App;
Enter fullscreen mode Exit fullscreen mode

This is what is happening in the above function.

  • Imports: The code imports Home, Dashboard, Footer components, etc.
  • Router Setup: The dashboard component is rendered when the URL path is “/dashboard” and the home component when the path is “/home”. The nav bar and footers are rendered throughout the app.

Define the Entrypoint

Finally, define the main.jsx file as the entry point for the application.

import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import App from './App.jsx'
import './index.css'

createRoot(document.getElementById('root')).render(
  <StrictMode>
    <App />
  </StrictMode>,
)
Enter fullscreen mode Exit fullscreen mode

This gets executed when the application is run.


Running the App

Finally, run the application using the following npm command.

npm run dev
Enter fullscreen mode Exit fullscreen mode

This will start up the front-end server on the localhost:5345.
You can now visit the app and see it in action.

You can now visit the app and input the necessary details; when you click the configure button, the details will be saved to a JSON file.

See the entire workflow in action below.

Gmail AI bot workflow


Thank you for reading.


Next Steps

In this article, you built a complete AI tool that manages invoices from Gmail, processes them and updates them in the Google sheet.

If you liked the article, explore and star the Composio repository for more AI use cases.

 
 

star the repo

Star the Composio repository ⭐
 
 

Top comments (29)

Collapse
 
king_triton profile image
King Triton

This is such a relatable problem! I love how you took a real-life frustration and turned it into an opportunity to build something that not only helps your mom but also showcases the power of AI and automation. It's amazing how technology can save hours of tedious work and improve our day-to-day lives. I’m definitely going to check out Composio for my own projects. Thanks for sharing the process so thoroughly—really inspiring stuff!

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thanks for the kind words @king_triton, and feel free to reach out on Discord in case of any queries. dub.composio.dev/JoinHQ

Collapse
 
king_triton profile image
King Triton

thanks

Collapse
 
josephdailey11 profile image
JDailey

I'm curious how you got the technical experience to write a program such as this? Once of the few projects that actual leverages technology that makes your mom's life better.

Collapse
 
codegoat profile image
Antonio Fuller

I wondering how he got experience with CrewAI!!

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

It's not that difficult, and ChatGPT does wonders.

Collapse
 
nevodavid profile image
Nevo David

Your mom is lucky to have you!

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thanks, Nevo; I like to think so. :)

Collapse
 
codegoat profile image
Antonio Fuller

If I built this same exact app using Flutter, the code would easily be more than 15 files long

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Haha...perhaps. I have no idea about Dart and Flutter.

Collapse
 
denys_bochko profile image
Denys Bochko

pretty cool.

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thanks, Denys.

Collapse
 
shricodev profile image
Shrijal Acharya

I love these automation blogs. You've done a really great job. :)

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thank you, Shrijal.

Collapse
 
ricardogesteves profile image
Ricardo Esteves

Cool... That's actually cool!

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thanks, Ricardo.

Collapse
 
pankaj9296 profile image
Pankaj Patidar • Edited

That's really cool.
I also built a similar tool to solve pretty much same problem with local businesses. You may check it out at digiparser.com

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

@pankaj9296 I Just checked, looks fantastic. Are you the founder?

Collapse
 
pankaj9296 profile image
Pankaj Patidar

Yes, I'm the founder of DigiParser.

Thread Thread
 
sunilkumrdash profile image
Sunil Kumar Dash

Awesome, all the best.

Collapse
 
eddsaura profile image
Jose E Saura

It is extremely hard to see what Composio does in the landing page. Too many info but nothing clear.

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Got it! Thanks for the feedback—we'll keep that in mind as we work to make things clearer.

Collapse
 
hosseinyazdi profile image
Hossein Yazdi • Edited

Love the design TBH. Great one Sunil 👏

Feel free to add it to webcurate.co for more exposure if you're interested 😊

Collapse
 
rgolawski profile image
Rafał Goławski

That was a good read! Great content!

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

Thanks, Rafal.

Collapse
 
ayushmaniac profile image
Ayush Shrivastava

Really great man!

Collapse
 
sunilkumrdash profile image
Sunil Kumar Dash

thanks Ayush.

Collapse
 
nightcrawler123 profile image
Abhijeet Verma

I'm curious as to what is the running cost of this? I mean over head.

Collapse
 
bnnguyen335 profile image
Big size

the company technology the time very good. it me the last time don’t let me go the building. thank company. it me come backs and support setting up and builds on thei future the know. thank you very much.