Building an AI Email Assistant That Prioritizes, Sorts, and Summarizes with LLMs

If you’re a developer interested in building a smart productivity tool using Large Language Models (LLMs), this guide walks you through building an intelligent, AI-powered email assistant.

We'll cover:

🧠 How to classify and prioritize incoming emails using LLMs
🔍 Summarizing long email threads in seconds
📌 Integrating with Gmail and scheduling workflows with tools like LangChain, FastAPI, and Celery

This is a technical deep dive — with code examples — aimed at developers building intelligent tools with OpenAI, Pinecone, and more.

🏗️ The High-Level Architecture

Here's what we're building:

Email Integration Layer: Gmail OAuth + IMAP sync
LLM-Powered Inference Pipeline
- Classification (e.g. Important / Ignore / Personal / Work)
- Smart Prioritization using fine-tuned prompts
- TL;DR Summarization
Memory Layer: Vector DB using Pinecone or Weaviate
Scheduler & Orchestration: Celery + Redis
Frontend Layer: React dashboard (optional, out of scope here)

🔐 Step 1: Gmail OAuth & IMAP

To pull emails from a Gmail inbox, use Google’s OAuth 2.0 and IMAP access.

import imaplib
import email

mail = imaplib.IMAP4_SSL("imap.gmail.com")
mail.login('your_email@gmail.com', 'app_password')
mail.select("inbox")
result, data = mail.search(None, "ALL")

Store and preprocess emails into a standard JSON format (date, sender, subject, body).

🔎 Step 2: Email Classification with LLM

Use OpenAI's GPT model to classify email type:

prompt = f"""
Classify the following email as one of: Work, Personal, Spam, Newsletter, Important.

Email:
{email_body}

Classification:
"""

response = openai.ChatCompletion.create(
  model="gpt-4",
  messages=[{"role": "user", "content": prompt}]
)
label = response['choices'][0]['message']['content']

📊 Step 3: Prioritize Using Metadata + Content

Instead of just classification, rank emails by priority using both metadata (time, sender) and LLM-driven sentiment/urgency analysis.

priority_prompt = f"""
Rate the urgency of this email from 1 (low) to 5 (very high). Just return the number.

Email:
{email_body}
"""

You can sort or tag inbox items based on this result.

📝 Step 4: TL;DR Summarization with GPT

summary_prompt = f"""
Summarize the following email thread in 2 sentences max.

Thread:
{email_thread}

Summary:
"""

LLMs are surprisingly effective at summarizing long chains, especially when you chunk them properly.

🧠 Step 5: Memory Using Pinecone or Weaviate

Store previous emails and summaries as vector embeddings for fast semantic search:

from sentence_transformers import SentenceTransformer
import pinecone

model = SentenceTransformer("all-MiniLM-L6-v2")
embedding = model.encode(summary)
pinecone.upsert([(email_id, embedding)])

Later, you can search "What did John say about the proposal?" and retrieve context semantically.

🔁 Step 6: Scheduling and Notifications with Celery

Use Celery for:

Checking new emails every 15 mins
Running classification + summarization jobs
Sending digest notifications

from celery import Celery
app = Celery('tasks', broker='redis://localhost:6379/0')

@app.task
def check_and_classify():
    # Pull emails, classify, summarize, send alerts