What if you could talk to your inbox directly from Discord?
Not just ask an AI to summarize emails, but actually trigger Gmail actions from a chat conversation:
- Read unread emails
- Reply to an existing email
- Send a brand-new email
- Handle Google authentication
- Format the result back into a clean Discord message
That is exactly what I built in this video tutorial:
In this tutorial, I use Hexabot, a self-hosted AI chatbot and workflow automation platform, to build a Discord AI assistant that communicates with Gmail.
The goal is not just to connect an LLM to Gmail.
The goal is to show a more controlled way to build AI automation workflows: conversation, intent detection, conditional logic, tool execution, authentication handling, and response formatting.
Why not just use an MCP Gmail server?
Technically, we could have built this faster.
For example, we could configure an AI agent with an MCP connection to a Gmail MCP server and let the agent decide when to call Gmail tools.
That approach is useful, especially for quick prototypes.
But this tutorial takes a different route on purpose.
Instead of giving the agent full freedom over the execution flow, we build the workflow visually step by step. This gives us more control over what happens, when it happens, and how each decision is handled.
In other words, the AI does not directly “do everything.”
The workflow separates responsibilities:
- The AI detects the user intent.
- The workflow validates the intent.
- The Gmail action performs the actual operation.
- A conditional branch handles authentication.
- Another AI step formats the final response for Discord.
This makes the assistant easier to debug, safer to extend, and more understandable for developers who want to build production-ready AI automations.
What the assistant can do
The Discord Gmail assistant supports three main use cases.
1. Read emails
Example prompt:
Show me the top two unread emails.
The assistant understands that the user wants to read emails, extracts the limit, calls Gmail, and returns a clean Discord-friendly summary.
2. Reply to an email
Example prompt:
Reply to this email saying: Thank you, well noted.
The assistant extracts the action as reply, identifies the target email, captures the reply text, and sends the response through Gmail.
3. Send a new email
Example prompt:
Send an email to someone@example.com about AI evolution during the past year.
The assistant detects that this is a new email, extracts the recipient, prepares the subject and content, sends the email, and confirms the result in Discord.
The workflow architecture
The full workflow follows this pattern:
Discord message
↓
AI Infer Intent
↓
Valid action?
├── No → Ask the user what they want to do
└── Yes → Gmail Action
↓
Gmail status?
├── 401 → Send Google sign-in button
└── 200 → AI Formatter → Send Discord message
This is the most important part of the tutorial.
We are not building a black-box AI agent that randomly decides what to do. We are building a controlled execution flow where every important step is visible.
Step 1: Detect the user intent
The first step is an AI Infer Object action.
Instead of asking the LLM to return a free-form text answer, we ask it to return a structured object.
The schema includes fields like:
action
limit
targetMail
mailText
subject
The action field can be:
read
reply
new
empty
This acts as a contract between the AI and the workflow.
For example:
Show me the top two unread emails.
Becomes something like:
{
"action": "read",
"limit": "2"
}
And:
Reply to the last email saying thank you.
Becomes:
{
"action": "reply",
"mailText": "Thank you"
}
This is a powerful pattern because the AI is not responsible for executing Gmail operations directly. It only extracts the intent and the required fields.
The workflow decides what happens next.
Step 2: Add a conditional branch
After intent detection, the workflow checks whether the AI detected a valid action.
If the action exists, the workflow continues to the Gmail step.
If the action is empty, the assistant sends a fallback message:
Hello, how can I help you today?
This prevents the assistant from guessing when the user request is unclear.
That matters a lot when the assistant has access to real actions like sending emails.
Step 3: Execute the Gmail action
The next step is the Gmail action.
This is where the workflow maps the structured AI output into the Gmail operation:
mailText → email body
subject → email subject
targetMail → recipient or target email
action → read, reply, or new
limit → number of emails to fetch
The key idea is this:
The AI detects the intent.
The Gmail action executes the operation.
That separation is what makes the workflow more predictable.
You can inspect the values, debug the action, add conditions, add validation, or insert an approval step before sending emails.
For a production setup, I would strongly recommend adding a confirmation step before sending or replying to emails, especially if the assistant is used by a team.
Step 4: Handle Google authentication
The Gmail action can return different statuses.
If the status is 200, the Gmail operation worked.
If the status is 401, the user still needs to authenticate with Google.
So the workflow handles the authentication case separately.
When authentication is required, the assistant sends a Discord button:
Sign in with Google
The button links to the Google OAuth flow.
This makes the user experience much smoother. The user can start from Discord, receive the authentication button, sign in with Google, and continue the conversation.
Step 5: Format the response for Discord
Once the Gmail action succeeds, the raw Gmail result still needs to be formatted.
For that, the workflow uses an AI Agent step called AI Formatter.
Its job is simple:
- Take the Gmail result
- Keep the useful information
- Format it as a Discord-friendly message
- Avoid unnecessary explanations
This is another good design choice.
The Gmail action handles the tool call.
The AI formatter handles presentation.
The final message is sent back to Discord.
Again, each step has a clear responsibility.
Step 6: Connect the workflow to Discord
After the workflow is ready, we connect it to the Discord channel.
In Hexabot, the Discord channel is configured with:
- The target workflow
- The Discord bot token
- The application ID
Once connected, messages sent to the Discord bot trigger the Gmail assistant workflow.
At that point, the assistant is ready to test directly inside Discord.
Demo: speaking with your inbox
In the demo, I test three real use cases.
First, I ask the assistant to access Gmail. Since the user is not authenticated yet, the workflow sends the Google sign-in button.
Then I test reading unread emails:
Show me the top two unread emails.
The assistant fetches the emails and returns a clean summary inside Discord.
Next, I test replying to an email:
Reply to this email saying: Thank you, well noted.
The assistant sends the reply and confirms the action.
Finally, I test sending a brand-new email.
The assistant prepares the email, sends it through Gmail, and returns the result in Discord.
At this point, the assistant is no longer just answering questions. It is taking real actions through Gmail.
Why this pattern matters
The interesting part of this tutorial is not only Gmail.
The real value is the workflow pattern:
Conversation → Intent Detection → Tool Calling → Authentication → Formatting → Response
You can reuse the same pattern for many other automations:
- CRM updates
- Support inbox triage
- Calendar scheduling
- Internal admin tools
- Lead qualification
- Notification workflows
- Reporting assistants
This is where AI automation becomes more practical.
Instead of building a fully autonomous agent and hoping it behaves correctly, you can build a workflow where the AI is used where it makes sense, while the business logic stays explicit and visible.
A note about custom actions
This tutorial focuses on building the workflow visually from the Hexabot editor.
It does not cover how to develop the Gmail custom action itself.
The source code for the project is linked in the YouTube description, and custom action development will be covered in a separate video.
So if your goal is to understand how to build the workflow, this tutorial is for you.
If your goal is to create new custom actions from scratch, stay tuned for the follow-up.
Security notes
Because this workflow connects to Gmail, security matters.
Before publishing or deploying something similar, make sure to:
- Never commit API keys or OAuth client secrets
- Use environment variables or secure credential storage
- Limit OAuth scopes to what your assistant actually needs
- Add a confirmation step before sending emails in production
- Log important actions for auditability
- Handle errors and expired sessions properly
AI agents are powerful, but production workflows need guardrails.
Watch the full tutorial
You can watch the full step-by-step tutorial here:
https://www.youtube.com/watch?v=FKrzVK1fqK4
Useful links:
- Hexabot website: https://hexabot.ai
- Hexabot documentation: https://docs.hexabot.ai
- Hexabot GitHub: https://github.com/hexabot-ai/hexabot
- Video tutorial: https://www.youtube.com/watch?v=FKrzVK1fqK4
Final thoughts
Building an AI assistant is easy when everything is a prompt.
Building a useful AI assistant requires more structure.
This Discord Gmail assistant is a simple example, but it shows an important idea: AI agents become much more reliable when they are combined with workflows, conditions, typed outputs, authentication handling, and controlled tool execution.
That is the difference between a cool demo and something you can actually build on top of.
If you are building AI automations, try thinking less in terms of “one big agent” and more in terms of “controlled execution flow.”
That is where things get interesting.
Top comments (0)