DEV Community: Ayomide olofinsawe

How to Build an AI Dev Assistant with GitHub and Gmail APIs Using Nango

Ayomide olofinsawe — Mon, 04 May 2026 14:47:11 +0000

Introduction

Every morning as a developer, I was doing the same thing: opening GitHub to check notifications, switching to Gmail to scan for anything urgent, trying to mentally piece together what needed my attention and in what order. It was not a big problem, but it was a constant one. Small context switches that added up.

I wanted something simple: a tool that pulls all of that together, tells me what I have, flags what is urgent, and suggests what to tackle first. No dashboard to maintain, no browser tab to open. Just run it and get your day's context.

That is what I built: a command-line dev assistant that connects to GitHub and Gmail, fetches real data from both, and uses an LLM to return a plain-text summary with priorities and a suggested daily plan.

What I built: An AI Dev Assistant

Dev Assistant is a command-line tool I built to reduce the mental overhead of context switching. Instead of jumping between GitHub and Gmail trying to piece together what needs my attention, I run one command and get everything surfaced in order of priority with a plain-text summary of what to act on first.

The honest version: it runs locally, so I have to run it manually whenever I need context. But that is actually fine for how I use it. Before I start my day, before I pick up a new task, or when I want to clear my head, I run it, get my briefing, and get back to work. No background process, no notifications, no distraction. Just information on demand.

Here is what it does under the hood:

Fetches unread GitHub notifications (PRs, CI failures, review requests, mentions) paginated and filtered by a lookback window
Fetches Gmail inbox messages with sender, subject, date, and snippet
Scores both by priority using custom logic before anything touches the LLM
Passes the cleaned, prioritized data to Groq's llama-3.3-70b-versatile model
Returns a plain-text summary with urgent items flagged and a suggested order for the day

Stack: Node.js, TypeScript, Nango, OpenAI SDK pointed at Groq, dotenv.

What You’ll Get (Example Output)

Before we dive into the build, here’s what the assistant actually produces when you run it:

=============================
 ASSISTANT
=============================

QUICK SUMMARY
- 2 urgent GitHub items need immediate review, including a failed CI workflow on main
- 3 new Gmail messages require attention, including a security alert and an interview update

GITHUB (ACT ON FIRST)
- Review PR in your-repo — changes are blocking deployment and require approval
- Investigate failed CI workflow in your-repo — deployment pipeline is currently broken

GMAIL (ACT ON FIRST)
- Respond to security alert from Google — suspicious login attempt detected
- Reply to interview email — time-sensitive scheduling required

TODAY'S PLAN
- Start with GitHub blockers affecting deployment
- Handle urgent emails next
- Then move to lower-priority updates

This is the entire goal of the tool: one command, one clear plan for your day.

How It Works (Architecture Overview of What I Built)

Here’s a high-level view of how the system flows from data collection to output:

At a high level:

GitHub and Gmail act as data sources
Nango handles authentication and API access
The system processes and scores incoming data
Groq (LLM) converts it into a structured daily plan
The CLI outputs a clean, actionable summary

Why Nango

To build this, I needed to connect to two APIs that both use OAuth: GitHub and Gmail. I could have written the OAuth flow myself. It is doable. But token storage, refresh logic, scope management across two providers — that overhead adds up fast and it is not the interesting part of the build.

I had been looking at Nango already for a separate reason, so it was already on my radar. I decided to use it here. Once I set it up, the auth layer essentially disappeared. I connected both integrations through the Nango dashboard, got a providerConfigKey and connectionId for each, and from that point every API call looked like this:

const response = await nango.get<GmailMessageListResponse>({
  endpoint: '/gmail/v1/users/me/messages?maxResults=5',
  providerConfigKey: gmailProviderConfigKey,
  connectionId,
});

No token handling. No refresh logic. Nango injects the credentials, manages token expiry, and returns the response. The generic type parameter <GmailMessageListResponse> is just TypeScript telling the compiler what shape to expect back from response.data. Same pattern works for GitHub: different providerConfigKey, different connectionId, same method.

That consistency is what makes adding a third provider later a small task instead of a big one.

Prerequisites

Before you start, make sure you have the following:

Node.js v18 or higher (this project was built on v24)
A Nango account (free to sign up)
A Groq account (free API key available)
A GitHub account with notifications enabled
A Gmail account

Project Setup

Create the project folder and initialize it.

mkdir dev-assistant && cd dev-assistant
npm init -y

Install the required dependencies.

npm install @nangohq/node openai dotenv
npm install -D typescript ts-node @types/node

Create a tsconfig.json in the root of your project.

{
  "compilerOptions": {
    "target": "ES2020",
    "module": "commonjs",
    "rootDir": "src",
    "outDir": "dist",
    "esModuleInterop": true,
    "forceConsistentCasingInFileNames": true,
    "strict": true,
    "skipLibCheck": true
  }
}

Create the project structure.

mkdir src && touch src/index.ts src/github.ts src/gmail.ts src/summarize.ts src/types.ts src/utils.ts

Here is what each file is responsible for:

src/
  index.ts      — entry point, wires everything together
  github.ts     — GitHub types, priority scoring, fetch function
  gmail.ts      — Gmail types, priority scoring, fetch function
  summarize.ts  — Groq call, prompt, input preparation
  types.ts      — shared types: DigestState, DigestDelta
  utils.ts      — requireEnv, clip, formatAssistantResponse, printSection

Create a .env file in the root of your project.

NANGO_SECRET_KEY=
GROQ_API_KEY=
NANGO_GITHUB_CONNECTION_ID=
NANGO_GMAIL_CONNECTION_ID=
NANGO_GMAIL_PROVIDER_CONFIG_KEY=
NANGO_GITHUB_PROVIDER_CONFIG_KEY=
DEBUG=false
GITHUB_NOTIFICATIONS_LOOKBACK_DAYS=30

You will fill these in as you set up Nango and Groq in the next section. Make sure to add .env to your .gitignore so you do not accidentally commit your API keys.

Connecting GitHub and Gmail via Nango

Before writing any code, you need to set up your integrations on the Nango dashboard and get a test connection for each. This is where your providerConfigKey, connectionId, and NANGO_SECRET_KEY come from.

Setting up Nango

Go to app.nango.dev and sign up for a free account.
Head to Environment Settings and copy your NANGO_SECRET_KEY. Add it to your .env file.

Setting up the GitHub integration

In the sidebar click Integrations, then Add New Integration. Search for GitHub and select GitHub OAuth.
Click Custom Developer App. By default Nango provides a test app you can use to get started quickly, but for this build we are using a custom developer app.
Go to github.com/settings/developers and click New OAuth App.

* Fill in the application name. Anything works, for example `dev-assistant`
* Set the **Homepage URL** to `http://localhost:3000`
* Set the **Authorization callback URL** to `https://api.nango.dev/oauth/callback`. Do not change this value

Click Register application. On the next page, copy your Client ID.
Click Generate a new client secret and copy the secret immediately. GitHub only shows it once.
Back in the Nango dashboard, paste your client ID and client secret into the custom developer app fields. For scopes, add:
```
notifications read:user
```
Give your integration an ID. This becomes your NANGO_GITHUB_PROVIDER_CONFIG_KEY. Something like github-dev-assistant works. Add it to your .env.
Go to the Connections tab and click Add Test Connection. Select your GitHub integration, click Authorize, and log in with your GitHub account.
The ID you assigned to that connection is your NANGO_GITHUB_CONNECTION_ID. Add it to your .env.

Setting up the Gmail integration

Gmail requires a Google Cloud project with the Gmail API enabled and OAuth credentials configured.

Go to console.cloud.google.com and sign in. Click the project selector at the top of the page, select New Project, give it a name (for example dev-assistant) and click Create.
With your new project selected, go to APIs & Services → Library. Search for Gmail API, click on it, and click Enable.
Go to APIs & Services → OAuth consent screen. Select External as the user type and click Create. Fill in the required fields: app name, support email, and developer contact email. Click Save and Continue through the remaining steps.
Go to the Audience tab, scroll to Test users, click Add users, and add the Gmail address you want to use with the tool.

Note: Your app will be in Testing mode by default. Only users you explicitly add as test users can authorize the app. If you skip this step, the authorization will fail when you try to connect in Nango.
Go to Clients in the left sidebar and click Create Client. Select Web application as the Application type and give it a name.
Under Authorized redirect URIs, click Add URI and enter:
```
https://api.nango.dev/oauth/callback
```
Click Create. Google will display your Client ID and Client Secret. Copy both immediately. Google only shows the client secret once.
Back in the Nango dashboard, go to Integrations → Add New Integration → Google Mail → Custom Developer App. Paste your client ID and client secret. For the scope add:
```
https://www.googleapis.com/auth/gmail.readonly
```
Give the integration an ID. This becomes your NANGO_GMAIL_PROVIDER_CONFIG_KEY. Add it to your .env.
Go to Connections → Add Test Connection, select your Gmail integration, and authorize with the Gmail account you added as a test user. The connection ID you set becomes your NANGO_GMAIL_CONNECTION_ID. Add it to your .env.

Getting your Groq API key

Go to console.groq.com and sign in. In the left sidebar click API Keys.
If you already have a key, copy it and add it to your .env. If not, click Create API Key, give it a name (for example dev-assistant) and click Submit. Copy the key immediately. Groq only shows it once.

Add the key to GROQ_API_KEY in your .env. Your .env file should now look like this:

NANGO_SECRET_KEY=your_secret_key
GROQ_API_KEY=your_groq_api_key
NANGO_GITHUB_CONNECTION_ID=your_github_connection_id
NANGO_GMAIL_CONNECTION_ID=your_gmail_connection_id
NANGO_GMAIL_PROVIDER_CONFIG_KEY=your_gmail_integration_id
NANGO_GITHUB_PROVIDER_CONFIG_KEY=your_github_integration_id
DEBUG=false
GITHUB_NOTIFICATIONS_LOOKBACK_DAYS=30

Fetching Data from the GitHub API

Create src/github.ts. This file handles everything GitHub related: the types, the priority scoring logic, and the fetch function.

Add the imports and types at the top of github.ts.

import { Nango } from '@nangohq/node';
import { requireEnv } from './utils';

export type GithubNotification = {
  id: string;
  unread: boolean;
  reason: string;
  updated_at: string;
  last_read_at?: string;
  subject?: {
    title?: string;
    url?: string | null;
    latest_comment_url?: string | null;
    type?: string;
  };
  repository?: {
    full_name?: string;
    html_url?: string;
  };
};

export type CleanGithubNotification = {
  id: string;
  unread: boolean;
  reason: string;
  updatedAt: string;
  lastReadAt: string | null;
  title: string;
  subjectType: string;
  repository: string;
  url: string | null;
  priority: number;
};

GithubNotification maps directly to what GitHub's API returns. CleanGithubNotification is the flattened version we actually work with. It has sensible defaults for missing fields, and a priority score we calculate ourselves before anything goes to Groq.

Add the priority scoring function below the types.

export function getGithubNotificationPriority(notification: GithubNotification): number {
  const reasonPriority: Record<string, number> = {
    review_requested: 100,
    mention: 95,
    author: 90,
    comment: 85,
    ci_activity: 80,
    state_change: 70,
    assign: 65,
    subscribed: 40,
    manual: 30,
    security_alert: 100,
  };

  const basePriority = reasonPriority[notification.reason] || 50;
  const unreadBoost = notification.unread ? 10 : 0;
  const subjectTypeBoost = notification.subject?.type === 'PullRequest' ? 5 : 0;

  const title = (notification.subject?.title || '').toLowerCase();
  const titleKeywordBoost = /failed|security|vulnerability|incident|urgent/.test(title)
    ? 15
    : 0;

  const securityReasonBoost = notification.reason === 'security_alert' ? 20 : 0;

  return (
    basePriority +
    unreadBoost +
    subjectTypeBoost +
    titleKeywordBoost +
    securityReasonBoost
  );
}

The score starts with a base value tied to the notification reason. GitHub tells you why you were notified, and that reason carries a lot of signal. A review_requested scores 100 because someone is actively waiting on you. A subscribed notification scores 40 because you opted in but nothing is demanding your attention.

Four boosts can push a notification higher:

* **Unread** adds 10. If you haven't seen it yet, it ranks higher
* **PullRequest** subject type adds 5. PRs tend to be more time-sensitive than issues
* **Title keywords** like `failed`, `security`, or `urgent` add 15. The title is a strong signal
* **Security alerts** get an extra 20 on top of their already high base score

Add the main fetch function.

export async function getGithubNotifications(
  nango: Nango,
  githubProviderConfigKey: string,
  githubNotificationsLookbackDays: number,
  githubNotificationsPerPage: number,
  githubNotificationsMaxPages: number
) {
  const connectionId = requireEnv('NANGO_GITHUB_CONNECTION_ID');
  const since = new Date(
    Date.now() - githubNotificationsLookbackDays * 24 * 60 * 60 * 1000
  ).toISOString();
  const notifications: GithubNotification[] = [];

  for (let page = 1; page <= githubNotificationsMaxPages; page += 1) {
    const response = await nango.get<GithubNotification[]>({
      endpoint: `/notifications?all=true&participating=false&since=${encodeURIComponent(
        since
      )}&per_page=${githubNotificationsPerPage}&page=${page}`,
      providerConfigKey: githubProviderConfigKey,
      connectionId,
    });

    const pageItems = response.data || [];
    notifications.push(...pageItems);

    if (pageItems.length < githubNotificationsPerPage) {
      break;
    }
  }

  const cleanedNotifications: CleanGithubNotification[] = notifications
    .map((notification) => ({
      id: notification.id,
      unread: notification.unread,
      reason: notification.reason,
      updatedAt: notification.updated_at,
      lastReadAt: notification.last_read_at || null,
      title: notification.subject?.title || '(No title)',
      subjectType: notification.subject?.type || 'Unknown',
      repository: notification.repository?.full_name || 'Unknown repository',
      url:
        notification.subject?.url ||
        notification.subject?.latest_comment_url ||
        notification.repository?.html_url ||
        null,
      priority: getGithubNotificationPriority(notification),
    }))
    .sort((left, right) => {
      if (right.priority !== left.priority) {
        return right.priority - left.priority;
      }
      return (
        new Date(right.updatedAt).getTime() - new Date(left.updatedAt).getTime()
      );
    });

  return {
    lookbackDays: githubNotificationsLookbackDays,
    totalCount: cleanedNotifications.length,
    unreadCount: cleanedNotifications.filter((n) => n.unread).length,
    urgentCount: cleanedNotifications.filter((n) => n.priority >= 80).length,
    notifications: cleanedNotifications,
  };
}

A few things worth noting here:

* **The Nango call** is the simplest part. One `nango.get()` with the endpoint, `providerConfigKey`, and `connectionId`. Nango handles the token. You get data back.
* **The lookback window** filters notifications to the last 30 days by default, controlled by `GITHUB_NOTIFICATIONS_LOOKBACK_DAYS` in your `.env`. Without this, GitHub returns everything going back potentially months — stale context you don't need the LLM reasoning about.
* **The pagination loop** fetches up to 5 pages of 50 notifications each. If a page returns fewer items than the page size, we've reached the end and break early. This prevents silently dropping notifications beyond the first page, which would defeat the whole point of the tool.
* **The sort** orders by priority first, then recency as a tiebreaker. The highest-urgency, most recent notifications surface at the top.

Fetching Data from the Gmail API

Create src/gmail.ts. Same pattern as github.ts: types, priority scoring, fetch function.

Add the imports and types at the top of gmail.ts.

import { Nango } from '@nangohq/node';
import { requireEnv } from './utils';

export type GmailMessageListResponse = {
  messages?: Array<{
    id: string;
    threadId: string;
  }>;
};

export type GmailMessageDetailResponse = {
  id: string;
  threadId: string;
  snippet?: string;
  internalDate?: string;
  labelIds?: string[];
  payload?: {
    headers?: Array<{
      name?: string;
      value?: string;
    }>;
  };
};

export type CleanGmailMessage = {
  id: string;
  threadId: string;
  from: string;
  subject: string;
  date: string | null;
  snippet: string;
  labelIds: string[];
  priority: number;
};

Two response types here instead of one. GmailMessageListResponse handles the initial list of message IDs, and GmailMessageDetailResponse handles the full message data. That split exists because Gmail's API works in two steps, which we get to in the fetch function.

Add the priority scoring function.

export function getGmailMessagePriority(message: {
  subject: string;
  snippet: string;
  labelIds: string[];
  from: string;
}): number {
  const labels = new Set(message.labelIds.map((label) => label.toUpperCase()));
  let score = 20;

  if (labels.has('UNREAD')) score += 10;
  if (labels.has('IMPORTANT')) score += 15;

  const text = `${message.subject} ${message.snippet}`.toLowerCase();
  const from = message.from.toLowerCase();

  if (/login|password|security|verify|verification|suspicious|alert/.test(text)) score += 35;
  if (/failed|down|error|incident/.test(text)) score += 25;
  if (/deadline|interview|offer|application|action required/.test(text)) score += 20;
  if (/invoice|payment|receipt|due/.test(text)) score += 18;
  if (/digest|newsletter|promotions|weekly|updates/.test(text)) score -= 10;
  if (/no-reply|noreply/.test(from)) score -= 5;

  return score;
}

Gmail doesn't have a reason field like GitHub does, so the scoring relies on signals from the message itself: labels, subject line, snippet, and sender. Security and authentication keywords score highest. Newsletters and no-reply senders get penalized because they rarely need action. Gmail's own IMPORTANT label adds weight. It's not perfect but it's a useful signal.

Add the main fetch function.

export async function getGmailMessages(
  nango: Nango,
  gmailProviderConfigKey: string
) {
  const connectionId = requireEnv('NANGO_GMAIL_CONNECTION_ID');

  // Step 1: Get a list of message IDs
  const response = await nango.get<GmailMessageListResponse>({
    endpoint: '/gmail/v1/users/me/messages?maxResults=5',
    providerConfigKey: gmailProviderConfigKey,
    connectionId,
  });

  const messages = response.data.messages || [];

  // Step 2: Fetch details for each message
  const detailedMessages: CleanGmailMessage[] = await Promise.all(
    messages.map(async ({ id }) => {
      const detailResponse = await nango.get<GmailMessageDetailResponse>({
        endpoint: `/gmail/v1/users/me/messages/${id}?format=metadata&metadataHeaders=From&metadataHeaders=Subject&metadataHeaders=Date`,
        providerConfigKey: gmailProviderConfigKey,
        connectionId,
      });

      const headers = detailResponse.data.payload?.headers || [];
      const getHeader = (name: string) =>
        headers.find(
          (header) => header.name?.toLowerCase() === name.toLowerCase()
        )?.value;

      const labelIds = detailResponse.data.labelIds || [];
      const subject = getHeader('Subject') || '(No subject)';
      const snippet = detailResponse.data.snippet || '';
      const from = getHeader('From') || 'Unknown sender';

      return {
        id: detailResponse.data.id,
        threadId: detailResponse.data.threadId,
        from,
        subject,
        date: getHeader('Date') || null,
        snippet,
        labelIds,
        priority: getGmailMessagePriority({ subject, snippet, labelIds, from }),
      };
    })
  );

  const sortedMessages = detailedMessages.sort((left, right) => {
    if (right.priority !== left.priority) {
      return right.priority - left.priority;
    }
    const rightDate = right.date ? new Date(right.date).getTime() : 0;
    const leftDate = left.date ? new Date(left.date).getTime() : 0;
    return rightDate - leftDate;
  });

  return {
    resultSizeEstimate: response.data.messages?.length || 0,
    unreadCount: sortedMessages.filter((m) =>
      m.labelIds.map((l) => l.toUpperCase()).includes('UNREAD')
    ).length,
    urgentCount: sortedMessages.filter((m) => m.priority >= 55).length,
    messages: sortedMessages,
  };
}

The Gmail fetch works in two round trips by design. The list endpoint returns message IDs only, with no subject, no sender, and no content. To get the actual message data you need a second request per message. This is Gmail's API design, not a Nango limitation. We use Promise.all to run all the detail fetches in parallel so it stays fast.

The format=metadata parameter tells Gmail to return only headers rather than the full message body. We only need sender, subject, date, and snippet. Pulling the full body would be wasteful and would hit token limits faster when passing data to Groq.

Nango handles auth for both calls, the list fetch and every detail fetch, using the same providerConfigKey and connectionId. You write the same nango.get() pattern twice and Nango takes care of the rest.

Summarizing Data with an LLM (Groq + OpenAI SDK)

Create src/summarize.ts. This file handles three things: preparing the data before it goes to the LLM, making the Groq API call, and defining the prompt that shapes the output.

Add the input preparation function.

import OpenAI from 'openai';
import { DigestDelta } from './types';
import { clip } from './utils';

export function prepareAssistantInput(data: {
  notifications: unknown;
  emails: unknown;
  digestDelta: DigestDelta;
}): string {
  const notifications = clip(JSON.stringify(data.notifications, null, 2), 10000);
  const emails = clip(JSON.stringify(data.emails, null, 2), 6000);
  const digest = JSON.stringify(data.digestDelta, null, 2);

  return [
    'GitHub notifications data:',
    notifications,
    '',
    'Gmail messages data:',
    emails,
    '',
    'Digest delta (new items since previous run):',
    digest,
  ].join('\n');
}

Before anything goes to Groq, prepareAssistantInput serializes the GitHub and Gmail data into a single string. The clip() utility truncates each block if it exceeds a character limit: 10,000 for GitHub notifications and 6,000 for Gmail messages. Without this, large inboxes or notification backlogs could push the input past the model's context window and cause the request to fail.

Add the Groq call function.

export async function askAssistant(groq: OpenAI, data: string, question: string) {
  const response = await groq.chat.completions.create({
    model: 'llama-3.3-70b-versatile',
    messages: [
      {
        role: 'system',
        content:
          'You are a friendly and sharp personal developer assistant. Analyze both the GitHub notifications and Gmail data provided. Respond in plain text only, no markdown. Use this exact structure and headings: QUICK SUMMARY, GITHUB (ACT ON FIRST), GITHUB (CAN WAIT), GMAIL (ACT ON FIRST), GMAIL (CAN WAIT), TODAY\'S PLAN. Put each item on its own line starting with "- ". Keep each bullet specific and concrete (about 12 to 28 words), mentioning exact repo names, PR/workflow titles, senders, and subjects where relevant. Avoid generic wording like "check this" or "review that". Use clear action language and include why each urgent item matters now. Keep the tone warm, practical, and supportive without sounding robotic. Prioritize items flagged as new since last run and never repeat the same item in multiple sections.',
      },
      {
        role: 'user',
        content: `${data}\n\nQuestion: ${question}`,
      },
    ],
  });

  return response.choices[0].message.content;
}

A few deliberate decisions here worth explaining:

* **Plain text only, no markdown.** The output is printed directly to the terminal. Markdown formatting like asterisks and hashes renders as literal characters in a CLI context. Telling the model to avoid markdown keeps the output clean.
* **A fixed structure.** The system prompt defines six required sections that appear in the same order every run. Without this, LLMs tend to produce free-form responses that vary in structure. A fixed structure makes the output predictable and easy to scan.
* **Specific and concrete bullets.** The prompt discourages vague language and asks the model to mention exact repo names, PR titles, senders, and subjects. Vague summaries are not useful when you are trying to decide what to do next.
* **Groq via OpenAI SDK.** The `groq` client is an OpenAI instance pointed at Groq's base URL. From the SDK's perspective nothing changes. Same method, same response shape. The model string `llama-3.3-70b-versatile` is the only Groq-specific detail.

Wiring It Together

This section covers the remaining three files: types.ts, utils.ts, and index.ts.

Add the shared types to src/types.ts.

export type DigestState = {
  lastRunAt: string;
  githubNotificationIds: string[];
  gmailMessageIds: string[];
};

export type DigestDelta = {
  hasPreviousRun: boolean;
  previousRunAt: string | null;
  newGithubNotifications: number;
  newGmailMessages: number;
  newGithubIds: string[];
  newGmailIds: string[];
};

DigestState is what gets saved to disk after each run: a timestamp and the IDs of everything that was seen. DigestDelta is what gets computed at runtime by comparing the current fetch against the previous state.

Add the shared utilities to src/utils.ts.

import * as dotenv from 'dotenv';

dotenv.config();

export function requireEnv(name: string): string {
  const value = process.env[name];
  if (!value) {
    throw new Error(`Missing required environment variable: ${name}`);
  }
  return value;
}

export function clip(value: string, maxLength: number): string {
  if (value.length <= maxLength) {
    return value;
  }
  return `${value.slice(0, maxLength)}\n...truncated...`;
}

export function printSection(title: string, content: string) {
  console.log('\n=============================');
  console.log(` ${title}`);
  console.log('=============================');
  console.log(content);
}

* `requireEnv()` throws immediately at startup if a required environment variable is missing. You find out before any API calls are made, not halfway through a fetch.
* `clip()` truncates a string to a maximum length and appends a truncation notice. Used in `summarize.ts` to keep the LLM input within safe bounds.
* `printSection()` formats terminal output with a consistent header style. Every section of the CLI output goes through this.

Add the entry point to src/index.ts.

import { Nango } from '@nangohq/node';
import OpenAI from 'openai';
import { readFile, writeFile } from 'fs/promises';
import { join } from 'path';

import { requireEnv, formatAssistantResponse, printSection, getSafeErrorMessage } from './utils';
import { getGithubNotifications, CleanGithubNotification } from './github';
import { getGmailMessages, CleanGmailMessage } from './gmail';
import { prepareAssistantInput, askAssistant } from './summarize';
import { DigestState, DigestDelta } from './types';

const nango = new Nango({ secretKey: requireEnv('NANGO_SECRET_KEY') });

const groq = new OpenAI({
  apiKey: requireEnv('GROQ_API_KEY'),
  baseURL: 'https://api.groq.com/openai/v1',
});

const githubProviderConfigKey = requireEnv('NANGO_GITHUB_PROVIDER_CONFIG_KEY');
const gmailProviderConfigKey = requireEnv('NANGO_GMAIL_PROVIDER_CONFIG_KEY');
const githubNotificationsLookbackDays = Number(
  process.env.GITHUB_NOTIFICATIONS_LOOKBACK_DAYS || '30'
);
const githubNotificationsPerPage = 50;
const githubNotificationsMaxPages = 5;
const digestStateFilePath = join(process.cwd(), '.digest-state.json');

The top of index.ts initializes the two clients (nango and groq) and reads all config values from the environment. Nothing runs yet. This is just setup.

Add the digest state functions below the config.

async function loadDigestState(): Promise<DigestState | null> {
  try {
    const raw = await readFile(digestStateFilePath, 'utf-8');
    const parsed = JSON.parse(raw) as Partial<DigestState>;

    if (
      typeof parsed.lastRunAt === 'string' &&
      Array.isArray(parsed.githubNotificationIds) &&
      Array.isArray(parsed.gmailMessageIds)
    ) {
      return {
        lastRunAt: parsed.lastRunAt,
        githubNotificationIds: parsed.githubNotificationIds,
        gmailMessageIds: parsed.gmailMessageIds,
      };
    }

    return null;
  } catch (error) {
    const maybeNodeError = error as NodeJS.ErrnoException;
    if (maybeNodeError.code === 'ENOENT') return null;
    throw error;
  }
}

async function saveDigestState(state: DigestState): Promise<void> {
  await writeFile(digestStateFilePath, JSON.stringify(state, null, 2), 'utf-8');
}

function getDigestDelta(
  previousState: DigestState | null,
  githubNotifications: CleanGithubNotification[],
  gmailMessages: CleanGmailMessage[]
): DigestDelta {
  if (!previousState) {
    return {
      hasPreviousRun: false,
      previousRunAt: null,
      newGithubNotifications: githubNotifications.length,
      newGmailMessages: gmailMessages.length,
      newGithubIds: githubNotifications.map((item) => item.id),
      newGmailIds: gmailMessages.map((item) => item.id),
    };
  }

  const previousGithub = new Set(previousState.githubNotificationIds);
  const previousGmail = new Set(previousState.gmailMessageIds);

  const newGithubIds = githubNotifications
    .map((item) => item.id)
    .filter((id) => !previousGithub.has(id));
  const newGmailIds = gmailMessages
    .map((item) => item.id)
    .filter((id) => !previousGmail.has(id));

  return {
    hasPreviousRun: true,
    previousRunAt: previousState.lastRunAt,
    newGithubNotifications: newGithubIds.length,
    newGmailMessages: newGmailIds.length,
    newGithubIds,
    newGmailIds,
  };
}

loadDigestState reads the previous run's state from .digest-state.json. If the file doesn't exist yet it returns null gracefully. saveDigestState writes the current run's state to disk after everything completes. getDigestDelta compares the two to figure out what is new since the last run.

Add the main function.

async function main() {
  const previousDigestState = await loadDigestState();

  console.log('Fetching GitHub notifications...');
  const notifications = await getGithubNotifications(
    nango,
    githubProviderConfigKey,
    githubNotificationsLookbackDays,
    githubNotificationsPerPage,
    githubNotificationsMaxPages
  );

  console.log('Fetching Gmail messages...');
  const emails = await getGmailMessages(nango, gmailProviderConfigKey);

  const digestDelta = getDigestDelta(
    previousDigestState,
    notifications.notifications,
    emails.messages
  );

  const combinedData = prepareAssistantInput({ notifications, emails, digestDelta });

  const answer = await askAssistant(
    groq,
    combinedData,
    'Give me a clear and friendly update. Prioritize what is new since my previous run, explain what needs attention first and why it matters, then give a short plan for today.'
  );

  if (process.env.DEBUG === 'true') {
    printSection('GITHUB NOTIFICATIONS', JSON.stringify(notifications, null, 2));
    printSection('GMAIL MESSAGES', JSON.stringify(emails, null, 2));
  }

  const formattedAnswer = formatAssistantResponse(answer ?? '');
  printSection('ASSISTANT', formattedAnswer);

  await saveDigestState({
    lastRunAt: new Date().toISOString(),
    githubNotificationIds: notifications.notifications.map((item) => item.id),
    gmailMessageIds: emails.messages.map((item) => item.id),
  });
}

main().catch((error) => {
  console.error(getSafeErrorMessage(error));
  process.exitCode = 1;
});

The flow is linear and easy to follow:

Load the previous digest state from disk
Fetch GitHub notifications and Gmail messages
Compute what is new since the last run
Prepare and send everything to Groq
Print the formatted summary
Save the current state to disk for next time

Run the tool.
```
npx ts-node src/index.ts
```
On the first run there is no previous state so everything is treated as new. On subsequent runs the tool compares against the saved state and the LLM focuses on what has changed since you last checked.

Sample Output

Here is what the tool prints when you run it:

Fetching GitHub notifications...
Fetching Gmail messages...

=============================
 DIGEST SNAPSHOT
=============================
GH  total=3 unread=1 urgent=1 new=0
MAIL total=5 unread=5 urgent=2 new=5
Compared with previous run at 2026-04-16T21:31:13.959Z

Top GitHub now:
- [GH 1] your-username/your-repo | ci_activity | Deploy workflow run failed for main branch
- [GH 2] your-username/your-repo | state_change | Add new feature to portfolio section

Top Gmail now:
- [MAIL 1] Learning Platform <hello@platform.com> | Course ready: Foundations of Cybersecurity
- [MAIL 2] Financial Service <updates@finance.com> | Market Update: Average Yield Falls 3bps

=============================
 ASSISTANT
=============================
QUICK SUMMARY
- There are no new GitHub notifications since the last run, but there are unread items to review
- There are 5 new Gmail messages, with 2 marked as urgent, requiring attention

GITHUB (ACT ON FIRST)
- Review the unread PR in your-username/your-repo — a teammate is waiting and it has been open
  since yesterday

GITHUB (CAN WAIT)
- Check the failed Deploy workflow run on main branch in your-username/your-repo to prevent
  future failures
- Look at the state change notification in your-username/your-repo for potential updates

GMAIL (ACT ON FIRST)
- Respond to the course email from Learning Platform — it contains a time-sensitive offer
- Read the market update from Financial Service — it contains important information requiring
  a decision

GMAIL (CAN WAIT)
- Browse the design inspiration email from your newsletter for later
- Read the founder strategy article from The AI Journal for learning
- Review the weekly stock recommendation for investment insights

TODAY'S PLAN
- First, review the unread GitHub item and respond to the two urgent Gmail messages
- Then prioritize preventing the workflow failure on main — a broken pipeline blocks future work
- Finally, allocate time to the non-urgent Gmail messages for learning and market awareness

A few things to notice in this output:

The digest snapshot comes first. Before the LLM summary, you get a quick count of total, unread, urgent, and new items since the last run. This gives you the shape of your day in seconds without reading anything else.
"New since last run" is the key signal. In this run, GitHub shows new=0, meaning nothing has changed since the previous run. Gmail shows new=5, meaning five emails arrived since you last checked. The LLM picks this up and leads with the Gmail items.
The structure is consistent every run. Six sections, always in the same order. You know where to look without reading everything else.
The bullets are specific. The LLM mentions exact repositories, workflow names, senders, and subjects rather than generic phrases. That specificity comes directly from the prompt instructions.

Bonus: What Else Is In the Code

The core of this build is the Nango integration, the priority scoring, and the Groq summarization. But there are three additional pieces in the codebase worth knowing about.

The digest state system

After every run, the tool saves a .digest-state.json file to the project root:

{
  "lastRunAt": "2026-04-16T21:31:13.959Z",
  "githubNotificationIds": ["abc123", "def456"],
  "gmailMessageIds": ["msg001", "msg002", "msg003"]
}

On the next run it loads this file, compares the current fetch against the saved IDs, and flags anything new. The system prompt explicitly tells the LLM to prioritize those new items, which is why the summary leads with Gmail when five new messages arrive but no new GitHub notifications.

Debug mode

Setting DEBUG=true in your .env dumps the full raw JSON from both fetches before the assistant summary. Useful when the LLM output looks off and you want to see exactly what data it received.

Output formatting

The formatAssistantResponse() function in utils.ts normalizes the LLM output and deduplicates bullets. LLMs sometimes return the same item in multiple sections despite the prompt telling them not to. The function tracks seen bullet content and silently drops duplicates. It also wraps long lines at 96 characters so the output stays readable regardless of terminal width.

What I Learned

Nango is genuinely easy to set up. I came into this build with a specific reason for using Nango that had nothing to do with evaluating it as a tool. But once I was inside it, the setup surprised me. All I needed was my client ID and client secret for each integration, and the dashboard walked me through the rest. No custom OAuth logic, no token management code, no refresh handling. It was smooth in a way I did not expect.

Prompt engineering is where the real work is. The Nango integration and the API fetching came together relatively quickly. The part that took the most iteration was finetuning the prompt and shaping the data before it reached the LLM. Getting the model to be specific rather than generic, to lead with what was new, to avoid repeating items across sections — that required deliberate prompt design and several passes at the input structure. If you are building something similar, budget more time for this than you think you need.

The tool taught me what I actually want next. Building something you use yourself is a good way to find out what is missing. A few things became clear while using it:

The priority scoring is static. It does not learn from what I actually pay attention to. A smarter version would be more sensitive to context and avoid surfacing things I have already dismissed
Being able to perform actions from the same place, like replying to an email or marking a notification as read, would close the loop the tool currently leaves open
A do-not-disturb mode where you can mute certain types of notifications for a set period would make it more respectful of focus time
Running inside the IDE rather than a separate terminal would fit the developer workflow better
Real-time data fetching rather than on-demand runs would make it more useful throughout the day

None of these are blockers for the current version. But they are the natural next layer for anyone who wants to take this further.

What's Next

This version is a working CLI tool that fetches real data, scores it, and returns a useful summary. But there are clear directions it could grow in.

A UI layer. The CLI output works but it is text in a terminal. A simple web UI would let you see your GitHub notifications and Gmail messages as cards, with action buttons attached: mark as read, archive, flag for follow-up. That closes the loop the current version leaves open.

Action execution via Nango. The same nango.get() pattern that fetches data works in reverse for writing. nango.patch() and nango.post() can mark GitHub notifications as read, archive Gmail messages, or reply to threads, all without touching the OAuth layer. Adding actions is a natural extension of what is already there.

More integrations. Nango supports 700+ APIs. Adding Slack, Linear, or Jira would follow the same pattern as GitHub and Gmail: a new provider config key, a new connection ID, and a fetch function that fits into the existing flow. The architecture is already set up for it.

An IDE extension. Developers live in their editors. A VS Code or Cursor extension that surfaces the same digest inside the IDE without switching context would be a better fit for the workflow this tool is trying to support.

A do-not-disturb mode. Sometimes you need to focus without any interruptions. A configurable mute window, similar to DND on your phone, would let you suppress certain notification types for a set period. The tool should be useful without being another source of noise.

The full source is available at github.com/techsplot/dev-assistant.

Your AI has no memory. Here's How to Add One with Node.js and Mem0

Ayomide olofinsawe — Thu, 19 Mar 2026 22:01:59 +0000

Every chat interface you've ever used feels like the AI remembers you. It doesn't. There's no memory, no session, no awareness of previous conversations. What you're seeing is your application feeding the entire conversation history back into every request. The model just processes whatever is in front of it and forgets everything the moment it responds.

That approach works, but it has real ceilings. The message array grows with every exchange, so longer conversations mean larger payloads and higher token costs. You'll hit context window limits eventually. And across sessions, everything is lost start a new conversation, and the user has to repeat themselves from scratch. There's also no intelligence to it: a throwaway message like "ok thanks" carries the same weight as "I'm building a fintech app in Node.js, and I hate ORMs."

What you actually want is a system that extracts the facts that matter, stores them, and retrieves only the relevant ones when needed without stuffing the entire conversation history into every request.

That's what Mem0 does. It sits between your application and your LLM as a dedicated memory layer: automatically pulling meaningful facts out of conversations, persisting them across sessions, and injecting the right context into future requests.

In this tutorial, we'll build a memory-aware REST API from scratch using:

Express and TypeScript for the API layer
Groq as the LLM provider (fast inference, generous free tier)
Mem0 as the memory layer that ties it all together By the end, you'll have a working API where a user can send a message, have relevant facts automatically extracted and stored, and receive responses informed by everything the system has learned about them across previous conversations.

Let's get into it.

What is Mem0 and How Does It Work

Mem0 is a memory layer for AI applications. The core idea is simple: instead of your application managing conversation history and replaying it on every request, Mem0 handles the memory side of things for you extracting what matters, storing it, and making it available when it's relevant.

How it differs from storing chat history

When you store chat history, you're storing everything every message, in order, as it happened. Nothing is filtered, nothing is prioritized, and the whole thing gets sent back to the model on the next request regardless of whether any of it is actually useful.
Mem0 works differently. After each exchange, it processes the conversation and pulls out meaningful facts things like user preferences, stated goals, technical context, or any detail worth remembering long term. Those facts are stored as discrete memory objects, not as raw messages. When a new request comes in, Mem0 retrieves only the memories that are relevant to it and surfaces them to your LLM as context.
The practical difference is significant. Instead of a growing message array that the model has to wade through, the model gets a compact, relevant summary of what it actually needs to know about the user.

Async processing and automatic fact extraction

Memory extraction in Mem0 happens asynchronously it doesn't block your API response. After your LLM replies, Mem0 processes the conversation in the background, extracts facts, and updates the memory store. It also handles deduplication, so if a user mentions the same detail across multiple conversations, Mem0 won't create redundant memory entries it updates the existing one instead.
This means your API stays fast, and the memory store stays clean over time without any extra work on your end.

Project Setup

In this section, we'll get the project initialized, install everything we need, and configure our environment variables. By the end of this section you should have a working foundation to start building on.

Prerequisites

Before getting started, make sure you have the following:

Node.js v18 or higher
A Groq account for the LLM API key
A Mem0 account for the memory layer API key

Installing dependencies

First, create your project folder and initialize it:

mkdir mem0-express-api && cd mem0-express-api
npm init -y

Then install the runtime dependencies:

npm install express mem0ai openai dotenv

And the dev dependencies:

npm install -D typescript ts-node nodemon @types/express @types/node

Finally, initialize TypeScript:

npx tsc --init

A quick note on the openai package: even though we're using Groq as our LLM provider, Groq's API is OpenAI-compatible, so the openai SDK works against it directly you just point it at Groq's base URL.

Add your scripts

Update your package.json scripts section to look like this:

"scripts": {
  "dev": "nodemon --exec ts-node src/index.ts",
  "build": "tsc",
  "start": "node dist/index.js"
}

Environment variables

Create a .env file at the root of your project:

GROQ_API_KEY=your_groq_api_key
MEM0_API_KEY=your_mem0_api_key
PORT=3000

You can grab your Groq API key from the Groq console and your Mem0 API key from the Mem0 dashboard. Never commit this file add it to your .gitignore.

Building the API

With the project set up, let's walk through the actual code. Here's how everything is structured:

mem0-express-api/
├── src/
│   ├── lib/
│   │   └── mem0.ts
│   ├── routes/
│   │   └── chat.ts
│   └── index.ts
├── .env
├── .gitignore
├── package.json
└── tsconfig.json

Go ahead and create the folders and files to match this structure before we start filling them in.

Setting up the entry point

src/index.ts is where the Express app is initialized. It loads environment variables, registers the chat router under the /chat path, and starts the server:

import "dotenv/config";
import express from "express";
import chatRouter from "./routes/chat";

const app = express();
const PORT = process.env.PORT || 3000;

app.use(express.json());
app.use("/chat", chatRouter);

app.listen(PORT, () => {
  console.log(`Server running on http://localhost:${PORT}`);
});

Initializing the Mem0 client

src/lib/mem0.ts initializes the Mem0 client once and exports it for use across the app:

import { MemoryClient } from "mem0ai";

const client = new MemoryClient({
  apiKey: process.env.MEM0_API_KEY!,
});

export default client;

Keeping this in its own file means you initialize the client once and import it wherever you need it, rather than recreating it on every request.

The chat routes

All the route logic lives in src/routes/chat.ts. Here's the full file:

import { Router, Request, Response } from "express";
import OpenAI from "openai";
import mem0 from "../lib/mem0";

const router = Router();

const openai = new OpenAI({
  apiKey: process.env.GROQ_API_KEY!,
  baseURL: "https://api.groq.com/openai/v1",
});

// POST /chat/:userId
router.post("/:userId", async (req: Request, res: Response) => {
  const { userId } = req.params;
  const { message, sessionId } = req.body;

  if (!message) {
    res.status(400).json({ error: "Message is required" });
    return;
  }

  try {
    const memories = await mem0.search(message, { user_id: userId } as any);
    const memoryContext = memories.map((m: any) => m.memory).join("\n");

    const systemPrompt = `You are a helpful assistant with memory of past interactions.${
      memoryContext ? `\nWhat you remember about this user:\n${memoryContext}` : ""
    }`;

    const completion = await openai.chat.completions.create({
      model: "llama-3.3-70b-versatile",
      messages: [
        { role: "system", content: systemPrompt },
        { role: "user", content: message },
      ],
    });

    const reply = completion.choices[0].message.content;

    const addOptions: any = { user_id: userId };
    if (sessionId) addOptions.run_id = sessionId;

    await mem0.add(
      [
        { role: "user", content: message },
        { role: "assistant", content: reply! },
      ],
      addOptions
    );

    res.json({ reply, userId, sessionId: sessionId || null });
  } catch (error: any) {
    res.status(500).json({ error: error.message });
  }
});

// GET /chat/:userId/memories
router.get("/:userId/memories", async (req: Request, res: Response) => {
  const { userId } = req.params;

  try {
    const memories = await mem0.getAll({ user_id: userId } as any);
    res.json({ userId, memories });
  } catch (error: any) {
    res.status(500).json({ error: error.message });
  }
});

// DELETE /chat/:userId/memories
router.delete("/:userId/memories", async (req: Request, res: Response) => {
  const { userId } = req.params;

  try {
    await mem0.deleteAll({ user_id: userId } as any);
    res.json({ message: `All memories cleared for user ${userId}` });
  } catch (error: any) {
    res.status(500).json({ error: error.message });
  }
});

// POST /chat/:userId/no-memory
router.post("/:userId/no-memory", async (req: Request, res: Response) => {
  const { userId } = req.params;
  const { message } = req.body;

  if (!message) {
    res.status(400).json({ error: "Message is required" });
    return;
  }

  try {
    const completion = await openai.chat.completions.create({
      model: "llama-3.3-70b-versatile",
      messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: message },
      ],
    });

    const reply = completion.choices[0].message.content;
    res.json({ reply, userId, memoryUsed: false });
  } catch (error: any) {
    res.status(500).json({ error: error.message });
  }
});

export default router;

Now let's break down what each route does.

POST /chat/:userId is the core route. When a message comes in, it first searches Mem0 for any memories relevant to that message and injects them into the system prompt. The LLM then responds with that context already in place. After the response is generated, the full exchange is passed to mem0.add() which processes it asynchronously and extracts facts worth remembering. The optional sessionId in the request body scopes the memory to a specific conversation thread via run_id useful if one user has multiple distinct conversation contexts.

GET /chat/:userId/memories returns everything Mem0 has stored for a given user. We'll use this in Section 6 to inspect what was actually extracted after a conversation.

DELETE /chat/:userId/memories wipes all stored memories for a user. Useful for resetting state during testing or giving users a clean slate.

POST /chat/:userId/no-memory is a control route that calls the same LLM with the same model but with no memory context at all just a plain system prompt. We'll use this in the next section to show the difference between a memory-aware response and a stateless one side by side.

How Mem0 Stores Memories

When you call mem0.add() after an exchange, Mem0 doesn't just save the raw messages. It processes the conversation, pulls out the facts that matter, and stores them as structured memory objects. Here's what that actually looks like.

After a few conversations as user123, calling GET /chat/user123/memories returns something like this:

{
  "userId": "user123",
  "memories": [
    {
      "id": "a9a14f57-e0ae-4533-82a1-c5e93364fb7b",
      "memory": "Ayomide loves table tennis and enjoys trying different foods",
      "user_id": "user123",
      "categories": ["sports", "food"],
      "created_at": "2026-03-19T05:38:37-07:00",
      "updated_at": "2026-03-19T05:38:41.614883-07:00",
      "structured_attributes": {
        "day": 19,
        "hour": 12,
        "year": 2026,
        "month": 3,
        "day_of_week": "thursday"
      }
    },
    {
      "id": "f0c175d5-1b9a-462a-bf17-88d287271df0",
      "memory": "Ayomide is building an API platform called APIblok",
      "user_id": "user123",
      "categories": ["professional_details", "technology"],
      "created_at": "2026-03-18T11:19:19-07:00",
      "updated_at": "2026-03-18T11:19:17.557120-07:00",
      "structured_attributes": {
        "day": 18,
        "hour": 18,
        "year": 2026,
        "month": 3,
        "day_of_week": "wednesday"
      }
    },
    {
      "id": "ea262dae-72af-48d9-a8e8-4b18cdefa9a5",
      "memory": "Ayomide has 3 years of experience with Node.js and prefers using TypeScript",
      "user_id": "user123",
      "categories": ["professional_details", "technology"],
      "created_at": "2026-03-18T10:55:20-07:00",
      "updated_at": "2026-03-18T10:56:45.972173-07:00",
      "structured_attributes": {
        "day": 18,
        "hour": 17,
        "year": 2026,
        "month": 3,
        "day_of_week": "wednesday"
      }
    }
  ]
}

A few things worth noticing here.

The memory field is a clean, distilled fact not a raw message. Mem0 didn't store "hey I love table tennis and I also like trying new foods", it stored "Ayomide loves table tennis and enjoys trying different foods". The extraction is deliberate and concise.

Categories are automatically assigned. Each memory object comes with a categories array ["sports", "food"], ["professional_details", "technology"] and so on. Mem0 figures these out on its own without any configuration from you. This is what makes retrieval intelligent — when a new message comes in, Mem0 knows which memories are actually relevant to surface.

Deduplication happens automatically. If the user mentions the same detail across multiple conversations, Mem0 won't create a new memory entry every time it updates the existing one instead. You can see this in the updated_at field, which can differ from created_at when a memory has been refined over time.

structured_attributes adds temporal context — recording when the memory was created down to the hour and day of week. This gives Mem0 the ability to reason about recency when deciding what to retrieve.

You can also view and manage all of this visually from the Mem0 dashboard:

The dashboard gives you a real-time view of what's been extracted for each user, which is particularly useful when you're debugging or want to verify that the right facts are being stored.

Testing the API

With the project set up, you'll need two terminal windows open simultaneously for this section. In the first terminal, start the dev server:

npm run dev

You should see Server running on http://localhost:3000. Leave this terminal running — the server needs to stay alive to handle requests. Open a second terminal window to send the curl commands. If you're on Windows, use Command Prompt (cmd) for both terminals rather than PowerShell. PowerShell maps curl to its own Invoke-WebRequest command, which won't work with these examples. If you prefer Git Bash, that works too — it ships with Git for Windows and handles curl correctly.

Windows cmd users: The multiline curl commands below use \ for line continuation, which is a macOS/Linux convention. Windows cmd doesn't support it if you run them as-is, you'll get curl: (3) URL rejected: Bad hostname errors after each line. The request itself will still go through, but to avoid the noise, collapse each command onto a single line. For example:

  curl -X POST http://localhost:3000/chat/user123 -H "Content-Type: application/json" -d "{\"message\": \"Your message here\"}"

Sending your first message

Start with a message that gives the model something worth remembering — some personal or technical context:

curl -X POST http://localhost:3000/chat/user123 \
  -H "Content-Type: application/json" \
  -d "{\"message\": \"Hey, I'm Ayomide. I have 3 years of experience with Node.js and I prefer TypeScript over plain JavaScript.\"}"

You should get back something like:

{
  "reply": "Hi Ayomide! That's a solid background — 3 years with Node.js and a preference for TypeScript is a great combo. TypeScript's type safety really pays off in larger codebases. What are you working on?",
  "userId": "user123",
  "sessionId": null
}

The response itself is unremarkable the model is working with just what you sent it. What matters is what's happening in the background: Mem0 is processing this exchange and extracting facts from it asynchronously.
Wait a few seconds, then send another message to give Mem0 time to finish processing:

curl -X POST http://localhost:3000/chat/user123 \
  -H "Content-Type: application/json" \
  -d "{\"message\": \"I'm currently building an API platform called APIblok.\"}"

Now check what Mem0 has stored:

curl http://localhost:3000/chat/user123/memories

You should see something like:

{
  "userId": "user123",
  "memories": [
    {
      "id": "ec30b7df-38f5-411d-af13-4d42befe4a3e",
      "memory": "Ayomide is currently building an API platform called APIblok.",
      "user_id": "user123",
      "categories": null,
      "created_at": "2026-03-19T14:08:07-07:00",
      "updated_at": "2026-03-19T14:08:07-07:00",
      "structured_attributes": {
        "day": 19,
        "hour": 21,
        "year": 2026,
        "month": 3,
        "day_of_week": "thursday"
      }
    },
    {
      "id": "b0217d62-f54d-4b7e-bca1-e6cf528ee09a",
      "memory": "Ayomide has 3 years of experience with Node.js and prefers TypeScript over plain JavaScript.",
      "user_id": "user123",
      "categories": ["professional_details", "technology"],
      "created_at": "2026-03-19T14:07:18-07:00",
      "updated_at": "2026-03-19T14:08:26.222857-07:00",
      "structured_attributes": {
        "day": 19,
        "hour": 21,
        "year": 2026,
        "month": 3,
        "day_of_week": "thursday"
      }
    }
  ]
}

Mem0 didn't store your raw messages. It extracted the facts that actually matter your experience level, your tech preferences, what you're building and stored them as clean, discrete memory objects. That's the extraction step working exactly as intended.

The before/after comparison

To make the difference concrete, the API has a /no-memory route that hits the same LLM with the same model but with zero memory context. Let's use it.

First, send a follow-up question through the memory-aware route:

curl -X POST http://localhost:3000/chat/user123 \
  -H "Content-Type: application/json" \
  -d "{\"message\": \"What framework would you recommend for my project?\"}"

Sample Response:

{
  "reply": "Given that you're building APIblok an API platform  and you're comfortable with TypeScript, I'd recommend sticking with Express if you want something minimal and flexible, or looking at Fastify if performance is a priority. Both have strong TypeScript support. Hono is also worth a look if you're targeting edge runtimes.",
  "userId": "user123"
}

The response is specific. It knows what you're building, knows you're a TypeScript person, and gives a recommendation that fits your actual context.

Now send the exact same message through the stateless route:

curl -X POST http://localhost:3000/chat/user123/no-memory \
  -H "Content-Type: application/json" \
  -d "{\"message\": \"What framework would you recommend for my project?\"}"

Sample Response:

{
  "reply": "To provide a framework recommendation that fits your needs, I'd love to know more about your project. Here are a few questions to get started:\n\n1. What type of project is it?\n2. What programming languages are you planning to use?\n3. What are the main features and functionalities?\n4. Do you have any specific requirements or constraints?\n5. Are there any particular technologies you're interested in integrating with?",
  "userId": "user123",
  "memoryUsed": false
}

Same model, same question and instead of an answer, you get an interrogation. The stateless route has no idea who you are, what you're building, or what language you use, so it fires back five clarifying questions asking you to repeat everything you've already told it. The memory-aware route answered directly because it already had the context it needed.
This is the ceiling that plain conversation history hits across sessions. Memory context closes that gap without any extra work from the user.

What the memory objects look like

If you've been following along, the memory objects from GET /chat/:userId/memories should look familiar — we walked through the full structure in the previous section. But now that you've seen them generated live from real conversations, the structure makes more sense in context.
A few things worth reinforcing:
The memory field is a distilled fact, not a raw message. Mem0 didn't store "hey I have 3 years with Node and I prefer TypeScript" it stored "Ayomide has 3 years of experience with Node.js and prefers using TypeScript". The extraction is deliberate and concise.

Categories are assigned automatically. Mem0 decided ["professional_details", "technology"] without any configuration from you. This is what makes retrieval intelligent when a new message comes in, Mem0 knows which memories are actually relevant to surface rather than dumping everything into the context.
Deduplication happens in the background. Try sending the same kind of detail twice across two separate requests:

curl -X POST http://localhost:3000/chat/user123 \
  -H "Content-Type: application/json" \
  -d "{\"message\": \"Just a reminder — I work primarily in TypeScript.\"}"

{
  "reply": "I recall that you prefer working with TypeScript over plain JavaScript, and that you have around 3 years of experience with Node.js. You're also currently building an API platform called APIblok. How can I assist you with your TypeScript or APIblok project today?",
  "userId": "user123",
  "sessionId": null
}

The model already knows. Now check the memories endpoint you won't find a new entry for TypeScript. Mem0 recognized it as information it already had and updated the updated_at timestamp on the existing memory rather than creating a redundant one. The store stays clean automatically, with no deduplication logic on your end.
To reset your state between test runs, hit the DELETE endpoint:

curl -X DELETE http://localhost:3000/chat/user123/memories

This wipes all stored memories for user123 and gives you a clean slate to test from scratch.

Conclusion

Most AI applications treat memory as an afterthought they replay conversation history on every request and call it context. That works until it doesn't: context windows fill up, costs climb, and the moment a user starts a new session, everything they've told the system is gone. The problem isn't the LLM. It's the architecture around it.

What we built here takes a different approach. By adding Mem0 as a dedicated memory layer, the API extracts what matters from each conversation, stores it as structured facts, and retrieves only what's relevant on the next request. The result is a system that gets more useful over time — not one that forgets everything the moment a session ends.

To recap what we covered: setting up an Express and TypeScript API with Groq as the LLM provider, initializing Mem0 and wiring it into the request lifecycle, building routes for memory-aware chat, memory retrieval, memory deletion, and a stateless control route for comparison, and testing the whole thing end to end to see extraction, retrieval, and deduplication working in practice.

From here, a few natural directions to explore: scope memories to specific conversation threads using run_id, swap the Mem0 cloud client for a self-hosted instance if you need full data control, or wire this API into a frontend to build a chat interface that actually remembers its users.
The full source code for this tutorial is available on GitHub: [https://github.com/techsplot/mem0-express-api]

Building Reliable Systems in Elixir: The "Let It Crash" Philosophy

Ayomide olofinsawe — Wed, 18 Feb 2026 13:16:49 +0000

Software systems fail. A background job crashes, a request throws an error, or a small bug causes part of an application to stop working. Often, a single failure can affect the entire system.

Most programming languages try to prevent crashes by handling errors everywhere using checks, conditionals, and exceptions to keep the program running.

Elixir takes a different approach. Instead of trying to stop every failure, Elixir assumes that crashes will happen and focuses on ensuring the system can recover quickly when they do.

Elixir runs on the BEAM, a runtime originally designed for systems that must stay online even when parts fail. Rather than letting one error bring everything down, BEAM isolates problems and keeps the rest of the system running.

In this article, we’ll explore how Elixir handles errors, what “let it crash” really means, and how you can use these ideas to build reliable applications.

Why Traditional Error Handling Breaks Down

In many programming languages, handling errors often means trying to prevent failures from happening in the first place. Techniques include:

Input validation
Conditional checks
Exception handling

While this works for small programs, it becomes difficult to manage as systems grow larger and more complex.

Common Challenges

Shared state and dependencies

Components often rely on each other. A failed database call might block a request handler, slowing down the whole system. In worst-case scenarios, a single unhandled error can crash everything.
Accumulation of defensive code

Functions filled with checks and special cases become harder to read, maintain, and reason about. Ironically, this extra complexity can introduce new bugs.

As applications scale, trying to prevent every crash becomes less practical. Instead, some systems focus on containing failures, ensuring that when something goes wrong, the damage is limited.

Elixir embraces this containment-focused approach. Crashes are treated as isolated incidents that can be managed and recovered from.

This leads to one of the most well-known concepts in Elixir: “Let it crash.”

The Elixir Philosophy: “Let It Crash”

At first, “let it crash” may sound reckless. Most developers are taught to prevent crashes at all costs. In Elixir, it means:

Don’t hide serious problems.

If a process encounters an error it cannot safely recover from, let it stop completely.
Use supervisors to recover.

A separate supervisor can restart the failed process in a clean state.

Simply put:

If something is badly broken, don’t keep using it restart it.

Real-World Analogy

Think about your phone. If an app freezes, what do you do?

Force close it
Reopen it

You don’t try to debug it while it’s stuck. Elixir builds systems that work in the same way automatically: crashes are contained, the rest of the system keeps running, and recovery happens cleanly.

Understanding Elixir Processes: Why Failures Stay Isolated

One of the key reasons Elixir can safely “let it crash” is how it runs tasks: each task runs in its own lightweight process.

Characteristics of Elixir Processes

Each process has its own workspace.
Processes don’t share memory directly with others.
Each process handles a specific job independently.

If a process crashes, it does not affect other processes. The rest of the system continues running as if nothing happened.

Analogy: Office Workers

Imagine an office where each worker sits in their own cubicle:

One worker makes a mistake on a task.
That mistake doesn’t spread to others because workspaces are separate.
A manager (the supervisor) notices the error and assigns a new worker.

Why It Matters

Crashes are contained and predictable.
Systems are more reliable.
Developers can focus on building features rather than defensive error handling everywhere.

In short:

Isolated processes + supervision = fault-tolerant systems

Supervisors: How Elixir Recovers Automatically

In Elixir, supervisors watch over processes. Think of a supervisor as a monitoring system for background jobs or microservices.

Supervisors don’t do the work themselves.
Their job is to ensure each process runs correctly.
If a process fails, the supervisor restarts it automatically, keeping the application running smoothly.

Programming Analogy: Job Queue Workers

Imagine a web app with multiple background workers:

Sending emails
Generating reports
Processing user uploads

Each worker runs in its own process:

One worker crashes due to a corrupted file.
The supervisor restarts a fresh worker.
Other workers continue without interruption.

This is similar to job queues like Sidekiq or Celery — but in Elixir, the restart mechanism is built-in.

Key Points About Supervisors

They manage failures, not prevent them.
Can be arranged in hierarchies to watch multiple processes.
Different restart strategies exist depending on the process’s importance.

Takeaway:

Processes crash → supervisors restart them → the app continues running automatically. This is the backbone of Elixir’s “let it crash” philosophy.

Why This Approach Works in Practice

The combination of isolated processes and supervisors makes Elixir applications truly resilient.

Benefits

Crashes are contained

One process crashing doesn’t take down the whole application.
Automatic recovery

Supervisors detect failures and restart processes without developer intervention.
Simpler code

Developers can write straightforward code without defensive clutter.
Scalability and concurrency

Thousands of independent tasks can run simultaneously, thanks to lightweight processes.

Example: Web Application Workers

Worker A: Image uploads
Worker B: Email notifications
Worker C: Report generation

If Worker B fails due to a temporary network issue:

Worker B stops.
Supervisor restarts a fresh Worker B.
Workers A and C continue uninterrupted.

✅ Key takeaway:

Elixir doesn’t try to prevent all failures — it manages them intelligently for reliability and maintainability.

Common Misconceptions and Beginner Mistakes

Overusing try/rescue

Catching every error defeats the purpose of supervisors.
Ignoring supervision trees

Skipping supervisors or using ad-hoc processes leads to fragile systems.
Trying to prevent every failure

Elixir assumes failures are inevitable. Handle them at the system level, not everywhere in code.
Confusing “let it crash” with sloppy coding

It doesn’t mean ignoring logic errors or poor design. It means isolating failures safely.

✅ Key takeaway:

Understand these pitfalls to use Elixir’s fault-tolerant features effectively.

Example Applications Where “Let It Crash” Shines

Messaging Platforms

Thousands of simultaneous messages; one process failing doesn’t crash the system.
Real-Time Analytics and Event Processing

Single faulty events don’t stop the entire pipeline; supervisors restart failed workers.
Background Job Processing

Jobs like email sending or image resizing run independently; failures are restarted automatically.
IoT or Embedded Systems

Each sensor/device runs independently; crashes don’t compromise the rest of the system.

Key Insight:

Elixir’s approach is practical, especially for concurrent, fault-tolerant, high-reliability systems.

Conclusion + Next Steps

Elixir’s approach to error handling — isolated processes, supervisors, and “let it crash” philosophy — provides a new way to build reliable applications.

Key Takeaways

Isolated processes: Crashes don’t affect the whole system.
Supervisors: Automatically monitor and restart failed processes.
Let it crash: Recovering cleanly is often better than over-handling errors.
Real-world impact: Messaging platforms, analytics pipelines, background jobs, and IoT applications all benefit.

Next Steps for Developers

Learn about OTP (Open Telecom Platform)

Provides core abstractions for fault-tolerant Elixir apps.
Experiment with Supervision Trees

Build small apps where supervisors manage multiple processes.
Study real applications

Explore open-source Elixir projects like Phoenix or Nerves to see these concepts in action.

By embracing these ideas, developers can build highly concurrent, reliable, and maintainable systems without writing overly defensive or complex code.

Why Learning Elixir (and programming) Still Matters in the AI Era

Ayomide olofinsawe — Mon, 09 Feb 2026 15:14:05 +0000

For a while now, it feels like nobody talks about programming languages anymore.

Everywhere you look, it’s AI agents, workflows, and tools that can write code almost instantly. The language underneath the code feels invisible almost irrelevant.

So it’s fair to ask:

Why learn a language like Elixir now?

Especially when AI can autocomplete functions, generate examples, and even explain errors.

What Confused Me When I First Met Elixir

Coming from a JavaScript background, diving into Elixir felt like stepping into a whole new world. It wasn’t just the syntax it was the way of thinking.

Concepts like immutability, pattern matching, and functional programming weren’t just unfamiliar; they actively challenged how I was used to writing code.

In JavaScript, I could change a variable whenever I wanted. In Elixir, that same idea felt almost wrong. I had to retrain my brain to think in terms of transforming data instead of changing state.

Functions became first-class citizens, recursion replaced loops, and I realized that side effects were something to be carefully managed, not ignored.

Even simple things like updating a map or handling a conditional made me pause and ask:

“Am I approaching this the Elixir way, or am I forcing JavaScript habits onto it?”

It was confusing. Sometimes frustrating. But it was also the kind of confusion that forces you to build better mental models for programming.

What Made Me Doubt Learning Elixir in the AI Era

What made me doubt learning Elixir wasn’t even Elixir itself it was programming languages in general.

With AI being able to write code, autocomplete logic, and suggest solutions, I started wondering if spending time learning a new language still made sense.

I kept asking myself:

Should I focus on learning a programming language, or should I focus on other things AI can’t easily replace?

From that mindset, learning Elixir felt risky. It’s not as mainstream as JavaScript, and it doesn’t dominate AI conversations.

But that doubt revealed something important.

The hard part wasn’t typing code it was understanding what the code is doing and why it’s structured that way.

AI could generate solutions, but it couldn’t give me intuition. It couldn’t replace the mental shift required to reason about state, failure, and concurrency.

What AI Doesn’t Fully Replace

When I tried to figure out what AI didn’t help me with while learning Elixir, I realized something uncomfortable I couldn’t point to a clear example yet.

Not because AI was doing everything, but because I was still early in the learning process.

AI could explain syntax, generate examples, and suggest solutions. What it couldn’t do was make things feel intuitive immediately.

I still had to sit with concepts like immutability and functional thinking until they slowly made sense.

That gap between seeing an answer and actually getting it is where learning still happens.

AI can shorten the distance, but it can’t skip it.

Learn the Right Way

If there’s one thing I’d tell someone unsure about learning a programming language today, it’s this:

Learn the right way. Learn the basics syntax, core concepts, and mental models then move on to implementation.

Don’t skip the foundation.

AI can fill in code snippets and autocomplete logic, but it can’t replace understanding. That understanding knowing why things work is what makes code reliable and systems easier to reason about.

Learning Elixir isn’t just about writing functions. It’s about training your brain to think clearly about data, state, and failure skills that will still matter no matter how advanced AI becomes.

I am still kinda new to the elixir space looking forward to writing more about elixir and my learning generally

Jess

Ayomide olofinsawe — Wed, 21 Jan 2026 16:36:53 +0000

From Local Model to Production API: A Practical CI/CD Workflow for Machine Learning

Ayomide olofinsawe — Thu, 15 Jan 2026 18:03:41 +0000

Machine learning projects rarely fail because the model is bad they fail when it hits production. A model might run perfectly on your laptop, but moving it into a stable, repeatable environment is a different story. Ad-hoc Docker commands, manual server setup, repeated SSH sessions, and fragile deployment processes turn every code change into a risky, time-consuming operation slowing iteration and making infrastructure management overshadow actual model improvement.

Most ML deployment guides assume access to GPUs, Kubernetes, or specialized MLOps platforms. They often gloss over the real-world headaches of shipping a model that just works. This tutorial takes a different approach: you’ll deploy a CPU-optimized ML API on a single cloud server, bake the model into a Docker container at build time, and automate updates with GitHub Actions. The result is a repeatable, reliable, low-cost workflow that works in real production not just in a notebook or local environment.

In this guide, you’ll build a production-ready deployment pipeline for a lightweight sentiment analysis API using FastAPI and Hugging Face, provision a Vultr Cloud Compute instance, and set up a CI/CD workflow that automatically updates your deployment on every push. By the end, you’ll have a workflow capable of taking any ML model from local development to a live, continuously deployed API.

Prerequisites

To follow along with this tutorial, ensure you have the following in place:

Access to an Ubuntu 24.04 server as a non-root user with sudo privileges
Docker installed on both your local machine and the server
A Docker Hub account for storing container images
A GitHub account with GitHub Actions enabled
Basic familiarity with FastAPI, Docker, and Git-based workflows

This guide focuses on automating machine learning deployments rather than covering server provisioning or Docker installation. Refer to the Vultr documentation if you need assistance preparing your environment.

NOTE
This tutorial focuses on CPU-based deployment. The FastAPI service and Hugging Face model are configured to run on CPU only, making it suitable for cost-effective Vultr Cloud Compute instances without GPU support.

Build a Sample ML Model API

Before automating deployment, you need a machine learning service that behaves predictably in production. In this section, you’ll build a CPU-only sentiment analysis API using FastAPI and Hugging Face Transformers, designed specifically for containerized deployment on Vultr.

The application performs inference only. The model is downloaded once, stored locally, and loaded at startup to ensure fast and reliable requests.

Project Structure

Use the following project layout:

deployment_ml/
├─ app/
│  ├─ main.py
│  ├─ model.py
├─ download_model.py
├─ requirements.txt
└─ models/

This structure keeps the API logic, model artifacts, and dependency management clearly separated.

Define Python Dependencies

Create a requirements.txt file with the following contents:

--extra-index-url https://download.pytorch.org/whl/cpu

fastapi==0.115.6
uvicorn[standard]==0.30.6

torch==2.9.1
transformers>=4.46.3

pydantic>=2.7,<3
python-dotenv==1.0.1

This configuration ensures:

CPU-only PyTorch wheels are used
Compatibility with FastAPI and Pydantic v2
Predictable dependency resolution during Docker builds

Download the Model Locally

Instead of downloading the model at runtime, the application uses locally stored model artifacts. This improves startup time and avoids network dependencies during deployment.

Create a script named download_model.py:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import os

MODEL_NAME = "distilbert-base-uncased-finetuned-sst-2-english"
LOCAL_PATH = "./models/distilbert-sst2"

os.makedirs(LOCAL_PATH, exist_ok=True)

model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

model.save_pretrained(LOCAL_PATH)
tokenizer.save_pretrained(LOCAL_PATH)

print(f"Model downloaded and saved to {LOCAL_PATH}")

Run the script once to populate the models/ directory.

Implement the Model Wrapper

Create app/model.py to load the model and handle inference:

from transformers import pipeline
import os
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

DEVICE = -1  # Force CPU usage

LOCAL_MODEL_PATH = os.path.join(
    os.path.dirname(__file__),
    "../models/distilbert-sst2"
)

class SentimentModel:
    def __init__(self):
        logger.info(f"Loading sentiment model from {LOCAL_MODEL_PATH} ...")
        self.pipeline = pipeline(
            task="sentiment-analysis",
            model=LOCAL_MODEL_PATH,
            tokenizer=LOCAL_MODEL_PATH,
            device=DEVICE
        )
        logger.info("Sentiment model loaded successfully.")

    def predict(self, text: str):
        result = self.pipeline(text)[0]
        return {
            "sentiment": result["label"].lower(),
            "confidence": round(result["score"], 4)
        }

sentiment_model = SentimentModel()

def predict_sentiment(text: str):
    return sentiment_model.predict(text)

The model is initialized once at application startup, ensuring consistent performance under load.

Create the FastAPI Application

Create app/main.py:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, Field
from app.model import predict_sentiment
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

app = FastAPI(
    title="ML Sentiment Analysis API",
    description="CPU-only ML API deployed with CI/CD on Vultr",
    version="1.0.0"
)

class PredictionRequest(BaseModel):
    text: str = Field(..., min_length=1, example="I love this product")

class PredictionResponse(BaseModel):
    sentiment: str
    confidence: float

@app.get("/")
def health_check():
    return {"status": "Hello from the automated ML deployment!"}

@app.post("/predict", response_model=PredictionResponse)
def predict(request: PredictionRequest):
    try:
        return predict_sentiment(request.text)
    except Exception as e:
        logger.error(f"Prediction failed: {e}")
        raise HTTPException(
            status_code=500,
            detail="Sentiment prediction failed."
        )

Install Dependencies and Run the API

Create a virtual environment

$ python -m venv .venv 
$ source .venv/bin/activate

install all dependencies and update pip

 pip install --upgrade pip
 pip install -r requirements.txt

Download the model:

 python download_model.py

This will create a ./models/distilbert-sst2 folder containing the pretrained Hugging Face model and tokenizer.

Start the API:

 uvicorn app.main:app --host 0.0.0.0 --port 8000

Test the service:

Health check: Confirm the API is running

curl http://localhost:8000/

Output:

{
  "status": "Hello from the automated ML deployment!"
}

Sentiment prediction: Send a test request

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"text": "This deployment workflow is impressive"}'

Expected outcome: The API should return a JSON object with the predicted sentiment and confidence score, for example:
A successful response confirms the API is ready for containerization.

{
  "sentiment": "positive",
  "confidence": 0.9987
}

Outcome

At this point, you have:

A CPU-optimized ML inference API
Locally packaged Hugging Face model artifacts
Deterministic dependency management
A service ready to be containerized and deployed automatically

Dockerize the ML API

To deploy your ML service reliably on Vultr, we need to containerize the API. Docker ensures that the environment is consistent, reproducible, and isolated, which is essential for CI/CD pipelines and automated deployments.

This section will show you how to:

Create a Docker image that includes your API and model
Bake in the Hugging Face model to avoid runtime downloads
Expose the service port for external access

Create the Dockerfile

In the root of your project directory, create a file named Dockerfile and paste the following content:

# ---- Base image ----
FROM python:3.14-slim

# ---- Environment variables ----
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV TRANSFORMERS_CACHE=/app/.hf_cache

# ---- Set working directory ----
WORKDIR /app

# ---- System dependencies ----
RUN apt-get update && apt-get install -y \
    git \
    && rm -rf /var/lib/apt/lists/*

# ---- Install Python dependencies ----
COPY requirements.txt ./
RUN pip install --upgrade pip \
    && pip install --no-cache-dir -r requirements.txt

# ---- Copy application code ----
COPY app ./app
COPY download_model.py ./download_model.py

# ---- Download model at build time ----
RUN python download_model.py

# ---- Expose port ----
EXPOSE 8000

# ---- Run the app ----
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Key Notes:

TRANSFORMERS_CACHE points to a local folder for Hugging Face caching.
download_model.py runs during build, so the container already includes the model.
Port 8000 is exposed for API access.
--no-cache-dir ensures Python packages don’t increase image size unnecessarily.

Create the .dockerignore File

In the root of your project directory, create a file named .dockerignore and paste the following content:

.venv
__pycache__
*.pyc
*.pyo
*.pyd
.git
.gitignore
.env
.cache
models/

[!NOTE]
models/ is ignored because download_model.py ensures the model is downloaded inside the container at build time.

Build the Docker Image

Run the following command in your project root:

 docker build -t sentiment-api .

[!NOTE]
The Docker image name sentiment-api is used throughout this tutorial for simplicity. You can rename it to anything you like, ust make sure to use the same name consistently in subsequent commands, including GitHub Actions workflows or Docker Hub pushes.

Output

The image is created with your Python dependencies and ML model
Logs during build show Loading… and Model downloaded and saved to /app/models/distilbert-sst2
The image size is optimized due to .dockerignore and --no-cache-dir

Run the Docker Container

Start the API in a container:

 docker run -p 8000:8000 sentiment-api

Output

The API runs inside the container
Health check:

 curl http://localhost:8000/

Output:

{"status": "Hello from the automated ML deployment!"}

Test prediction:

 curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"text": "Dockerized deployment is smooth!"}'

Output:

{
  "sentiment": "positive",
  "confidence": 0.9981
}

Provision a Vultr Instance

Before deploying your Dockerized ML API, ensure you have access to an Ubuntu 24.04 server as a non-root user with sudo privileges (as noted in the prerequisites).

This section shows how to add your SSH key to Vultr when creating or managing an instance and how to install Docker on your server.

Create an SSH Key (If You Don’t Have One)

If you don’t already have an SSH key, generate one on your local machine:

 ssh-keygen -t ed25519 -C "your_email@example.com"

Press Enter to accept the default file location.
Optionally, set a passphrase for added security.

Your public key will be saved as:

~/.ssh/id_ed25519.pub

Copy Your Public Key

Run:

cat ~/.ssh/id_ed25519.pub

The public key will appear in the terminal.
Select and copy the entire line (starts with ssh-ed25519 and ends with your email or usename).
This is the key you will add to your Vultr server for secure SSH access.

Add SSH Key to Your Vultr Instance

When creating a new Vultr Cloud Compute instance or managing an existing one:

Log in to your Vultr dashboard.
Navigate to Account.
Under SSH Keys, click Add SSH Key.
Paste your public key from Step 2.

Once the server is deployed, note the IP address — you’ll use it to connect via SSH.

Connect via SSH

Use your existing sudo-enabled user to log in:

 ssh username@YOUR_VULTR_IP

Replace username with your sudo-enabled user.
You are now ready to install Docker and deploy your ML API.

Install Docker

Run each command line by line as your sudo user:

Update package lists:

 sudo apt update

Install Docker and Docker Compose:

 sudo apt install -y docker.io docker-compose

Enable Docker to start on boot:

 sudo systemctl enable docker

Start Docker:

 sudo systemctl start docker

Verify installation:

 docker --version

Output:

Docker version 29.1.4, build 0e6fee6

Set up firewall rules for port 8000:

 ufw allow 8000/tcp
 ufw enable

Optional — Test Docker

Run a quick test container:

 docker run hello-world

You should see a confirmation message that Docker is installed and running correctly.

Set Up GitHub Actions Workflow

We’ll automate deployment using GitHub Actions. The workflow will:

Build the Docker image
Push it to Docker Hub
SSH into your Vultr server and deploy the container

Before setting up the workflow, make sure you can push code to GitHub via SSH.

First-Time SSH Setup for GitHub

To push your project over SSH, GitHub needs your SSH public key added to your account (only required once per machine).

Copy your public key:

 cat ~/.ssh/id_ed25519.pub

Copy the entire output (starts with ssh-ed25519 and ends with your email).
Add the SSH key to GitHub:

Log in to GitHub → click your profile → Settings → SSH and GPG keys → New SSH key.
Give it a descriptive title (e.g., “Laptop key”).
Paste your public key in the Key field and click Add SSH key.

Test the connection:

 ssh -T git@github.com

Output:

Hi your-username! You've successfully authenticated, but GitHub does not provide shell access.

You’re now ready to push code to GitHub securely.

Create a GitHub Repository

Go to GitHub and create a new repository for your project.
In your local project folder, initialize Git (if not already done):

 git init

Push Your Project to GitHub

Add your files, commit, and push to the new repository:

 git add .
 git commit -m "Initial commit"
 git branch -M main
 git remote add origin git@github.com:your-username/ml-api.git
 git push -u origin main

Replace your-username and ml-api with your GitHub username and repository name.

Add Repository Secrets

GitHub Actions needs secrets for Docker Hub and your Vultr server:

Secret	Description
`DOCKER_USERNAME`	Your Docker Hub username
`DOCKER_PASSWORD`	Your Docker Hub password
`VULTR_HOST`	Vultr server IP
`VULTR_USER`	SSH username (sudo-enabled user)
`VULTR_SSH_KEY`	Private SSH key corresponding to the public key on the server
`VULTR_PORT`	SSH port (default 22)

Add them via Settings → Secrets and Variables → Actions → New repository secret.

Create the Workflow File

In your project root, create the directory and file:

.github/workflows/deploy.yml

Paste the following workflow code:

name: CI/CD Deploy to Vultr

on:
  push:
    branches:
      - main

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Verify Docker username
        run: |
          if [ -z "${{ secrets.DOCKER_USERNAME }}" ]; then
            echo "Error: DOCKER_USERNAME secret is empty!"
            exit 1
          fi
          echo "Docker username is set ✅"

      - name: Log in to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}

      - name: Build and Push Docker Image
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: |
            ${{ secrets.DOCKER_USERNAME }}/sentiment-api:latest
            ${{ secrets.DOCKER_USERNAME }}/sentiment-api:${{ github.run_number }}

      - name: Deploy to Vultr
        uses: appleboy/ssh-action@v0.1.6
        with:
          host: ${{ secrets.VULTR_HOST }}
          username: ${{ secrets.VULTR_USER }}
          key: ${{ secrets.VULTR_SSH_KEY }}
          port: ${{ secrets.VULTR_PORT }}
          script: |
            docker pull ${{ secrets.DOCKER_USERNAME }}/sentiment-api:latest
            docker stop sentiment-api || true
            docker rm sentiment-api || true
            docker run -d \
              --name sentiment-api \
              -p 8000:8000 \
              --restart always \
              ${{ secrets.DOCKER_USERNAME }}/sentiment-api:latest

Push Workflow to Trigger Deployment

 git add .
 git commit -m "Add CI/CD workflow"
 git push origin main

GitHub Actions will automatically run the workflow.
Your Dockerized ML API will be built, pushed, and deployed to your Vultr server.

Test Automatic Deployment

With the GitHub Actions workflow in place, you can now verify that your Dockerized ML API deploys automatically to your Vultr server whenever you push changes.

Make a Code Change

For example, you can update the health check message in app/main.py:

@app.get("/", tags=["Health"])
def health_check():
    return {"status": "Hello from the automated ML deployment! tested and trusted ✅"}

Save your changes locally.
This minor update is enough to trigger the CI/CD workflow.

Commit and Push

 git add .
 git commit -m "Update health check message"
 git push origin main

GitHub Actions will automatically detect the push.
The workflow will build the Docker image, push it to Docker Hub, and deploy it to your Vultr server.

Monitor Deployment

Go to your repository → Actions tab.
Click on the latest workflow run.
You’ll see step-by-step logs:
Checkout repository ✅
Set up Docker Buildx ✅
Log in to Docker Hub ✅
Build and push Docker image ✅
Deploy to Vultr ✅

Any errors will appear in the logs, making debugging straightforward.

Verify on Vultr

Once the workflow completes, check your API:

 curl http://YOUR_VULTR_IP:8000/

Output

{
  "status": "Hello from the automated ML deployment! ✅"
}

Next, test the prediction endpoint:

 curl -X POST http://YOUR_VULTR_IP:8000/predict \
-H "Content-Type: application/json" \
-d '{"text": "I love this product!"}'

Output:

{
  "sentiment": "positive",
  "confidence": 0.9987
}

This confirms the latest code changes are live on your Vultr server.
Each subsequent push to main will automatically deploy updated containers.

Pro Tips:

If the workflow fails, check the Actions logs for errors in Docker build, push, or SSH deployment.
You can rename your container in the workflow if you want multiple APIs on the same server.
For safety, consider adding a rollback strategy (optional) if a deployment introduces a bug.

Add a Rollback Strategy (Optional)

In case a deployment introduces an issue, you can quickly roll back to a previous version of the API using Docker image tags.

Each deployment pushes two tags to Docker Hub:

latest – the most recent deployment
A numbered tag (for example, sentiment-api:42) representing a specific build

If the latest deployment fails, you can redeploy a previous version directly on your Vultr server.

SSH into your server
Stop and remove the current container
Run a previous image tag

Stop and remove the current container:

docker stop sentiment-api
docker rm sentiment-api

Run the previous stable image:

docker run -d \
  --name sentiment-api \
  -p 8000:8000 \
  --restart always \
  your-docker-username/sentiment-api:42

This immediately restores the last stable version without rebuilding or modifying the CI/CD pipeline.

Conclusion

Automating machine learning deployments is often the missing link between experimentation and production. By combining Docker, GitHub Actions, and Vultr Cloud Compute, you can turn a local ML model into a continuously deployed, production-ready API with minimal operational overhead.

In this tutorial, you:

Built a CPU-efficient ML API using FastAPI and Hugging Face
Containerized the application for consistent deployments
Prepared a Vultr server for hosting Dockerized workloads
Implemented a CI/CD pipeline that deploys automatically on every push
Verified live updates and added a lightweight rollback option

Vultr’s straightforward infrastructure and predictable pricing make it an excellent platform for deploying ML services—whether you’re shipping a prototype, running internal tools, or serving real users in production.

With this foundation in place, you can confidently extend the workflow to support additional models, environments, or scaling strategies as your MLOps needs grow.

🛡️ AgentGuard: Secure and Govern Your AI Agents with Auth0 for AI

Ayomide olofinsawe — Mon, 27 Oct 2025 03:54:47 +0000

This is a submission for the Auth0 for AI Agents Challenge

What I Built

AgentGuard is a security and governance platform for AI agents — designed to protect how enterprise AI agents interact with external APIs, tools, and sensitive data.

Today, organizations build powerful AI assistants that can access Slack, Stripe, HR systems, or internal APIs. But what happens when one of these agents tries to access a resource it shouldn’t?
That’s where AgentGuard steps in.

The Problem:
AI agents often operate without proper guardrails — no authentication for users prompting them, no authorization boundaries for tools they access, and no audit logs for compliance teams.

The Solution:
AgentGuard acts as a control layer between humans, AI agents, and external APIs. It authenticates users, authorizes agent actions, and provides real-time governance through Auth0’s identity and policy management system.

With AgentGuard, every agent action is:

Authenticated (user identity verified with Auth0)

Authorized (checked against a policy engine)

Audited (logged for compliance visibility)

Demo

🔗 GitHub Repository: https://github.com/techsplot/Agentguard_new

🖼️ Screenshots

Agent Dashboard: Register and monitor agents securely.

Chat Interface: Interact with AI agents safely, backed by Auth0 authentication.

Logs View: Real-time audit of agent activities and policy checks.

(Attach screenshots or screen recordings of the UI — dashboard, logs, chat, etc.)

How I Used Auth0 for AI Agents

This project was purpose-built to showcase Auth0 for AI’s three core capabilities:
Authentication, Authorization, and Token Control — all integrated seamlessly into an AI-agent-driven architecture.

🧱 1. User Authentication

Users sign in via Auth0’s Universal Login to access the AgentGuard dashboard.
Protected routes ensure only authenticated users can register or interact with agents.

🧩 2. Token Vault Simulation

Before an agent can call an external API (like Slack or Stripe), AgentGuard generates a short-lived access token through a mock Auth0 Token Vault.
Each token:

Is tied to a specific agent and action.

Expires after a short period.

Is logged for traceability.

This prevents agents from misusing permanent credentials or accessing unauthorized resources.

🛡️ 3. Fine-Grained Authorization (FGA)

A Policy Engine checks if each action is allowed for the given agent:

{
"agent": "Stripe Bot",
"allowedTools": ["Stripe"],
"restrictedActions": ["refundOver500USD"]
}

If an agent tries something beyond its defined scope, the action is denied — and the attempt is logged for review.

🧠 4. AI Interaction via Gemini

AgentGuard uses Gemini 2.5 Flash to power intelligent agent responses.
Every response is mediated through Auth0’s guardrails — ensuring only authorized actions trigger model calls.

Lessons Learned and Takeaways

AgentGuard embodies a vision where AI agents operate responsibly — always within defined limits, fully observable, and entirely secure.

As AI becomes more autonomous, tools like Auth0 for AI Agents will be the backbone of trustworthy AI ecosystems.

devchallenge for googleaichallenge and it is an ai powered app that allows you to get summary fom video and questions to answer based on that video also there is the amazing feature for writer and also generate blog ideas from the video

Ayomide olofinsawe — Thu, 02 Oct 2025 01:00:47 +0000

Ayomide olofinsawe

Sep 15 '25

Lecture lab AI: Transform Lectures into Summaries, Questions, and Blog Ideas

#devchallenge #googleaichallenge #ai #gemini

220

2 min read

🌍AI-Powered Disaster Relief Dashboard with KendoReact, Twilio, and Gemini

Ayomide olofinsawe — Mon, 29 Sep 2025 05:17:54 +0000

This is a submission for the KendoReact Free Components Challenge.

What I Built

I built a Disaster Relief Resource Dashboard – a React web app designed to help NGOs, volunteers, and aid coordinators manage disaster response operations more effectively.

The app solves three key problems during disaster response:

Resource Tracking → Manage critical resources like food, water, medicine, and shelter, and keep stock levels up-to-date.
Volunteer Scheduling → Register and assign volunteers to shifts and locations, with an integrated notification system (via Twilio SMS).
Impact Visualization → Use interactive charts to visualize resource distribution, aid progress, and overall relief impact.

🔮 Bonus Feature: A contextual AI Assistant Widget, powered by Google Gemini + KendoReact Conversational UI. The AI assistant can answer questions like:

“How many food packs do we still have left?”
“Which volunteers are free tomorrow?”
“What’s the biggest gap in our current resource stock?”

The assistant uses contextual information (selected disaster type + available resources) to provide more accurate and actionable answers.

This project demonstrates how KendoReact free components + AI integration can be used to solve a real-world problem in disaster management.

Demo

github link
Live url

here are some ui flow of the app

this is the dahboard that showing all the pages we have and also how you can see resources available

this ui is responsible for adding diaster

this is the voulunteer page wheree voulunteer can be added and a notification is sent to them via sms in real time using twillo

here is the proof that the sms is been sent to a real number

The analysis dahboard help in analysing and seeing chart of voulteers and also resources

the ai ui which is powered by gemini enable context based answers and can update pages based on prompt from user e.g i can tell it to increase the number of resources in a disater or all disaster and i can tell it otherwise

KendoReact Components Used

I used a wide range of KendoReact components (well above the 10 required):

Grid → Resource management table.
DropDownList → Filter resources by category.
NumericTextBox → Update stock quantities.
Upload → Upload delivery proof documents.
Button → Save actions and trigger updates.
Scheduler → Assign volunteers to shifts/locations.
Form → Register new volunteers.
DatePicker → Assign shifts/dates.
Chart (bar/line) → Distribution by region.
PieChart → Aid breakdown by category.
Card → Quick relief stats.ProgressBar → Show progress toward aid targets.

[Optional: Code Smarter, Not Harder prize category] AI Coding Assistant Usage

I used GitHub Copilot and the KendoReact AI extension to speed up development.

Copilot generated starter code for each page (Resources, Volunteers, Analytics).
I provided prompts to contextualize the AI assistant, ensuring it always reads the selected disaster type and available resources.
The AI assistant was built as a floating widget using Copilot-generated boilerplate, refined with Kendo UI components.

This saved significant time, allowing me to focus on the logic and integration instead of writing every component from scratch.

[Optional: RAGs to Riches prize category] Nuclia Integration

❌ Not integrated (I explored Nuclia RAG but couldn’t proceed due to account setup limitations).
✅ Instead, I used Google Gemini as the AI backend, making the AI Assistant widget functional and contextual without RAG for now.

Thanks dor reading

Lecture lab AI: Transform Lectures into Summaries, Questions, and Blog Ideas

Ayomide olofinsawe — Mon, 15 Sep 2025 03:49:46 +0000

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

I built an AI-powered lecture assistant that transforms lengthy lecture transcripts into concise summaries, generates questions for self-assessment, and even crafts content ideas for blogs or learning journals on platforms like Medium, Dev.to, or personal blogs.

This web app operates in two phases:

Lecture Analysis & Summarization:

Accepts raw lecture text input
Sends it to a custom AI workflow built in Google AI Studio
Instantly returns a clean, easy-to-digest summary

Content & Learning Assistance:

Generates self-assessment questions based on the lecture
Provides article and content ideas so learners can document their study journey or share insights online It’s a lightweight, accessible tool designed for students and content creators who want to learn, review, and share knowledge efficiently.

Demo

Originally, I planned to deploy the app using Google Cloud Run, but due to card verification issues, I switched to Netlify for hosting so reviewers can still access the live demo seamlessly.

Live Demo (Hosted on Netlify): Netlify
GitHub Repository: Github repo
AI Studio Workflow File: View Saved File

And also here is the link to the full demo video on Youtube.
Video link

Screenshots:
here are the screen shot of some of the ui:

this is the landing page:

here is the search box and the result:

Selected video :

How I Used Google AI Studio

I used Google AI Studio to:

Design and test the AI prompt for summarizing lecture transcripts
Fine-tune the response formatting for clarity and conciseness
Export the workflow so it could be integrated into a simple web application

The saved AI Studio file above verifies the use of the platform and allows others to see exactly how the workflow was built.

Multimodal Features

Our system transforms lecture content into interactive, multimodal learning experiences through the following stages:

Audio/Video → Text → Structured Modules
Gemini-2.5-Flash converts lectures into accurate transcripts, then organizes them into learning modules with quizzes, flashcards, and AI-generated practice questions.
Text → Image Storytelling
Imagen-4.0-Generate-001 brings story-driven learning segments to life as engaging visuals, creating strong memory cues for students.
Text ↔ Image Matching Challenges
Learners complete interactive matching exercises before lessons begin—pairing terms with images to build active recall and conceptual understanding.
Text → Speech Narration
Using the Web Speech API, summaries and story scenes are turned into clear, natural-sounding audio to support auditory learners and improve accessibility.

Conclusion

This project demonstrates how Google AI Studio can be combined with a simple frontend to deliver practical, student-focused solutions in a short amount of time.

All source code, demo links, and workflow files are available above for anyone to explore or extend.

How to Connect Your Name.com Domain to Netlify: A Beginner’s Guide

Ayomide olofinsawe — Sun, 14 Sep 2025 13:29:58 +0000

I have to admit something upfront: before this, I had zero experience with domain configuration. I always thought connecting a domain to a hosting platform was just a matter of a few clicks. Turns out, it’s not always that simple—especially if you make a mistake like I did.
Recently, I bought a domain on Name.com and deployed a project on Netlify. My goal was simple: connect my shiny new domain to my Netlify site so it would look professional. But the process wasn’t as smooth as I expected.
In this article, I’ll share exactly what happened, the mistakes I made, and how I finally got it right. Hopefully, this will save you from the headaches I had.

Step 1: Buying My Domain on Name.com

This part was straightforward. I searched for my desired domain name, saw it was available, and bought it. Within minutes, I had access to the Name.com dashboard where all my domain settings live.

"Here’s what my Name.com dashboard looked like after buying the domain. This is where all the DNS settings live."

Step 2: Deploying My Site on Netlify

I already had my site deployed on Netlify (via GitHub integration). Netlify automatically gives you a temporary URL like your-site.netlify.app.
The goal was to replace that with my custom domain: mydomain.com.

Step 3: Adding the Custom Domain in Netlify

Inside the Netlify dashboard:
I clicked on Domain Settings → Custom domains.

Added my new domain name (e.g., mydomain.com).

"In Netlify, I went to Domain Settings → Custom domains and typed in my new domain name here."
Netlify then provided me with DNS records to add on Name.com—CNAME, A records, and sometimes TXT records for verification.

"After adding the domain, Netlify provided these DNS records that I needed to add on Name.com."

Step 4: The Big Mistake—Changing Name Servers Instead of DNS Records

This is where things went wrong.
Instead of adding only the DNS records Netlify gave me, I went ahead and changed the name servers on Name.com to point to Netlify.
What happened?
My email stopped working because changing name servers moved all domain control to Netlify.

My domain wasn’t resolving properly because I didn’t set up email or DNS records correctly on Netlify.

I learned the hard way:
If you only want to connect your website but keep your email working, don’t change name servers. Only add the records Netlify asks for under DNS management on Name.com.

"This was my mistake: instead of just adding DNS records, I changed the name servers to point to Netlify—and my email stopped working."

Step 5: Fixing the DNS on Name.com

After some research (and frustration), I realized what I needed to do:
Revert the name servers back to the default Name.com ones.

Add the CNAME and A records provided by Netlify in the DNS records section.

"Here’s the correct way: I reverted to Name.com’s default name servers and added the CNAME and A records provided by Netlify."
Once saved, it took about 30–60 minutes for everything to propagate.

Step 6: The “It Finally Worked” Moment

After refreshing (and waiting a bit), my site finally showed up on my custom domain. Email started working again, and I learned a lot in the process.

"After waiting for DNS propagation, my site finally loaded on my custom domain—mission accomplished!"

Optional: Confirming DNS Propagation

To be extra sure, I used a tool like WhatsMyDNS to confirm that my DNS changes had propagated globally before celebrating.

"I used a tool like WhatsMyDNS to confirm my domain changes were fully propagated before celebrating."
Lessons I Learned
Don’t change name servers unless you know exactly what you’re doing.

Stick to DNS record updates when pointing your domain to hosting providers.

Propagation takes time—sometimes up to 24 hours. Don’t panic if things don’t work instantly.

Conclusion

What started as a simple task turned into a learning experience. But in the end, I now have a professional domain connected to my Netlify site—and my email works too.
If you’re setting this up for the first time, follow the steps carefully, avoid changing name servers unnecessarily, and be patient during DNS propagation.

What I learned building a GraphQL API with Prisma, SQLite & Node.js (Part 1)

Ayomide olofinsawe — Wed, 14 May 2025 10:37:37 +0000

Introduction

Have you ever wanted to build your own backend API but felt overwhelmed by terms like GraphQL, resolvers, and databases? Good news — you’re in the right place!
In the world of APIs, REST has been the traditional choice for a long time. However, GraphQL is becoming increasingly popular because it allows clients to request exactly the data they need — no more, no less. Instead of multiple endpoints like in REST, GraphQL gives you one smart endpoint that does exactly what you ask.

Why Choose GraphQL Over REST?

Traditional REST APIs can become bulky with multiple endpoints and overfetching data. GraphQL solves this by providing a single flexible endpoint where clients can ask for exactly what they need, making APIs more efficient and developer-friendly. Companies like GitHub, Shopify, and Facebook use GraphQL in production for this very reason.
In this article, I'll walk you through what I learned while building my first GraphQL API using Node.js, Prisma ORM, and SQLite. Whether you're new to GraphQL or just curious about a different way to build APIs, this guide is for you.
By the end, you’ll have a running GraphQL server that can create and fetch task data — and you'll truly understand the core concepts behind it. Even better, you’ll walk away with a working backend that you can connect to a frontend project later!
Let's dive in and make backend development fun and simple!

Why This Project Matters

Learning GraphQL and Prisma together is a superpower for modern developers. Many real-world applications — from simple todo apps to complex booking systems — rely on APIs to manage and share data.
By mastering how to build a GraphQL API with Prisma, you're opening the door to:

Building fullstack applications faster
Writing cleaner and more efficient backend code
Creating projects you can showcase in your portfolio

We’ll keep it practical by building a simple "Task Manager" API. You’ll be able to:

Create new tasks
Mark them as completed
Fetch all tasks easily

What I Learned

Building this project was not just about writing code; it was about understanding the underlying architecture and how modern APIs function.

Here are some of the key lessons I took away:

GraphQL is powerful because it gives clients control. Instead of relying on multiple endpoints like REST, I could get exactly the data I needed.
Prisma makes database operations simpler. Defining models and generating a client in just a few commands felt incredibly efficient.
SQLite is perfect for quick prototyping. I didn’t need a heavy setup, and yet I had a real database running within minutes.
Error messages like Unexpected end of input are common when a script is incomplete — in my case, I had missed a closing bracket in server.js. Debugging is part of the process!
This hands-on experience helped me appreciate the modern tools that simplify backend development.

Prerequisites

Before we dive in, make sure you have the following:

Node.js installed (v16 or higher) ➔ Download Node.js
Basic command-line knowledge
Code editor (like VS Code)
Postman or GraphQL Playground installed (or VS Code extension)
(Optional) SQLite Browser ➔ Download SQLite Browser Don't worry — I’ll explain every step so you can follow along easily!

Project Setup

Create a new project folder

mkdir graphql-prisma-api
cd graphql-prisma-api

Initialize a Node.js project

npm init -y

Install dependencies

npm install express graphql express-graphql prisma @prisma/client sqlite3 dotenv

Initialize Prisma

npx prisma init

This will create a new prisma folder and a .env file.
Your project folder should now look like:
graphql-prisma-api/
├── node_modules/
├── prisma/
│ └── schema.prisma
├── .env
├── package.json
└── package-lock.json

Setting Up the Database with Prisma and SQLite

Configure your .env file Replace the value of DATABASE_URL with:

DATABASE_URL="file:./dev.db"

Define your Prisma model Replace the contents of prisma/schema.prisma with:

generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "sqlite"
  url      = env("DATABASE_URL")
}

model Task {
  id          Int      @id @default(autoincrement())
  title       String
  description String
  completed   Boolean  @default(false)
  createdAt   DateTime @default(now())
}

Migrate the database and generate the client

npx prisma migrate dev --name init

This creates a dev.db SQLite file and generates the Prisma client for database access.
You now have a real database ready to store tasks!

Building the GraphQL Server

Create a server.js file

touch server.js

Add the following code to server.js

const express = require('express');
const { graphqlHTTP } = require('express-graphql');
const { buildSchema } = require('graphql');
const { PrismaClient } = require('@prisma/client');

const prisma = new PrismaClient();
const app = express();

const schema = buildSchema(`
  type Task {
    id: Int!
    title: String!
    description: String!
    completed: Boolean!
    createdAt: String!
  }

  type Query {
    getTasks: [Task!]!
  }

  type Mutation {
    createTask(title: String!, description: String!): Task!
    completeTask(id: Int!): Task!
  }
`);

const root = {
  getTasks: async () => await prisma.task.findMany(),
  createTask: async ({ title, description }) => {
    return await prisma.task.create({
      data: { title, description },
    });
  },
  completeTask: async ({ id }) => {
    return await prisma.task.update({
      where: { id },
      data: { completed: true },
    });
  },
};

app.use('/graphql', graphqlHTTP({
  schema,
  rootValue: root,
  graphiql: true,
}));

const PORT = process.env.PORT || 4000;
app.listen(PORT, () => {
  console.log(`🚀 Server running at http://localhost:${PORT}/graphql`);
});

Run the server

node server.js

Visit: http://localhost:4000/graphql

Testing the API

You can test your API using GraphiQL in your browser.

1.Fetch all tasks

{
  getTasks {
    id
    title
    description
    completed
    createdAt
  }
}

2.Create a new task

mutation {
  createTask(title: "Write article", description: "Finish GraphQL tutorial") {
    id
    title
    completed
  }
}

3.Complete a task

mutation {
  completeTask(id: 1) {
    id
    title
    completed
  }
}

Why GraphQL Is So Efficient

One of the coolest things I learned while building this project is how GraphQL optimizes data fetching. Unlike REST APIs that use multiple endpoints, GraphQL uses a single smart endpoint that serves as the parent node. From there, it dynamically moves to fetch only the specific child nodes (fields) you request.
GraphQL operates kind of like a breadth-first search algorithm, scanning through the structure of your schema tree level by level. It doesn’t fetch unnecessary data — just what you ask for. This makes responses faster, reduces payload size, and improves performance overall.

Conclusion

Congratulations! You've just built your first GraphQL API using Node.js, Prisma, and SQLite. You now have:

A working GraphQL server
A real SQLite database
The ability to create and fetch tasks

This is a great foundation for fullstack development. In the next part, we’ll improve this setup with TypeScript, Prisma validation, and real-time subscriptions.

Coming Next:
Level Up Your GraphQL Server: TypeScript, Prisma, and Real-Time Features
Thanks for reading! 🚀

Learn More

Here are some helpful resources if you'd like to dive deeper:

GraphQL Official Docs
Prisma Documentation
SQLite Documentation
Node.js Docs
Express GraphQL