DEV Community

Cover image for I Built a Bot That Keeps My Resume Always Up to Date on GitHub
Nakul Dev
Nakul Dev

Posted on

I Built a Bot That Keeps My Resume Always Up to Date on GitHub

I use Overleaf to write my resume in LaTeX. Every time I made an edit, I had to manually compile it, download the PDF, and push it to my GitHub repo so my portfolio website could link to it. After doing this one too many times, I decided to automate the whole thing.

Here's the full story - the scraper, the GitHub Actions workflow, the bugs I hit, and how I eventually wired it to my portfolio site.


The Problem

My portfolio at nakuldev.vercel.app links directly to my resume PDF. For that link to always point to the latest version, I'd have to:

  1. Open Overleaf, compile, download
  2. Replace the old PDF in my repo
  3. Commit and push

Boring. Repetitive. Easy to forget. So I automated it.


The Plan

  1. Write a Node.js script that opens my Overleaf share link in a headless browser
  2. Find the PDF download link in the DOM
  3. Download the PDF and save it locally
  4. Run this on a schedule via GitHub Actions
  5. Auto-commit and push the new PDF back to the repo

Step 1 - Scraping Overleaf with Playwright

Overleaf is a React app - you can't just fetch() it and parse the HTML. The download link only appears after the project compiles, which happens client-side. So I needed a real browser.

I went with Playwright for this.

npm install playwright
npx playwright install chromium
Enter fullscreen mode Exit fullscreen mode

The download button in Overleaf's DOM looks like this:

<a href="/download/project/69b65297.../output/output.pdf?..." 
   aria-label="Download PDF">
Enter fullscreen mode Exit fullscreen mode

So the selector I needed was:

a[href^="/download/project"]
Enter fullscreen mode Exit fullscreen mode

Here's the core scraping logic:

await page.goto("https://www.overleaf.com/read/YOUR_SHARE_LINK", {
  waitUntil: "networkidle",
  timeout: 60_000
});

await page.waitForSelector('a[href^="/download/project"]', {
  state: "attached",  // important - element lives inside a closed dropdown
  timeout: 90_000,
});

const relativeHref = await page.evaluate(() => {
  const links = Array.from(
    document.querySelectorAll('a[href^="/download/project"]')
  );
  const pdfLink = links.find((el) => el.href.includes("output.pdf"));
  return pdfLink
    ? new URL(pdfLink.href).pathname + new URL(pdfLink.href).search
    : links[0]?.getAttribute("href") ?? null;
});

const fullUrl = `https://www.overleaf.com${relativeHref}`;
Enter fullscreen mode Exit fullscreen mode

Then download it using the browser's session cookies (so Overleaf accepts the request):

const cookies = await context.cookies();
await downloadFile(fullUrl, destPath, cookies);
Enter fullscreen mode Exit fullscreen mode

Bug I hit: waitForSelector timing out

My first attempt used the default visible state:

await page.waitForSelector('a[href^="/download/project"]'); // ❌ timed out
Enter fullscreen mode Exit fullscreen mode

The error log showed:

180 × locator resolved to 2 elements. Proceeding with the first one
Enter fullscreen mode Exit fullscreen mode

The element existed but was hidden inside a closed dropdown menu. It was never going to become visible. The fix was switching to state: "attached" - which only requires the element to be in the DOM, not visible on screen.


Step 2 - Saving to Two Places

I wanted two things from each run:

  • scrapped/resume01.pdf - a numbered archive (auto-deleted after 7 days)
  • Nakul_Dev_M_V_Resume.pdf in the root - always the latest, fixed filename, safe to link to
// Numbered archive
const pdfName = nextPdfName(); // resume01.pdf, resume02.pdf...
await downloadFile(fullUrl, path.join(SCRAPPED_DIR, pdfName), cookies);

// Fixed root copy
const rootPdfPath = path.join(ROOT_DIR, "Nakul_Dev_M_V_Resume.pdf");
if (fs.existsSync(rootPdfPath)) fs.unlinkSync(rootPdfPath);
fs.copyFileSync(path.join(SCRAPPED_DIR, pdfName), rootPdfPath);
Enter fullscreen mode Exit fullscreen mode

Cleanup for files older than 7 days:

function cleanOldScrapped() {
  const now = Date.now();
  const ONE_WEEK_MS = 7 * 24 * 60 * 60 * 1000;

  fs.readdirSync(SCRAPPED_DIR)
    .filter((f) => /^resume\d+\.pdf$/i.test(f))
    .forEach((f) => {
      const filePath = path.join(SCRAPPED_DIR, f);
      if (now - fs.statSync(filePath).mtimeMs > ONE_WEEK_MS) {
        fs.unlinkSync(filePath);
      }
    });
}
Enter fullscreen mode Exit fullscreen mode

Step 3 - GitHub Actions Workflow

The local script worked great. Now I needed to run it automatically in CI.

name: Overleaf PDF Scraper

on:
  schedule:
    - cron: "0 6 * * *"     # every day at 06:00 UTC
  workflow_dispatch:         # manual trigger via Actions tab
  repository_dispatch:
    types: [visitor-trigger] # triggered via API (more on this below)

permissions:
  contents: write

jobs:
  scrape-and-push:
    runs-on: ubuntu-latest
    timeout-minutes: 10

    steps:
      - uses: actions/checkout@v4.2.2
        with:
          fetch-depth: 0

      - uses: actions/setup-node@v4.4.0
        with:
          node-version: "20"
          cache: "npm"

      - run: npm ci
      - run: npx playwright install --with-deps chromium
      - run: node scraper.js

      - name: Commit and push
        run: |
          git config user.name "github-actions[bot]"
          git config user.email "github-actions[bot]@users.noreply.github.com"
          git add Nakul_Dev_M_V_Resume.pdf scrapped/ History.json
          if git diff --cached --quiet; then
            echo "Nothing new to commit."
          else
            git commit -m "feat: update resume [$(date -u '+%Y-%m-%d %H:%M UTC')]"
            git push
          fi
Enter fullscreen mode Exit fullscreen mode

Important: enable write permissions

By default GitHub Actions can't push to your repo. Go to:
Settings → Actions → General → Workflow permissions → Read and write permissions


Step 4 - Triggering From My Portfolio

The daily cron covers most cases, but I also wanted to trigger a fresh scrape whenever someone visits my portfolio - so they always get the absolute latest version.

The challenge: you can't call the GitHub API directly from frontend JavaScript because you'd have to expose your token. The solution: a Next.js API route that acts as a secure middleman.

Visitor loads portfolio
  → calls /api/trigger (server-side)
  → API route calls GitHub with the secret token
  → GitHub Actions fires
Enter fullscreen mode Exit fullscreen mode

The API route at app/api/trigger/route.js:

const COOLDOWN_HOURS = 6;
let lastTriggeredAt = null;

export async function POST() {
  const now = Date.now();
  const cooldownMs = COOLDOWN_HOURS * 60 * 60 * 1000;

  // Don't trigger more than once every 6 hours
  if (lastTriggeredAt && now - lastTriggeredAt < cooldownMs) {
    return Response.json({ ok: false, reason: "cooldown" }, { status: 429 });
  }

  await fetch("https://api.github.com/repos/nakuldevmv/Resume/dispatches", {
    method: "POST",
    headers: {
      Authorization: `token ${process.env.GITHUB_TOKEN}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ event_type: "visitor-trigger" }),
  });

  lastTriggeredAt = now;
  return Response.json({ ok: true });
}
Enter fullscreen mode Exit fullscreen mode

Called from the portfolio page:

useEffect(() => {
  fetch("/api/trigger", { method: "POST" });
}, []);
Enter fullscreen mode Exit fullscreen mode

The token lives in .env.local and never touches the browser:

GITHUB_TOKEN=ghp_xxxxxxxxxxxx
Enter fullscreen mode Exit fullscreen mode

The repository_dispatch trigger in the workflow listens for visitor-trigger - which matches the event_type sent by the API route exactly.


Step 5 - Failure Notifications

If Overleaf goes down or the compile fails and the download link isn't found, I wanted to know immediately. The simplest option: GitHub's built-in email notifications.

Go to: GitHub profile → Settings → Notifications → Actions → Failed workflows

That's it. Zero config, instant email on any workflow failure.

For something more immediate (like Telegram), you can add this step at the end of the workflow:

- name: Notify on failure
  if: failure()
  run: |
    curl -s -X POST "https://api.telegram.org/bot${{ secrets.TELEGRAM_TOKEN }}/sendMessage" \
      -d chat_id="${{ secrets.TELEGRAM_CHAT_ID }}" \
      -d text="❌ Resume scraper failed - check the Actions tab."
Enter fullscreen mode Exit fullscreen mode

The Final Repo Structure

Resume/
├── .github/workflows/scrape.yml   # the workflow
├── scrapped/                      # daily archive (7-day TTL)
│   ├── resume01.pdf
│   └── resume02.pdf
├── scraper.js                     # the scraper
├── package.json
├── History.json               # download log
└── Nakul_Dev_M_V_Resume.pdf       # ← always the latest
Enter fullscreen mode Exit fullscreen mode

History.json logs every run:

{
  "timestamp": "2026-03-15T06:02:41.000Z",
  "scrappedFile": "scrapped/resume01.pdf",
  "rootFile": "Nakul_Dev_M_V_Resume.pdf",
  "link": "https://www.overleaf.com/download/project/..."
}
Enter fullscreen mode Exit fullscreen mode

Result

  • ✅ Resume auto-updates every day at 06:00 UTC
  • ✅ Portfolio visitors can trigger a fresh scrape via the API route (with cooldown)
  • ✅ Root PDF always has a fixed, linkable filename
  • ✅ Weekly archive kept for 7 days, then cleaned up
  • ✅ Email alert if anything breaks

Direct link to my latest resume:
👉 nakuldevmv.github.io/Resume/Nakul_Dev_M_V_Resume.pdf

Full source code:
👉 github.com/nakuldevmv/Resume

Portfolio:
👉 nakuldev.vercel.app


If you also write your resume in LaTeX on Overleaf, you can fork the repo, swap out the share URL and PDF filename in scraper.js, and have this running for yourself in under 10 minutes.

Top comments (0)