Omer Farooq

Posted on Apr 2

How I Built a Job Finder Agent with Claude AI, GitHub Actions, and Notion

#python #ai #automation #github

I was spending 30+ minutes every morning trawling job boards for AI engineering and automation work. So I built an agent that does it for me — runs on a schedule, filters by relevance, scores each listing with Claude, and pushes everything into a clean Notion database. Here's exactly how I built it.

The Problem

As a freelance AI engineer targeting UAE and Saudi Arabia clients, I need to stay on top of new opportunities — but manually searching Upwork, LinkedIn, and niche job boards every day is exactly the kind of repetitive, pattern-matching task AI is built for.

I wanted a system that ran overnight, so my morning started with a curated inbox rather than a raw firehose.

Requirements:

Scrape listings from multiple sources
Filter out irrelevant ones automatically
Score the best matches using an LLM
Surface results somewhere I'd actually look — Notion

Architecture

The agent is a three-stage pipeline: fetch → score → store. GitHub Actions provides the scheduler and runtime so there's nothing to host or maintain.

GitHub Actions (cron: 0 6 * * *)
        ↓
Python scraper (Upwork RSS + Apify)
        ↓
Claude API (claude-sonnet-4)
        ↓
Notion Database (scored + tagged)

Step 1: Scraping Job Listings

I'm pulling from two sources: Upwork's RSS feed (yes, it still exists) and a lightweight scrape of relevant LinkedIn searches. The Upwork RSS endpoint is the easiest win — no auth, returns structured XML, and covers most of the categories I care about.

import feedparser
import requests
from dataclasses import dataclass
from typing import List

UPWORK_RSS = "https://www.upwork.com/ab/feed/jobs/rss?q=AI+automation+agent&sort=recency"

@dataclass
class JobListing:
    title: str
    description: str
    budget: str
    url: str
    posted_at: str

def fetch_upwork_listings() -> List[JobListing]:
    feed = feedparser.parse(UPWORK_RSS)
    listings = []

    for entry in feed.entries[:20]:  # latest 20
        listings.append(JobListing(
            title=entry.title,
            description=entry.summary[:800],
            budget=extract_budget(entry.summary),
            url=entry.link,
            posted_at=entry.published
        ))

    return listings

def extract_budget(text: str) -> str:
    # Budget appears in Upwork RSS as "Budget: $X" or "Hourly Range"
    import re
    match = re.search(r'(Budget|Hourly Range)[:\s]+([^\n<]+)', text)
    return match.group(2).strip() if match else "Not specified"

Note on scraping: Upwork's RSS is public and rate-limit friendly. For LinkedIn I use Apify's LinkedIn Jobs Scraper actor — it handles anti-bot measures and costs about $2/month at my usage volume. Totally worth it over fighting Playwright against their detection.

Step 2: Scoring with Claude API

This is where the agent gets its intelligence. I pass each listing to Claude with a structured prompt that asks it to score relevance (1–10), extract key details, and give a one-line reason for the score. The output is JSON so I can parse it directly into Notion fields.

import anthropic
import json

client = anthropic.Anthropic()

MY_PROFILE = """
  Skills: Claude API, n8n, Make.com, Python, FastAPI, Supabase RAG
  Experience: 16 years — fintech, maritime, fashion e-commerce, SaaS
  Target: AI automation, chatbots, agentic workflows
  Location: Dubai, UAE (remote preferred)
  Rate: $50–120/hr
"""

def score_listing(listing: JobListing) -> dict:
    prompt = f"""You are a job relevance scorer. Given a freelancer profile and a job listing, 
return a JSON object only — no preamble, no markdown fences.

Freelancer profile:
{MY_PROFILE}

Job listing:
Title: {listing.title}
Description: {listing.description}
Budget: {listing.budget}

Return this exact JSON shape:
{{
  "score": <1-10>,
  "match_reason": "<one sentence>",
  "required_skills": ["<skill1>", "<skill2>"],
  "budget_fit": "<low|good|high>",
  "action": "<apply|skip|maybe>"
}}"""

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=300,
        messages=[{"role": "user", "content": prompt}]
    )

    return json.loads(response.content[0].text)

A few things I learned tuning this prompt:

Ask Claude to return "JSON only — no preamble, no markdown fences" — otherwise you'll get code blocks with backticks that break json.loads().
Include your actual rate range so Claude can judge budget_fit meaningfully.
Keep max_tokens low (300 here) — it forces concise output and cuts API cost significantly.

Step 3: Pushing to Notion

I created a Notion database with properties that map directly to Claude's output fields: Score (number), Action (select), Match Reason (text), Budget (text), and a URL column.

import os
from notion_client import Client

notion = Client(auth=os.environ["NOTION_TOKEN"])
DB_ID = os.environ["NOTION_DATABASE_ID"]

def push_to_notion(listing: JobListing, scored: dict):
    # Skip low-relevance listings
    if scored["score"] < 5:
        return

    notion.pages.create(
        parent={"database_id": DB_ID},
        properties={
            "Name": {"title": [{"text": {"content": listing.title}}]},
            "Score": {"number": scored["score"]},
            "Action": {"select": {"name": scored["action"].title()}},
            "Budget": {"rich_text": [{"text": {"content": listing.budget}}]},
            "Match Reason": {"rich_text": [{"text": {"content": scored["match_reason"]}}]},
            "URL": {"url": listing.url},
            "Posted": {"rich_text": [{"text": {"content": listing.posted_at}}]},
        }
    )

def run_agent():
    listings = fetch_upwork_listings()
    for listing in listings:
        scored = score_listing(listing)
        push_to_notion(listing, scored)
    print(f"Done. {len(listings)} listings processed.")

if __name__ == "__main__":
    run_agent()

Step 4: Scheduling with GitHub Actions

Store the script in a GitHub repo and use Actions to run it every morning at 6am UAE time (2am UTC). Secrets — the Anthropic API key, Notion token, and database ID — live in repo Actions secrets, never in code.

# .github/workflows/job-agent.yml

name: Job Finder Agent

on:
  schedule:
    - cron: '0 2 * * *'  # 6am UAE (UTC+4)
  workflow_dispatch:      # manual trigger for testing

jobs:
  run-agent:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: pip install anthropic notion-client feedparser requests

      - name: Run Job Agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          NOTION_TOKEN: ${{ secrets.NOTION_TOKEN }}
          NOTION_DATABASE_ID: ${{ secrets.NOTION_DATABASE_ID }}
        run: python job_agent.py

Tip: Always add workflow_dispatch alongside your cron trigger. It lets you run the agent manually from the GitHub UI without pushing a dummy commit — essential when testing prompt changes.

Results After Two Weeks

Metric	Number
Listings processed per week	~120
Listings pushed to Notion (score ≥ 5)	18–25
Claude API cost per run	~$0.04
Time saved per morning	~30 minutes

The Claude scoring accuracy was the biggest surprise. It correctly identifies listings that need skills I don't have (LangChain-heavy, blockchain, etc.) and filters them without me maintaining a keyword blocklist. Qualitative reasoning beats regex every time.

What I'd Do Differently

1. Add deduplication
The same listing can appear across two consecutive runs on Upwork. A quick check against existing URLs using notion.databases.query before creating each page fixes this.

2. Add memory via Supabase
Storing scored listings in a vector DB would let Claude compare new listings against historically successful ones — "this looks like the Branch.io project you won." That's a proper agentic feedback loop.

3. Surface top picks via WhatsApp
Notion is fine but a push notification is better. A simple Make.com scenario — trigger on new Notion page with score ≥ 8, send WhatsApp message — would make this feel truly agentic.

4. Migrate to Claude's tool_use
Defining a score_listing tool schema gives you validated, structured output with zero parsing risk. No more praying json.loads() doesn't throw. I'll migrate this in v2.

Final Thoughts

The whole agent is under 150 lines of Python. It runs for free on GitHub's hosted runners. The API cost is negligible. And it saves 30 minutes every morning.

The bigger lesson: you don't need LangChain, CrewAI, or a complex framework to build a useful agent. A loop, an LLM call, and a reliable output store is often enough. Start simple, then add complexity only when you hit a real limitation.

I'm a Dubai-based AI engineer and automation consultant. I build agentic workflows, RAG pipelines, and AI-powered tools for clients across UAE and Saudi Arabia. Follow me for more practical AI engineering content.

DEV Community