Sam Chen

Posted on Jun 30

Building Your Own AI News Digest: A Developer’s Tutorial

#ai #tutorial #python #automation

You’re a developer. You know AI is moving fast—new models, frameworks, papers, and tools drop daily. Trying to keep up by scanning Twitter, Reddit, and a dozen newsletters is a full-time job. What if you could build a personalized, automated AI news digest that pulls the stories you actually care about, summarizes them, and delivers them to your inbox or Slack?

In this tutorial, I’ll show you how to build exactly that using Python, a few free APIs, and an LLM (like GPT or a local model). No fluff, just code and a system you can deploy in an afternoon.

Why Build Your Own?

Curated, not noisy – Filter by topics you care about (e.g., “LLMs”, “computer vision”, “AI safety”).
Summarised – Skip the clickbait; get the core insight in a sentence or two.
Automated – Runs daily on GitHub Actions (or your server) and sends the digest to you.
Extensible – Want to add sentiment analysis, source scoring, or local summarization? Go ahead.

What You’ll Need

Python 3.9+
A NewsAPI key (free tier: 100 requests/day) – or you can use RSS feeds.
An OpenAI API key (or any LLM endpoint; we’ll use GPT‑3.5‑turbo as an example).
Optional: a Telegram bot token or SMTP credentials for delivery.

Step 1: Fetching the Latest AI News

We’ll use NewsAPI to query for articles with the keyword “AI”. You can also pull from arXiv, Hacker News, or RSS – but the API is the simplest.

import requests
from datetime import datetime, timedelta

NEWS_API_KEY = "your_key_here"

def fetch_ai_news():
    url = "https://newsapi.org/v2/everything"
    params = {
        "q": "artificial intelligence OR machine learning OR LLM",
        "from": (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d"),
        "sortBy": "popularity",
        "language": "en",
        "pageSize": 20,
        "apiKey": NEWS_API_KEY
    }
    response = requests.get(url, params=params)
    response.raise_for_status()
    articles = response.json().get("articles", [])
    # Deduplicate by title (simple)
    seen = set()
    unique = []
    for art in articles:
        if art["title"] not in seen:
            seen.add(art["title"])
            unique.append(art)
    return unique

Output: a list of dicts with title, description, url, source, and publishedAt.

Step 2: (Optional) Score and Filter by Relevance

Not every article is worth your time. Let’s rank them using a simple keyword density check or, better, an embedding similarity to your interests.

For a lightweight filter, use functools.lru_cache to compute a “relevance score” from the title + description:

KEYWORDS = ["transformer", "GPT", "PyTorch", "fine-tuning", "RAG", "diffusion", "agent"]

def relevance_score(article):
    text = f"{article['title']} {article.get('description', '')}".lower()
    return sum(1 for kw in KEYWORDS if kw in text)

# Keep only top 10 most relevant
def filter_articles(articles, top_k=10):
    scored = sorted(articles, key=relevance_score, reverse=True)
    return [a for a in scored if relevance_score(a) > 0][:top_k]

Pro tip: For a more sophisticated filter, use sentence-transformers to compare article embeddings with your own interest vector. But that’s a whole separate post – keep it simple first.

Step 3: Summarize with an LLM

Now the fun part. We’ll send each article’s title and description to an LLM and ask for a one‑sentence summary. This makes your digest dense and scannable.

import openai

openai.api_key = "sk-..."

def summarize_article(article):
    prompt = f"""
Summarize the following AI news article in one clear, factual sentence.
Focus on the key insight for a developer.

Title: {article['title']}
Description: {article.get('description', '(No description)')}

Summary:
"""
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.3,
        max_tokens=60
    )
    return response.choices[0].message.content.strip()

Why this works: The LLM condenses the noise into the signal. Even a free model like gpt-3.5-turbo-0125 does a decent job for under $0.001 per article.

Warning: For 10 articles, you’re looking at ~$0.01 per run. Use a local model (e.g., Phi-3-mini via Ollama) if you want zero cost and total privacy.

Step 4: Format the Digest

I like Markdown – it’s clean, and I can drop it into an email or a GitHub Issue. Here’s a simple template:

def build_digest_md(articles_summaries):
    lines = ["# 🤖 AI News Digest", f"**{datetime.now().strftime('%A, %B %d, %Y')}**\n"]
    for title, summary, url in articles_summaries:
        lines.append(f"- **{title}**")
        lines.append(f"  {summary}")
        lines.append(f"  [[Read more]]({url})\n")
    return "\n".join(lines)

Example output:

OpenAI Launches GPT-4o Mini OpenAI releases a smaller, cheaper model for developers, with vision and improved latency. [Read more]

Step 5: Deliver the Digest

You have several options:

Print to console – for local testing.
Email – use smtplib with Gmail or SendGrid.
Slack webhook – just POST a JSON payload.
GitHub Issue – create an issue daily in your private repo (great for free hosting!).

I’ll show a simple email version:

import smtplib
from email.mime.text import MIMEText

def send_email(subject, body, to_email="you@example.com"):
    msg = MIMEText(body, "markdown")
    msg["Subject"] = subject
    msg["From"] = "digest-bot@example.com"
    msg["To"] = to_email

    with smtplib.SMTP("smtp.gmail.com", 587) as server:
        server.starttls()
        server.login("your_email", "app_password")
        server.send_message(msg)

(Use an app‑specific password for Gmail; never hardcode secrets – use environment variables.)

Step 6: Automate with GitHub Actions

Create .github/workflows/digest.yml:


yaml
name: Daily AI Digest
on:

DEV Community