Over the past few months, I have been building a small AI news brief called DeepSignal.
The idea started from a simple personal frustration:
I was reading X, Hacker News, arXiv, OpenAI and Anthropic blogs, product launch pages, newsletters, and company updates every day, but still felt like I was either missing important AI news or wasting time on low-signal updates.
So I built a small system that does three things:
- Collects AI-related updates from multiple sources
- Scores each story with a transparent 0–100 signal score
- Publishes a daily and weekly brief
The product is not technically complex, but the workflow taught me a lot about building AI-assisted content products, SEO for dynamic sites, and the difference between summarizing information and filtering information.
This is a breakdown of the stack, architecture, and lessons learned.
The stack
The current stack is intentionally simple:
Frontend: Next.js 15
Database: Supabase
Hosting: Vercel
AI processing: GPT-4o-mini
Content model: Articles, sources, tags, guides, weekly briefs
SEO: sitemap, canonical URLs, RSS, structured pages
I wanted to keep the system cheap and easy to maintain because this is a solo project.
The rough monthly cost is still low. Vercel handles deployment and hosting, Supabase handles the database, and GPT-4o-mini is used for scoring and classification rather than heavy generation.
The main goal was not to build a complicated AI pipeline.
The goal was to build a reliable workflow that could turn noisy inputs into useful outputs.
The basic architecture
The system has a simple flow:
Sources
↓
Fetch / import
↓
Normalize article data
↓
AI relevance check
↓
Signal scoring
↓
Tagging and categorization
↓
Publish article pages
↓
Generate daily / weekly briefs
↓
Expose guides, RSS, sitemap
At a high level, each story becomes a structured object:
type Article = {
id: string;
title: string;
url: string;
source: string;
summary: string;
publishedAt: string;
aiRelevanceScore: number;
signalScore: number;
tags: string[];
category: string;
canonicalUrl: string;
isIndexable: boolean;
};
The most important field is not the summary.
It is isIndexable.
That one field ended up being more important than I expected.
Why filtering matters more than summarizing
At first, I thought the main problem was summarization.
Take a long article, summarize it, and users save time.
But after building the first version, I realized summarization alone does not solve the real problem.
A summary tells you:
What does this article say?
But users usually need to know:
Should I care?
Why does this matter?
Is this actually about AI?
Is this a durable signal or just a temporary headline?
Is this more important than the other 50 updates today?
That changed the product direction.
Instead of only generating summaries, the system needed to decide what should be included, ranked, grouped, and excluded.
For an AI news product, filtering is not a minor feature.
Filtering is the product.
The signal score
Each story gets a 0–100 signal score.
The score is not meant to be perfect. It is a transparent ranking system that helps explain why a story may matter.
A story can score higher based on signals like:
- source quality
- AI relevance
- novelty
- technical depth
- business impact
- research importance
- company importance
- cross-source confirmation
- relevance to builders, researchers, or operators
A simplified scoring idea looks like this:
type ScoreInput = {
sourceWeight: number;
aiRelevance: number;
novelty: number;
technicalDepth: number;
marketImpact: number;
researchValue: number;
companyImportance: number;
};
function calculateSignalScore(input: ScoreInput) {
const score =
input.sourceWeight * 0.15 +
input.aiRelevance * 0.25 +
input.novelty * 0.15 +
input.technicalDepth * 0.15 +
input.marketImpact * 0.1 +
input.researchValue * 0.1 +
input.companyImportance * 0.1;
return Math.round(Math.min(100, Math.max(0, score)));
}
The exact formula can change, but the principle matters:
I wanted users to feel that the ranking had a visible logic, not just a black-box AI label.
That was one of the biggest lessons:
A simple transparent scoring system can be more trustworthy than a more complex but invisible AI ranking.
Using GPT-4o-mini
I use GPT-4o-mini mostly for classification, scoring support, and short summaries.
The AI tasks are intentionally narrow:
- Is this article actually AI-related?
- What category does it belong to?
- What are the key takeaways?
- Is the story relevant to models, agents, research, hardware, infrastructure, regulation, or adoption?
- What tags should it receive?
- What score explanation should be shown?
I try not to use AI as a generic content generator.
Instead, I use it as a structured processing layer.
A simplified prompt pattern looks like this:
You are classifying an AI industry news article.
Return JSON only.
Evaluate:
1. AI relevance from 0 to 100
2. Signal strength from 0 to 100
3. Primary category
4. 3 to 5 tags
5. One-sentence reason why this story matters
6. Whether this story should be indexable for search
Article:
Title: ...
Source: ...
Excerpt: ...
URL: ...
The important part is forcing structured output.
For this kind of workflow, predictable JSON is more useful than beautifully written prose.
Supabase data model
The database is simple.
Core tables:
articles
sources
tags
article_tags
daily_briefs
weekly_briefs
guides
guide_articles
The articles table stores the normalized content.
The sources table stores source metadata and source quality.
The tags table keeps topic structure clean.
The guides table is for evergreen topic pages, such as:
AI agents
AI coding tools
AI research papers
OpenAI updates
Anthropic Claude updates
NVIDIA AI chips
AI hardware
This guide layer became important later for SEO.
A chronological feed is useful for freshness, but guide pages are better for long-term search and topic authority.
Next.js page structure
The site uses a few main page types:
/
Homepage
/articles/[slug]
Individual article pages
/guides
Guide index
/guides/[slug]
Evergreen topic pages
/weekly
Weekly AI brief
/tags/[slug]
Core topic pages
/sources/[slug]
Selected source pages
Not every page deserves to be indexed.
That became one of the most important SEO decisions.
SEO lesson: not every page should be in the sitemap
Early on, I made the mistake of thinking more indexed pages would be better.
It was not.
When a site has too many low-quality, thin, duplicate, or off-topic pages, search engines can get confused about what the site is actually about.
For an AI news site, this matters a lot because source feeds can easily include AI-adjacent but irrelevant content.
So I added stricter sitemap rules.
The sitemap should include:
- homepage
- about page
- guides
- high-quality guide pages
- weekly brief
- selected high-quality article pages
- selected core tag pages
The sitemap should not include:
- saved pages
- subscribe pages
- internal API routes
- search result pages
- parameter URLs
- low-quality tag pages
- non-AI articles
- thin source pages
- duplicate daily feed pages
The rule I use now is simple:
Only put a URL in the sitemap if it is:
- canonical
- indexable
- useful as a search landing page
- relevant to the core AI topic
- not thin or duplicated
This helped clean up the site’s search profile.
Canonical URLs and UTM links
For promotion, I use UTM links like:
https://ai-deep-signal.com/weekly?utm_source=x&utm_medium=social&utm_campaign=weekly
or:
https://ai-deep-signal.com/?utm_source=reddit&utm_medium=social&utm_campaign=launch
But the canonical URL must always point to the clean version:
https://ai-deep-signal.com/weekly
https://ai-deep-signal.com/
That avoids turning campaign URLs into duplicate SEO pages.
For a dynamic site, this is easy to overlook.
Tracking URLs are for analytics.
Canonical URLs are for search engines.
They should not be mixed.
Why I added guides and weekly briefs
The first version of the site was mostly a feed.
That worked, but it had a problem:
Feeds are good for browsing.
Guides are better for understanding.
So I added topic-based guides and a weekly brief.
The weekly page is for people who want a quick summary of what mattered this week.
The guide pages are for evergreen themes that should grow over time.
For example:
/guides/what-are-ai-agents
/guides/best-ai-coding-agents
/guides/ai-research-papers-this-week
/guides/nvidia-ai-chip-news
/guides/openai-news
This gives the site a more stable structure:
Homepage
↓
Guides
↓
Topic pages
↓
Related articles
That structure is much better than only having a reverse-chronological feed.
Deployment on Vercel
Vercel is a good fit for this kind of project because most of the site is content-oriented.
The project benefits from:
- fast deployments
- preview deployments
- automatic HTTPS
- good Next.js support
- serverless functions for lightweight API work
- ISR / caching options
But I avoid using Vercel for heavy background work.
If the project grows, I would move heavier jobs to a separate worker or queue system.
For now, Vercel + Supabase is enough.
What I would improve next
There are still many things I would improve.
Better deduplication
AI news often appears in multiple places. The same story can show up as a company blog post, a tweet thread, a newsletter item, and a Hacker News discussion.
Better clustering would make the brief cleaner.
Better source weighting
Not all sources should have equal authority. A research paper, company announcement, social post, and rewritten news article should be weighted differently.
Better guide pages
The guide pages should become more like living topic trackers, not just lists of related articles.
Each guide should eventually include:
- topic explanation
- latest updates
- important companies
- relevant research
- key risks
- related stories
- last updated date
Better scoring explanations
A score is only useful if users understand it.
I want each article to explain not just the score, but the reason behind the score.
What I learned
A few lessons stood out.
1. Filtering is harder than summarizing
Summarization is relatively easy now. Deciding what deserves attention is much harder.
2. SEO quality matters more than SEO volume
More pages are not always better. Cleaner, more relevant pages are better.
3. Topic pages are more durable than feeds
Feeds create freshness. Guides create long-term value.
4. Transparent AI systems feel more trustworthy
Users do not need a perfect score, but they do need to understand why a score exists.
5. The workflow around the model is the real product
The AI model is only one part. The source selection, scoring rules, publishing flow, SEO structure, and user experience matter just as much.
Final thoughts
This project started as a small personal tool because I was tired of reading too many AI sources every morning.
But it turned into a useful lesson:
AI products do not always need to generate more content.
Sometimes the better product is the one that helps people ignore more content.
That is what I am trying to build with DeepSignal: a cleaner way to follow AI news, research, agents, models, and infrastructure without the daily noise.
The site is here:
https://ai-deep-signal.com/?utm_source=devto&utm_medium=article&utm_campaign=build_log
The weekly brief is here:
https://ai-deep-signal.com/weekly?utm_source=devto&utm_medium=article&utm_campaign=build_log
I would love feedback from other developers:
Would you trust a transparent signal score for news ranking?
Or would you rather see a purely editorial brief without scoring?

Top comments (0)