harrisgnr

Posted on May 6

How I Built an Autonomous AI News Platform That Publishes 20 Articles/Day

#agents #ai #architecture #javascript

I'm a solo developer who built AgentNews.gr — a fully autonomous news platform covering the Greek economy. It publishes ~20 articles per day with zero human intervention. No editors, no manual curation, no daily maintenance.

In this post I'll walk through the architecture, the agent pipeline, and the hard lessons from building it.

The Problem

Greece has massive public economic data — 16,000 government decisions published daily on Diavgeia (the national transparency portal), tax authority announcements on AADE, statistical releases from ELSTAT, stock exchange data from ATHEX. All of it is public. None of it is readable.

These datasets exist in raw bureaucratic form. Nobody aggregates them. Nobody explains them in plain language. Traditional Greek media covers the big headlines but ignores the long tail of government spending, regulatory changes, and economic data.

I wanted to build a system that could monitor all of this and turn it into readable news articles — autonomously, 24/7.

The Architecture

The system is a pipeline of specialized AI agents, each handling one step. Think of it as a virtual newsroom where every role is an agent.

Stack: Express + React + PostgreSQL + Google Gemini, running on a single $7/month server.

The Scout Agent

The Scout monitors 30+ RSS feeds from Greek financial media, plus direct connections to government databases. Every 2 hours it fetches new content, deduplicates against existing articles, and scores each story for newsworthiness on a 1-10 scale. Only stories scoring 7+ pass through.

The key design decision: the Scout doesn't write anything. It just produces "leads" — structured objects with the source URL, a topic summary, and a relevance score. This separation means I can swap out scoring logic without touching the rest of the pipeline.

The Social Signal Scouts

Beyond RSS, three signal detectors monitor social platforms for trending economic topics:

Reddit — monitors Greek finance subreddits via their free RSS feeds
Google Trends Greece — catches search spikes in real time
YouTube — monitors Greek finance channels and transcribes videos

The critical rule: social posts are signals, never sources. When the Reddit scout detects a trending topic, it finds the actual primary source — the government announcement, the press release, the data release — and writes the article from that. The Reddit post is never mentioned or linked.

For YouTube, the system fetches auto-generated transcripts and uses them to identify topics. For English-language channels, it translates the key insights into Greek articles with clear attribution to the original creator.

The Writer Agent

Takes a lead and produces a full Greek article — 400-800 words, journalistic tone, structured with proper paragraphs. The system prompt is specific: write like a senior Greek financial journalist, cite sources by name, don't editorialize, don't hallucinate.

Each article gets a transliterated Greek slug, category assignment, and source attribution.

The Fact-Checker Agent

Reviews the draft against the original source material. Flags any claims that don't appear in the source. This doesn't catch everything, but it catches the obvious hallucinations — made-up statistics, wrong company names, fabricated quotes.

The SEO Agent

Generates meta titles, descriptions, keywords, and Open Graph tags. For articles triggered by Google Trends, it optimizes specifically for the trending search term — this is how a new site with low domain authority can compete for timely queries.

The Image Agent

Three-tier system: (1) try to extract the OG image from the source article, (2) search Pexels for a relevant photo, (3) generate with AI. Falls back to a branded gradient if all three fail.

The Publisher Agent

Takes the finished article, transliterates the title into a URL-friendly Greek slug, saves everything to PostgreSQL, and it's live. No human approval step.

The Translator Agent

Every 3 hours, picks the top articles and translates them to English at /en/. This turned out to be more valuable than I expected — the English section gets more organic traffic than Greek because there's less competition for "Greek economy news in English."

The Deep Research Agent

Every Sunday, a separate agent synthesizes a full week of government spending data from Diavgeia into a long-form analysis piece with charts. This is the content that's hardest to replicate and most valuable for SEO — nobody else covers weekly patterns in Greek government spending.

The Data Pipeline

The most unique part of the system is the Diavgeia pipeline. Diavgeia is Greece's government transparency portal — every public spending decision, contract, and regulatory act gets published there. That's ~16,000 decisions per day.

My agent fetches these decisions, scores them for public interest, and produces plain-language Greek articles explaining what happened. "The Ministry of Health awarded a €2.3M contract to Company X for IT systems" is boring as a raw database entry. As a news article with context about the ministry's spending patterns, it becomes interesting.

This is the data moat. The raw data is public, but the structured intelligence layer on top of it is unique.

Prerendering for SEO

The frontend is a React SPA. When I launched, Google was seeing empty <div id="root"></div> pages and barely indexing anything — 57 pages after a month.

I added a prerender middleware that detects crawler user agents (Googlebot, Bingbot, social media crawlers) and serves fully-rendered HTML with all meta tags, JSON-LD structured data, and article content. Regular users still get the SPA.

After deploying the prerender, indexed pages jumped from 57 to 1,000+ within weeks.

What I Learned

Distribution is harder than building. The platform runs itself. Getting humans to discover it is the actual hard problem. Zero backlinks for the first two months meant Google had no reason to rank me above established competitors.

English content grows faster than Greek. Counter-intuitive for a Greek news site, but the English section gets more organic traffic because there are fewer competitors for "Greek economy news in English" than for "ελληνικά οικονομικά νέα."

Quality threshold matters. I initially set the Scout's minimum quality score too low and was publishing 35+ thin articles per day. Raising the threshold to only pass high-scoring stories improved the average quality and Google's willingness to index.

The agent separation principle pays off. Every agent does one job. When image generation was causing server hangs, I fixed the Image Agent without touching anything else. When SEO needed improvement, I updated the SEO Agent's prompts. This modularity is the main reason a solo developer can maintain a system this complex.

Current Numbers

~1,000 pages indexed by Google
~20 articles/day published autonomously
30+ RSS sources + Diavgeia + ELSTAT + ATHEX + ECB
3 social signal sources (Reddit, Google Trends, YouTube)
Greek + English, fully automated
Running on a single server, $7/month
Zero revenue (yet)

What's Next

Newsletter with weekly digest. B2B licensing of the agent platform for other publishers. More signal sources. And honestly — just letting SEO compound while the system runs itself.

If you're building with AI agents, I'd love to hear about your architecture. The multi-agent pipeline pattern has worked well for this use case but I'm curious what others are doing differently.

Live site: agentnews.gr (English) | agentnews.gr (Greek)

DEV Community