ContentCrew: A 7-Agent AI Pipeline That Researches, Writes, and Fact-Checks Articles Using Google ADK

#agents #buildmultiagents #gemini #adk

Education Track: Build Multi-Agent Systems with ADK

This post is my submission for DEV Education Track: Build Multi-Agent Systems with ADK.

What I Built

Content creation is time-consuming. Researching a topic, drafting an article, editing it, fact-checking claims, formatting for SEO, and then repurposing it for social media can take hours. I built ContentCrew — a multi-agent pipeline that automates this entire workflow end to end.

You give it a topic. Within 3–5 minutes it produces:

A fully researched, written, and edited blog post
A SEO-optimized article with YAML front-matter
A fact-check report verifying key claims with sources
A Twitter/X thread + LinkedIn post ready to publish
The article translated into any language you choose

The system is built entirely on Google ADK (Agent Development Kit) with Gemini 2.0 Flash, deployed with a Streamlit web UI on Google Cloud Shell.

Cloud Run Embed

Try ContentCrew live — no setup required:

gcloud builds submit --tag gcr.io/call-centre-513111/contentcrew
gcloud run deploy contentcrew \
--image gcr.io/call-centre-513111/contentcrew \
--platform managed \
--region asia-south1 \
--allow-unauthenticated \
--set-env-vars GOOGLE_CLOUD_PROJECT=call-centre-513111,GOOGLE_CLOUD_LOCATION=us-central1,GOOGLE_GENAI_USE_VERTEXAI=true

Enter any topic and watch all 7 agents work in real time — researching, writing, editing, fact-checking, and generating social media posts.

Your Agents

ContentCrew uses 7 specialized agents, each with a single well-defined responsibility. They run in two stages — a main orchestrated pipeline and a set of standalone agents that run after.

Stage 1 — Orchestrated Pipeline

🔍 ResearchAgent
Receives the topic and performs 2–3 web searches using a custom web_search tool built on the DuckDuckGo API. Produces a structured research brief with key facts, trends, and source URLs. This feeds directly into the WriterAgent.

✍️ WriterAgent
Takes the research brief and drafts an 800–1200 word blog post. Structured with a hook introduction, 3–4 main sections with subheadings, and a conclusion with a call to action. Uses a word_count tool to verify length.

🧹 EditorAgent
Reviews the draft for clarity, flow, grammar, and consistency. Removes filler words, strengthens the hook and conclusion, and ensures consistent tone throughout.

📦 FormatterAgent
Adds YAML front-matter (title, description, keyword, tags, reading time), ensures correct Markdown heading hierarchy, and saves the final article to output/final_article.md.

🎯 ContentOrchestrator
Coordinates the four agents above in sequence, passing each agent's output to the next. Built with ADK's sub_agents pattern.

Stage 2 — Standalone Agents

✅ FactCheckAgent
Runs independently with the finished article as input. Identifies up to 6 factual claims, searches each one using web_search, and produces a structured fact-check report with verdicts: ✅ VERIFIED, ⚠️ UNVERIFIED, or ❌ DISPUTED. Saved separately to output/factcheck_report.md.

🐦 SocialMediaAgent
Takes the finished article and generates a 7-tweet Twitter/X thread and a LinkedIn post. Each tweet covers one key insight. The LinkedIn post ends with a question to drive engagement.

🌍 TranslateAgent
Translates the complete article into any target language (Hindi, Spanish, Telugu, French, German, Arabic, Japanese, Chinese) while preserving all Markdown formatting, YAML front-matter, and heading structure.

How They Work Together

User Input (Topic)
       │
       ▼
ContentOrchestrator
  ├── ResearchAgent   → Research Brief
  ├── WriterAgent     → Draft Article
  ├── EditorAgent     → Polished Article
  └── FormatterAgent  → final_article.md
       │
       ├── FactCheckAgent   → factcheck_report.md
       ├── SocialMediaAgent → social_media.md
       └── TranslateAgent   → translated_<lang>.md

The key architectural decision was to separate FactCheckAgent from the orchestrator. When it was a sub-agent inside the orchestrator, its output was absorbed into the orchestrator's context and never surfaced as a separate event. Running it as a standalone Runner with its own session solved this completely.

Key Learnings

Vertex AI does not allow mixing tool types in a single agent.
The biggest early blocker was discovering that Vertex AI throws a 400 INVALID_ARGUMENT: Multiple tools are supported only when they are all search tools error when you mix google_search (a built-in ADK tool) with custom Python callables in the same agent. The fix was to replace the built-in google_search with a custom web_search function built on DuckDuckGo's free API — making all tools plain Python callables.

Sub-agent outputs are absorbed by the orchestrator.
I expected each sub-agent's output to surface as separate events I could intercept. In practice, the orchestrator wraps all sub-agent activity — their outputs never appear with the sub-agent as event.author. The solution was to move agents that needed their output captured separately (FactCheck, Social Media, Translate) out of the orchestrator entirely and run them as independent Runner instances.

Cloud Shell requires specific Streamlit config to work.
Streamlit's WebSocket connections fail silently in Cloud Shell without the right settings. The fix was a .streamlit/config.toml with enableCORS = false, enableXsrfProtection = false, enableWebsocketCompression = false, and port = 8080. Without this, the UI renders as a blank page.

ADK emits function_call warning noise that can't be easily suppressed.
The warning non-text parts in the response: ['function_call'] appears whenever an agent makes a tool call. It's cosmetic and doesn't affect functionality, but it clutters output. The clean fix was to patch logging.Logger.warning at startup to filter the specific message.

LLMs ignore URL format instructions reliably.
The ImageAgent was instructed to use a specific image URL format but consistently generated Unsplash URLs from training memory instead. The lesson: never rely on an LLM to generate specific URL formats. Either generate the URLs in code based on keywords the LLM provides, or use a different approach entirely.