DEV Community

Cover image for Building FactFlux: A Multi-Agent System for Social Media Fact-Checking
Meir
Meir

Posted on

Building FactFlux: A Multi-Agent System for Social Media Fact-Checking

Building FactFlux: A Multi-Agent System for Social Media Fact-Checking

In an era where misinformation spreads faster than wildfire across social platforms, I built FactFlux - an intelligent multi-agent system that automatically fact-checks social media posts using the power of AI agents working in coordination.

🚨 The Problem

Social media has become the primary source of news for millions, but it's also a breeding ground for misinformation. Manual fact-checking is slow and doesn't scale. We need automated solutions that can:

  • Extract content from multiple social platforms
  • Identify verifiable claims within posts
  • Cross-reference against authoritative sources
  • Provide transparent, evidence-based verdicts

🎯 The Solution: Multi-Agent Architecture

FactFlux uses a team of specialized AI agents working together, each with a specific role in the fact-checking pipeline. Here's how it works:

🎥 Demo

Bright Data's MCP server

🏗️ Agent Team Structure

1. Content Extractor Agent

def create_content_extractor_agent_sync(mcp_tools):
    return Agent(
        name="Content Extractor",
        role="Extract comprehensive data from social media posts",
        model=Gemini(id="gemini-2.0-flash"),
        tools=[mcp_tools],
        instructions=[
            "You MUST use the available tools to extract data from social media posts",
            "For TikTok URLs: Use the web_data_tiktok_posts tool to get structured data",
            "For other platforms or if structured data fails: Use scrape_as_markdown tool",
            "Extract: post text, media URLs, user info, engagement metrics, timestamps",
        ]
    )
Enter fullscreen mode Exit fullscreen mode

Responsibilities:

  • Extracts post content from 6+ platforms (TikTok, Instagram, Twitter/X, Facebook, YouTube, LinkedIn)
  • Uses Bright Data tools for reliable scraping
  • Captures text, media, metadata, and engagement metrics

2. Claim Identifier Agent

def create_claim_identifier_agent():
    return Agent(
        name="Claim Identifier",
        role="Identify factual claims that can be verified",
        model=Gemini(id="gemini-2.0-flash"),
        tools=[ReasoningTools(add_instructions=True)],
        instructions=[
            "Parse extracted content to find specific, verifiable factual claims",
            "Ignore opinions, jokes, satire, and subjective statements",
            "Extract key facts: statistics, events, quotes, dates, locations",
            "Focus on claims that could potentially mislead people if false",
        ]
    )
Enter fullscreen mode Exit fullscreen mode

Responsibilities:

  • Separates factual claims from opinions and satire
  • Prioritizes verifiable statements
  • Rates claim significance and potential for harm

3. Cross-Reference Agent

def create_cross_reference_agent(mcp_tools):
    return Agent(
        name="Cross-Reference Verifier",
        role="Verify claims against multiple authoritative sources",
        model=Gemini(id="gemini-2.0-flash"),
        tools=[mcp_tools, ReasoningTools(add_instructions=True)],
        instructions=[
            "For each claim, search multiple authoritative sources automatically",
            "Check news sites, fact-checkers, government sources, academic sources",
            "For media content, perform reverse image/video searches",
            "Document all sources and their credibility levels",
        ]
    )
Enter fullscreen mode Exit fullscreen mode

Responsibilities:

  • Searches authoritative sources (news sites, fact-checkers, government sources)
  • Performs reverse media searches for images/videos
  • Documents source credibility and consensus patterns

4. Verdict Agent

def create_verdict_agent():
    return Agent(
        name="Verdict Synthesizer",
        role="Analyze all evidence and deliver final fact-check verdict",
        model=Gemini(id="gemini-2.0-flash"),
        tools=[ThinkingTools(add_instructions=True), ReasoningTools(add_instructions=True)],
        instructions=[
            "Structure your response with these sections:",
            "## Post Summary - Describe what the post contained",
            "## Claims Identified - List each claim found",
            "## Verification Results - Detail findings for each claim with sources",
            "## Final Verdict - Clear verdict with confidence score (0-100%)",
        ]
    )
Enter fullscreen mode Exit fullscreen mode

Responsibilities:

  • Synthesizes all evidence from previous agents
  • Provides structured analysis with clear sections
  • Delivers final verdict: TRUE/FALSE/MISLEADING/INSUFFICIENT_EVIDENCE
  • Includes confidence scoring and detailed reasoning

🔄 The Workflow Process

URL Input → Content Extraction → Claim Identification → Cross-Reference → Final Verdict
Enter fullscreen mode Exit fullscreen mode

Step-by-Step Breakdown:

  1. Input: User provides a social media URL
  2. Extraction: Content Extractor Agent scrapes the post using Bright Data tools
  3. Analysis: Claim Identifier Agent parses content for verifiable facts
  4. Verification: Cross-Reference Agent searches authoritative sources
  5. Synthesis: Verdict Agent combines all evidence into a final assessment

🛠️ Technical Implementation

Core Technologies Used:

Key Features:

Multi-Platform Support

# Supports major social platforms
platforms = [
    "TikTok", "Instagram", "Twitter/X", 
    "Facebook", "YouTube", "LinkedIn"
]
Enter fullscreen mode Exit fullscreen mode

Intelligent Tool Selection

The system automatically chooses the optimal scraping method based on the platform:

  • Structured data extraction for TikTok
  • Markdown scraping for other platforms
  • Fallback mechanisms for reliability

Comprehensive Error Handling

# Built-in resilience for:
error_scenarios = [
    "Invalid URLs",
    "Network failures", 
    "API rate limits",
    "Malformed social media posts",
    "Missing content"
]
Enter fullscreen mode Exit fullscreen mode

🚀 Getting Started

Prerequisites

  • Python 3.8+
  • Google Gemini API key
  • Bright Data API key

Installation

# Clone the repository
git clone https://github.com/MeirKaD/FactFlux.git
cd FactFlux

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env
# Edit .env with your API keys
Enter fullscreen mode Exit fullscreen mode

Configuration

Create a .env file:

GOOGLE_API_KEY=your_google_gemini_api_key_here
BRIGHT_DATA_API_KEY=your_bright_data_api_key_here
Enter fullscreen mode Exit fullscreen mode

Usage

# Run the playground interface
python playground_fact_check.py
Enter fullscreen mode Exit fullscreen mode

The system launches a web interface at http://localhost:7777 where you can:

  • Input social media URLs
  • Watch agents work in real-time
  • See the complete fact-checking process
  • Review structured reports with sources

📊 Sample Output

When you submit a social media URL, FactFlux produces a comprehensive report:

## Post Summary
TikTok video claiming "95% of ocean plastic comes from 10 rivers in Asia"

## Claims Identified
1. 95% of ocean plastic originates from rivers
2. These rivers are specifically located in Asia
3. Only 10 rivers are responsible

## Verification Results
- Claim 1: PARTIALLY TRUE - Studies show 80-90% from rivers (sources: Nature, Science journals)
- Claim 2: MISLEADING - 8 rivers in Asia, 2 in Africa (source: Environmental Science & Technology)
- Claim 3: TRUE - Confirmed by multiple peer-reviewed studies

## Citations
[1] Lebreton et al. (2017) Nature Communications - doi:10.1038/ncomms15611
[2] Schmidt et al. (2017) Environmental Science & Technology - doi:10.1021/acs.est.7b02368

## Final Verdict: MISLEADING (Confidence: 85%)
The core claim about river pollution is scientifically supported, but the specific statistics are inaccurate.
Enter fullscreen mode Exit fullscreen mode

💡 Why Multi-Agent Architecture?

Specialization Benefits:

  • Each agent focuses on what it does best
  • Reduces complexity and improves accuracy
  • Easier to debug and improve individual components

Scalability:

  • Can add new agents for specific platforms or verification types
  • Parallel processing capabilities
  • Modular design allows independent updates

Transparency:

  • Shows the complete reasoning chain
  • Users can see how each agent contributed
  • Builds trust through explainable AI

🔮 Future Enhancements

Planned Features:

  • Real-time monitoring of trending posts
  • Browser extension for instant fact-checking
  • API endpoints for integration with other platforms
  • Machine learning to improve claim detection accuracy
  • Collaborative filtering with human fact-checkers

Technical Improvements:

  • Support for more social platforms (Reddit, Snapchat, Discord)
  • Enhanced media analysis (deepfake detection)
  • Multi-language support
  • Performance optimizations for high-volume processing

🤝 Contributing

FactFlux is open-source and welcomes contributions:

# Contributing workflow
1. Fork the repository
2. Create a feature branch (git checkout -b feature/amazing-feature)
3. Commit your changes (git commit -m 'Add amazing feature')
4. Push to the branch (git push origin feature/amazing-feature)
5. Open a Pull Request
Enter fullscreen mode Exit fullscreen mode

Areas for contribution:

  • New platform integrations
  • Improved claim detection algorithms
  • Additional verification sources
  • UI/UX improvements
  • Documentation and tutorials

🔍 Technical Deep Dive

Agent Coordination

fact_check_team = Team(
    name="Social Media Fact Check Team",
    mode="coordinate",
    model=Gemini(id="gemini-2.0-flash"),
    members=[content_extractor, claim_identifier, cross_reference, verdict_agent],
    show_members_responses=True,
    enable_agentic_context=True,
)
Enter fullscreen mode Exit fullscreen mode

The team uses coordinate mode, meaning agents work sequentially, passing context between each step. This ensures:

  • Each agent has access to previous findings
  • No information is lost in the pipeline
  • The final verdict considers all evidence

Tool Integration

server_params = StdioServerParameters(
    command="npx",
    args=["@brightdata/mcp"],
    env={
        "API_TOKEN": os.getenv("BRIGHT_DATA_API_KEY"),
        "WEB_UNLOCKER_ZONE": "unblocker",
        "BROWSER_ZONE": "scraping_browser"
    }
)
Enter fullscreen mode Exit fullscreen mode

Bright Data integration provides:

  • Reliable scraping across geo-blocked content
  • Anti-bot detection mechanisms
  • High success rates for social media extraction

📈 Impact and Use Cases

Educational Institutions

  • Teaching media literacy
  • Research on misinformation patterns
  • Academic fact-checking projects

News Organizations

  • Automated preliminary fact-checking
  • Source verification assistance
  • Social media monitoring

Individual Users

  • Personal fact-checking tool
  • Browser extension for real-time verification
  • Educational resource for critical thinking

🔒 Ethical Considerations

Responsible AI Principles:

  • Transparency: All sources and reasoning are shown
  • Accuracy: Conservative approach when evidence is weak
  • Privacy: No personal data storage or tracking
  • Bias Mitigation: Multiple source cross-referencing

Limitations:

  • Cannot replace human judgment entirely
  • Dependent on source availability and quality
  • May struggle with highly contextual or cultural claims
  • Requires ongoing updates as misinformation tactics evolve

🎯 Conclusion

FactFlux represents a significant step forward in automated fact-checking technology. By leveraging multi-agent architecture, we've created a system that's both powerful and transparent, capable of handling the complexity of modern misinformation while remaining explainable to users.

The project demonstrates how AI agents can work together to solve complex problems that require multiple specialized skills. As misinformation continues to evolve, tools like FactFlux will play a crucial role in maintaining information integrity across social platforms.

Try FactFlux today: GitHub Repository


🏷️ Tags

#ai #factchecking #misinformation #python #multiagent #socialmedia #automation #gemini #brightdata #opensource


Have you built similar AI agent systems? What challenges did you face with multi-agent coordination? Share your experiences in the comments below!

Top comments (0)