Meir

Posted on Jun 19

Building FactFlux: A Multi-Agent System for Social Media Fact-Checking

In an era where misinformation spreads faster than wildfire across social platforms, I built FactFlux - an intelligent multi-agent system that automatically fact-checks social media posts using the power of AI agents working in coordination.

🚨 The Problem

Social media has become the primary source of news for millions, but it's also a breeding ground for misinformation. Manual fact-checking is slow and doesn't scale. We need automated solutions that can:

Extract content from multiple social platforms
Identify verifiable claims within posts
Cross-reference against authoritative sources
Provide transparent, evidence-based verdicts

🎯 The Solution: Multi-Agent Architecture

FactFlux uses a team of specialized AI agents working together, each with a specific role in the fact-checking pipeline. Here's how it works:

🎥 Demo

Bright Data's MCP server

🏗️ Agent Team Structure

1. Content Extractor Agent

def create_content_extractor_agent_sync(mcp_tools):
    return Agent(
        name="Content Extractor",
        role="Extract comprehensive data from social media posts",
        model=Gemini(id="gemini-2.0-flash"),
        tools=[mcp_tools],
        instructions=[
            "You MUST use the available tools to extract data from social media posts",
            "For TikTok URLs: Use the web_data_tiktok_posts tool to get structured data",
            "For other platforms or if structured data fails: Use scrape_as_markdown tool",
            "Extract: post text, media URLs, user info, engagement metrics, timestamps",
        ]
    )

Responsibilities:

Extracts post content from 6+ platforms (TikTok, Instagram, Twitter/X, Facebook, YouTube, LinkedIn)
Uses Bright Data tools for reliable scraping
Captures text, media, metadata, and engagement metrics

2. Claim Identifier Agent

def create_claim_identifier_agent():
    return Agent(
        name="Claim Identifier",
        role="Identify factual claims that can be verified",
        model=Gemini(id="gemini-2.0-flash"),
        tools=[ReasoningTools(add_instructions=True)],
        instructions=[
            "Parse extracted content to find specific, verifiable factual claims",
            "Ignore opinions, jokes, satire, and subjective statements",
            "Extract key facts: statistics, events, quotes, dates, locations",
            "Focus on claims that could potentially mislead people if false",
        ]
    )

Responsibilities:

Separates factual claims from opinions and satire
Prioritizes verifiable statements
Rates claim significance and potential for harm

3. Cross-Reference Agent

def create_cross_reference_agent(mcp_tools):
    return Agent(
        name="Cross-Reference Verifier",
        role="Verify claims against multiple authoritative sources",
        model=Gemini(id="gemini-2.0-flash"),
        tools=[mcp_tools, ReasoningTools(add_instructions=True)],
        instructions=[
            "For each claim, search multiple authoritative sources automatically",
            "Check news sites, fact-checkers, government sources, academic sources",
            "For media content, perform reverse image/video searches",
            "Document all sources and their credibility levels",
        ]
    )

Responsibilities:

Searches authoritative sources (news sites, fact-checkers, government sources)
Performs reverse media searches for images/videos
Documents source credibility and consensus patterns

4. Verdict Agent

def create_verdict_agent():
    return Agent(
        name="Verdict Synthesizer",
        role="Analyze all evidence and deliver final fact-check verdict",
        model=Gemini(id="gemini-2.0-flash"),
        tools=[ThinkingTools(add_instructions=True), ReasoningTools(add_instructions=True)],
        instructions=[
            "Structure your response with these sections:",
            "## Post Summary - Describe what the post contained",
            "## Claims Identified - List each claim found",
            "## Verification Results - Detail findings for each claim with sources",
            "## Final Verdict - Clear verdict with confidence score (0-100%)",
        ]
    )

Responsibilities:

Synthesizes all evidence from previous agents
Provides structured analysis with clear sections
Delivers final verdict: TRUE/FALSE/MISLEADING/INSUFFICIENT_EVIDENCE
Includes confidence scoring and detailed reasoning

🔄 The Workflow Process

URL Input → Content Extraction → Claim Identification → Cross-Reference → Final Verdict

Step-by-Step Breakdown:

Input: User provides a social media URL
Extraction: Content Extractor Agent scrapes the post using Bright Data tools
Analysis: Claim Identifier Agent parses content for verifiable facts
Verification: Cross-Reference Agent searches authoritative sources
Synthesis: Verdict Agent combines all evidence into a final assessment

🛠️ Technical Implementation

Core Technologies Used:

Agno Framework: For multi-agent orchestration
Google Gemini 2.0: As the language model powering each agent
Bright Data MCP (Model Context Protocol): For tool integration

Key Features:

Multi-Platform Support

# Supports major social platforms
platforms = [
    "TikTok", "Instagram", "Twitter/X", 
    "Facebook", "YouTube", "LinkedIn"
]

Intelligent Tool Selection

The system automatically chooses the optimal scraping method based on the platform:

Structured data extraction for TikTok
Markdown scraping for other platforms
Fallback mechanisms for reliability

Comprehensive Error Handling

# Built-in resilience for:
error_scenarios = [
    "Invalid URLs",
    "Network failures", 
    "API rate limits",
    "Malformed social media posts",
    "Missing content"
]

🚀 Getting Started

Prerequisites

Python 3.8+
Google Gemini API key
Bright Data API key

Installation

# Clone the repository
git clone https://github.com/MeirKaD/FactFlux.git
cd FactFlux

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment variables
cp .env.example .env
# Edit .env with your API keys

Configuration

Create a .env file:

GOOGLE_API_KEY=your_google_gemini_api_key_here
BRIGHT_DATA_API_KEY=your_bright_data_api_key_here

Usage

# Run the playground interface
python playground_fact_check.py

The system launches a web interface at http://localhost:7777 where you can:

Input social media URLs
Watch agents work in real-time
See the complete fact-checking process
Review structured reports with sources

📊 Sample Output

When you submit a social media URL, FactFlux produces a comprehensive report:

## Post Summary
TikTok video claiming "95% of ocean plastic comes from 10 rivers in Asia"

## Claims Identified
1. 95% of ocean plastic originates from rivers
2. These rivers are specifically located in Asia
3. Only 10 rivers are responsible

## Verification Results
- Claim 1: PARTIALLY TRUE - Studies show 80-90% from rivers (sources: Nature, Science journals)
- Claim 2: MISLEADING - 8 rivers in Asia, 2 in Africa (source: Environmental Science & Technology)
- Claim 3: TRUE - Confirmed by multiple peer-reviewed studies

## Citations
[1] Lebreton et al. (2017) Nature Communications - doi:10.1038/ncomms15611
[2] Schmidt et al. (2017) Environmental Science & Technology - doi:10.1021/acs.est.7b02368

## Final Verdict: MISLEADING (Confidence: 85%)
The core claim about river pollution is scientifically supported, but the specific statistics are inaccurate.

💡 Why Multi-Agent Architecture?

Specialization Benefits:

Each agent focuses on what it does best
Reduces complexity and improves accuracy
Easier to debug and improve individual components

Scalability:

Can add new agents for specific platforms or verification types
Parallel processing capabilities
Modular design allows independent updates

Transparency:

Shows the complete reasoning chain
Users can see how each agent contributed
Builds trust through explainable AI

🔮 Future Enhancements

Planned Features:

Real-time monitoring of trending posts
Browser extension for instant fact-checking
API endpoints for integration with other platforms
Machine learning to improve claim detection accuracy
Collaborative filtering with human fact-checkers

Technical Improvements:

Support for more social platforms (Reddit, Snapchat, Discord)
Enhanced media analysis (deepfake detection)
Multi-language support
Performance optimizations for high-volume processing

🤝 Contributing

FactFlux is open-source and welcomes contributions:

# Contributing workflow
1. Fork the repository
2. Create a feature branch (git checkout -b feature/amazing-feature)
3. Commit your changes (git commit -m 'Add amazing feature')
4. Push to the branch (git push origin feature/amazing-feature)
5. Open a Pull Request

Areas for contribution:

New platform integrations
Improved claim detection algorithms
Additional verification sources
UI/UX improvements
Documentation and tutorials

🔍 Technical Deep Dive

Agent Coordination

fact_check_team = Team(
    name="Social Media Fact Check Team",
    mode="coordinate",
    model=Gemini(id="gemini-2.0-flash"),
    members=[content_extractor, claim_identifier, cross_reference, verdict_agent],
    show_members_responses=True,
    enable_agentic_context=True,
)

The team uses coordinate mode, meaning agents work sequentially, passing context between each step. This ensures:

Each agent has access to previous findings
No information is lost in the pipeline
The final verdict considers all evidence

Tool Integration

server_params = StdioServerParameters(
    command="npx",
    args=["@brightdata/mcp"],
    env={
        "API_TOKEN": os.getenv("BRIGHT_DATA_API_KEY"),
        "WEB_UNLOCKER_ZONE": "unblocker",
        "BROWSER_ZONE": "scraping_browser"
    }
)

Bright Data integration provides:

Reliable scraping across geo-blocked content
Anti-bot detection mechanisms
High success rates for social media extraction

📈 Impact and Use Cases

Educational Institutions

Teaching media literacy
Research on misinformation patterns
Academic fact-checking projects

News Organizations

Automated preliminary fact-checking
Source verification assistance
Social media monitoring

Individual Users

Personal fact-checking tool
Browser extension for real-time verification
Educational resource for critical thinking

🔒 Ethical Considerations

Responsible AI Principles:

Transparency: All sources and reasoning are shown
Accuracy: Conservative approach when evidence is weak
Privacy: No personal data storage or tracking
Bias Mitigation: Multiple source cross-referencing

Limitations:

Cannot replace human judgment entirely
Dependent on source availability and quality
May struggle with highly contextual or cultural claims
Requires ongoing updates as misinformation tactics evolve

🎯 Conclusion

FactFlux represents a significant step forward in automated fact-checking technology. By leveraging multi-agent architecture, we've created a system that's both powerful and transparent, capable of handling the complexity of modern misinformation while remaining explainable to users.

The project demonstrates how AI agents can work together to solve complex problems that require multiple specialized skills. As misinformation continues to evolve, tools like FactFlux will play a crucial role in maintaining information integrity across social platforms.

Try FactFlux today: GitHub Repository

🏷️ Tags

#ai #factchecking #misinformation #python #multiagent #socialmedia #automation #gemini #brightdata #opensource

Have you built similar AI agent systems? What challenges did you face with multi-agent coordination? Share your experiences in the comments below!