Can people tell AI-written news from human-written journalism? As large language models grow more capable, the answer is becoming increasingly uncomfortable. This is the question at the heart of two open-source research platforms: JudgeGPT and RogueGPT.
Both are licensed under GPLv3 and have companion papers accepted at The Web Conference 2026 (WWW '26).
The Problem: Industrialized Deception
Generative AI has created an asymmetric arms race. Producing convincing synthetic news now costs almost nothing. Detecting it reliably does not. Two papers at WWW '26 address this:
- "Industrialized Deception: The Collateral Effects of LLM-Generated Misinformation on Digital Ecosystems" (arXiv:2601.21963) -- systemic effects of LLM-generated misinformation on trust networks.
- "Eroding the Truth-Default: A Causal Analysis of Human Susceptibility to Foundation Model Hallucinations and Disinformation in the Wild" (arXiv:2601.22871) -- key finding: the human truth-default is being measurably eroded by LLM-generated content.
RogueGPT: Controlled Stimulus Generation
RogueGPT is a Python framework for generating controlled news stimuli. The current corpus contains 2,663 multilingual news fragments: 37 model configurations across 10 providers (OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, Microsoft, Zhipu, Moonshot, Qwen, MiniMax), 4 languages, 3 formats, 5 journalistic styles per language, and 222 human-sourced fragments as experimental anchors.
Three interfaces over a shared data layer: Streamlit app, CLI, and an MCP server exposing tools for AI agent integration.
git clone https://github.com/aloth/RogueGPT
pip install -r requirements.txt
python cli.py ingest --text "..." --model "gpt-4o" --language en --style nyt --format article
python cli.py retrieve --model "gpt-4o" --language en --limit 10
JudgeGPT: Human Evaluation at Scale
JudgeGPT is a live Streamlit platform collecting human judgments on news authenticity. Participants evaluate fragments on three 7-point scales: source attribution (human vs. machine), veracity (legitimate vs. fake), and topic familiarity.
After each submission, participants see the ground truth and the specific model that generated the content. A shareable score card is generated every 5 responses.
Live survey: judgegpt.streamlit.app
Why It Matters for Developers
Every fragment has full provenance: model, parameters, seed. This enables questions beyond can humans detect AI -- which models are hardest to detect? In which languages? By which demographic groups?
Corpus on Zenodo (academic access): DOI: 10.5281/zenodo.18703138
Both repos are GPLv3. Contributions welcome.
- GitHub: aloth/JudgeGPT | aloth/RogueGPT
- arXiv: 2601.21963 | 2601.22871
Top comments (0)