Ask Claude, GPT, and Gemini the same question and you get three different answers — and
you're left guessing which to trust.
So I built PolyHelper: send one prompt, it runs across many models at once and shows where
they agree, where they disagree, and a consensus — with an audit trail behind it.
Three modes:
- Convergent — one answer + how strongly the models agreed
- Divergent — the perspectives side by side
- Artifact council — models critique and rank each other
Open source, Apache-2.0: https://github.com/PolyHelper/polyhelper
What consensus method would you actually trust? Majority vote feels naive, LLM-as-judge
has biases. How would you approach it?
Top comments (0)