DEV Community

Edwin Lisowski
Edwin Lisowski

Posted on

ContextCheck: An open-source framework for testing and evaluating LLMs, RAGs, Chatbots

#ai

Hey devs!

We just open-sourced ContextCheck, a framework for testing and evaluating LLMs, RAGs, and chatbots 🚀

What it does:

  • Generates queries and handles completions
  • Detects regressions and hallucinations
  • Runs penetration tests
  • Works in CI pipelines (YAML-configurable)

We built it while developing our AI Knowledge Base Assistant to solve real headaches with testing and validating LLMs. Now it’s out there for you to use, break, and improve.

Try it out and let us know what you think! ➡️ Github repo

Top comments (0)

The Most Contextual AI Development Assistant

Pieces.app image

Our centralized storage agent works on-device, unifying various developer tools to proactively capture and enrich useful materials, streamline collaboration, and solve complex problems through a contextual understanding of your unique workflow.

đź‘Ą Ideal for solo developers, teams, and cross-company projects

Learn more