DEV Community

Edwin Lisowski
Edwin Lisowski

Posted on

2

СontextCheck: LLM & RAG Evaluation Framework

Hi all! We open-sourced a framework for testing LLMs, RAGs, and chatbots. The tool automates query generation, completion requests, regression detection, penetration testing, and hallucination assessment. Designed for developers, researchers, and businesses. And we are looking for contributors! Feel free to try it out for yourself and share your feedback!

Repo on Github

Top comments (1)

Collapse
 
mayank_laddha_21ef3e061ff profile image
Mayank Laddha

Hi, Nice work. I would love to know why most of the frameworks use only "llm as a judge" for hallucination. Why not perplexity and semantic entropy? dev.to/mayank_laddha_21ef3e061ff/d...

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

👋 Kindness is contagious

Explore a sea of insights with this enlightening post, highly esteemed within the nurturing DEV Community. Coders of all stripes are invited to participate and contribute to our shared knowledge.

Expressing gratitude with a simple "thank you" can make a big impact. Leave your thanks in the comments!

On DEV, exchanging ideas smooths our way and strengthens our community bonds. Found this useful? A quick note of thanks to the author can mean a lot.

Okay