DEV Community: Giovanna

OS tool to debug LLM reasoning patterns with entropy analysis

Giovanna — Sun, 09 Feb 2025 12:27:23 +0000

After struggling to understand why our reasoning models would sometimes produce flawless reasoning or go completely off track - we updated Klarity to get instant insights into reasoning uncertainty and concrete suggestions for dataset and prompt optimization. Just point it at your model to save testing time.
Key new features:

Identify where your model's reasoning goes off track with step-by-step entropy analysis - Get actionable scores for coherence and confidence at each reasoning step - Training data insights: Identify which reasoning data lead to high-quality outputs

Structured JSON output with step-by-step analysis:

steps: array of {step_number, content, entropy_score, semantic_score, top_tokens[]} - quality_metrics: array of {step, coherence, relevance, confidence} - reasoning_insights: array of {step, type, pattern, suggestions[]} - training_targets: array of {aspect, current_issue, improvement}

Example use cases:

Debug why your model's reasoning edge cases - Identify which types of reasoning steps contribute to better outcomes - Optimize your RL datasets by focusing on high-quality reasoning patterns

Currently supports Hugging Face transformers and Together AI API, we tested the library with DeepSeek R1 distilled series (Qwen-1.5b, Qwen-7b etc)

Installation: pip install git+https://github.com/klara-research/klarity.git

We are building OS interpretability/explainability tools to debug generative models behaviors. What insights would actually help you debug these black box systems?

Links:

Repo: https://github.com/klara-research/klarity - Our website: https://klaralabs.com - Discord: https://discord.gg/wCnTRzBE

Open-source tool to analyze uncertainty/entropy in LLM output (github.com/klara-research)

Giovanna — Tue, 04 Feb 2025 16:24:18 +0000

We've open-sourced Klarity - a tool for analyzing uncertainty and decision-making in LLM token generation. It provides structured insights into how models choose tokens and where they show uncertainty.
What Klarity does:

Real-time analysis of model uncertainty during generation - Dual analysis combining log probabilities and semantic understanding - Structured JSON output with actionable insights - Fully self-hostable with customizable analysis models

The tool works by analyzing each step of text generation and returns a structured JSON:

uncertainty_points: array of {step, entropy, options[], type} - high_confidence: array of {step, probability, token, context} - risk_areas: array of {type, steps[], motivation} - suggestions: array of {issue, improvement}

Currently supports hugging face transformers (more frameworks coming), we tested extensively with Qwen2.5 (0.5B-7B) models, but should work with most HF LLMs.

Installation is simple: pip install git+https://github.com/klara-research/klarity.git

We are building OS interpretability/explainability tools to visualize & analyse attention maps, saliency maps etc. and we want to understand your pain points with LLM behaviors. What insights would actually help you debug these black box systems?

Links:

Repo: https://github.com/klara-research/klarity - Our website: https://klaralabs.com

Klarity – Open-source tool to analyze uncertainty/entropy in LLM output (github.com/klara-research)

Giovanna — Tue, 04 Feb 2025 16:16:00 +0000

Real-time analysis of model uncertainty during generation - Dual analysis combining log probabilities and semantic understanding - Structured JSON output with actionable insights - Fully self-hostable with customizable analysis models

The tool works by analyzing each step of text generation and returns a structured JSON:

uncertainty_points: array of {step, entropy, options[], type} - high_confidence: array of {step, probability, token, context} - risk_areas: array of {type, steps[], motivation} - suggestions: array of {issue, improvement}

Currently supports hugging face transformers (more frameworks coming), we tested extensively with Qwen2.5 (0.5B-7B) models, but should work with most HF LLMs.

Installation is simple: pip install git+https://github.com/klara-research/klarity.git

Links:

Repo: https://github.com/klara-research/klarity - Our website: https://klaralabs.com

Let me know in comments if you find it useful and your all around feedback!