<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Giovanna</title>
    <description>The latest articles on DEV Community by Giovanna (@giovanna_8c3492d882250ac6).</description>
    <link>https://dev.to/giovanna_8c3492d882250ac6</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2815653%2Ffe0fd230-55fc-4a18-b4eb-5c1f5c700e13.png</url>
      <title>DEV Community: Giovanna</title>
      <link>https://dev.to/giovanna_8c3492d882250ac6</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/giovanna_8c3492d882250ac6"/>
    <language>en</language>
    <item>
      <title>OS tool to debug LLM reasoning patterns with entropy analysis</title>
      <dc:creator>Giovanna</dc:creator>
      <pubDate>Sun, 09 Feb 2025 12:27:23 +0000</pubDate>
      <link>https://dev.to/giovanna_8c3492d882250ac6/os-tool-to-debug-llm-reasoning-patterns-with-entropy-analysis-5hcf</link>
      <guid>https://dev.to/giovanna_8c3492d882250ac6/os-tool-to-debug-llm-reasoning-patterns-with-entropy-analysis-5hcf</guid>
      <description>&lt;p&gt;After struggling to understand why our reasoning models would sometimes produce flawless reasoning or go completely off track - we updated Klarity to get instant insights into reasoning uncertainty and concrete suggestions for dataset and prompt optimization. Just point it at your model to save testing time.&lt;br&gt;
Key new features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify where your model's reasoning goes off track with step-by-step entropy analysis - Get actionable scores for coherence and confidence at each reasoning step - Training data insights: Identify which reasoning data lead to high-quality outputs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Structured JSON output with step-by-step analysis:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;steps: array of {step_number, content, entropy_score, semantic_score, top_tokens[]} - quality_metrics: array of {step, coherence, relevance, confidence} - reasoning_insights: array of {step, type, pattern, suggestions[]} - training_targets: array of {aspect, current_issue, improvement}&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Debug why your model's reasoning edge cases - Identify which types of reasoning steps contribute to better outcomes - Optimize your RL datasets by focusing on high-quality reasoning patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Currently supports Hugging Face transformers and Together AI API, we tested the library with DeepSeek R1 distilled series (Qwen-1.5b, Qwen-7b etc)&lt;/p&gt;

&lt;p&gt;Installation: &lt;code&gt;pip install git+https://github.com/klara-research/klarity.git&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We are building OS interpretability/explainability tools to debug generative models behaviors. What insights would actually help you debug these black box systems?&lt;/p&gt;

&lt;p&gt;Links:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/klara-research/klarity" rel="noopener noreferrer"&gt;https://github.com/klara-research/klarity&lt;/a&gt; - Our website: &lt;a href="https://klaralabs.com/" rel="noopener noreferrer"&gt;https://klaralabs.com&lt;/a&gt; - Discord: &lt;a href="https://discord.gg/wCnTRzBE" rel="noopener noreferrer"&gt;https://discord.gg/wCnTRzBE&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>deepseek</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Open-source tool to analyze uncertainty/entropy in LLM output (github.com/klara-research)</title>
      <dc:creator>Giovanna</dc:creator>
      <pubDate>Tue, 04 Feb 2025 16:24:18 +0000</pubDate>
      <link>https://dev.to/giovanna_8c3492d882250ac6/open-source-tool-to-analyze-uncertaintyentropy-in-llm-output-githubcomklara-research-42kp</link>
      <guid>https://dev.to/giovanna_8c3492d882250ac6/open-source-tool-to-analyze-uncertaintyentropy-in-llm-output-githubcomklara-research-42kp</guid>
      <description>&lt;p&gt;We've open-sourced Klarity - a tool for analyzing uncertainty and decision-making in LLM token generation. It provides structured insights into how models choose tokens and where they show uncertainty.&lt;br&gt;
What Klarity does:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time analysis of model uncertainty during generation - Dual analysis combining log probabilities and semantic understanding - Structured JSON output with actionable insights - Fully self-hostable with customizable analysis models&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tool works by analyzing each step of text generation and returns a structured JSON:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;uncertainty_points: array of {step, entropy, options[], type} - high_confidence: array of {step, probability, token, context} - risk_areas: array of {type, steps[], motivation} - suggestions: array of {issue, improvement}&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Currently supports hugging face transformers (more frameworks coming), we tested extensively with Qwen2.5 (0.5B-7B) models, but should work with most HF LLMs.&lt;/p&gt;

&lt;p&gt;Installation is simple: &lt;code&gt;pip install git+https://github.com/klara-research/klarity.git&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We are building OS interpretability/explainability tools to visualize &amp;amp; analyse attention maps, saliency maps etc. and we want to understand your pain points with LLM behaviors. What insights would actually help you debug these black box systems?&lt;/p&gt;

&lt;p&gt;Links:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/klara-research/klarity" rel="noopener noreferrer"&gt;https://github.com/klara-research/klarity&lt;/a&gt; - Our website: &lt;a href="https://klaralabs.com/" rel="noopener noreferrer"&gt;https://klaralabs.com&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>opensource</category>
      <category>explainability</category>
      <category>llm</category>
      <category>github</category>
    </item>
    <item>
      <title>Klarity – Open-source tool to analyze uncertainty/entropy in LLM output (github.com/klara-research)</title>
      <dc:creator>Giovanna</dc:creator>
      <pubDate>Tue, 04 Feb 2025 16:16:00 +0000</pubDate>
      <link>https://dev.to/giovanna_8c3492d882250ac6/klarity-open-source-tool-to-analyze-uncertaintyentropy-in-llm-output-githubcomklara-research-24nl</link>
      <guid>https://dev.to/giovanna_8c3492d882250ac6/klarity-open-source-tool-to-analyze-uncertaintyentropy-in-llm-output-githubcomklara-research-24nl</guid>
      <description>&lt;p&gt;We've open-sourced Klarity - a tool for analyzing uncertainty and decision-making in LLM token generation. It provides structured insights into how models choose tokens and where they show uncertainty.&lt;br&gt;
What Klarity does:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time analysis of model uncertainty during generation - Dual analysis combining log probabilities and semantic understanding - Structured JSON output with actionable insights - Fully self-hostable with customizable analysis models&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tool works by analyzing each step of text generation and returns a structured JSON:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;uncertainty_points: array of {step, entropy, options[], type} - high_confidence: array of {step, probability, token, context} - risk_areas: array of {type, steps[], motivation} - suggestions: array of {issue, improvement}&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Currently supports hugging face transformers (more frameworks coming), we tested extensively with Qwen2.5 (0.5B-7B) models, but should work with most HF LLMs.&lt;/p&gt;

&lt;p&gt;Installation is simple: &lt;code&gt;pip install git+https://github.com/klara-research/klarity.git&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We are building OS interpretability/explainability tools to visualize &amp;amp; analyse attention maps, saliency maps etc. and we want to understand your pain points with LLM behaviors. What insights would actually help you debug these black box systems?&lt;/p&gt;

&lt;p&gt;Links:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/klara-research/klarity" rel="noopener noreferrer"&gt;https://github.com/klara-research/klarity&lt;/a&gt; - Our website: &lt;a href="https://klaralabs.com/" rel="noopener noreferrer"&gt;https://klaralabs.com&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let me know in comments if you find it useful and your all around feedback!&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>ai</category>
      <category>deepseek</category>
      <category>interpretability</category>
    </item>
  </channel>
</rss>
