<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nicolai Bohn</title>
    <description>The latest articles on DEV Community by Nicolai Bohn (@nicolai_bohn_rhesis).</description>
    <link>https://dev.to/nicolai_bohn_rhesis</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3842808%2F5b6f6897-8d90-43e4-8a4a-1585c2787923.png</url>
      <title>DEV Community: Nicolai Bohn</title>
      <link>https://dev.to/nicolai_bohn_rhesis</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nicolai_bohn_rhesis"/>
    <language>en</language>
    <item>
      <title>Launch week day 1: drive the full AI testing workflow from inside any AI tool</title>
      <dc:creator>Nicolai Bohn</dc:creator>
      <pubDate>Mon, 04 May 2026 17:17:03 +0000</pubDate>
      <link>https://dev.to/nicolai_bohn_rhesis/launch-week-day-1-drive-the-full-ai-testing-workflow-from-inside-any-ai-tool-1b6b</link>
      <guid>https://dev.to/nicolai_bohn_rhesis/launch-week-day-1-drive-the-full-ai-testing-workflow-from-inside-any-ai-tool-1b6b</guid>
      <description>&lt;p&gt;Building an AI agent uses one tool. Testing it uses another. Every iteration cycle ends with you switching to your test platform UI, running tests, inspecting results, then back to your editor to fix what broke. The context switch is small, but it adds up.&lt;/p&gt;

&lt;p&gt;Today we shipped the fix: the &lt;strong&gt;Rhesis Agent Skill&lt;/strong&gt;, day 1 of Rhesis Launch Week.&lt;/p&gt;

&lt;p&gt;If you build LLM agents and use Claude Code, Cursor, Codex, Gemini CLI, or any of 40+ other AI tools, you can now drive the full Rhesis testing workflow from inside the chat where you write the code.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/_YYj98Lu5rU"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;The Agent Skill packages our domain knowledge into a portable skill file that any compatible AI tool can load. Once installed, your AI assistant gains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Endpoint discovery&lt;/strong&gt; in Quick or Comprehensive mode&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test suite design&lt;/strong&gt; with behaviors, test sets, and metrics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confirmation guards&lt;/strong&gt; that wait for approval before anything is created&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test execution&lt;/strong&gt; against your endpoints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure analysis&lt;/strong&gt; with pass/fail summaries and links back to runs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All powered by the Rhesis MCP server (27 tools covering test sets, behaviors, metrics, runs, and OData queries), all in natural language.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;

&lt;p&gt;Single command across all your AI tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add rhesis-ai/rhesis &lt;span class="nt"&gt;-g&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI detects which AI tools you have installed and configures the skill for each one. Then set your API token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;RHESIS_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;rhs_your_token_here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Get a token at &lt;a href="https://app.rhesis.ai/tokens" rel="noopener noreferrer"&gt;app.rhesis.ai/tokens&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Code
&lt;/h3&gt;

&lt;p&gt;Claude Code uses a plugin system that bundles the skill and MCP server config together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/plugin marketplace add rhesis-ai/rhesis
/plugin &lt;span class="nb"&gt;install &lt;/span&gt;rhesis@rhesis-ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cursor
&lt;/h3&gt;

&lt;p&gt;Add the MCP server to your &lt;code&gt;.cursor/mcp.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"rhesis"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;https://api.rhesis.ai/mcp&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"headers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Authorization"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bearer YOUR_RHESIS_API_KEY"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For self-hosted backends, swap &lt;code&gt;https://api.rhesis.ai/mcp&lt;/code&gt; for &lt;code&gt;http://localhost:8080/mcp&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use it
&lt;/h2&gt;

&lt;p&gt;Type something like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Test my support agent on billing scenarios, run it, and rank the failures by severity."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The skill walks the conversation through a 6-step loop:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Discover&lt;/strong&gt;: explores what your endpoint can do&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan&lt;/strong&gt;: proposes a test suite with behaviors and metrics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review&lt;/strong&gt;: waits for your approval before creating anything&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create&lt;/strong&gt;: builds entities on the platform following the approved plan&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execute&lt;/strong&gt;: runs tests once you confirm&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analyze&lt;/strong&gt;: surfaces a pass/fail summary, failure patterns, and links back to results&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For ad-hoc operations:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"List my existing test sets."&lt;br&gt;
"Improve the Safety Compliance metric. Make the threshold stricter."&lt;br&gt;
"Compare my last two test runs."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The host agent's native confirmation handles the safety guard, so destructive actions never happen without your sign-off.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>testing</category>
      <category>llmops</category>
    </item>
    <item>
      <title>Issues in conversational AI apps are so obvious right now that even John Oliver felt the need to dedicate a whole episode to the topic: https://www.youtube.com/watch?v=Ykvf3MunGf8

#AISafety #AIEvals #Testing #QA</title>
      <dc:creator>Nicolai Bohn</dc:creator>
      <pubDate>Tue, 28 Apr 2026 09:24:37 +0000</pubDate>
      <link>https://dev.to/nicolai_bohn_rhesis/issues-in-conversational-ai-apps-are-so-obvious-right-now-that-even-john-oliver-felt-the-need-to-3oj8</link>
      <guid>https://dev.to/nicolai_bohn_rhesis/issues-in-conversational-ai-apps-are-so-obvious-right-now-that-even-john-oliver-felt-the-need-to-3oj8</guid>
      <description>&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
      &lt;div class="c-embed__body flex items-center justify-between"&gt;
        &lt;a href="https://www.youtube.com/watch?v=Ykvf3MunGf8" rel="noopener noreferrer" class="c-link fw-bold flex items-center"&gt;
          &lt;span class="mr-2"&gt;youtube.com&lt;/span&gt;
          

        &lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


</description>
    </item>
  </channel>
</rss>
