<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: bob lee</title>
    <description>The latest articles on DEV Community by bob lee (@bob_lee_2d29b22cee96eb65f).</description>
    <link>https://dev.to/bob_lee_2d29b22cee96eb65f</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3976852%2F87e1f8a7-af0e-4d25-9b61-67341a2d9a5e.png</url>
      <title>DEV Community: bob lee</title>
      <link>https://dev.to/bob_lee_2d29b22cee96eb65f</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bob_lee_2d29b22cee96eb65f"/>
    <language>en</language>
    <item>
      <title>Building a Self-Verifying FTIR Agent with Qwen Function Calling</title>
      <dc:creator>bob lee</dc:creator>
      <pubDate>Fri, 26 Jun 2026 04:40:16 +0000</pubDate>
      <link>https://dev.to/bob_lee_2d29b22cee96eb65f/building-a-self-verifying-ftir-agent-with-qwen-function-calling-5he9</link>
      <guid>https://dev.to/bob_lee_2d29b22cee96eb65f/building-a-self-verifying-ftir-agent-with-qwen-function-calling-5he9</guid>
      <description>&lt;p&gt;&lt;em&gt;Built for Track 4: Autopilot Agent — #QwenCloudHackathon&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Most AI "agents" are API wrappers with a system prompt. Upload data, call one endpoint, return the result. No verification, no reasoning about what went wrong, no ability to self-correct.&lt;/p&gt;

&lt;p&gt;For the Qwen Cloud Hackathon, I built &lt;strong&gt;ChemSpectra Agent&lt;/strong&gt; — an FTIR spectral analysis system where Qwen-3.7-Max autonomously selects tools, cross-validates evidence across multiple results, and triggers self-verification when confidence is low. The key insight: an agent that checks its own work catches errors that single-pass analysis misses.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Function Calling Changes Everything
&lt;/h3&gt;

&lt;p&gt;The agent has access to 5 analysis tools, each hitting a different endpoint of the FTIR.fun spectral library (130,000+ reference spectra):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;identify_material&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Match spectrum against reference library, return ranked candidates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;explain_peaks&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Explain what chemical bond vibration each peak represents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;assign_functional_groups&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Map peaks to functional groups (C=O, O-H, N-H, etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;match_library_topk&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rapid top-K screening without deep analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;search_public_results&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Search publicly shared analysis cases (via MCP)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Instead of hardcoding which tools to call, I define these as Qwen Function Calling schemas and let the model decide:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;AGENT_TOOLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;identify_material&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Match spectrum against 130,000+ reference spectra...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parameters&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;properties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;top_k&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;integer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sample_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="c1"&gt;# ... 4 more tools
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Generation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DASHSCOPE_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qwen3.7-max&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AGENT_TOOLS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# Qwen decides which to call
&lt;/span&gt;    &lt;span class="n"&gt;result_format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result: different questions trigger different tool combinations. "What is this material?" → &lt;code&gt;identify_material&lt;/code&gt; + &lt;code&gt;explain_peaks&lt;/code&gt;. "Deformulate this sample" → all three analytical tools. "Quick screening" → just &lt;code&gt;match_library_topk&lt;/code&gt;. The LLM decides, not the developer.&lt;/p&gt;

&lt;h3&gt;
  
  
  The ReAct Loop
&lt;/h3&gt;

&lt;p&gt;The agent runs a Think → Act → Observe loop, up to 6 iterations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Qwen receives the user request + tool schemas&lt;/li&gt;
&lt;li&gt;Qwen returns &lt;code&gt;tool_calls&lt;/code&gt; — which tools to invoke and with what parameters&lt;/li&gt;
&lt;li&gt;Agent executes the tools against FTIR.fun API&lt;/li&gt;
&lt;li&gt;Results are formatted and sent back to Qwen&lt;/li&gt;
&lt;li&gt;Qwen either calls more tools or produces a final synthesis&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In practice, most analyses complete in 2-3 iterations. Qwen's &lt;code&gt;enable_thinking=True&lt;/code&gt; mode shows the full chain-of-thought reasoning, so you can see &lt;em&gt;why&lt;/em&gt; it chose each tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cross-Validation: Where It Gets Interesting
&lt;/h3&gt;

&lt;p&gt;After the ReAct loop, the agent doesn't just return results. It runs two automated checks:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Confidence estimation&lt;/strong&gt; — calculated from match scores, candidate score gaps, and functional group coverage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_estimate_confidence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;id_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;identify_material&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;id_result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;matches&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;top_sim&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id_result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;matches&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;similarity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;top_sim&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id_result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;matches&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;gap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;top_sim&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;id_result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;matches&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;similarity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gap&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# larger gap = more confident
&lt;/span&gt;    &lt;span class="c1"&gt;# ... more signals from other tools
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Evidence conflict detection&lt;/strong&gt; — compares outputs across tools. If &lt;code&gt;identify_material&lt;/code&gt; says "PET" but &lt;code&gt;assign_functional_groups&lt;/code&gt; found no ester groups, that's a contradiction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;expected_groups&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ester&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;c=o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aromatic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nylon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amide&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;n-h&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;c=o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;polyethylene&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;c-h&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ch2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;methylene&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;silicone&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;si-o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;si-c&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;siloxane&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;# If 2+ expected groups are missing → conflict
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Self-Verification Round
&lt;/h3&gt;

&lt;p&gt;When confidence &amp;lt; 0.75 or conflicts are detected, the agent automatically triggers a verification round. Qwen is told exactly what went wrong:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ISSUES DETECTED:
- functional_group_mismatch: material="pet", missing=["ester", "aromatic"]
- low_confidence: 0.62 (threshold: 0.75)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Qwen then autonomously calls additional tools to investigate. After verification, confidence is recalculated. In testing, I've seen confidence traces like &lt;code&gt;[0.62, 0.84]&lt;/code&gt; — a 35% improvement from one verification round.&lt;/p&gt;

&lt;h3&gt;
  
  
  Self-Repair
&lt;/h3&gt;

&lt;p&gt;When Qwen's structured JSON output fails to parse (it happens — LLMs sometimes wrap JSON in markdown code blocks), the error and original output are sent back to Qwen with context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;repair_messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Parse error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;!r}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Return ONLY valid JSON.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;raw_retry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_call_qwen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repair_messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Near-100% recovery rate. No silent failures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters for Real Applications
&lt;/h3&gt;

&lt;p&gt;In regulated industries — pharmaceutical QC under FDA 21 CFR Part 11, forensic substance identification, environmental contaminant detection — an AI that returns wrong results without flagging uncertainty is dangerous. ChemSpectra Agent's self-verification turns "AI that gives answers" into "AI that checks its work." The confidence trace provides an audit trail that fits existing compliance frameworks.&lt;/p&gt;

&lt;p&gt;All LLM reasoning — tool selection, synthesis, verification, self-repair, follow-up chat, report generation — runs through Alibaba Cloud's &lt;code&gt;dashscope&lt;/code&gt; SDK with &lt;code&gt;qwen3.7-max&lt;/code&gt;. Six distinct call sites, one provider.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try it&lt;/strong&gt;: &lt;a href="https://github.com/jxbaoxiaodong/chemspectra-agent" rel="noopener noreferrer"&gt;github.com/jxbaoxiaodong/chemspectra-agent&lt;/a&gt;&lt;/p&gt;

</description>
      <category>qwencloudhackathon</category>
      <category>qwen</category>
      <category>alibabacloud</category>
      <category>ftir</category>
    </item>
    <item>
      <title>How I Built a Self-Verifying AI Agent with DynamoDB and ReAct Reasoning</title>
      <dc:creator>bob lee</dc:creator>
      <pubDate>Fri, 26 Jun 2026 04:34:25 +0000</pubDate>
      <link>https://dev.to/bob_lee_2d29b22cee96eb65f/how-i-built-a-self-verifying-ai-agent-with-dynamodb-and-react-reasoning-6dp</link>
      <guid>https://dev.to/bob_lee_2d29b22cee96eb65f/how-i-built-a-self-verifying-ai-agent-with-dynamodb-and-react-reasoning-6dp</guid>
      <description>&lt;p&gt;&lt;em&gt;Built for the #H0Hackathon — Hack the Zero Stack with Vercel v0 and AWS Databases&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Most AI pipelines follow a fixed script: input in, output out, nobody checks the work. For the H0 hackathon (Track 2: Monetizable B2B App), I built &lt;strong&gt;ChemSpectra Agent&lt;/strong&gt; — an FTIR spectral analysis system where the AI verifies its own conclusions and self-corrects when evidence conflicts.&lt;/p&gt;

&lt;h3&gt;
  
  
  The ReAct Loop
&lt;/h3&gt;

&lt;p&gt;Instead of hardcoding which tools to call, the agent uses a ReAct loop with Qwen-3.7-Max function calling. The LLM autonomously selects from 5 tools — &lt;code&gt;identify_material&lt;/code&gt; (130K+ reference spectra), &lt;code&gt;explain_peaks&lt;/code&gt;, &lt;code&gt;assign_functional_groups&lt;/code&gt;, &lt;code&gt;match_library_topk&lt;/code&gt;, and &lt;code&gt;search_public_results&lt;/code&gt;. A material ID request might trigger two tools; a deformulation request triggers all three analytical tools. The LLM decides, not the developer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cross-Validation and Self-Verification
&lt;/h3&gt;

&lt;p&gt;After tools return results, &lt;code&gt;_detect_evidence_conflicts()&lt;/code&gt; compares outputs. If &lt;code&gt;identify_material&lt;/code&gt; says "PET" but &lt;code&gt;assign_functional_groups&lt;/code&gt; found no ester groups, that's a contradiction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;expected_groups&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ester&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;c=o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aromatic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nylon&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;amide&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;n-h&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;c=o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent estimates confidence from match scores, candidate score gaps, and functional group coverage. Below 0.75 confidence or any conflicts, a verification round fires automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;needs_verification&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.75&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conflicts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent gets told exactly what went wrong and calls additional tools to investigate. Post-verification confidence is logged, creating traces like &lt;code&gt;[0.62, 0.84]&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  DynamoDB: Beyond Key-Value Storage
&lt;/h3&gt;

&lt;p&gt;Every session persists to DynamoDB with 30-day TTL — tool call logs, confidence traces, synthesis, final report. But we went deeper than basic CRUD:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Two GSIs&lt;/strong&gt; — &lt;code&gt;gsi-created&lt;/code&gt; (partition: &lt;code&gt;ALL&lt;/code&gt;, sort: &lt;code&gt;created_at&lt;/code&gt;) replaces full-table scan with efficient time-ordered query; &lt;code&gt;gsi-material&lt;/code&gt; (partition: &lt;code&gt;top_match&lt;/code&gt;, sort: &lt;code&gt;created_at&lt;/code&gt;) enables "show me all PET analyses" aggregation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomic counters&lt;/strong&gt; — a separate &lt;code&gt;chemspectra-stats&lt;/code&gt; table tracks &lt;code&gt;total_analyses&lt;/code&gt; and &lt;code&gt;total_tools_called&lt;/code&gt; via DynamoDB &lt;code&gt;ADD&lt;/code&gt; operations, safe under concurrent requests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conditional writes&lt;/strong&gt; — confirmed sessions use &lt;code&gt;attribute_not_exists(session_id) OR step &amp;lt;&amp;gt; :confirmed&lt;/code&gt; to prevent concurrent overwrites of finalized reports&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Regulated industries (pharma, forensics) require this audit trail. DynamoDB fits because the primary access is single-item by &lt;code&gt;session_id&lt;/code&gt;, the GSIs cover the two secondary patterns, and TTL handles cleanup automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;p&gt;The loop runs 2-4 iterations in under 30 seconds. Self-repair for malformed LLM JSON has near-100% recovery. This turns "AI that gives answers" into "AI that checks its work" — essential when reports go into regulatory filings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try it&lt;/strong&gt;: &lt;a href="https://chemspectra-agent-h0.vercel.app" rel="noopener noreferrer"&gt;chemspectra-agent-h0.vercel.app&lt;/a&gt; | &lt;strong&gt;Code&lt;/strong&gt;: &lt;a href="https://github.com/jxbaoxiaodong/chemspectra-agent-h0" rel="noopener noreferrer"&gt;github.com/jxbaoxiaodong/chemspectra-agent-h0&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This article was written as part of my participation in the H0 AWS+Vercel Hackathon.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>h0hackathon</category>
      <category>aws</category>
      <category>dynamodb</category>
      <category>vercel</category>
    </item>
    <item>
      <title>How I Built an FTIR Analysis Platform with Claude (and What I Learned About AI-Assisted Development)</title>
      <dc:creator>bob lee</dc:creator>
      <pubDate>Wed, 10 Jun 2026 03:32:31 +0000</pubDate>
      <link>https://dev.to/bob_lee_2d29b22cee96eb65f/how-i-built-an-ftir-analysis-platform-with-claude-and-what-i-learned-about-ai-assisted-development-1f0c</link>
      <guid>https://dev.to/bob_lee_2d29b22cee96eb65f/how-i-built-an-ftir-analysis-platform-with-claude-and-what-i-learned-about-ai-assisted-development-1f0c</guid>
      <description>&lt;h1&gt;
  
  
  DEV.to Article: How I Built an FTIR Analysis Platform with Claude
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;Title:&lt;/strong&gt; How I Built an FTIR Analysis Platform with Claude (and What I Learned About AI-Assisted Development)&lt;br&gt;
&lt;strong&gt;Tags:&lt;/strong&gt; python, chemistry, opensource, ai&lt;br&gt;
&lt;strong&gt;Published:&lt;/strong&gt; true (can publish immediately on DEV)&lt;/p&gt;


&lt;h2&gt;
  
  
  The Backstory
&lt;/h2&gt;

&lt;p&gt;I'm a materials science graduate, not a software developer. I know FTIR spectroscopy — identifying polymers, interpreting functional group peaks, matching unknown samples against reference libraries. But when I needed to search FTIR spectra programmatically, I hit a wall: the existing tools were either expensive enterprise packages or Excel macros from the early 2000s.&lt;/p&gt;

&lt;p&gt;So I decided to build my own. And I used Claude (Anthropic's AI assistant) as my coding partner.&lt;/p&gt;

&lt;p&gt;This is the story of how a domain expert with basic Python skills built a production FTIR search platform — 135,000 spectra, MCP server, API, community features — with AI writing about 70% of the code.&lt;/p&gt;


&lt;h2&gt;
  
  
  Step 1: The Core Algorithm
&lt;/h2&gt;

&lt;p&gt;FTIR spectrum matching sounds complex, but the core is simple geometry: given a set of peak positions from an unknown sample, find the library spectra with the most matching peaks within a tolerance window (typically ±5 to ±15 cm⁻¹).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Claude helped with:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writing the initial peak-matching loop&lt;/li&gt;
&lt;li&gt;Setting up the Django project structure&lt;/li&gt;
&lt;li&gt;Designing the database schema for the spectral library&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What I handled:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understanding which tolerance values actually work (different wavenumber regions need different tolerances)&lt;/li&gt;
&lt;li&gt;Validating match results against known materials&lt;/li&gt;
&lt;li&gt;Rejecting the first three algorithm designs that looked correct on paper but failed on real data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; AI can write the code faster than you can, but it can't tell you if the chemistry is right. Domain expertise is the bottleneck, not code.&lt;/p&gt;


&lt;h2&gt;
  
  
  Step 2: Parsing FTIR Instrument Files
&lt;/h2&gt;

&lt;p&gt;This was the hardest technical challenge. FTIR instruments output data in at least 6 different formats:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Origin&lt;/th&gt;
&lt;th&gt;Difficulty&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SPA&lt;/td&gt;
&lt;td&gt;Thermo Nicolet&lt;/td&gt;
&lt;td&gt;Medium — binary, proprietary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SPC&lt;/td&gt;
&lt;td&gt;GRAMS&lt;/td&gt;
&lt;td&gt;Medium — documented but complex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OPUS&lt;/td&gt;
&lt;td&gt;Bruker&lt;/td&gt;
&lt;td&gt;High — completely proprietary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSV&lt;/td&gt;
&lt;td&gt;Universal&lt;/td&gt;
&lt;td&gt;Easy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JDX&lt;/td&gt;
&lt;td&gt;JCAMP-DX&lt;/td&gt;
&lt;td&gt;Medium — standard but varied implementations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;XLSX&lt;/td&gt;
&lt;td&gt;Labs&lt;/td&gt;
&lt;td&gt;Easy — but infinite variations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;What Claude helped with:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writing binary file parsers from format documentation&lt;/li&gt;
&lt;li&gt;Extracting peak tables from raw instrument data&lt;/li&gt;
&lt;li&gt;Handling edge cases (missing metadata, non-standard headers)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What I handled:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Testing with real instrument files from my university lab&lt;/li&gt;
&lt;li&gt;Identifying which format variants actually appear in practice&lt;/li&gt;
&lt;li&gt;Setting up error handling for unparseable files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Claude is surprisingly good at binary file parsing. I pasted format specs from Thermo and Bruker documentation, and it generated working parsers. But I caught three subtle byte-offset errors that would have silently corrupted data.&lt;/p&gt;


&lt;h2&gt;
  
  
  Step 3: The MCP Server
&lt;/h2&gt;

&lt;p&gt;MCP (Model Context Protocol) lets AI agents call your tool directly. Instead of a human typing peak values into a web form, an AI agent can send structured requests and receive structured results.&lt;/p&gt;

&lt;p&gt;The MCP server, at &lt;code&gt;fastapi_server/mcp_server.py&lt;/code&gt;, exposes one main tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;analyze_ftir_spectrum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;peaks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Accept either an instrument file or a peak list. Returns ranked matches with similarity scores.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Claude generated:&lt;/strong&gt; ~90% of the MCP server code, including the Pydantic output schemas, error handling, and feature documentation.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: What Broke in Production
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Problem 1: Memory&lt;/strong&gt;&lt;br&gt;
Loading the entire 135K-spectrum library into memory on every request was fine locally. On a 2GB VPS with other services running, it caused OOM kills within hours.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fix:&lt;/strong&gt; Added Redis caching for frequent searches, lazy loading for the library, and a batch query size limit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Problem 2: Cloudflare timeouts&lt;/strong&gt;&lt;br&gt;
The MCP streamable-http transport needs persistent connections. Cloudflare's default 100-second timeout killed long searches.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fix:&lt;/strong&gt; Server-sent events for progress reporting, and Cloudflare timeout tuning.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Problem 3: Hallucination-like false positives&lt;/strong&gt;&lt;br&gt;
The matching algorithm returned chemically impossible candidates for very short peak lists (2-3 peaks).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fix:&lt;/strong&gt; Added a minimum peak count threshold and a confidence penalty for low-peak queries.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Result
&lt;/h2&gt;

&lt;p&gt;FTIR.fun is now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Live at&lt;/strong&gt; &lt;a href="https://ftir.fun" rel="noopener noreferrer"&gt;https://ftir.fun&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP endpoint:&lt;/strong&gt; &lt;code&gt;https://ftir.fun/mcp&lt;/code&gt; — connect from Claude, Cursor, Copilot, or any MCP client&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAPI spec:&lt;/strong&gt; &lt;code&gt;https://ftir.fun/openapi.platform.yaml&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/jxbaoxiaodong/ftirfun-mcp" rel="noopener noreferrer"&gt;github.com/jxbaoxiaodong/ftirfun-mcp&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~135,000 spectra&lt;/strong&gt; indexed and searchable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~70% of the code&lt;/strong&gt; co-written with Claude&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;~30% of the code&lt;/strong&gt; rewritten after Claude's version failed in production&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I'd Tell Other Domain Experts Considering AI-Assisted Development
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Start with the messy data, not the shiny framework.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I spent two weeks getting Claude to generate a perfect Docker Compose setup. Then I spent two months wrangling real FTIR instrument files. The infrastructure was the easy part — the data was the hard part.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. AI will write code that looks right but is wrong.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Claude produced beautiful peak-matching code that passed unit tests and failed on real spectra. The peak positions "matched" mathematically but violated basic FTIR chemistry. You need domain knowledge to catch this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Production is where the AI-generated code breaks first.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The code that looks clean in a notebook dies first under real load, real data variety, and real timeout limits. Be ready to rewrite the hot paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. But the framework code is perfect for AI.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Settings, schemas, API routing, test scaffolding, README files, deployment scripts — Claude wrote these flawlessly. Let AI handle the glue while you focus on the domain logic.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Confidence calibration (how reliable is a 0.85 similarity score?)&lt;/li&gt;
&lt;li&gt;Expanded file format support&lt;/li&gt;
&lt;li&gt;Public API with usage tiers&lt;/li&gt;
&lt;li&gt;More MCP tools for agent workflows&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;FTIR.fun is an open-spectral-search project by a materials scientist who learned Python by building it. Questions, feedback, or FTIR datasets to contribute? &lt;a href="mailto:ftir.fun@outlook.com"&gt;ftir.fun@outlook.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>chemistry</category>
      <category>opensource</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
