<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Chase Naughton</title>
    <description>The latest articles on DEV Community by Chase Naughton (@tensorops9014).</description>
    <link>https://dev.to/tensorops9014</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3784496%2F3bfa8a8d-2b94-4c23-8736-a38c552df360.jpg</url>
      <title>DEV Community: Chase Naughton</title>
      <link>https://dev.to/tensorops9014</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tensorops9014"/>
    <language>en</language>
    <item>
      <title>Why Defense-Specific LLM Testing is a Game-Changer for AI Safety</title>
      <dc:creator>Chase Naughton</dc:creator>
      <pubDate>Sun, 22 Feb 2026 04:03:16 +0000</pubDate>
      <link>https://dev.to/tensorops9014/why-defense-specific-llm-testing-is-a-game-changer-for-ai-safety-1h3f</link>
      <guid>https://dev.to/tensorops9014/why-defense-specific-llm-testing-is-a-game-changer-for-ai-safety-1h3f</guid>
      <description>&lt;p&gt;In an era where AI models are increasingly deployed in high-stakes environments, generic evaluation tools no longer cut it. That’s why Justin Norman’s new open-source framework, DoDHaluEval, is such a standout contribution—it zeroes in on a critical niche: defense-domain hallucinations in large language models (LLMs).&lt;/p&gt;

&lt;p&gt;What caught my eye immediately is the framework’s focus on &lt;strong&gt;context-aware hallucination testing&lt;/strong&gt;. Instead of using generic prompts or public-domain benchmarks, DoDHaluEval includes over 92 military-specific templates and identifies seven distinct hallucination patterns unique to defense knowledge. This approach recognizes that not all inaccuracies are equal—a misstatement about troop movements or equipment specs can have far more severe consequences than a fictional movie plot.&lt;/p&gt;

&lt;p&gt;Justin and his team didn’t just stop at domain-specific data. They implemented an &lt;strong&gt;ensemble detection system&lt;/strong&gt; combining HuggingFace HHEM, G-Eval, and SelfCheckGPT, offering multiple layers of validation. This multi-method approach is smart—it acknowledges that no single tool can catch every type of error, especially in nuanced, high-risk domains like defense.&lt;/p&gt;

&lt;p&gt;For developers and organizations working with LLMs in regulated or sensitive sectors, this framework is a blueprint for building safer, more reliable systems. It’s a reminder that effective AI safety isn’t just about scaling model size—it’s about tailoring evaluation to real-world contexts and consequences.&lt;/p&gt;

&lt;p&gt;If you're working on LLM trust and safety—whether in defense, healthcare, finance, or beyond—this is a must-read project. Check out the full details and code on &lt;a href="https://github.com/stbiadmin/diu-halu" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://github.com/stbiadmin/diu-halu" rel="noopener noreferrer"&gt;Read the full post here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Follow Justin's work: &lt;a href="https://bsky.app/profile/justintime.ai" rel="noopener noreferrer"&gt;Bluesky&lt;/a&gt; | &lt;a href="https://github.com/stbiadmin" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; | &lt;a href="https://www.linkedin.com/in/justindnorman/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; | &lt;a href="https://justintime.ai" rel="noopener noreferrer"&gt;Blog&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aisafety</category>
      <category>llmevaluation</category>
      <category>defense</category>
      <category>hallucinationdetection</category>
    </item>
    <item>
      <title>How to Rebuild Portfolio Projects Without Proprietary Code</title>
      <dc:creator>Chase Naughton</dc:creator>
      <pubDate>Sun, 22 Feb 2026 04:02:56 +0000</pubDate>
      <link>https://dev.to/tensorops9014/how-to-rebuild-portfolio-projects-without-proprietary-code-4kdn</link>
      <guid>https://dev.to/tensorops9014/how-to-rebuild-portfolio-projects-without-proprietary-code-4kdn</guid>
      <description>&lt;p&gt;In Justin Norman's latest post, he tackles a challenge familiar to many developers and data scientists: how to showcase past work when the original code is owned by a former employer. His solution? Build a simulation engine to recreate the problem space without proprietary data or IP, then reconstruct the solution using modern tools.&lt;/p&gt;

&lt;p&gt;This approach struck me as brilliant for two reasons. First, it respects intellectual property boundaries—a must in our industry. Second, it allows you to demonstrate not just what you built, but how you’d build it now with updated frameworks and techniques. For example, Justin reimplemented a freight forecasting system using GRUs and Prophet, and a security event clustering pipeline with K-means and LSA—both reflecting current best practices.&lt;/p&gt;

&lt;p&gt;This isn’t just about recreating code; it’s about showcasing adaptability, problem-solving, and technical growth. By rebuilding projects from the ground up, you prove you understand the fundamentals, not just the implementation details locked away in a corporate codebase.&lt;/p&gt;

&lt;p&gt;For anyone struggling to demonstrate real-world experience in interviews or portfolio reviews, Justin’s method offers a clear, ethical path forward. Dive into his full post to see how he generated synthetic data, trained models, and even built a production-style serving layer—all from scratch.&lt;/p&gt;

&lt;p&gt;Check out the original article and code here: &lt;a href="https://blog.justintime.ai/rebuilding-ml-projects-time-series/" rel="noopener noreferrer"&gt;Someone Else Owns My Best Code, So I Rewrote It&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://blog.justintime.ai/rebuilding-ml-projects-time-series/" rel="noopener noreferrer"&gt;Read the full post here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Follow Justin Norman's work: &lt;a href="https://bsky.app/profile/justintime.ai" rel="noopener noreferrer"&gt;Bluesky&lt;/a&gt; | &lt;a href="https://github.com/stbiadmin" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; | &lt;a href="https://www.linkedin.com/in/justindnorman/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; | &lt;a href="https://justintime.ai" rel="noopener noreferrer"&gt;Blog&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>portfolio</category>
      <category>timeseries</category>
      <category>career</category>
    </item>
  </channel>
</rss>
