<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Samuel Pordeus</title>
    <description>The latest articles on DEV Community by Samuel Pordeus (@samuelpordeus).</description>
    <link>https://dev.to/samuelpordeus</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F205916%2F965aee55-7ae6-46e9-85a8-3e8c9a5b99ad.png</url>
      <title>DEV Community: Samuel Pordeus</title>
      <link>https://dev.to/samuelpordeus</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/samuelpordeus"/>
    <language>en</language>
    <item>
      <title>Testing LLMs with Elixir</title>
      <dc:creator>Samuel Pordeus</dc:creator>
      <pubDate>Wed, 19 Jun 2024 19:19:43 +0000</pubDate>
      <link>https://dev.to/samuelpordeus/testing-llm-output-with-elixir-1l71</link>
      <guid>https://dev.to/samuelpordeus/testing-llm-output-with-elixir-1l71</guid>
      <description>&lt;p&gt;&lt;em&gt;disclaimer: this post was entirely written by a human 🧬&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;I've been working on a Gen AI product for the past year and quickly realized that LLMs can be considered wild beasts that require constant vigilance. As someone used to leverage test coverage to prevent regressions, the LLM/AI world can be pretty frustrating since model outputs are nondeterministic.&lt;/p&gt;

&lt;p&gt;When you deploy an LLM solution to production, you get an amorphous mass of statistical data that produces ever-changing outputs. And it can get even more chaotic, due to various reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompts might need tweaking as you start getting more customers.&lt;/li&gt;
&lt;li&gt;a new and shiny model was released! ✨ but after upgrading your model mindlessly once, you aren't that confident that it's not going to break this time&lt;/li&gt;
&lt;li&gt;models are changed by vendors without sufficient notice, even if their documentation says it's stable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One way to mitigate these issues is to start evaluating the output from your LLM product before you have your first paying customer 💰&lt;/p&gt;

&lt;h3&gt;
  
  
  The Approach
&lt;/h3&gt;

&lt;p&gt;One of the most common test approaches for Python is to use &lt;a href="https://cookbook.openai.com/examples/evaluation/getting_started_with_openai_evals"&gt;OpenAI Evals&lt;/a&gt;, but ideally, we should integrate our tests closer to our Elixir code, where our Business Logic currently lives, so we can seamlessly test prompt changes.&lt;/p&gt;

&lt;p&gt;So let's do it inspired by Evals but with Elixir 🧪&lt;/p&gt;

&lt;p&gt;LLM outputs can either be &lt;strong&gt;structured&lt;/strong&gt;, with a well-defined schema with predictable outputs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Muse T-Shirt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"clothes"&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or &lt;strong&gt;unstructured&lt;/strong&gt;, with high variance, undefined format, and usually free text:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Of course. 'Muse T-Shirt' belongs to the 'clothes' Category.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first type of test is quite easy to handle. You send a request, get a response, validate if the schema is correct, and it's done 👌 – OpenAI Evals handles them with something called &lt;strong&gt;Basic Evals&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The unstructured type is where it gets tricky. Although you should try to use JSON schemas for everything (might deserve a separate post), it might be inevitable to produce responses with significant variances:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"product_availability"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"answer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Unfortunately I don't have a Muse T-Shirt available in stock now. Would you be interested in a BTS one?"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's where we use &lt;strong&gt;Model-graded Evals&lt;/strong&gt;: A two-step process where a model gives an output and we use &lt;strong&gt;another model&lt;/strong&gt; to validate it.&lt;/p&gt;

&lt;p&gt;Quite chaotic, right? But sometimes when dealing with this crazy AI world you need to fight chaos with chaos ☄️&lt;/p&gt;

&lt;h3&gt;
  
  
  Implementation
&lt;/h3&gt;

&lt;p&gt;I've been working with Elixir for the past 5 years, so I'm quite fond of the idea of the LLM tests looking like a regular &lt;code&gt;mix test&lt;/code&gt; suite run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mix test --only llm_test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So let's see what a test will look like before we dive into its internals:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight viml"&gt;&lt;code&gt;# test&lt;span class="sr"&gt;/llm_tests/&lt;/span&gt;llm_eval_demo_test&lt;span class="p"&gt;.&lt;/span&gt;exs

alias Test&lt;span class="p"&gt;.&lt;/span&gt;LLMEvaluation

@&lt;span class="k"&gt;tag&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;llm_test
test &lt;span class="s2"&gt;"robot returns a muse t-shirt"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  conversation &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
      %&lt;span class="p"&gt;{&lt;/span&gt;
        role&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;system&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        content&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"The best band t-shirt in the world is a Muse one!"&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      %&lt;span class="p"&gt;{&lt;/span&gt;
        role&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;user&lt;span class="p"&gt;,&lt;/span&gt;
        content&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"tell me a cool band t-shirt to buy, my man"&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

  assert LLMEvaluation&lt;span class="p"&gt;.&lt;/span&gt;basic_eval&lt;span class="p"&gt;(&lt;/span&gt;conversation&lt;span class="p"&gt;)&lt;/span&gt;
end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pretty Elixir-ish, right? So let's start configuring it.&lt;/p&gt;

&lt;h4&gt;
  
  
  Configuration
&lt;/h4&gt;

&lt;p&gt;Add &lt;code&gt;exclude: :llm_test&lt;/code&gt; to your &lt;code&gt;ExUnit.start/1&lt;/code&gt; function in &lt;code&gt;test/test_helper.exs&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight viml"&gt;&lt;code&gt;ExUnit&lt;span class="p"&gt;.&lt;/span&gt;start&lt;span class="p"&gt;(&lt;/span&gt;exclude&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;llm_test&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This way, we leverage tags to ensure the LLM tests, where real API calls are made, don't conflict with mocks &amp;amp; stubs.&lt;/p&gt;

&lt;h4&gt;
  
  
  Implementing the Test.LLMEvaluation helper module: Basic Eval
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;basic_eval/1&lt;/code&gt; receives as input a conversation/chat, however you implemented it.&lt;/p&gt;

&lt;p&gt;After that you send a request to your LLM Chat completion API Provider and parse the response content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight viml"&gt;&lt;code&gt;defmodule Test&lt;span class="p"&gt;.&lt;/span&gt;LLMEvaluation &lt;span class="k"&gt;do&lt;/span&gt;
  @moduledoc false

  @default_model &lt;span class="s2"&gt;"gpt-4"&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; basic_eval&lt;span class="p"&gt;(&lt;/span&gt;conversation&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    params &lt;span class="p"&gt;=&lt;/span&gt; %&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"model"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; fetch_model&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s2"&gt;"response_format"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; %&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"type"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"json_object"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;

    Client&lt;span class="p"&gt;.&lt;/span&gt;request&lt;span class="p"&gt;(&lt;/span&gt;conversation&lt;span class="p"&gt;,&lt;/span&gt; params&lt;span class="p"&gt;)&lt;/span&gt;
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's worth enforcing JSON output so we can beautifully pattern-match the decoded response content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight viml"&gt;&lt;code&gt;assert %&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"category"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"clothes"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; LLMEvaluation&lt;span class="p"&gt;.&lt;/span&gt;basic_eval&lt;span class="p"&gt;(&lt;/span&gt;conversation&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using OpenAI's Chat Completion APIs you can achieve that by using the &lt;code&gt;response_format&lt;/code&gt; param.&lt;/p&gt;

&lt;h4&gt;
  
  
  Implementing the Test.LLMEvaluation helper module: Model-graded Eval
&lt;/h4&gt;

&lt;p&gt;For the model-graded eval, we include an intermediary step: we send an extra call to openAI that verifies if the assertion is truthful.&lt;/p&gt;

&lt;p&gt;For that, we need to craft an &lt;em&gt;Assertion Prompt&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight viml"&gt;&lt;code&gt;You are an assertion agent that returns &lt;span class="s1"&gt;'true'&lt;/span&gt; &lt;span class="nb"&gt;or&lt;/span&gt; &lt;span class="s1"&gt;'false'&lt;/span&gt;
depending &lt;span class="k"&gt;on&lt;/span&gt; whether the Generated Message complies with the assertion&lt;span class="p"&gt;.&lt;/span&gt;

Generated Message&lt;span class="p"&gt;:&lt;/span&gt; #&lt;span class="p"&gt;{&lt;/span&gt;llm_output&lt;span class="p"&gt;}&lt;/span&gt;

Assert that&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'#{assertion}'&lt;/span&gt;

Return the following JSON format &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="k"&gt;a&lt;/span&gt; response&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c"&gt;"assertion": true,&lt;/span&gt;
  &lt;span class="c"&gt;"reason": "Explanation on why the assertion failed or not"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I bet you can write something better than that 🙂&lt;/p&gt;

&lt;p&gt;Moving forward, we need to include the Chat Completion call to include this new Assertion Prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight viml"&gt;&lt;code&gt;defmodule Test&lt;span class="p"&gt;.&lt;/span&gt;LLMEvaluation &lt;span class="k"&gt;do&lt;/span&gt;
  @moduledoc false

  @default_model &lt;span class="s2"&gt;"gpt-4"&lt;/span&gt;
  @assertion_model &lt;span class="s2"&gt;"gpt-4o"&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; model_graded_eval&lt;span class="p"&gt;(&lt;/span&gt;conversation&lt;span class="p"&gt;,&lt;/span&gt; assertion&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    params &lt;span class="p"&gt;=&lt;/span&gt; %&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"model"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; fetch_model&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s2"&gt;"response_format"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; %&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"type"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"json_object"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;

    conversation
    &lt;span class="p"&gt;|&amp;gt;&lt;/span&gt; Client&lt;span class="p"&gt;.&lt;/span&gt;request&lt;span class="p"&gt;(&lt;/span&gt;params&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;|&amp;gt;&lt;/span&gt; assertion&lt;span class="p"&gt;(&lt;/span&gt;assertion&lt;span class="p"&gt;)&lt;/span&gt;
  end

  defp assertion&lt;span class="p"&gt;(&lt;/span&gt;llm_output&lt;span class="p"&gt;,&lt;/span&gt; assertion&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="nb"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;assertion_prompt&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

    &lt;span class="k"&gt;messages&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;%&lt;span class="p"&gt;{&lt;/span&gt;content&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; role&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;

    params &lt;span class="p"&gt;=&lt;/span&gt; %&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"model"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; @assertion_model&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"response_format"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; %&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"type"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"json_object"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;

    &lt;span class="k"&gt;messages&lt;/span&gt;
    &lt;span class="p"&gt;|&amp;gt;&lt;/span&gt; Client&lt;span class="p"&gt;.&lt;/span&gt;request&lt;span class="p"&gt;(&lt;/span&gt;params&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;|&amp;gt;&lt;/span&gt; Map&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"llm_output"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; llm_output&lt;span class="p"&gt;)&lt;/span&gt;
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's important to return the &lt;code&gt;llm_output&lt;/code&gt; so in case the assertion goes wrong, you can check what was produced by the first model.&lt;/p&gt;

&lt;p&gt;And the test look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight viml"&gt;&lt;code&gt;@&lt;span class="k"&gt;tag&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;llm_test
test &lt;span class="s2"&gt;"robot returns a muse t-shirt"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  conversation &lt;span class="p"&gt;=&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;
      %&lt;span class="p"&gt;{&lt;/span&gt;
        role&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nb"&gt;system&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        content&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"The best band t-shirt in the world is a Muse one!"&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      %&lt;span class="p"&gt;{&lt;/span&gt;
        role&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;user&lt;span class="p"&gt;,&lt;/span&gt;
        content&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"tell me a cool band t-shirt to buy, my man"&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

  assertion &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Assistant response is a Muse T-Shirt"&lt;/span&gt;

  assert %&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;"assertion"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; true&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; LLMEvaluation&lt;span class="p"&gt;.&lt;/span&gt;model_graded_eval&lt;span class="p"&gt;(&lt;/span&gt;conversation&lt;span class="p"&gt;,&lt;/span&gt; assertion&lt;span class="p"&gt;)&lt;/span&gt;
end
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pretty Elixir-ish too, right!? 🥹&lt;/p&gt;

&lt;p&gt;That's it, folks! Ideally, you can now run a &lt;code&gt;mix test --only llm_test&lt;/code&gt; for every prompt or model change you do, to make sure your beloved customers don't experience hallucinations while speaking with your robots 🤖&lt;/p&gt;

&lt;p&gt;I'm planning to write more about using Elixir &amp;amp; LLMs in production. Hopefully with less code so you all don't get bored.&lt;/p&gt;

&lt;p&gt;Don't hesitate to send me a message through &lt;a href="https://www.linkedin.com/in/samuelpordeus/"&gt;Linkedin&lt;/a&gt; or to my email: &lt;em&gt;&lt;a href="mailto:samuelspordeus@gmail.com"&gt;samuelspordeus@gmail.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>elixir</category>
      <category>ai</category>
      <category>testing</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
