<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kaylynn</title>
    <description>The latest articles on DEV Community by Kaylynn (@kaylynnw).</description>
    <link>https://dev.to/kaylynnw</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1198561%2Fce0b2b4f-105d-47c0-a17f-b054ac4ae65b.png</url>
      <title>DEV Community: Kaylynn</title>
      <link>https://dev.to/kaylynnw</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kaylynnw"/>
    <language>en</language>
    <item>
      <title>Debugging Your RAG Application: A LangChain, Python, and OpenAI Tutorial</title>
      <dc:creator>Kaylynn</dc:creator>
      <pubDate>Thu, 11 Jan 2024 19:41:24 +0000</pubDate>
      <link>https://dev.to/focused_dot_io/debugging-your-rag-application-a-langchain-python-and-openai-tutorial-3a0g</link>
      <guid>https://dev.to/focused_dot_io/debugging-your-rag-application-a-langchain-python-and-openai-tutorial-3a0g</guid>
      <description>&lt;p&gt;Let’s explore a real-world example of debugging a RAG-type application. I recently undertook this process while updating our company knowledge base – a resource for potential clients and employees to learn about us.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack:
&lt;/h2&gt;

&lt;p&gt;I work with Python and the LangChain framework, specifically using LangChain Expression Language (LCEL) to build chains. You can find the LangChain LCEL documentation &lt;a href="https://python.langchain.com/docs/expression_language/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This approach services as a good alternative to LangChain’s debugging tool, &lt;a href="https://www.langchain.com/langsmith" rel="noopener noreferrer"&gt;LangSmith&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Load memory
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_session_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ConversationBufferMemory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ConversationBufferMemory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;return_messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_loaded_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;get_session_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;load_memory_variables&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]})&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_memory_chain&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;RunnablePassthrough&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;RunnableLambda&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_get_loaded_memory&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Create Question
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_question_chain&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                                   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                                   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;get_buffer_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
                               &lt;span class="p"&gt;}&lt;/span&gt;
                               &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;CONDENSE_QUESTION_PROMPT&lt;/span&gt;
                               &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;
                               &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nc"&gt;StrOutputParser&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Retrieve Documents
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieve_documents_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Answer
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_answer_chain&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;final_inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;combine_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;DEFAULT_DOCUMENT_PROMPT&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;answer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;final_inputs&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;ANSWER_PROMPT&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Final Chain looks like this
&lt;/span&gt;&lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_memory_chain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nf"&gt;create_question_chain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nf"&gt;retrieve_documents_chain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nf"&gt;create_answer_chain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While debugging, I prefer using a cheaper model like gpt-3.5-turbo for its cost-effectiveness. The less advanced models are more than adequate for basic testing. For final testing and deployment to production, you might consider upgrading to gpt-4-turbo or a similar advanced model.&lt;/p&gt;

&lt;p&gt;I also favor Jupyter notebooks for much of my debugging. This way, I can include the notebook in a .gitignore file, reducing cleanup from debugging shenanigans in my main code. I can also run very specific pieces of my code without plumbing overhead. &lt;/p&gt;

&lt;h2&gt;
  
  
  Initial Observations
&lt;/h2&gt;

&lt;p&gt;I noticed that basic queries received correct answers, but any follow-up question would lack the appropriate context, indicating that conversational memory was no longer functioning effectively.&lt;/p&gt;

&lt;p&gt;Here's what I observed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Question: What are the Focused Labs core values?
&lt;span class="gt"&gt;&amp;gt; AI: The core values of Focused Labs are Love Your Craft, Listen First, and Learn Why ✅&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Sources: ...&lt;/span&gt;

Question: Tell me more about the first one.
&lt;span class="gt"&gt;&amp;gt; AI: Based on the given context, the first one is about the importance of the "Red" step in Test Driven Development (TDD). ❌&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Sources: ...&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, I expected responses more in line with explanations like "Love Your craft is when you are passionate about what you do."&lt;/p&gt;

&lt;p&gt;For more context, this issue with conversational memory arose while I was implementing a new feature: allowing end users to customize responses based on their role. So, for example, a developer could receive a highly technical answer while a marketing manager would see more high-level details. &lt;/p&gt;

&lt;h2&gt;
  
  
  Debugging Steps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Ensure Role Feature Integrity
&lt;/h3&gt;

&lt;p&gt;To avoid impacting the newly implemented role feature, I made it overly obvious and active in every response during this debugging session by temporarily updating my system prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Answer the question from the perspective of a {role}.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;DEBUGGING_SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Answer the question in a {role} accent.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's how the AI responded, clearly adhering to my updated prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Question: What are the Focused Labs core values?
Role: pirate
&lt;span class="gt"&gt;&amp;gt; AI: Arr, the core values of Focused Labs be Love Your Craft, Listen First, and Learn Why, matey! ✅&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Sources: ...&lt;/span&gt;

Question: Tell me more about the first one.
&lt;span class="gt"&gt;&amp;gt; AI: Arr, the first one be talkin' about the importance of reachin' the "Red" stage in Test Driven Development... ✅&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Sources: ...&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Creating a Visual Representation
&lt;/h3&gt;

&lt;p&gt;I created a diagram of the app to visualize the process flow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiabxvee4m6y1co7fmy1b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiabxvee4m6y1co7fmy1b.png" alt="RAG App Diagram"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I began at the end of my flow and worked backward to identify issues. I first checked whether my LLM was answering questions based on the provided context. Upon inspecting the sources, I realized that the given context was a blog on TDD.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gt"&gt;&amp;gt; Sources: [{'URL': 'https://focusedlabs.io/blog/tdd-first-step-think'}, ...]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thus, I ruled out the answer component as the source of the bug.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Tracing the Bug's Origin
&lt;/h3&gt;

&lt;p&gt;Next, I examined the logic for retrieving documents. I added a 'standalone question' key to every input and output chain to log runtime values, which revealed that questions were being incorrectly rephrased.&lt;/p&gt;

&lt;p&gt;💡&lt;em&gt;Adding these keys to the chains allows us to log the values seen by the components at runtime. Using breakpoints will only show the code when it’s instantiated and not populated with real-time values.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Code Snippet with added keys
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieve_documents_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_answer_chain&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;final_inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added 
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I expected the &lt;code&gt;standalone_question&lt;/code&gt; to be more specific, like “What can you tell me about the core value of Love your Craft?”&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Question: What are the Focused Labs core values?
&lt;span class="gt"&gt;&amp;gt; standalone_question: What are the core values of Focused Labs? ✅&lt;/span&gt;

Question: Tell me more about the first one.
&lt;span class="gt"&gt;&amp;gt; standalone_question: What can you tell me about the first one? ❌&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Identifying the Exact Source
&lt;/h3&gt;

&lt;p&gt;I focused on the &lt;code&gt;chat_history&lt;/code&gt; variable, suspecting an issue with how the chat history was being recognized.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;retrieve_documents_chain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added
&lt;/span&gt;                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_answer_chain&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;final_inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added 
&lt;/span&gt;                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
                &lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standalone_question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added
&lt;/span&gt;                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;itemgetter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chat_history&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Question: What are the Focused Labs core values?

Question: Tell me more about the first one.
&lt;span class="gt"&gt;&amp;gt; chat_history: [] ❌&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🔔 Found the issue! Since the &lt;code&gt;chat_history&lt;/code&gt; was blank, it wasn’t being loaded as I had assumed.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Implementing the Solution
&lt;/h3&gt;

&lt;p&gt;I resolved the issue by checking my conversation memory store. As a &lt;code&gt;dict&lt;/code&gt;, the conversation memory store was sensitive to the type of saved messages. I saved the messages with a &lt;code&gt;str&lt;/code&gt; converted version of &lt;code&gt;session_id&lt;/code&gt;. But, I invoked with an &lt;code&gt;Optional[UUID]&lt;/code&gt; version. So, while the conversation memory store itself was set up correctly, I needed to update how I invoked my chain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 
&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Therefore, I updated the &lt;code&gt;session_id&lt;/code&gt; type to &lt;code&gt;str&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; 
&lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Confirming the Fix
&lt;/h3&gt;

&lt;p&gt;I confirmed that the conversation memory now functioned correctly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Question: What are the Focused Labs core values?

Question: Tell me more about the first one.
&lt;span class="gt"&gt;&amp;gt; chat_history: ['What are the Focused Labs core values?'] ✅&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; standalone_question: Can you provide more information about the first core value: Love Your Craft? ✅&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; AI: This value means that we are passionate about being the best at what we do, paying attention to every detail... ✅&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; Sources: [{'URL': 'https://www.notion.so/Who-are-we-c42efb179fa64f6bb7866deb363fb7ef'}, ...] ✅&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Final Cleanup and Future-Proofing
&lt;/h3&gt;

&lt;p&gt;I reverted back from the temporary pirate accent debug feature used for easy identification of the role feature. &lt;/p&gt;

&lt;p&gt;I decided to maintain detailed logging within the system for future debugging efforts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Debugging AI Systems:&lt;/strong&gt; A mix of traditional and AI-specific debugging techniques is essential.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Opting for Cost-Effective Models:&lt;/strong&gt; Use more affordable models to reduce costs during repeated queries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Importance of Transparency:&lt;/strong&gt; Clear visibility into each step and component of your RAG accelerates debugging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type Consistency:&lt;/strong&gt; Paying attention to small details, like variable types, can significantly impact functionality.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Thanks for reading!
&lt;/h3&gt;

&lt;p&gt;Stay tuned for more insights into the world of software engineering and AI. Have questions or insights? Feel free to share them in the comments below!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>testing</category>
      <category>openai</category>
    </item>
    <item>
      <title>Open AI Assistants: Limited, but Incredible</title>
      <dc:creator>Kaylynn</dc:creator>
      <pubDate>Fri, 22 Dec 2023 17:58:47 +0000</pubDate>
      <link>https://dev.to/focused_dot_io/open-ai-assistants-limited-but-incredible-512c</link>
      <guid>https://dev.to/focused_dot_io/open-ai-assistants-limited-but-incredible-512c</guid>
      <description>&lt;p&gt;In the ever-evolving realm of AI development, OpenAI has ventured into the domain of Retrieval Augmented Generation (RAG) with its latest offering, &lt;a href="https://platform.openai.com/docs/assistants/overview" rel="noopener noreferrer"&gt;Assistants&lt;/a&gt;. Assistants with Retrieval represents OpenAI's attempt to harness the power of RAG — a technique widely embraced and refined within open-source libraries to augment an AI’s knowledge with custom information. Think ChatGPT, but with the ability to ask it about custom information. While it doesn't yet lead the pack in creating AI-driven knowledge bases, its inception marks a significant step towards more sophisticated and nuanced AI applications. As developers, our dive into this beta tool is not just about evaluating its current capabilities, but also understanding its place in the broader context of RAG's evolution and its potential to reshape our approaches to LLMs.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Incredible Accuracy for Incredible Ease of Use&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;OpenAI's RAG tool is an awesome prospect for developers eager to explore the nuances of retrieval-augmented models. It serves as a canvas for experimentation, offering a glimpse into the future of sophisticated AI applications. Here's why it's a valuable experimental platform:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ease of Use&lt;/strong&gt;: Its user-friendly nature invites developers to explore and experiment with minimal setup. This allows more people to try out more use cases to learn where LLM retrieval could improve their workflows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Respectable Accuracy&lt;/strong&gt;: We swapped out our gpt-3.5-turbo model in our custom chatbot with an OpenAI Assistant, and saw a slightly-lower, but similar level of accuracy. This is so cool! Being able to spin up a custom chatbot in a couple of hours with ~75% accuracy rate is incredible! For more information about our custom chatbot, visit our website: focusedlabs.io/ai.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Understanding the Limitations: A Beta Analysis&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In its beta stage, this tool has its limitations. The current landscape of RAG in open-source libraries such as &lt;a href="https://www.langchain.com/" rel="noopener noreferrer"&gt;Langchain&lt;/a&gt; sets a high benchmark, and OpenAI's iteration is an ambitious stride into this territory. However, with ambition comes the teething problems of any beta technology. Here's what developers should be aware of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Source Citation Feature&lt;/strong&gt;: A vital feature for gaining user trust, source citation, is not yet functional, indicating future improvements but also a present gap.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document Limitations&lt;/strong&gt;: An assistant supports only 20 documents, each up to 512 MB, which is not scalable for most enterprise datasets. For smaller data sets, while the file size itself is not limiting, combining multiple smaller files into 1 larger file is an anti-pattern resulting in loss of structure and context. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lack of Customization&lt;/strong&gt;: While the simple interface facilitates quick setup times, high abstraction levels mean less control for developers to adjust and optimize the tool for specific use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Polling Over Streaming&lt;/strong&gt;: Without streaming capabilities, developers are left with polling methods, which hamper real-time efficiency.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Tips for Maximizing Your Experience&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Navigating a beta tool requires a mix of patience and strategy. To make the most of your journey with OpenAI's RAG tool, keep these tips in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Management&lt;/strong&gt;: Efficiently handle your limited document space by converting to and concatenating data wherever possible. We recommend using *.txt formats for ease and compression.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File Type Performance&lt;/strong&gt;: Experiment with different file types to discover which yields the best results for your specific case. For us, *.txt returned more accurate results over PDFs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use with Other Libraries:&lt;/strong&gt;Langchain supports Open AI Assistants. While Assistants may not be the leader on their own, they are still powerful when combined with other techniques. I also recommend using &lt;a href="https://llamahub.ai/?tab=loaders" rel="noopener noreferrer"&gt;LlamaHub’s loaders&lt;/a&gt; to help integrate with various data sources.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;OpenAI's entry into the RAG space is a testament to the ongoing evolution of AI and machine learning technologies. While this particular tool may not be ready for production, it offers a valuable learning curve for developers keen on the future of retrieval-augmented models. By engaging with it critically and creatively, we can contribute to its growth and simultaneously expand our own understanding of where RAG can take us in the realm of AI development.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>openai</category>
    </item>
    <item>
      <title>Enhancing AI Apps with Streaming: Practical Tips for Smoother AI Generation</title>
      <dc:creator>Kaylynn</dc:creator>
      <pubDate>Thu, 14 Dec 2023 20:01:13 +0000</pubDate>
      <link>https://dev.to/focused_dot_io/enhancing-ai-apps-with-streaming-practical-tips-for-smoother-ai-generation-2n59</link>
      <guid>https://dev.to/focused_dot_io/enhancing-ai-apps-with-streaming-practical-tips-for-smoother-ai-generation-2n59</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Interested in building an AI-powered app using generative models? You're on the right track! The realm of generative AI is brimming with untapped potential. A popular approach is to prompt the AI to generate a list of ideas based on provided context, allowing users to select their preferred option and then refine and expand on the content.&lt;/p&gt;

&lt;p&gt;An example of a generative AI app is one my team and I created for copywriters to seamlessly integrate storytelling into marketing emails. This user-friendly wizard elevates email creation, combining the power of generative AI with the user’s own expertise and creativity for impactful results. Our development journey led to the integration of streaming technology, significantly reducing AI response times. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frs2ab157yhnqld3bquj4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frs2ab157yhnqld3bquj4.png" alt="The image shows a person with headphones using a laptop, surrounded by a colorful, tech-themed fantasy setting with neon trees and floating digital icons." width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article, I'll share 2 essential tips to enhance user experience in AI-driven apps through effective use of streaming. Let's dive in!&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Backend: Node.js Typescript with Express&lt;/li&gt;
&lt;li&gt;Frontend: React Typescript&lt;/li&gt;
&lt;li&gt;AI Integration: &lt;a href="https://github.com/openai/openai-node" rel="noopener noreferrer"&gt;OpenAI’s Node SDK&lt;/a&gt; and GPT-4&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s start with the basics. &lt;/p&gt;

&lt;p&gt;First, send requests to OpenAI leveraging their Node.js SDK. Due to our prompt, the AI response is a list with bullet numbers like “1.”. We use the bullet number format to parse the separate options.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPEN_AI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chatModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;createIdeas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;occasion&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;occasion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;completion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`I want to host an event for the following occasion: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;occasion&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. Write me a list of 4 separate ideas for this event`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;chatModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;choiceElementElement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;message&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;content&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;ideas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;parseIdeas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;choiceElementElement&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)};&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// Use the bullet number format (ex: "1.") to split the ideas into individual elelements in an array&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parseIdeas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;[&lt;/span&gt;&lt;span class="sr"&gt;0-9&lt;/span&gt;&lt;span class="se"&gt;]&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="sr"&gt;/gm&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;messageSliced&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;messageSliced&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, let’s return the parsed ideas to the frontend. This is example method leverages Express.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/generate-ideas&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;occasion&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;generatedIdeas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createIdeas&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;occasion&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;generatedIdeas&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This produces a nicely formatted json response to the frontend that is very easy to pass into the appropriate UI components.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ideas"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;" Example first idea &lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Challenge: Latency
&lt;/h3&gt;

&lt;p&gt;The hiccup? A waiting time of up to 30 seconds before users can view the AI’s suggestions. Watching a loading icon spin for half a minute is not a good user experience. &lt;/p&gt;

&lt;h2&gt;
  
  
  Tip #1: Leverage Streaming
&lt;/h2&gt;

&lt;p&gt;Enter OpenAI’s “streaming” feature - a savior for reducing latency. By setting OpenAI’s Node SDK input parameter &lt;code&gt;**stream**&lt;/code&gt; to &lt;strong&gt;&lt;code&gt;true&lt;/code&gt;&lt;/strong&gt; , we display words to the user as they became available. This doesn’t expedite the complete generation process, but it cuts down the wait time for the first word. Think of it as the “typewriter” effect seen in ChatGPT.&lt;/p&gt;

&lt;p&gt;To peek under the hood, the streaming feature uses an HTML5 capability called Server Sent Events (SSE). SSE allows servers to push real-time data to web clients over a single HTTP connection. Unlike WebSockets, which is bidirectional, SSE is unidirectional, making it perfect for sending data from the server to the client in scenarios where the client doesn't need to send data back.&lt;/p&gt;

&lt;p&gt;So, we refactor the request to OpenAI to include the input parameter &lt;code&gt;stream&lt;/code&gt; and we return a Stream wrapped in an API Promise to our controller method.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;createIdeas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;occasion&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;occasion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;chatModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`I want to host an event for the following occasion: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;occasion&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. Write me a list of 4 separate ideas for this event`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In our controller method, we clean up the data a little bit, and then send the data over the HTTP connection to the frontend as it is received from our request to OpenAI.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/generate-ideas&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;occasion&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createIdeas&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="nx"&gt;occasion&lt;/span&gt;&lt;span class="p"&gt;}))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="c1"&gt;// If we don't receive any more content, end the connection. &lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Sometimes the AI will format the response with an extra label. &lt;/span&gt;
    &lt;span class="c1"&gt;// Remove this. &lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;data:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;data: &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After understanding the mechanics, our next step was to synchronize the frontend client that communicates with the backend Express endpoint above with the frontend to support SSE.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requestIdeas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;occasion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REACT_APP_API_BASE_URL&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/generate-ideas`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="nx"&gt;occasion&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;pipeThrough&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextDecoderStream&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;getReader&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Impact
&lt;/h3&gt;

&lt;p&gt;The results are night and day. From a staggering 30-second wait, users now see the initial AI-generated content within half a second. &lt;/p&gt;

&lt;h2&gt;
  
  
  Tip #2: UI Components of Streaming
&lt;/h2&gt;

&lt;p&gt;When users are given multiple options to choose from, the UI should split those different options into different components. For example, each option should be a radio button or a different div. But streaming text in real-time throws a wrench in the works. How can we differentiate, parse each AI-generated suggestion as distinct UI components?&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: Add parsing based on a unique character.
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add a unique Bullet Identifier:&lt;/strong&gt;  Update your prompt to the AI to use an unusual character as a bullet point that is unlikely to appear in the rest of your text. We used the “¶” symbol and updated our prompt to include the following: &lt;code&gt;Start each bullet point with a new line and the ¶ character. Do not include the normal bullet point character, the '-' character, or list numbers.&lt;/code&gt;`&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Splitting the Stream:&lt;/strong&gt; We segmented each byte array from the SSE endpoint into distinct words. This separation was pivotal, given that a single SSE byte array content was unpredictable. Sometimes it included a single word, other times it included full phrases that contained the “¶” character, like &lt;code&gt;subject matter. ¶ Engage&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Append each word:&lt;/strong&gt; Once each word is prepared, we append the value to the appropriate UI component. Tracking the “¶” occurrences helps us assign words to the correct component. For instance, a single “¶” means the content belonged to the first-option component.  Repeating this process in the loop until the SSE endpoint closed. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;&lt;/code&gt;`tsx&lt;br&gt;
export const parseResponseValue = (index: number, value: string, setters: Function[]) =&amp;gt; {&lt;br&gt;
    // Separate into individual words only (no phrases)&lt;br&gt;
  const splitValue = value.split(/(?! {2})/g);&lt;/p&gt;

&lt;p&gt;for (let word of splitValue) {&lt;br&gt;
    if (word.includes('¶')) {&lt;br&gt;
      index++;&lt;br&gt;
    } else {&lt;br&gt;
        setters&lt;a href="https://dev.to(prev:%20string)%20=&amp;gt;%20{&amp;lt;br&amp;gt;%0A%20%20%20%20%20%20%20%20%20%20return%20prev%20+%20word;&amp;lt;br&amp;gt;%0A%20%20%20%20%20%20%20%20}"&gt;index&lt;/a&gt;;&lt;br&gt;
    }&lt;br&gt;
  }&lt;br&gt;
  // Return the index to the calling function for use for the next byte array received from the endpoint. &lt;br&gt;
  return index;&lt;br&gt;
};&lt;br&gt;
`&lt;code&gt;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;&lt;/code&gt;`tsx&lt;/p&gt;

&lt;p&gt;const [firstIdea, setFirstIdea] = useState("");&lt;br&gt;
const [secondIdea, setSecondIdea] = useState("");&lt;br&gt;
const [thirdIdea, setThirdIdea] = useState("");&lt;br&gt;
const [fourthIdea, setFourthIdea] = useState("");&lt;/p&gt;

&lt;p&gt;const getIdeas = async (newStatus: boolean) =&amp;gt; {&lt;br&gt;
  const responseReader = await requestIdeas(occasion);&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// The index will be incremented each time the unique bullet identifier is seen. This includes the first bullet, so offset by 1. 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;let index = -1;&lt;br&gt;
  while (true) {&lt;br&gt;
    const { value, done } = await responseReader.read();&lt;br&gt;
    if (done) {&lt;br&gt;
      break;&lt;br&gt;
    }&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;index = parseResponseValue(index, value, [
  setFirstIdea,
  setSecondIdea,
  setThirdIdea,
  setFourthIdea,
]);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;}&lt;br&gt;
};&lt;br&gt;
`&lt;code&gt;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Though string parsing occasionally fails due to edge cases from the AI, it facilitates an overall better user experience by stylizing real-time text streaming. On encountering an AI anomaly, equip users with the ability to retry AI generation. Generally, this fixes any parsing issue encountered the first time. &lt;/p&gt;

&lt;p&gt;Importantly, by avoiding multiple smaller requests, we economized on tokens sent to GPT-4. This not only curtailed costs but also enriched the result quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Harnessing the power of generative AI in applications is undeniably transformative, but it doesn't come without its challenges. As we've explored, latency can be a significant hurdle, potentially hampering user experience. However, with innovative solutions like real-time streaming and strategic UI component parsing, we can overcome these challenges, making our applications not only more responsive but also user-friendly.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>python</category>
      <category>programming</category>
    </item>
    <item>
      <title>The Blueprint for Trustworthy AI: Constructing Accurate Chatbots with Sophisticated Data Pipelines</title>
      <dc:creator>Kaylynn</dc:creator>
      <pubDate>Thu, 09 Nov 2023 15:49:41 +0000</pubDate>
      <link>https://dev.to/focused_dot_io/the-blueprint-for-trustworthy-ai-constructing-accurate-chatbots-with-sophisticated-data-pipelines-57e1</link>
      <guid>https://dev.to/focused_dot_io/the-blueprint-for-trustworthy-ai-constructing-accurate-chatbots-with-sophisticated-data-pipelines-57e1</guid>
      <description>&lt;h2&gt;
  
  
  AI Custom Chatbot with an Advanced Data Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;To get started leveling up your custom AI chatbot, check out our &lt;a href="https://github.com/focused-labs/ai-custom-chatbot-data-pipeline" rel="noopener noreferrer"&gt;GitHub repository here&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This tutorial shows you how to level-up your data pipeline when building a custom AI chatbot.&lt;/p&gt;

&lt;p&gt;We recommend setting up a basic custom chatbot first. This tutorial will build on top of &lt;a href="https://github.com/focused-labs/ai-custom-chatbot-quickstart" rel="noopener noreferrer"&gt;the quick start you can reference here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building AI chatbot apps is easy. Customizing is hard.
&lt;/h2&gt;

&lt;p&gt;We are using our preferred tech stack.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://openai.com/" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://python.langchain.com/docs/get_started/introduction" rel="noopener noreferrer"&gt;Langchain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://llamahub.ai/" rel="noopener noreferrer"&gt;Llama Hub Data Loaders&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pinecone.io/" rel="noopener noreferrer"&gt;Pinecone Vector Database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Python &lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;(FastAPI web framework)&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We are leveraging Retrieval Augmented Generation (RAG). &lt;/p&gt;

&lt;h3&gt;
  
  
  The goal of this tutorial is to help level-up your customized chatbot beyond the basics.
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Increase accuracy&lt;/li&gt;
&lt;li&gt;Grow dataset by integrating more disparate data sources&lt;/li&gt;
&lt;li&gt;Earning user trust&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Technical Objectives
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Increase accuracy by implementing multiple data cleansing techniques&lt;/li&gt;
&lt;li&gt;Integrate Notion&lt;/li&gt;
&lt;li&gt;Earn user trust by providing the link to the original data source&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Here’s a peak at the components of the AI custom chatbot.
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fswzrt5i9n9k50yelwvog.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fswzrt5i9n9k50yelwvog.png" alt="Architecture Overview" width="800" height="490"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  For more resources: check out &lt;a href="https://focusedlabs.io/ai" rel="noopener noreferrer"&gt;our other blog posts on AI&lt;/a&gt;.
&lt;/h4&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>python</category>
      <category>openai</category>
    </item>
    <item>
      <title>From Basic to Custom: AI Chatbot Building 101</title>
      <dc:creator>Kaylynn</dc:creator>
      <pubDate>Thu, 02 Nov 2023 15:47:19 +0000</pubDate>
      <link>https://dev.to/focused_dot_io/from-basic-to-custom-ai-chatbot-building-101-ilf</link>
      <guid>https://dev.to/focused_dot_io/from-basic-to-custom-ai-chatbot-building-101-ilf</guid>
      <description>&lt;h2&gt;
  
  
  AI Custom Chatbot Quickstart
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;To get started leveling up your custom AI chatbot, check out our &lt;a href="https://github.com/focused-labs/ai-custom-chatbot-quickstart" rel="noopener noreferrer"&gt;GitHub repository here&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Domain specific AI chatbots are powerful applications that can be used in many use cases: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Virtual Assistants&lt;/li&gt;
&lt;li&gt;Knowledge retrieval&lt;/li&gt;
&lt;li&gt;Text synthesis&lt;/li&gt;
&lt;li&gt;Text formatting&lt;/li&gt;
&lt;li&gt;Sentiment analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Building AI chatbot apps is easy. Customizing is hard.
&lt;/h2&gt;

&lt;p&gt;This ready-to-go sample project shows you how to build a custom chatbot using our preferred LLM tech stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://openai.com/" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://python.langchain.com/docs/get_started/introduction" rel="noopener noreferrer"&gt;Langchain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://llamahub.ai/" rel="noopener noreferrer"&gt;Llama Hub Data Loaders&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pinecone.io/" rel="noopener noreferrer"&gt;Pinecone Vector Database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Python &lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;(FastAPI web framework)&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We are leveraging Retrieval Augmented Generation (RAG). &lt;/p&gt;

&lt;h3&gt;
  
  
  The goal of this tutorial is to demonstrate the basics.
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Demonstrate how to add custom data to an LLM model (we're using OpenAI's gpt-3.5-turbo)&lt;/li&gt;
&lt;li&gt;Demonstrate a conversational memory LLM chatbot&lt;/li&gt;
&lt;li&gt;Demonstrate using agents and tools with the Langchain Framework&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Technical Objectives
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Ingest data into a vector database&lt;/li&gt;
&lt;li&gt;Query the vector database&lt;/li&gt;
&lt;li&gt;Query an agent that decides whether to query the vector database&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Here’s a peak at the components of a basic AI custom chatbot.
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff1rsq1j3wfp9p12j5m19.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff1rsq1j3wfp9p12j5m19.png" alt="Architecture Overview"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  For more resources: check out &lt;a href="https://focusedlabs.io/ai" rel="noopener noreferrer"&gt;our other blog posts on AI&lt;/a&gt;.
&lt;/h4&gt;

</description>
      <category>tutorial</category>
      <category>ai</category>
      <category>beginners</category>
      <category>python</category>
    </item>
  </channel>
</rss>
