<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dewy Knowledge Base</title>
    <description>The latest articles on DEV Community by Dewy Knowledge Base (@dewy).</description>
    <link>https://dev.to/dewy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F8430%2F243b65cd-5b0d-47ac-a322-3eff43311efb.png</url>
      <title>DEV Community: Dewy Knowledge Base</title>
      <link>https://dev.to/dewy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dewy"/>
    <language>en</language>
    <item>
      <title>Building a RAG tool with Vercel's Generative UI components</title>
      <dc:creator>Ryan Michael</dc:creator>
      <pubDate>Thu, 07 Mar 2024 19:52:11 +0000</pubDate>
      <link>https://dev.to/dewy/building-a-rag-tool-with-vercels-generative-ui-components-1ee</link>
      <guid>https://dev.to/dewy/building-a-rag-tool-with-vercels-generative-ui-components-1ee</guid>
      <description>&lt;p&gt;Retrieval-Augmented Generation (RAG) blends the generative abilities of LLMs with the retrieval of information from diverse knowledge bases (KBs). However, traditional implementations of RAG have often relied on pre-scripting the logic for selecting and utilizing these KBs. &lt;/p&gt;

&lt;p&gt;This conventional method, while effective, places limits on the flexibility and adaptability, and performance of the applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fko2h2mtzlk61a5ugoknx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fko2h2mtzlk61a5ugoknx.png" alt="Naive RAG" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But consider an alternative: applications that empower the LLM itself to determine which actions are necessary to generate a response. This approach not only harnesses the LLM's generative capabilities but also its ability to make contextual decisions on-the-fly, thereby opening up a more dynamic and responsive way to handle knowledge search tasks. As a result, you can build targeted KB's and use the power of LLMs to write application-specific queries that retrieve better, more relevant results.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv479gjw6b0s1k0uyehjn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv479gjw6b0s1k0uyehjn.png" alt="RAG Tool" width="800" height="537"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To demonstrate this approach, we'll build &lt;a href="https://github.com/DewyKB/dewy-ts/tree/main/examples"&gt;a simple chatbot&lt;/a&gt; backed by a custom tool. The chatbot will have access to a KB containing product information - any time you ask a question the chatbot determines could be answered using this tool it will query the KB and show both the in-progress query and retrieved results as custom UI elements before responding with the retrieved products as context.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0sfm36pnvz079ll6axzw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0sfm36pnvz079ll6axzw.png" alt="Image description" width="800" height="497"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We'll use Vercel's &lt;a href="https://sdk.vercel.ai/docs/concepts/ai-rsc"&gt;Generative UI library&lt;/a&gt; to render the chat window and Dewy to implement the information storage and retrieval used by our tool and &lt;a href="https://github.com/DewyKB/dewy"&gt;Dewy&lt;/a&gt; as the knowledge store backing the tool. &lt;/p&gt;

&lt;h2&gt;
  
  
  Why Dewy and Vercel?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Vercel's Generative UI library)&lt;/strong&gt; is designed to simplify the creation of custom LLM "tools" and to render rich, interactive UI elements that are specific to each tool the LLM chooses to use. What this means in practice is that your UI can dynamically adapt to the specific tools chosen by the LLM, displaying tool-specific information and progress indicators to the user.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dewy&lt;/strong&gt; is an OSS knowledge base with the simplicity and ease-of-use of a document store: insert your documents and Dewy takes care of preparing them for semantic search. Dewy's flexibility and ease of use allow developers to focus on building amazing user experiences rather than complex data-processing pipelines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before diving into the tutorial, ensure you have the following prerequisites covered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic knowledge of Typescript and React&lt;/li&gt;
&lt;li&gt;A NextJS environment setup on your local machine&lt;/li&gt;
&lt;li&gt;A copy of Dewy running on your local machine (see Dewy's &lt;a href="https://dev.to/docs/GettingStarted/installation"&gt;installation instructions&lt;/a&gt; if you need help here).&lt;/li&gt;
&lt;li&gt;Access to the &lt;a href="https://platform.openai.com/"&gt;OpenAI API&lt;/a&gt; platform.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Set up your project
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Initialize a new NextJS project&lt;/strong&gt;: Create a NextJS app by running the following command in your terminal:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm dlx create-next-app@canary rag-tool
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;Navigate into your new project directory:&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;rag-tool
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install required packages&lt;/strong&gt;: Next, we'll install &lt;code&gt;ai&lt;/code&gt;, Vercel's AI library, &lt;code&gt;openai&lt;/code&gt;, OpenAI's official JavaScript SDK compatible with the Vercel Edge Runtime, and &lt;code&gt;dewy-ts&lt;/code&gt;, the Dewy client library. Zod will be used to describe the input types for our custom tool.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pnpm install ai openai zod dewy-ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Prepare your environment&lt;/strong&gt;: Configure your OpenAI key and Dewy endpoint.&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// .env.local
OPENAI_API_KEY=xxxxxxxxx
DEWY_ENDPOINT=localhost:8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Create a custom tool
&lt;/h2&gt;

&lt;p&gt;First, create the custom tool the LLM will use to answer product-related questions. This tool describes its purpose and parameters so the LLM knows when and how to use it, and defines the implementation and UI logic to complete the tool invocation.&lt;/p&gt;

&lt;p&gt;In this case, the tool will "get information about products", given a search query and a result count. When the LLM determines this tool should be used, it will generate the query string and choose how many results are appropriate, then execute the &lt;code&gt;render&lt;/code&gt; function defined below.&lt;/p&gt;

&lt;p&gt;On execution, the tool searches Dewy for relevant products, then calls the LLM again to pick up where it left off. Since this process can take awhile, the &lt;code&gt;render&lt;/code&gt; function &lt;code&gt;yield&lt;/code&gt;'s components indicating that the tool is being used and its progress. These components allow the UI to reflect the specific information relevant to this tool's outcomes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/productSearch.tsx&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Dewy&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dewy-ts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OpenAIStream&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Tokens&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai/react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;AssistantMessage&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./AssistantMessage&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;ResultCard&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./ResultCard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;SearchCard&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./SearchCard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Create a Dewy client.&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dewy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Dewy&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;BASE&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DEWY_ENDPOINT&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;// Implement the tool's logic.&lt;/span&gt;
&lt;span class="c1"&gt;// In this case, we search for the given query&lt;/span&gt;
&lt;span class="c1"&gt;// and return the `count` most similar chunks in the KB.&lt;/span&gt;
&lt;span class="c1"&gt;// The returned chunks are used in the `render` method below.&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;searchProducts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;dewy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retrieveChunks&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;product_info&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Define the behavior of the product search tool.&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;productSearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// A description of the tool.&lt;/span&gt;
        &lt;span class="c1"&gt;// This used by the LLM to decide when to use the tool&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Get information about products&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="c1"&gt;// Parameters control how the tool behaves.&lt;/span&gt;
        &lt;span class="c1"&gt;// These values will be picked by the LLM,&lt;/span&gt;
        &lt;span class="c1"&gt;// so be sure to clearly explain what they're&lt;/span&gt;
        &lt;span class="c1"&gt;// used for.&lt;/span&gt;
        &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;```


                A description of a product 
                or what the product can be used for.


            ```&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;```


                The number of products to return.


            ```&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;required&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;

        &lt;span class="c1"&gt;// Configure the tool's behavior.&lt;/span&gt;
        &lt;span class="c1"&gt;// This function will be called after the LLM has&lt;/span&gt;
        &lt;span class="c1"&gt;// chosen to use the tool and generated values for&lt;/span&gt;
        &lt;span class="c1"&gt;// the parameters we configured above.&lt;/span&gt;
        &lt;span class="na"&gt;render&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Let the user know we're looking for &lt;/span&gt;
            &lt;span class="c1"&gt;// products related to their message&lt;/span&gt;
            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;SearchCard&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;

            &lt;span class="c1"&gt;// Search for products related to the user's question&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;products&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;searchProducts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;// Update the message history &lt;/span&gt;
            &lt;span class="c1"&gt;// with the results we found&lt;/span&gt;
            &lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;product_search&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="c1"&gt;// Now reply to the user.&lt;/span&gt;
            &lt;span class="c1"&gt;// The products we retrieved are part of the state &lt;/span&gt;
            &lt;span class="c1"&gt;// provided as the messages parameter&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4-0125-preview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;

            &lt;span class="c1"&gt;// Stream the results back as they're generated&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;onFinal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="c1"&gt;// Update the conversation history &lt;/span&gt;
                    &lt;span class="c1"&gt;// once the full response is received&lt;/span&gt;
                    &lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;done&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
                        &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
                        &lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                            &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;])&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;});&lt;/span&gt;

            &lt;span class="c1"&gt;// Display the response alongside &lt;/span&gt;
            &lt;span class="c1"&gt;// the products provided to the model.&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"flex flex-col gap-2"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ResultCard&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;results&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
                &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;AssistantMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
                    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Tokens&lt;/span&gt; &lt;span class="na"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
                &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;AssistantMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setup the server-side message handler
&lt;/h2&gt;

&lt;p&gt;Configure a message handling action to receive new messages from the client. This handler configures the initial state of the chatbot and defines the server's behavior when a message is received from the user.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/action.tsx&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createAI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;createStreamableUI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;getMutableAIState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;render&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai/rsc&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;productSearch&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./productSearch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;AssistantMessage&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./AssistantMessage&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Configure the LLM, in this case OpenAI&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;submitUserMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;use server&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// The AI state. This contains our message history,&lt;/span&gt;
  &lt;span class="c1"&gt;// and will be provided to the LLM each time a &lt;/span&gt;
  &lt;span class="c1"&gt;// chat completion is generated.&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;aiState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;getMutableAIState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="c1"&gt;// Helper function for building streamable UI components.&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ui&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;render&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4-0125-preview&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
        You are a product recommendation engine. 
        Respond only with information about products 
        retrieved using the "product_search" function role.
        `&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;

    &lt;span class="c1"&gt;// This determines how generated text &lt;/span&gt;
    &lt;span class="c1"&gt;// (as opposed to function calls) will&lt;/span&gt;
    &lt;span class="c1"&gt;// be rendered.&lt;/span&gt;
    &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;done&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;done&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
          &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nx"&gt;content&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;AssistantMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;AssistantMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;

    &lt;span class="c1"&gt;// Configure the tools available to the LLM&lt;/span&gt;
    &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;product_search&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;productSearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;aiState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;display&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ui&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Define the initial state of the AI. It can be any JSON object.&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;initialAIState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;name&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;

&lt;span class="c1"&gt;// The initial UI state that the client will keep track of, which contains the message IDs and their UI nodes.&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;initialUIState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;display&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;React&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ReactNode&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;display&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;AssistantMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Hi! How can I help you today?&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;AssistantMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;}];&lt;/span&gt;

&lt;span class="c1"&gt;// AI is a provider you wrap your application with so you can access AI and UI state in your components.&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;AI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;actions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;submitUserMessage&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="c1"&gt;// Each state can be any shape of object, but for chat applications&lt;/span&gt;
  &lt;span class="c1"&gt;// it makes sense to have an array of messages. Or you may prefer something like { id: number, messages: Message[] }&lt;/span&gt;
  &lt;span class="nx"&gt;initialUIState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;initialAIState&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setup the chat UI on the client
&lt;/h2&gt;

&lt;p&gt;Finally, set up the chat UI on the client.&lt;br&gt;
Map over the UI state's messages and insert their &lt;code&gt;display&lt;/code&gt; property. Configure a form to update the AI state, then call the &lt;code&gt;submitUserMessage&lt;/code&gt; server side handler.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/page.tsx&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;use client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useUIState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useActions&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai/rsc&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./action&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Input&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@/components/ui/input&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Button&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@/components/ui/button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;UserMessage&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./UserMessage&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;PlaneIcon&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./icons&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;Page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;inputValue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setInputValue&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useUIState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;submitUserMessage&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useActions&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"flex-1 overflow-auto px-4 mt-4"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"grid gap-4 md:gap-8"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="si"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;display&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
              &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
          &lt;span class="si"&gt;}&lt;/span&gt;

        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;form&lt;/span&gt; &lt;span class="na"&gt;onSubmit&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;preventDefault&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="c1"&gt;// Add user message to UI state&lt;/span&gt;
        &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;currentMessages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;currentMessages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="na"&gt;display&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;UserMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;inputValue&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;UserMessage&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;,&lt;/span&gt;
          &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="c1"&gt;// Submit and get response message&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;responseMessage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;submitUserMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inputValue&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;currentMessages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;currentMessages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="nx"&gt;responseMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="nf"&gt;setInputValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"border-t-2"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"flex items-center h-14 px-4"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Input&lt;/span&gt;
              &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"rounded-full flex-1 min-w-0 bg-gray-200 dark:bg-gray-800"&lt;/span&gt;
              &lt;span class="na"&gt;placeholder&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Type a message..."&lt;/span&gt;
              &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;inputValue&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;
              &lt;span class="na"&gt;onChange&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nf"&gt;setInputValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Button&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"ml-2 h-8 w-8 rounded-full bg-gray-200 dark:bg-gray-800"&lt;/span&gt; &lt;span class="na"&gt;size&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"icon"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;PlaneIcon&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"h-4 w-4"&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
              &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"sr-only"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Send message&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;span&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;form&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To keep this post from going any longer than it is already, several basic UI components have been omitted, but you can checkout the full implementation in the &lt;a href="https://github.com/DewyKB/dewy-ts/tree/main/examples/rag-tool-vercel"&gt;examples directory&lt;/a&gt; of the Dewy repo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In conclusion, this blog post has guided you through the process of building a Retrieval-Augmented Generation (RAG) tool using Vercel's Generative UI components. This approach moves away from the traditional methods of pre-scripting logic for selecting and utilizing knowledge bases. By empowering the LLM to determine which knowledge bases are necessary and how best to use them, we've showcased a more dynamic, efficient, and user-centric way of handling information retrieval tasks.&lt;/p&gt;

&lt;p&gt;By utilizing Dewy as our knowledge base, we've emphasized the importance of a flexible, easy-to-use platform for storing, organizing, and retrieving information. This synergy between Dewy's streamlined data management and Vercel's dynamic UI rendering paves the way for developers to create more intelligent, responsive, and user-friendly applications.&lt;/p&gt;

&lt;p&gt;If this tutorial has been helpful and you'd like to help others learn about Dewy, please consider starring &lt;a href="https://github.com/DewyKB/dewy"&gt;our GitHub repo&lt;/a&gt;!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Building a Question-Answering CLI with Dewy and LangChain</title>
      <dc:creator>Ben Chambers</dc:creator>
      <pubDate>Mon, 04 Mar 2024 18:10:42 +0000</pubDate>
      <link>https://dev.to/dewy/building-a-question-answering-cli-with-dewy-and-langchain-2fi9</link>
      <guid>https://dev.to/dewy/building-a-question-answering-cli-with-dewy-and-langchain-2fi9</guid>
      <description>&lt;p&gt;In this tutorial, we're focusing on how to build a question-answering CLI tool using Dewy and LangChain. Dewy is an open-source knowledge base that helps developers organize and retrieve information efficiently. LangChain is a framework that simplifies the integration of large language models (LLMs) into applications. By combining Dewy's capabilities for managing knowledge with LangChain's LLM integration, you can create tools that answer complex queries with precise and relevant information.&lt;/p&gt;

&lt;p&gt;The use of a knowledge base to augment the capabilities of an LLM is referred to as retrieval augmented generation or RAG. This guide walks you through setting up a simple command-line RAG application. It covers everything from setting up your environment andi loading documents into Dewy to using an LLM through LangChain to answer questions based on the retrieved results. It's designed for engineers looking to enhance their projects with advanced question-answering functionalities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Dewy and LangChain?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Dewy&lt;/strong&gt; is an OSS knowledge base designed to streamline the way developers store, organize, and retrieve information. Its flexibility and ease of use make it an excellent choice for developers aiming to build knowledge-driven applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangChain&lt;/strong&gt;, on the other hand, is a powerful framework that enables developers to integrate LLMs into their applications seamlessly. By combining Dewy's structured knowledge management with LangChain.js's LLM capabilities, developers can create sophisticated question-answering systems that can understand and process complex queries, offering precise and contextually relevant answers.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Goal
&lt;/h2&gt;

&lt;p&gt;Our aim is to build a simple yet powerful question-answering CLI script. This script will allow users to load documents into the Dewy knowledge base and then use an LLM, through LangChain, to answer questions based on the information stored in Dewy. This tutorial will guide you through the process, from setting up your environment to implementing the CLI script.&lt;/p&gt;

&lt;p&gt;You'll learn how to use LangChain to build a simple question-answering application, and how to integrate Dewy as a source of knowledge, allowing your application to answer questions based on specific documents you provide it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before diving into the tutorial, ensure you have the following prerequisites covered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic knowledge of Python programming&lt;/li&gt;
&lt;li&gt;Familiarity with CLI tools development&lt;/li&gt;
&lt;li&gt;A copy of Dewy running on your local machine (see Dewy's &lt;a href="https://dewykb.github.io/docs/GettingStarted/installation"&gt;installation instructions&lt;/a&gt; if you need help here).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1: Set Up Your Project
&lt;/h2&gt;

&lt;p&gt;The final code for this example is available &lt;a href="https://github.com/DewyKB/dewy/tree/main/demos/python-langchain-cli"&gt;in the Dewy repo&lt;/a&gt; if you'd like to jump ahead.&lt;/p&gt;

&lt;p&gt;First, create a directory for the CLI project and change into the directory&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;dewy_qa
&lt;span class="nb"&gt;cd &lt;/span&gt;dewy_qa
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the directory set up, you can create and initialize a project using Poetry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;poetry init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When it asks about defining your main dependencies interactively you can choose yes and enter the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;langchain-core&lt;/code&gt; which we'll use for the orchestration&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;langchain-openai&lt;/code&gt; which we'll use for the OpenAI LLM interface&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;click&lt;/code&gt; which we'll use for the CLI&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;dewy-langchain&lt;/code&gt; which provides the LangChain retriver querying Dewy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now you're ready to create the CLI application.&lt;/p&gt;

&lt;p&gt;We're using &lt;code&gt;click&lt;/code&gt;, which lets us create a CLI using decorators on methods.&lt;br&gt;
To start, we'll create a "group" for the two commands we're going to implement -- one for adding a document and one for asking a question.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;``&lt;code&gt;python title="CLI entry point"&lt;br&gt;
@click.group()&lt;br&gt;
@click.option("--collection", default="main")&lt;br&gt;
@click.option("--base_url", default="http://localhost:8000")&lt;br&gt;
@click.pass_context&lt;br&gt;
def cli(ctx, collection, base_url):&lt;br&gt;
    # ensure that ctx.obj exists and is a dict (in case&lt;/code&gt;cli()&lt;code&gt;is called&lt;br&gt;
    # by means other than the&lt;/code&gt;if` block below)&lt;br&gt;
    ctx.ensure_object(dict)&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ctx.obj["base_url"] = base_url
ctx.obj["collection"] = collection
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
  
  
  Commands will go here
&lt;/h1&gt;

&lt;p&gt;if &lt;strong&gt;name&lt;/strong&gt; == "&lt;strong&gt;main&lt;/strong&gt;":&lt;br&gt;
    cli()&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


In addition to creating a group, this does the following:

- Accepts a `collection` argument indicating which Dewy collection to operate on.
- Accepts a `base_url` argument (with a default) indicating which Dewy service to connect to.
- Stores both of those options on the context.
- Executes the `cli` group when invoked.

Adding these options to the root allows them to be passed before the specific command, and makes it clear they apply to all (or most) of the commands in the CLI application we're building.

Now you can run your script with `poetry run python -m dewy_qa`.

## Step 2: Implement document loading

Load documents by setting up the Dewy client. The following code adds an `add_file` command which accepts a single positional `url_or_file`. If that corresponds to a valid file path, it uploads the file to Dewy. Otherwise, it creates the document from the given URL and Dewy will fetch it. This logic could be improved (eg., `file://` URLs should be uploaded) but it demonstrates several key abilities:

1. You can create a document from a URL, and Dewy will download and ingest the file.
2. You can create a document without associated content, and then upload content which Dewy will ingest.



```python title="Add File Command"
@cli.command()
@click.pass_context
@click.argument("url_or_file")
def add_file(ctx, url_or_file):
    from dewy_client.api.kb import add_document, upload_document_content
    from dewy_client.models import AddDocumentRequest, BodyUploadDocumentContent
    from dewy_client.types import File

    client = Client(ctx.obj["base_url"])
    if os.path.isfile(url_or_file):
        document = add_document.sync(
            client=client,
            body=AddDocumentRequest(
                collection=ctx.obj["collection"],
            ),
        )
        print(f"Added document {document.id}. Uploading content.")

        with open(url_or_file, "rb") as file:
            payload = file.read()
            upload_document_content.sync(
                document.id,
                client=client,
                body=BodyUploadDocumentContent(
                    content=File(
                        payload=payload,
                        file_name=os.path.basename(url_or_file),
                    ),
                ),
            )
        print(f"Uploaded content for document {document.id}.")

    else:
        document = add_document.sync(
            client=client,
            body=AddDocumentRequest(collection=ctx.obj["collection"], url=url_or_file),
        )
        print(f"Added document {document.id} from URL '{url_or_file}'")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;At this point, you should be able to load a document from a URL or local PDF using the command &lt;code&gt;poetry run python -m dewy_qa &amp;lt;url_or_file&amp;gt;&lt;/code&gt;.&lt;br&gt;
For example, you could use &lt;code&gt;https://arxiv.org/pdf/2009.08553.pdf&lt;/code&gt; to load a PDF from Arxiv.&lt;/p&gt;

&lt;p&gt;You may ask -- "what happens if I upload content to a document that is already ingested?" Conveniently, Dewy will treat this as a new version of the document and re-ingest it!&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 3: Implement question-answering
&lt;/h2&gt;

&lt;p&gt;With the ability to load documents into Dewy, it's time to integrate LangChain to invoke LLMs for answering questions. This step involves setting up LangChain to query the Dewy knowledge base and process the results using an LLM to generate answers.&lt;/p&gt;

&lt;p&gt;We're going to introduce a &lt;code&gt;query&lt;/code&gt; command which accepts a file containing the question (or reads it from stdin).&lt;br&gt;
We'll build this up in several steps.&lt;/p&gt;
&lt;h3&gt;
  
  
  Create DewyRetriever
&lt;/h3&gt;

&lt;p&gt;First, we'll create the &lt;code&gt;query&lt;/code&gt; command and create a &lt;code&gt;DewyRetriever&lt;/code&gt; for our collection.&lt;br&gt;
This is an adapter that let's LangChain know how to retrieve documents from Dewy.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;```python title="Create DewyRetriever"&lt;br&gt;
from dewy_langchain import DewyRetriever&lt;/p&gt;

&lt;p&gt;retriever = DewyRetriever.for_collection(&lt;br&gt;
    collection=ctx.obj["collection"], base_url=ctx.obj["base_url"]&lt;br&gt;
)&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


### Create a PromptTemplate


This is a string template that tells LangChain how to create the prompt for the LLM. In this case, the LLM is instructed to answer the question, but only using the information it's provided. This reduces the model's tendency to "hallucinate", or make up an answer that's plausible but wrong.  The values of `context` and `question` will be configured when we assemble the "chain".



```python title="Prompt Template"
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template(
    """Answer the question based only on the following context:
{context}

Question: {question}
"""
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create the Chain
&lt;/h3&gt;

&lt;p&gt;LangChain works by building up "chains" of behavior that control how to query the LLM and other data sources. This example uses &lt;a href="https://python.langchain.com/docs/expression_language/"&gt;LCEL&lt;/a&gt;, which provides a more flexible programming experience than some of LangChain's original interfaces.&lt;/p&gt;

&lt;p&gt;Use a &lt;code&gt;RunnableSequence&lt;/code&gt; to create an LCEL chain. This chain describes how to generate the &lt;code&gt;context&lt;/code&gt; and &lt;code&gt;question&lt;/code&gt; values: the context is generated using the retriever created earlier, and the question is generated by passing through the step's input. The results Dewy retrieves are formatted as a string by piping them to the &lt;code&gt;formatDocumentsAsString&lt;/code&gt; function.&lt;/p&gt;

&lt;p&gt;This chain does the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;It retrieves documents using the &lt;code&gt;DewyRetriever&lt;/code&gt; and assigns them to &lt;code&gt;context&lt;/code&gt; and assigns the chain's input value to &lt;code&gt;question&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;It formats the prompt string using the &lt;code&gt;context&lt;/code&gt; and &lt;code&gt;question&lt;/code&gt; variables.&lt;/li&gt;
&lt;li&gt;It passes the formatted prompt to the LLM to generate a response.&lt;/li&gt;
&lt;li&gt;It formats the LLM's response as a string.
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;```python title="Create the Chain"&lt;br&gt;
from langchain_core.output_parsers import StrOutputParser&lt;br&gt;
from langchain_core.runnables import RunnablePassthrough&lt;br&gt;
from langchain_openai import ChatOpenAI&lt;/p&gt;

&lt;p&gt;model = ChatOpenAI()&lt;/p&gt;

&lt;p&gt;chain = (&lt;br&gt;
    {"context": retriever, "question": RunnablePassthrough()}&lt;br&gt;
    | prompt&lt;br&gt;
    | model&lt;br&gt;
    | StrOutputParser()&lt;br&gt;
)&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


### Invoke the Chain and Print Results

Now that the chain has been constructed, execute it and output the results to the console. As you'll see, `question` is an input argument provided by the caller of the function.

Executing the chain using `chain.streamLog()` allows you to see each response chunk as it's returned from the LLM. The stream handler loop is sort of ugly, but it's just filtering to appropriate stream results and writing them to `STDOUT` (using `console.log` it would have added newlines after each chunk).



```python title="Invoke the Chain"
query_str = query.read()
click.echo(f"Invoking chain for:\n{query_str}")
result = chain.invoke(query_str)
click.echo(f"\n\nAnswer:\n{result}")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Putting it all together
&lt;/h3&gt;



&lt;p&gt;```python title="Putting it all together"&lt;br&gt;
@cli.command()&lt;br&gt;
@click.pass_context&lt;br&gt;
@click.argument("query", type=click.File("r"), default=sys.stdin)&lt;br&gt;
def query(ctx, query):&lt;br&gt;
    from dewy_langchain import DewyRetriever&lt;br&gt;
    from langchain_core.output_parsers import StrOutputParser&lt;br&gt;
    from langchain_core.prompts import ChatPromptTemplate&lt;br&gt;
    from langchain_core.runnables import RunnablePassthrough&lt;br&gt;
    from langchain_openai import ChatOpenAI&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;retriever = DewyRetriever.for_collection(
    collection=ctx.obj["collection"], base_url=ctx.obj["base_url"]
)

prompt = ChatPromptTemplate.from_template(
    """Answer the question based only on the following context:
{context}

Question: {question}
"""
)

model = ChatOpenAI()

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

query_str = query.read()
click.echo(f"Invoking chain for:\n{query_str}")
result = chain.invoke(query_str)
click.echo(f"\n\nAnswer:\n{result}")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


### Trying it out

Now you can run `echo "&amp;lt;your question&amp;gt;" | poetry run python -m dewy_qa query`.

## Conclusion

By following this guide, you've learned how to create a CLI that uses Dewy to manage knowledge and LangChain to process questions and generate answers. This tool demonstrates the practical application of combining a structured knowledge base with the analytical power of LLMs, enabling developers to build more intelligent and responsive applications.

## Further Reading and Resources

- Dewy GitHub repository: [https://github.com/Dewy](https://github.com/DewyKB/dewy)
- Dewy LangChain integration repository: [https://github.com/DewyKB/dewy_langchain](https://github.com/DewyKB/dewy_langchain)
- LangChain documentation: [https://python.langchain.com](https://python.langchain.com)
- OpenAI documentation: [https://platform.opnai.com](https://platform.openai.com/docs/introduction)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>rag</category>
      <category>python</category>
      <category>langchain</category>
      <category>dewy</category>
    </item>
    <item>
      <title>Extraction Matters Most</title>
      <dc:creator>Ben Chambers</dc:creator>
      <pubDate>Wed, 28 Feb 2024 21:33:32 +0000</pubDate>
      <link>https://dev.to/dewy/extraction-matters-most-2kc6</link>
      <guid>https://dev.to/dewy/extraction-matters-most-2kc6</guid>
      <description>&lt;p&gt;There is a lot of content on getting started with Retrieval Augmented Generation (RAG), and numerous posts on various forms of “advanced” RAG. LangChain lists 8 different PDF loaders, 4 different text splitters, and a variety of techniques you can add such as Parent-Document Retrieval and Multi-Vector indexing. It is easy to be paralyzed with doubt – which of these should you use?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffgbfjdkl6bo54l35r5xj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffgbfjdkl6bo54l35r5xj.png" alt="Some of the many RAG configuration options" width="800" height="230"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’re wondering which of these techniques are best for your domain, the long answer is – you should understand which techniques may help your documents and run experiments on your data to see which of those help.&lt;/p&gt;

&lt;p&gt;If you're in a hurry, the short answer is that how you extract information matters most. Embeddings and advanced splitting and indexing strategies don't help (or actively hurt) retrieval.&lt;/p&gt;

&lt;p&gt;To understand the importance of different factors and prioritize implementation of corresponding techniques within Dewy (the open-source knowledge base we’re developing), we ran some experiments on different configurations. We’re posting these to hopefully save you some time by using these results as a starting point, and possibly using some of the methodology from this post for your own measurements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Experimentation
&lt;/h2&gt;

&lt;p&gt;There are a variety of frameworks for evaluating RAG and LLM usage, such as &lt;a href="https://www.trulens.org/"&gt;TruLens&lt;/a&gt; and &lt;a href="https://docs.ragas.io/en/latest/index.html"&gt;RAGAS&lt;/a&gt; – at the time of writing LangSmith supports evaluation but doesn’t consider the context, making it less suited for evaluating RAG. The general pattern is to record the question and answer (and often the retrieved contexts) and then use an LLM to grade the responses. Typically, you may ask for grades along different metrics – such as “are the retrieved contexts relevant to the question” or “is the generated answer grounded in the retrieved context”. This approach lets you measure the performance across a large set of generated responses, and fits well with the existing tracing systems like &lt;a href="https://www.langchain.com/langsmith"&gt;LangSmith&lt;/a&gt;. For this article, I used RAGAS.&lt;/p&gt;

&lt;p&gt;The data set I’m using is the &lt;a href="https://github.com/run-llama/llama-datasets/tree/main/llama_datasets/history_of_alexnet"&gt;History of Alexnet&lt;/a&gt;. This data set includes a PDF representing the source content (the document or documents to index and retrieve) and a set of 160 questions and answers. For your experiments, it’s best to have your LLM application record actual questions and possible golden answers – generated by a human or using a larger model and/or larger context.&lt;/p&gt;

&lt;p&gt;All of the frameworks mentioned support running the experiments out-of-band. This allows you to record the actual questions and answers your users are asking, and then run evaluation after the fact to assess the quality of the answers you’ve served up. This is very useful for monitoring your LLM application.&lt;/p&gt;

&lt;p&gt;RAGAS provides a variety of &lt;a href="https://docs.ragas.io/en/latest/concepts/metrics/index.html"&gt;metrics&lt;/a&gt;. For these experiments, I’m only measuring the following metrics related to retrieval.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.ragas.io/en/latest/concepts/metrics/context_precision.html"&gt;&lt;strong&gt;Context Precision&lt;/strong&gt;&lt;/a&gt;: Whether the context items relevant to the ground-truth answer are ranked higher than those irrelevant.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.ragas.io/en/latest/concepts/metrics/context_recall.html"&gt;&lt;strong&gt;Context Recall&lt;/strong&gt;&lt;/a&gt;: The proportion of sentences in the ground-truth answer attributed to the retrieved context. This uses the ground-truth answer as a proxy for the "correct" context.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To simplify comparisons in this post, we use the geometric mean of precision and recall. This is often called the “f1 score”, but since the RAGAS metrics context recall and precision are slightly different, we’ll call this the “context f1”.&lt;/p&gt;

&lt;p&gt;See &lt;a href="https://github.com/DewyKB/dewy/tree/main/experiments"&gt;the code for the experiments&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;For our experiments, we ran using 3 different extractors (&lt;a href="https://pypi.org/project/pypdf/"&gt;pypdf&lt;/a&gt;, &lt;a href="https://pypi.org/project/pymupdf/"&gt;pymupdf&lt;/a&gt; and &lt;a href="https://pypi.org/project/unstructured/"&gt;unstructured&lt;/a&gt;), 5 different chunking and indexing strategies (&lt;a href="https://python.langchain.com/docs/modules/data_connection/document_transformers/recursive_text_splitter"&gt;recursive chunking&lt;/a&gt;, &lt;a href="https://python.langchain.com/docs/modules/data_connection/retrievers/parent_document_retriever"&gt;parent-document chunking&lt;/a&gt;, &lt;a href="https://python.langchain.com/docs/modules/data_connection/document_transformers/semantic-chunker"&gt;semantic chunking&lt;/a&gt;, and &lt;a href="https://python.langchain.com/docs/modules/data_connection/retrievers/multi_vector#hypothetical-queries"&gt;questions-answered multi-vector embedding&lt;/a&gt; with both recursive and semantic chunking), 2 different embeddings (OpenAI &lt;a href="https://platform.openai.com/docs/guides/embeddings/embedding-models"&gt;ada-002&lt;/a&gt; and &lt;a href="https://huggingface.co/BAAI/bge-small-en-v1.5/tree/main"&gt;bge-en-small&lt;/a&gt;) and 2 different search strategies (similarity and maximum-marginal relevance). We ran the experiment and evaluation on the cross-product of all of these factors.&lt;/p&gt;

&lt;p&gt;The results are available in a &lt;a href="https://drive.google.com/file/d/1AyfIyYrBf1sJz03cCYdbGrF4th0ed8TL/view?usp=drive_link"&gt;zip file&lt;/a&gt; and the &lt;a href="[https://colab.research.google.com/drive/1ycGgTPvDRLuEOvK6kOfsxg_nKkki6ncG#scrollTo=VH-7g3CdqSOI]"&gt;analysis notebook&lt;/a&gt; used to produce the charts is available in Google Colab. For each of the choices we show a heatmap comparing the median of the relative differences Context F1 between the choices.&lt;/p&gt;

&lt;h3&gt;
  
  
  Extraction
&lt;/h3&gt;

&lt;p&gt;Our suspicion was that extraction would play a significant role in the results. After all, if the wrong text is extracted from a PDF, or it is extracted in a confusing manner there isn’t much that can be done to retrieve the right content.&lt;/p&gt;

&lt;p&gt;The results largely supported this. In this visualization you can see that switching from &lt;code&gt;pymupdf&lt;/code&gt; to &lt;code&gt;unstructured&lt;/code&gt; produces an 87% increase in f1 score, and changing from &lt;code&gt;pypdf&lt;/code&gt; to &lt;code&gt;unstructured&lt;/code&gt; produces a 72% increase.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn25ug02k11329w0a9vic.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn25ug02k11329w0a9vic.png" alt="Heatmap showing median change for Extractors" width="800" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Chunking &amp;amp; Indexing
&lt;/h3&gt;

&lt;p&gt;On the indexing front, the results were somewhat surprising.&lt;br&gt;
We expected more sophisticated chunking strategies like &lt;code&gt;semantic&lt;/code&gt; to outperform the basic &lt;code&gt;recursive&lt;/code&gt; chunking. But the results suggest that &lt;code&gt;recursive&lt;/code&gt; generally outperforms the other techniques. As you can see, &lt;code&gt;recursive&lt;/code&gt; indexing produces context f1 scores 81% better than &lt;code&gt;semantic&lt;/code&gt;, and 56% better than parent indexing.&lt;/p&gt;

&lt;p&gt;Generating the questions-answered is a relatively expensive process (applying an LLM during indexing) and didn’t seem to significantly improve upon the baseline chunking.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftue9314a9rrea9x6vgox.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftue9314a9rrea9x6vgox.png" alt="Heatmap showing median change for Indexing" width="800" height="303"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the future, we’d like to rerun these experiments on larger datasets to see if these advanced indexing techniques become more important in different cases. Until then, this suggests that for simple RAG use cases, simple indexing strategies like &lt;code&gt;recursive&lt;/code&gt; work well.&lt;/p&gt;

&lt;h3&gt;
  
  
  Search
&lt;/h3&gt;

&lt;p&gt;Many vector store retrievers support basic similarity search as well as a “maximum marginal relevance” or MMR. This is a technique to retrieve more diverse chunks with the hopes of producing richer contexts. We expected MMR to outperform similarity, but in our experiments we saw the opposite - MMR reduced f1 scores by 13%.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9cjoz221mjhjb1imfuy5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9cjoz221mjhjb1imfuy5.png" alt="Heatmap showing median change for Extractors" width="800" height="340"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Our hypothesis is that because the dataset has a single PDF, there isn’t much benefit to increased diversity. This is another experiment we’d like to re-run on different datasets in the future.&lt;/p&gt;

&lt;h3&gt;
  
  
  Embeddings
&lt;/h3&gt;

&lt;p&gt;On embeddings, we expected the larger (and better) embeddings from OpenAI ada-002 with 1536 dimensions to outperform &lt;a href="https://dev.tobge_small_en"&gt;bge-small-en-v1.5&lt;/a&gt; which only has 512 dimensions. While there is technically a difference, it is basically insignificant.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F831n1ibarnip959ym31c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F831n1ibarnip959ym31c.png" alt="Heatmap showing median change for Embeddings" width="800" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;While it is important to consider (and measure!) the benefits of these techniques on your own documents and questions, the above analysis suggests that the most important thing you can do for your RAG application is ensure you’re extracting the right content. After that, relatively simple chunking, indexing, and retrieval provides good results.&lt;/p&gt;

&lt;p&gt;In the future, we’d like to re-run these experiments with a larger dataset and by feeding the entire set of documents through the models rather than comparing advanced RAG techniques to baseline answers using naive RAG.&lt;/p&gt;

&lt;p&gt;Don’t be paralyzed by all the options – nearly any choice you use will allow answering questions over the domain specific data and significantly improve the experience of using your GenAI application. Providing the right context reduces hallucinations and lets you use GenAI to answer questions about domain specific information.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dewykb.github.io/"&gt;Dewy&lt;/a&gt; makes this easy. It provides a RAG service with built-in indexing and retrieval built on the lessons from these experiments and more, allowing you to add documents and retrieve relevant contexts with a &lt;a href="https://dewykb.github.io/docs/api/dewy-knowledge-base-api"&gt;simple API&lt;/a&gt; and then get back to improving your GenAI application!&lt;/p&gt;

</description>
      <category>rag</category>
      <category>langchain</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
