<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: DataStax, an IBM company</title>
    <description>The latest articles on DEV Community by DataStax, an IBM company (@datastax).</description>
    <link>https://dev.to/datastax</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F4276%2Fa4eecb28-13eb-465e-8388-ff038f20507d.png</url>
      <title>DEV Community: DataStax, an IBM company</title>
      <link>https://dev.to/datastax</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/datastax"/>
    <language>en</language>
    <item>
      <title>Improve Your Python Search Relevancy with Astra DB Hybrid Search</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 30 Apr 2025 00:44:04 +0000</pubDate>
      <link>https://dev.to/datastax/improve-your-python-search-relevancy-with-astra-db-hybrid-search-11d6</link>
      <guid>https://dev.to/datastax/improve-your-python-search-relevancy-with-astra-db-hybrid-search-11d6</guid>
      <description>&lt;p&gt;Astra DB now supports hybrid search, which can increase the accuracy of your search by up to 45%. It does this by performing both vector search and BM25 keyword search and then reranking the results from both to return the most relevant results.&lt;/p&gt;

&lt;p&gt;In this post, we'll take a look at how to use Astra DB Hybrid Search in Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is hybrid search?
&lt;/h2&gt;

&lt;p&gt;Before we get to the code, let's go over what hybrid search actually is and why it helps. You would typically build a retrieval-augmented generation (RAG) app by &lt;a href="https://www.datastax.com/blog/how-to-create-vector-embeddings-in-python?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;creating vector embeddings&lt;/a&gt; for your unstructured content and storing them in a database. Then, when a user makes a query, you turn the query into a vector embedding and use it to perform a similarity search to return relevant context that you can provide to a large language model (LLM) to generate an answer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fphtg571c6x9xj038bun8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fphtg571c6x9xj038bun8.png" alt="A flow chart showing the ingestion and generation phases of a RAG application." width="680" height="330"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The more accurate and relevant your search results from your database are, the better your RAG application will be. With better context, there’s less opportunity for the LLM to return inaccurate or hallucinated responses.&lt;/p&gt;

&lt;p&gt;To improve on the relevancy of this system, we need to focus on the search element. Vector search is great at understanding context and meaning, but it can miss results that would be returned from a keyword match. Meanwhile, keyword search can be restrictive as it doesn't understand context. Performing both searches gives us the best chance of returning the top results, but you then need to combine those results so you can pass them to an LLM. This is where reranking comes in.&lt;/p&gt;

&lt;p&gt;Reranking is performed by another machine learning model—a cross-encoder—that more accurately scores relevance because the model uses both the original query and the document to create the score. You can't use reranking models for search because it would require scoring every document in your database against the query every time; for small subsets of your data, however, this is achievable.&lt;/p&gt;

&lt;p&gt;You can actually use a reranker to help improve vector search results, by returning more results than required, reranking to adjust the order, then returning the top results.&lt;/p&gt;

&lt;p&gt;In hybrid search, we use reranking to rescore the combination of results from the vector and keyword searches and pick the top, most relevant results from the output.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fht94d2uo6qtox7fjr6dr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fht94d2uo6qtox7fjr6dr.png" alt="A flow chart showing how hybrid search works. From a database a vector search is performed and returns some results, and a keyword search is performed and returns different results. The results are then passed through a reranker and the output is a mix of the results representing the most relevant results." width="440" height="651"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Astra DB can now perform hybrid search by combining vector search and BM25 keyword search, then reranking using the &lt;a href="https://developer.nvidia.com/nemo-retriever/" rel="noopener noreferrer"&gt;NVIDIA NeMo Retriever&lt;/a&gt; reranking microservices (including the &lt;a href="https://build.nvidia.com/nvidia/llama-3_2-nv-rerankqa-1b-v2/modelcard" rel="noopener noreferrer"&gt;nvidia/llama-3.2-nv-rerankqa-1b-v2 reranking model&lt;/a&gt;). Let's take a look at how to use Astra DB hybrid search to improve search relevancy in your Python application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hybrid Search in Python with Astra DB
&lt;/h2&gt;

&lt;p&gt;Let's start by creating a database in your DataStax account. While it’s provisioning, let's get our coding environment set up.&lt;/p&gt;

&lt;p&gt;To use Hybrid Search in Python, you’ll need to install version 2 of &lt;a href="https://github.com/datastax/astrapy" rel="noopener noreferrer"&gt;astrapy&lt;/a&gt; as well as &lt;a href="https://github.com/theskumar/python-dotenv" rel="noopener noreferrer"&gt;python-dotenv&lt;/a&gt; so that you can load environment variables from an &lt;em&gt;.env&lt;/em&gt; file. Install the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"astrapy&amp;gt;=2.0,&amp;lt;3.0"&lt;/span&gt; python-dotenv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a file called &lt;em&gt;.env&lt;/em&gt; and add your database API endpoint, access token and choose a name for your collection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Creating a collection for hybrid search
&lt;/h3&gt;

&lt;p&gt;Once the database is created, we'll need to create a collection to store our data in. We'll do this in code, because we want to create some settings that aren't yet available in the dashboard.&lt;/p&gt;

&lt;p&gt;Create a file called &lt;em&gt;create_collection.py&lt;/em&gt; and add this code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataAPIClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy.info&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CollectionDefinition&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy.constants&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VectorMetric&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;collection_definition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;CollectionDefinition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_vector_dimension&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_vector_metric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;VectorMetric&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DOT_PRODUCT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_vector_service&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nvidia&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NV-Embed-QA&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_lexical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tokenizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;standard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;args&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filters&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lowercase&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;porterstem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;asciifolding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;definition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;collection_definition&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this code we create a definition for our collection and then create the collection. The definition includes details on how we want the collection to create vectors for our data as well as how it should treat the keyword search.&lt;/p&gt;

&lt;p&gt;For vector search, we are using &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra Vectorize&lt;/a&gt; with the built-in NVIDIA NeMo Retriever nv-embed-qa model to create vector embeddings on insert and search. The model creates vectors with 1024 dimensions, and we configure the collection to use the &lt;a href="https://docs.datastax.com/en/dse/6.9/get-started/vector-concepts.html#dot-product-metric?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;dot product&lt;/a&gt; to calculate similarity between vectors.&lt;/p&gt;

&lt;p&gt;For the keyword search, the default performs exact keyword matching, but we can tweak this a bit with settings like this. First, we define the tokenizer, which is how the collection breaks up the text into words. We'll use the standard tokenizer, which divides based on word boundaries and strips out punctuation. We then add filters, which transform the text to make it easier to match searches. In this case, we add four filters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;lowercase -&lt;/strong&gt; converts all the text to lowercase&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;stop -&lt;/strong&gt; removes English stop words&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;porterstem -&lt;/strong&gt; applies the Porter Stemming algorithm for English, which translates different forms of words to a common stem, e.g. "search", "searches", and "searched" will all translate to the token "search"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;asciifolding -&lt;/strong&gt; translates characters into ASCII, that is it turns accented characters into an ASCII equivalent if it exists, e.g. "café" becomes "cafe"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note that both the stop and porterstem filters are specific to English texts.&lt;/p&gt;

&lt;p&gt;You can choose to include the filters that will work best for your data. There is more on the &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/analyzers.html#supported-built-in-analyzers?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;available filters and links to further information in the Astra DB documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now we've created our collection, we can ingest some data to search against.&lt;/p&gt;

&lt;h3&gt;
  
  
  Indexing data for hybrid search
&lt;/h3&gt;

&lt;p&gt;Save this &lt;a href="https://gist.github.com/philnash/2525bb35cc83f7ec544c02e91fc3231f" rel="noopener noreferrer"&gt;list of made up restaurant descriptions&lt;/a&gt; that we'll use as our example data as a JSON file called &lt;em&gt;restaurants.json&lt;/em&gt;. Create a new file called &lt;em&gt;ingest.py&lt;/em&gt; and add the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataAPIClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;restaurants.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;restaurant_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;restaurants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;restaurant_data&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;restaurants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this code we load the restaurant descriptions and then create each as a document in Astra DB passing in the description as the &lt;code&gt;$hybrid&lt;/code&gt; property. Creating documents with the &lt;code&gt;$hybrid&lt;/code&gt; property does two things.&lt;/p&gt;

&lt;p&gt;It will use the NVIDIA NeMo Retriever embedding model that we configured when we created the collection to create vector embeddings of the content. This is the same as using Astra Vectorize to generate embeddings.&lt;/p&gt;

&lt;p&gt;It will also index the text for the new BM25 keyword search.&lt;/p&gt;

&lt;p&gt;Run the code with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python ingest.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check your collection in the DataStax dashboard, you should find both &lt;code&gt;$vectorize&lt;/code&gt; and &lt;code&gt;$lexical&lt;/code&gt; properties.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performing a hybrid search
&lt;/h3&gt;

&lt;p&gt;Having indexed using &lt;code&gt;$hybrid,&lt;/code&gt; we can now perform vector and hybrid searches against this collection. Create a file called &lt;em&gt;search.py&lt;/em&gt; and enter the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataAPIClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;salads&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;projection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will perform a vector search on the collection using Astra Vectorize when you run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python search.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run this search you will see five results. In position four is "The Green Leaf Eatery," which is the most "salads" sounding place on the list to me. Positions one and two do mention salads, and, because it is vector search and not keyword search, position three, "Fusion Flavors Bistro," doesn't mention salads at all.&lt;/p&gt;

&lt;p&gt;Now, let's update the search to use Hybrid Search and perform reranking on the results. You will need to change the find method to the new &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/document-methods/find-and-rerank.html#find-documents-with-a-hybrid-search?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;&lt;code&gt;find_and_rerank&lt;/code&gt; method&lt;/a&gt; and pass &lt;code&gt;{"$hybrid": query}&lt;/code&gt; as your sort field. You can add other arguments too, like &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/document-methods/find-and-rerank.html#limit-the-number-of-documents-returned-by-the-underlying-searches?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;hybrid_limits, which sets the number of documents to retrieve from each inner query before reranking&lt;/a&gt;, and &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/document-methods/find-and-rerank.html#include-the-scores-in-the-response?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;include_scores, which shows the various scores used to rank documents along the way&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_and_rerank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;salads&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hybrid_limits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;projection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;include_scores&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you will see these results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Green Leaf Eatery: A bright and airy vegetarian and vegan restaurant focusing on fresh, seasonal produce. Their innovative menu features creative plant-based dishes, from vibrant salads and grain bowls to hearty vegetable curries and decadent vegan desserts. It's a celebration of healthy and delicious eating."&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-3.6972656&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.67285335&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.03175403&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bohemian&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Brew&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bites:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;quirky&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;eclectic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;cafe&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;offering&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;relaxed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;atmosphere&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;diverse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Enjoy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;gourmet&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sandwiches&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;artisanal&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bread&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creative&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;salads&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;house-made&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dressings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;selection&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;globally&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;inspired&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;small&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;plates.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Their&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;extensive&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;coffee&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;craft&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;beer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;makes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;perfect&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;spot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;casual&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bite&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;leisurely&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hangout.'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-4.5507812&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.6813005&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.032002047&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Olive&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Grove&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Mediterranean:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Transport&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;yourself&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sunny&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;shores&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Mediterranean&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;at&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;charming&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;restaurant.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Their&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;features&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flavorful&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Greek&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Turkish&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dishes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;grilled&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;kebabs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;savory&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;spanakopita&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creamy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;hummus&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;vibrant&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;salads.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Enjoy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;fresh&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;herbs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;olive&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;oil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sun-drenched&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flavors.'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-5.1210938&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.68404347&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.032786883&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Fusion&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Flavors&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Bistro:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;contemporary&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;restaurant&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;creatively&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;blends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;different&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;culinary&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;traditions.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Expect&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;unexpected&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;exciting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flavor&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;combinations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;innovative&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;presentations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;constantly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;evolves.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;This&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;place&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;adventurous&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;palates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seeking&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;unique&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dining&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;experience.'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-11.375&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.67336804&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.015873017&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$vectorize':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Farmhouse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Kitchen:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;rustic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;charming&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;restaurant&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;celebrating&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bounty&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;local&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;farm.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Their&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;menu&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;changes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;seasonally&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;featuring&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;dishes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;made&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;freshest&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ingredients&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sourced&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;directly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;nearby&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;farms.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Expect&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;simple&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;yet&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;elegant&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;preparations&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;highlight&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;natural&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;flavors&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ingredients.'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'$rerank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-11.375&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vector':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.6582356&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$vectorRank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$bm&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="err"&gt;Rank':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'$rrf':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.014925373&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this output, you can see the results and also the various scores that were used to rank them. You can see that "The Green Leaf Eatery" now ranks first on the list having been ranked in fourth by vector search and second by the BM25 search. The reranker lifted it up to first place.&lt;/p&gt;

&lt;p&gt;There are other similar movements in the list, plus in the fifth position was a restaurant that was initially ranked seventh by the vector search and doesn't contain the search term "salads." Hybrid Search initially returns more results than we need, reranks them and then returns the most relevant, so this result was lifted up into a position to be returned. Positions four and five also received the same rerank score, so were placed in their order based on one more score that is calculated, reciprocal rank fusion (RRF). &lt;a href="https://www.datastax.com/blog/reranker-algorithm-showdown-vector-search?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;RRF isn't great for reranking&lt;/a&gt;, but is very quick, so is useful to help with tie-breaks here.&lt;/p&gt;

&lt;p&gt;Try running vector and hybrid searches with other search terms to get a feel for the results. In our testing, we’ve seen &lt;a href="https://www.datastax.com/blog/introducing-astra-db-hybrid-search?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Hybrid Search improve relevance by up to 45%&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Next, we'll take a look at a couple of other things you will need to consider when using Hybrid Search.&lt;/p&gt;

&lt;h2&gt;
  
  
  Providing your own vectors
&lt;/h2&gt;

&lt;p&gt;The example above used Astra Vectorize to automatically create vector embeddings, but you can always use a different model and provide your own vectors.&lt;/p&gt;

&lt;p&gt;If you do use your own vector embedding model, then you will need to provide both the vector and the text that will be indexed for keyword search. You can do this with the special property &lt;code&gt;$lexical&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Imagine you have a method that &lt;a href="https://www.datastax.com/blog/how-to-create-vector-embeddings-in-python?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;creates a vector embedding&lt;/a&gt; called &lt;code&gt;create_embedding&lt;/code&gt;. You might then ingest the data like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;restaurants.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;restaurant_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;restaurants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;create_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$lexical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;restaurant_data&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_many&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;restaurants&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, when you perform a hybrid search, you need to provide a &lt;code&gt;$vector&lt;/code&gt; with which to search. Also, the default property on which the content is reranked is &lt;code&gt;$vectorize&lt;/code&gt;, so you need to tell the database which property to rerank on too.&lt;/p&gt;

&lt;p&gt;You also need to set the query that you want to use to perform the reranking. It can be the same query that you use for the vector search and the keyword search, or something else. You can see more about using different searches below.&lt;/p&gt;

&lt;p&gt;You can define the query with the &lt;code&gt;rerank_query&lt;/code&gt; argument and the field on which to perform the reranking with the r&lt;code&gt;erank_on&lt;/code&gt; argument. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;salad&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_and_rerank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;create_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$lexical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;rerank_query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;rerank_on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hybrid_limits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Performing different searches
&lt;/h2&gt;

&lt;p&gt;You can also use different terms to perform your initial searches. This is useful because BM25 keyword search acts as a filter on the query keywords.&lt;/p&gt;

&lt;p&gt;In our Hybrid Search example above, only three restaurant descriptions mentioned "salads" so only three results had a &lt;code&gt;$bm25Rank&lt;/code&gt; in the results.&lt;/p&gt;

&lt;p&gt;That worked fine for our example, but when we're dealing with a RAG application, the search queries are often in natural language rather than keyword focused. We already set up our collection to use word stems and translate accented characters into ASCII. You may also want to perform keyword extraction, using &lt;a href="https://spotintelligence.com/2022/12/13/keyword-extraction/" rel="noopener noreferrer"&gt;something like NLTK, SpaCy or keyBERT&lt;/a&gt;, on the user query so you can then use the keywords for the lexical search. This would look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m looking for a restaurant that serves the best salad&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_and_rerank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;create_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$lexical&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;extract_keywords&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;rerank_query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;rerank_on&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hybrid_limits&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above code will now perform the vector search with your own vector embedding model, keyword search using keywords extracted from the user query and then rerank the results based on the initial query.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try hybrid search for better search and RAG relevancy
&lt;/h2&gt;

&lt;p&gt;Combining vector search with keyword search and a reranking model like NVIDIA NeMo Retriever nvidia/llama-3.2-nv-rerankqa-1b-v2 produces more relevant results, improving the output of your RAG application. You can get started with hybrid search and reranking in Astra DB today by &lt;a href="https://astra.datastax.com/?utm_medium=byline&amp;amp;utm_campaign=improve-python-search-relevance-with-astra-db-hybrid-search&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;signing up&lt;/a&gt; and using AstraPy or with &lt;a href="https://docs.langflow.org/components-vector-stores#hybrid-search" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you want to chat more about improving retrieval accuracy, drop into the &lt;a href="https://discord.gg/datastax" rel="noopener noreferrer"&gt;DataStax Devs Discord&lt;/a&gt; or drop me an email at &lt;a href="mailto:phil.nash@datastax.com"&gt;phil.nash@datastax.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>python</category>
      <category>vectordatabase</category>
      <category>rag</category>
      <category>genai</category>
    </item>
    <item>
      <title>Build a RAG Chat App with Firebase Genkit and Astra DB</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 16 Apr 2025 04:17:10 +0000</pubDate>
      <link>https://dev.to/datastax/build-a-rag-chat-app-with-firebase-genkit-and-astra-db-31n1</link>
      <guid>https://dev.to/datastax/build-a-rag-chat-app-with-firebase-genkit-and-astra-db-31n1</guid>
      <description>&lt;p&gt;Today &lt;a href="https://www.datastax.com/blog/introducing-astra-db-plugin-for-firebase-genkit?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;we announced the release of a plugin for Firebase's Genkit framework&lt;/a&gt; for building generative AI applications. &lt;a href="https://firebase.google.com/docs/genkit" rel="noopener noreferrer"&gt;Genkit&lt;/a&gt; is a powerful framework that provides the primitives for building production-quality GenAI applications. From easy access to models, prompts, indexers, and retrievers, to more advanced features like flows, traces, and evals, its power lies in making it easy to do the right thing while building GenAI applications.&lt;/p&gt;

&lt;p&gt;In this post, we'll take a look at how to use the Astra DB plugin for Genkit to build a &lt;a href="https://www.datastax.com/guides/what-is-retrieval-augmented-generation?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;retrieval-augmented generation application&lt;/a&gt; with Genkit.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/D7MqKnKtpnE"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a RAG application
&lt;/h2&gt;

&lt;p&gt;Let's build a RAG application from scratch and see how straightforward it can be with Genkit and Astra DB. First, you'll need a Gemini API key, which you can get from &lt;a href="https://aistudio.google.com/" rel="noopener noreferrer"&gt;Google AI Studio&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You’ll also need an Astra DB database to store your data and vectors; if you don't already have an account you can sign up for &lt;a href="https://astra.datastax.com/signup?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;a free DataStax account&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Start by creating a new Astra DB database; give it a name and choose a cloud and region. This takes a couple of minutes, so carry on with the next steps while it starts up.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxfptu0z6qyt393d7ai6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxfptu0z6qyt393d7ai6.png" alt="The " width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting up the app
&lt;/h3&gt;

&lt;p&gt;Create a directory for your app and install the dependencies you'll need:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;genkit-astra-db-rag
&lt;span class="nb"&gt;cd &lt;/span&gt;genkit-astra-db-rag
npm init &lt;span class="nt"&gt;--yes&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;genkit @genkit-ai/googleai genkitx-astra-db
npm &lt;span class="nb"&gt;install &lt;/span&gt;genkit-cli tsx &lt;span class="nt"&gt;-D&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a file to work in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;touch &lt;/span&gt;index.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;em&gt;index.ts&lt;/em&gt; and import the dependencies you installed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;genkit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Document&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;genkit&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;textEmbedding004&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;googleAI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gemini20Flash&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@genkit-ai/googleai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;astraDBIndexerRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;astraDBRetrieverRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;astraDB&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;genkitx-astra-db&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, we're pulling in &lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api" rel="noopener noreferrer"&gt;Google's text-embedding-004 model&lt;/a&gt; for creating vector embeddings, and the &lt;a href="https://deepmind.google/technologies/gemini/flash/" rel="noopener noreferrer"&gt;Gemini Flash 2.0 model for generation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It's about time to create a collection in which we can store our vectors. Hopefully your database has been created now, so head to the &lt;a href="https://astra.datastax.com/?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;DataStax dashboard&lt;/a&gt;, choose your database, open the Data Explorer, and create a collection. Give the collection a name and choose "Bring my own" for the embedding generation method. The text-embedding-004 model creates vectors with 768 dimensions (though you can choose fewer), so enter 768 for the number of dimensions and choose "Cosine" for the similarity metric.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uhm65n6ylws36uhkm0j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uhm65n6ylws36uhkm0j.png" alt="The " width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you've created the collection, you'll need the API endpoint of the database, the collection name and to generate an API token.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa2a5citkb3zsvati0aq7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa2a5citkb3zsvati0aq7.png" alt="The database Overview tab. Highlighted on the right hand side of the page are the Database Details including the API endpoint a button for generating an Application Token." width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With those, create a &lt;em&gt;.env&lt;/em&gt; file in your application and enter the credentials:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also in the &lt;em&gt;.env&lt;/em&gt; file, enter your API key from AI Studio too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can configure Genkit. In &lt;em&gt;index.ts&lt;/em&gt; create the &lt;code&gt;ai&lt;/code&gt; object like so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collectionName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;genkit&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nf"&gt;googleAI&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nf"&gt;astraDB&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;clientParams&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;applicationToken&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;apiEndpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;collectionName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;collectionName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;embedder&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;textEmbedding004&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]),&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sets up Genkit with the Google AI plugin for models and embeddings and the Astra DB plugin, configured with the credentials to access the collection you just created and the vector embedding model &lt;em&gt;text-embedding-004&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;We can now access the Astra DB indexer and retriever via the reference functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;astraDBIndexer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;astraDBIndexerRef&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;collectionName&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;astraDBRetriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;astraDBRetrieverRef&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;collectionName&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The indexer is used to store documents in the collection and the retriever is used to perform vector search to return documents from the collection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ingesting data
&lt;/h2&gt;

&lt;p&gt;Now we can ingest some data into Astra DB. For this RAG application, let's grab data from the web. To ingest web data, we'll need to fetch it from a URL and then extract the main content from the returned HTML. I've written before about how I like to &lt;a href="https://www.datastax.com/blog/html-content-retrieval-augmented-generation-readability-js?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;use Readability.js to parse out the content from a page&lt;/a&gt;, so we'll follow that. We'll also need something to &lt;a href="https://www.datastax.com/blog/how-to-chunk-text-in-javascript-for-rag-applications?utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;turn the content into chunks&lt;/a&gt;, let's use &lt;a href="https://www.npmjs.com/package/llm-chunk" rel="noopener noreferrer"&gt;llm-chunk&lt;/a&gt; for this as it's relatively simple.&lt;/p&gt;

&lt;p&gt;Install the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @mozilla/readability jsdom llm-chunk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Import them at the top of the script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;import &lt;span class="o"&gt;{&lt;/span&gt; Readability &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"@mozilla/readability"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
import &lt;span class="o"&gt;{&lt;/span&gt; JSDOM &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"jsdom"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
import &lt;span class="o"&gt;{&lt;/span&gt; chunk &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"llm-chunk"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Write a function that takes a URL, fetches the HTML content, extracts the content and returns it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;fetchTextFromWeb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSDOM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Readability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next thing to do is write our first Genkit flow to ingest data from a URL into the collection. Flows are functions that you can run via the Genkit UI or through code. Flows have strongly defined input and output schemas using &lt;a href="https://zod.dev/" rel="noopener noreferrer"&gt;zod&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For this flow we'll accept a string which is a URL. There's no need for an output as the function will just end when it completes successfully.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;indexWebPage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;defineFlow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;indexPage&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;url&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;URL&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;outputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;extract-text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;fetchTextFromWeb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;chunk-it&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;minLength&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;maxLength&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;overlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;indexer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;astraDBIndexer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ingestion pipeline is nice and easy to read as a flow. And using &lt;code&gt;ai.run&lt;/code&gt; around the non-Genkit functions provides an extra level of tracing that we'll be able to see later.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Genkit UI
&lt;/h3&gt;

&lt;p&gt;This seems like a good time to test out what we've built so far. Open &lt;em&gt;package.json&lt;/em&gt; and add a script to run your application code and one to start the Genkit server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"scripts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"start"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tsx --env-file .env ./index.ts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"genkit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"genkit start -- npm start"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can run &lt;code&gt;npm run genkit&lt;/code&gt; and open the Genkit UI in your browser at &lt;a href="http://localhost:4000/" rel="noopener noreferrer"&gt;localhost:4000&lt;/a&gt;. You can either find your flow on the dashboard or by clicking on &lt;em&gt;Flows&lt;/em&gt; in the sidebar and then selecting it from the list.&lt;/p&gt;

&lt;p&gt;This gives you a box to add some input. The input is the schema that we set up as the parameters to the flow. In this case, it just expects a string that’s a URL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhoabu5vbkb2fgy7ixoyj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhoabu5vbkb2fgy7ixoyj.png" alt="The Genkit UI showing the input JSON that you can enter to run the flow we just built as well as the schema that it accepts." width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Enter a URL and run the flow. Once it's complete, you can open the DataStax dashboard and see the chunks and their vectors stored in the collection.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe144hgbblyyq7e67ktz8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe144hgbblyyq7e67ktz8.png" alt="After successfully ingesting the data through Genkit, the Data Explorer tab in the Datastax dashboard shows the Collection data including the vectors that were created." width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Back in the Genkit UI you can click on &lt;em&gt;View trace&lt;/em&gt; and you’ll be shown each of the steps the flow took to fetch, chunk, embed and store the data.&lt;/p&gt;

&lt;p&gt;Head back to the Genkit dashboard and open Retrievers from the sidebar. All we did to define the available retriever was set up the Astra DB plugin and export the &lt;code&gt;astraDBRetrieverRef&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We can already use that retriever from the Genkit UI. Click on the retriever and enter the following in the input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"some search term"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the options, change the property k to 5. Run the retriever and it will perform a vector search using the text you provide in the input and returning five results from the database.&lt;/p&gt;

&lt;p&gt;We can now hook this up with a full RAG flow, in which we first retrieve context from the database and then pass it to a model to generate a response. Open the code again and define another flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ragFlow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;defineFlow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;rag&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;outputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;astraDBRetriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;gemini20Flash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`
You are a helpful AI assistant that can answer questions.

Use only the context provided to answer the question.
If you don't know, do not make up an answer.

Question: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we use the retriever to search for the string input, and then pass the resulting documents as part of a prompt to the generate function that uses the Gemini Flash 2.0 model to perform the generation.&lt;/p&gt;

&lt;p&gt;Restart the Genkit server, open up the Flows section and choose your RAG flow. You can now input a question, make sure it's relevant to the data you indexed, and Gemini will generate a relevant response based on the docs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkyntz1qacmmzilzno5iu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkyntz1qacmmzilzno5iu.png" alt="In the Genkit UI you can run the RAG flow and see the output of the full pipeline." width="800" height="727"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once again, you can hit the &lt;em&gt;View trace&lt;/em&gt; button to see what happened at each stage in this request.&lt;/p&gt;

&lt;p&gt;We've only used these flows in the Genkit interface so far, but for either of the flows, you can run them like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;indexWebPage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;URL&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Genkit and Astra DB make RAG easy
&lt;/h2&gt;

&lt;p&gt;It took us fewer than 100 lines of code to build the two major flows required for RAG: ingestion and generation. Firebase Genkit made it easy to test our implementation as we went—without us having to build a UI for it. And the tracing in Genkit means it's easier to track down bugs in your flows.&lt;/p&gt;

&lt;p&gt;Astra DB is an easy to use and powerful vector database, and it's even easier to use when all you need to do is configure the plugin in Genkit and reference indexers and retrievers.&lt;/p&gt;

&lt;p&gt;You can find the &lt;a href="https://github.com/philnash/genkit-astra-db-rag" rel="noopener noreferrer"&gt;code for this app on GitHub&lt;/a&gt;. The &lt;a href="https://github.com/datastax/genkitx-astra-db" rel="noopener noreferrer"&gt;Astra DB plugin for Genkit&lt;/a&gt; is open source so if you have any issues or requests, please &lt;a href="https://github.com/datastax/genkitx-astra-db/issues" rel="noopener noreferrer"&gt;open an issue on the GitHub repo&lt;/a&gt;. And check out the &lt;a href="https://firebase.google.com/docs/genkit" rel="noopener noreferrer"&gt;Genkit docs&lt;/a&gt; for more on what you can build with Genkit.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQ)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is Astra DB?
&lt;/h3&gt;

&lt;p&gt;Astra DB is a cloud-based NoSQL document store. It features an accurate and performant vector index for storing vectors which can be used for similarity searches. It comes with a Genkit plugin for integration with the Firebase Genkit framework.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Genkit?
&lt;/h3&gt;

&lt;p&gt;Genkit is a framework for building generative AI applications. It provides essential tools such as models, prompts, indexers, retrievers, flows, traces, and evaluations. By using Genkit, developers can efficiently create applications that leverage the power of generative AI.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I get started with building a RAG application using Genkit and Astra DB?
&lt;/h3&gt;

&lt;p&gt;To build a RAG application with Genkit and Astra DB, you need to create a database and collection within Astra DB and then install Genkit and related dependencies into your Node.js application. Once you've configured Genkit with your Astra DB credentials, you can start creating flows.&lt;/p&gt;

&lt;h3&gt;
  
  
  What does building the RAG application involve?
&lt;/h3&gt;

&lt;p&gt;Building the RAG application involves creating a collection in Astra DB to store vectors, and setting up flows in Genkit to ingest data, and to generate responses based on context retrieved from the database. You can test these flows out using the Genkit UI.&lt;/p&gt;

&lt;h6&gt;
  
  
  DataStax AI Platform:
&lt;/h6&gt;

&lt;p&gt;The Fastest Way to Build and Deploy AI Apps&lt;/p&gt;

&lt;p&gt;&lt;a href="https://astra.datastax.com/signup?type=langflow&amp;amp;utm_medium=byline&amp;amp;utm_campaign=rag-powered-chat-bot-genkit-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Try For Free&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>firebase</category>
      <category>genai</category>
      <category>vectordatabase</category>
    </item>
    <item>
      <title>How to Create Vector Embeddings in Python</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 09 Apr 2025 01:26:09 +0000</pubDate>
      <link>https://dev.to/datastax/how-to-create-vector-embeddings-in-python-3am0</link>
      <guid>https://dev.to/datastax/how-to-create-vector-embeddings-in-python-3am0</guid>
      <description>&lt;p&gt;When you’re building a &lt;a href="https://www.datastax.com/guides/what-is-retrieval-augmented-generation" rel="noopener noreferrer"&gt;retrieval-augmented generation (RAG)&lt;/a&gt; app, the first thing you need to do is prepare your data. You need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;collect your unstructured data&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.datastax.com/blog/chunking-to-get-your-data-ai-ready?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;split it into chunks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;turn those chunks into &lt;a href="https://www.datastax.com/guides/what-is-a-vector-embedding?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;vector embeddings&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;store the embeddings in a &lt;a href="https://www.datastax.com/products/vector-search?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;vector database&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are many ways that you can create vector embeddings in Python. In this post, we’ll take a look at four ways to generate vector embeddings: locally, via API, via a framework, and with &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB's Vectorize&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Local vector embeddings
&lt;/h2&gt;

&lt;p&gt;There are many &lt;a href="https://huggingface.co/models?pipeline_tag=sentence-similarity&amp;amp;sort=trending" rel="noopener noreferrer"&gt;pre-trained embedding models available on Hugging Face&lt;/a&gt; that you can use to create vector embeddings. &lt;a href="https://www.sbert.net/" rel="noopener noreferrer"&gt;Sentence Transformers (SBERT)&lt;/a&gt; is a library that makes it easy to use these models for vector embedding, as well as cross-encoding for reranking. It even has tools for finetuning models, if that’s something that might be of use.&lt;/p&gt;

&lt;p&gt;You can install the library with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;sentence_transformers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A popular local model for vector embedding is &lt;a href="https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2" rel="noopener noreferrer"&gt;all-MiniLM-L6-v2&lt;/a&gt;. It’s trained as a good all-rounder that produces a 384-dimension vector from a chunk of text.&lt;/p&gt;

&lt;p&gt;To use it, import &lt;code&gt;sentence_transformers&lt;/code&gt; and create a model using the identifier from Hugging Face, in this case "all-MiniLM-L6-v2". If you want to use a model that isn't in the &lt;a href="https://huggingface.co/sentence-transformers/" rel="noopener noreferrer"&gt;sentence-transformers project&lt;/a&gt;, like the multilingual &lt;a href="https://huggingface.co/BAAI/bge-m3" rel="noopener noreferrer"&gt;BGE-M3&lt;/a&gt;, you can use the organization to identify the model too, like, "BAAI/BGE-M3". Once you've loaded the model, use the &lt;code&gt;encode&lt;/code&gt; method to create the vector embedding. The full code looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;


&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; [ 1.95171311e-03  1.51085425e-02  3.36140348e-03  2.48030387e-02 ... ]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you pass an array of texts to the model, they’ll all be encoded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;


&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;sentences&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentences&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; [[ 0.00195174  0.01510859  0.00336139 ...  0.07971715  0.09885529  -0.01855042]
# [-0.04523939 -0.00046248  0.02036596 ...  0.08779042  0.04936493  -0.06218244]
# [-0.05453169  0.01125113 -0.00680178 ...  0.06443197  0.08771271  -0.00063468]]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are many &lt;a href="https://huggingface.co/models?pipeline_tag=feature-extraction&amp;amp;library=sentence-transformers&amp;amp;sort=trending" rel="noopener noreferrer"&gt;more models you can use to generate vector embeddings with the sentence-transformers library&lt;/a&gt; and, because you’re running locally, you can try them out to see which is most appropriate for your data. You do need to watch out for any restrictions that these models might have. For example, the all-MiniLM-L6-v2 model doesn’t produce good results for more than 128 tokens and can only handle a maximum of 256 tokens. BGE-M3, on the other hand, can encode up to 8,192 tokens. However, the BGE-M3 model is a couple of gigabytes in size and all-MiniLM-L6-v2 is under 100MB, so there are space and memory constraints to consider, too.&lt;/p&gt;

&lt;p&gt;Local embedding models like this are useful when you’re experimenting on your laptop, or if you have &lt;a href="https://www.sbert.net/docs/installation.html#install-pytorch-with-cuda-support" rel="noopener noreferrer"&gt;hardware that PyTorch can use to speed up the encoding process&lt;/a&gt;. It’s a good way to get comfortable running different models and seeing how they interact with your data.&lt;/p&gt;

&lt;p&gt;If you don't want to run your models locally, there are plenty of available APIs you can use to create embeddings for your documents.&lt;/p&gt;

&lt;h2&gt;
  
  
  APIs
&lt;/h2&gt;

&lt;p&gt;There are several services that make embedding models available as APIs. These include LLM providers like &lt;a href="https://platform.openai.com/docs/guides/embeddings" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;, &lt;a href="https://ai.google.dev/gemini-api/docs/embeddings" rel="noopener noreferrer"&gt;Google&lt;/a&gt;, or &lt;a href="https://docs.cohere.com/docs/cohere-embed" rel="noopener noreferrer"&gt;Cohere&lt;/a&gt;, as well as specialist providers like &lt;a href="https://jina.ai/embeddings/" rel="noopener noreferrer"&gt;Jina AI&lt;/a&gt; or model hosts like &lt;a href="https://docs.fireworks.ai/guides/querying-embeddings-models" rel="noopener noreferrer"&gt;Fireworks&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;These API providers provide HTTP APIs, often with a Python package to make it easy to call them. You will typically require an API key from the service. Once you have that setup you can generate vector embeddings by sending your text to the API.&lt;/p&gt;

&lt;p&gt;For example, with Google's &lt;a href="https://ai.google.dev/gemini-api/docs/sdks" rel="noopener noreferrer"&gt;google-genai SDK&lt;/a&gt; and a Gemini API key you can generate a vector embedding with their &lt;a href="https://developers.googleblog.com/en/gemini-embedding-text-model-now-available-gemini-api/" rel="noopener noreferrer"&gt;experimental Gemini embedding model&lt;/a&gt; like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;


&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-embedding-exp-03-07&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each API can be different, though many providers do make OpenAI-compatible APIs. However, each time you try a new provider you might find you have a new API to learn. Unless, of course, you try one of the available frameworks that are intended to simplify this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frameworks
&lt;/h2&gt;

&lt;p&gt;There are several projects available, like &lt;a href="https://www.langchain.com/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; or &lt;a href="https://docs.llamaindex.ai/en/stable/" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt;, that create abstractions over the common components of the GenAI ecosystem, including embeddings.&lt;/p&gt;

&lt;p&gt;Both LangChain and LlamaIndex have methods for creating vector embeddings via APIs or local models, all with the same interface. For example, you can create the same Gemini embedding as the code snippet above with LangChain like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_google_genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;


&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-embedding-exp-03-07&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;google_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a comparison, here is how you would generate an embedding using an OpenAI embeddings model and LangChain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAIEmbeddings&lt;/span&gt;


&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text-embedding-3-small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We had to change the name of the import and the API key we used, but otherwise the code is identical. This makes it easy to swap them out and experiment.&lt;/p&gt;

&lt;p&gt;If you're using LangChain to build your entire RAG pipeline, these embeddings fit in well with the vector database interfaces. You can provide an embedding model to the database object and LangChain handles generating the embeddings as you insert documents or perform queries. For example, here's how you can combine the Google embeddings model with the &lt;a href="https://python.langchain.com/docs/integrations/vectorstores/astradb/" rel="noopener noreferrer"&gt;LangChain wrapper for Astra DB&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_google_genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_astradb&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AstraDBVectorStore&lt;/span&gt;


&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-embedding-exp-03-07&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;google_api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AstraDBVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;astra_vector_langchain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;api_endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# a list of document objects to store in the db
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can use the same &lt;code&gt;vector_store&lt;/code&gt; object and associated embeddings to perform the vector search, too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Are robots allowed to protect themselves?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LlamaIndex has a similar set of abstractions that enable you to combine different embedding models and vector stores. Check out this &lt;a href="https://docs.llamaindex.ai/en/stable/understanding/rag/" rel="noopener noreferrer"&gt;LlamaIndex introduction to RAG&lt;/a&gt; to learn more.&lt;/p&gt;

&lt;p&gt;If you're new to embeddings, &lt;a href="https://python.langchain.com/docs/integrations/text_embedding/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; has a handy list of embedding models and providers that can help you find different options to try.&lt;/p&gt;

&lt;h2&gt;
  
  
  Directly in the database
&lt;/h2&gt;

&lt;p&gt;The methods we’ve talked through so far have involved creating a vector independently of storing it in or using it to search against a vector database. When you want to store those vectors in a &lt;a href="https://www.datastax.com/guides/what-is-a-vector-database?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;vector database like Astra DB&lt;/a&gt;, it looks a bit like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;astrapy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataAPIClient&lt;/span&gt;


&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_database&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.04574034&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.038084425&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.00916391&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The above assumes that you have already created your vector-enabled collection with the right number of dimensions for the model you’re using.&lt;/p&gt;

&lt;p&gt;Performing a vector search then looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{},&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vector&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.04574034&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.038084425&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.00916391&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In these examples, you have to create your vectors first, before storing or searching against the database with them. In the case of the frameworks, you might not see this happen, as it has been abstracted away, but the operations are being performed.&lt;/p&gt;

&lt;p&gt;With Astra DB, you can have the database generate the vector embeddings for you as you either insert the document into the collection or at the point of performing the search. This is called &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra Vectorize&lt;/a&gt; and it simplifies a crucial step in your RAG pipeline.&lt;/p&gt;

&lt;p&gt;To use Vectorize, you first need to set up an embedding provider integration. There’s one built-in integration that you can use with no extra work; the &lt;a href="https://www.datastax.com/integrations/vectorize-with-nvidia?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;NVIDIA NV-Embed-QA model&lt;/a&gt;, or you can choose one of the other embeddings providers and configure them with your API.&lt;/p&gt;

&lt;p&gt;When you create a collection, you can choose which embedding provider you want to use with the requisite number of dimensions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8e1poo6z0jw73ilw8wn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8e1poo6z0jw73ilw8wn.png" alt="A screen shot of the form to create a collection. After the name field, there is a drop down list where you can select the embedding provider you want to use, in this example, NVIDIA has been chosen. Then there are fields for the number of dimensions and the similarity metric to use." width="800" height="589"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you set up your collection this way you can add content and have it automatically vectorized by using the special property &lt;code&gt;$vectorize&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
         &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, when a user query comes in, you can perform a vector search by sorting using the &lt;code&gt;$vectorize&lt;/code&gt; property. Astra DB will create the vector embedding and then make the search in one step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{},&lt;/span&gt;
    &lt;span class="n"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$vectorize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Are robots allowed to protect themselves?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are several advantages to this approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Astra DB team has done the work to make the embedding creation robust already&lt;/li&gt;
&lt;li&gt;Making two separate API calls to create embeddings and then store them is often slower than letting Astra DB handle it&lt;/li&gt;
&lt;li&gt;Using the built-in NVIDIA embeddings model is even quicker than that&lt;/li&gt;
&lt;li&gt;You have less code to write and maintain&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  A world of vector embedding options
&lt;/h2&gt;

&lt;p&gt;As we have seen, there are many choices you can make in how to implement vector embeddings, which model you use, and which provider you use. It's an important step in your RAG pipeline and it is important to spend the time to find out which model and method is right for your application and your data.&lt;/p&gt;

&lt;p&gt;You can choose to host your own models, rely on third-party APIs, abstract the problem away through frameworks, or entrust Astra DB to create embeddings for you. Of course, if you want to avoid code entirely, then you can &lt;a href="https://www.datastax.com/products/langflow?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-python&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;drag-and-drop your components into place with Langflow&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you want to chat more about vector embeddings and RAG, drop into the &lt;a href="https://discord.gg/datastax" rel="noopener noreferrer"&gt;DataStax Devs Discord&lt;/a&gt; or drop me an email at &lt;a href="mailto:phil.nash@datastax.com"&gt;phil.nash@datastax.com&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frequently asked questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What are vector embeddings?
&lt;/h3&gt;

&lt;p&gt;Vector embeddings are numerical representations of text in multi-dimensional space used for tasks like document retrieval and recommendation systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  What steps are involved in creating vector embeddings for a retrieval-augmented generation (RAG) app?
&lt;/h3&gt;

&lt;p&gt;To create vector embeddings, you need to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Collect unstructured data&lt;/li&gt;
&lt;li&gt;Split data into chunks&lt;/li&gt;
&lt;li&gt;Turn chunks into vector embeddings&lt;/li&gt;
&lt;li&gt;Store embeddings in a vector database&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  How can I create vector embeddings locally in Python?
&lt;/h3&gt;

&lt;p&gt;You can create vector embeddings locally in Python using pre-trained embedding models from the HuggingFace, specifically using the sentence-transformers library.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are some limitations of local embedding models?
&lt;/h3&gt;

&lt;p&gt;Local embedding models handle a limited number of tokens effectively, and larger models require substantial memory and storage.&lt;/p&gt;

&lt;h3&gt;
  
  
  How can I create vector embeddings using an API?
&lt;/h3&gt;

&lt;p&gt;You can create vector embeddings using APIs provided by services such as OpenAI, Google, and Cohere.&lt;/p&gt;

&lt;h3&gt;
  
  
  Are there frameworks to simplify embedding creation?
&lt;/h3&gt;

&lt;p&gt;Yes, frameworks like LangChain and LlamaIndex offer standardized interfaces that abstract the complexities of embedding models and APIs.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Astra Vectorize, and how does it simplify the embedding process?
&lt;/h3&gt;

&lt;p&gt;Astra Vectorize enables Astra DB to automatically generate vector embeddings as documents are inserted or queries are performed.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are the advantages of using Astra Vectorize?
&lt;/h3&gt;

&lt;p&gt;The advantages include simplified code maintenance, faster performance, improved efficiency, and robustness through pre-tested integrations.&lt;/p&gt;

</description>
      <category>python</category>
      <category>genai</category>
      <category>vectordatabase</category>
    </item>
    <item>
      <title>How to Create Vector Embeddings in Node.js</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Thu, 03 Apr 2025 21:43:17 +0000</pubDate>
      <link>https://dev.to/datastax/how-to-create-vector-embeddings-in-nodejs-2khl</link>
      <guid>https://dev.to/datastax/how-to-create-vector-embeddings-in-nodejs-2khl</guid>
      <description>&lt;p&gt;When you’re building a &lt;a href="https://www.ibm.com/think/topics/retrieval-augmented-generation" rel="noopener noreferrer"&gt;retrieval-augmented generation (RAG)&lt;/a&gt; app, job number one is preparing your data. You’ll need to take your unstructured data and &lt;a href="https://philna.sh/blog/2024/09/18/how-to-chunk-text-in-javascript-for-rag-applications/" rel="noopener noreferrer"&gt;split it up into chunks&lt;/a&gt;, turn those chunks into &lt;a href="https://www.ibm.com/think/topics/vector-embedding" rel="noopener noreferrer"&gt;vector embeddings&lt;/a&gt;, and finally, store the embeddings in a &lt;a href="https://www.ibm.com/think/topics/vector-database" rel="noopener noreferrer"&gt;vector database&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;There are many ways that you can create vector embeddings in JavaScript. In this post, we’ll investigate four ways to generate vector embeddings in Node.js: locally, via API, via a framework, and with &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB's Vectorize&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Local vector embeddings
&lt;/h2&gt;

&lt;p&gt;There are lots of open-source models available on &lt;a href="https://huggingface.co/" rel="noopener noreferrer"&gt;HuggingFace&lt;/a&gt; that can be used to create vector embeddings. &lt;a href="https://huggingface.co/docs/transformers.js/en/index" rel="noopener noreferrer"&gt;Transformers.js&lt;/a&gt; is a module that lets you use machine learning models in JavaScript, both in the browser and Node.js. It uses the &lt;a href="https://onnxruntime.ai/" rel="noopener noreferrer"&gt;ONNX runtime&lt;/a&gt; to achieve this; it works with models that have published ONNX weights, of which there are plenty. Some of those models we can use to create vector embeddings.&lt;/p&gt;

&lt;p&gt;You can install the module with&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @xenova/transformers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://huggingface.co/docs/transformers.js/index#tasks" rel="noopener noreferrer"&gt;The package can actually perform many tasks&lt;/a&gt;, but &lt;a href="https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.FeatureExtractionPipeline" rel="noopener noreferrer"&gt;feature extraction&lt;/a&gt; is what you want for generating vector embeddings.&lt;/p&gt;

&lt;p&gt;A popular, local model for vector embedding is &lt;a href="https://huggingface.co/Xenova/all-MiniLM-L6-v2" rel="noopener noreferrer"&gt;all-MiniLM-L6-v2&lt;/a&gt;. It’s trained as a good all-rounder and produces a 384-dimension vector from a chunk of text.&lt;/p&gt;

&lt;p&gt;To use it, import the &lt;code&gt;pipeline&lt;/code&gt; function from Transformers.js and create an extractor that will perform "feature-extraction" using your provided model. You can then pass a chunk of text to the extractor and it will return a tensor object which you can turn into a plain JavaScript array of numbers.&lt;/p&gt;

&lt;p&gt;All in all, it looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;pipeline&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@xenova/transformers&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;extractor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;feature-extraction&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Xenova/all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;extractor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;pooling&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mean&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="c1"&gt;// =&amp;gt; [-0.004044221248477697,  0.026746056973934174,   0.0071970801800489426, ... ]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can actually embed multiple texts at a time if you pass an array to the extractor. Then you can call tolist on the response and that will return you a list of arrays as your vectors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;extractor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;pooling&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mean&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tolist&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="c1"&gt;// [&lt;/span&gt;
&lt;span class="c1"&gt;//   [ -0.006129210349172354,  0.016346964985132217,   0.009711502119898796, ...],&lt;/span&gt;
&lt;span class="c1"&gt;//   [-0.053930871188640594,  -0.002175076398998499,   0.032391052693128586, ...],&lt;/span&gt;
&lt;span class="c1"&gt;//   [-0.05358131229877472,  0.021030642092227936, 0.0010665050940588117, ...]&lt;/span&gt;
&lt;span class="c1"&gt;// ]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are &lt;a href="https://huggingface.co/models?pipeline_tag=feature-extraction&amp;amp;library=transformers.js" rel="noopener noreferrer"&gt;many models you can use to create vector embeddings from text&lt;/a&gt;, and, because you’re running locally, you can try them out to see which works best for your data. You should pay attention to the length of text that these models can handle. For example, the all-MiniLM-L6-v2 model does not provide good results for more than 128 &lt;a href="https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them" rel="noopener noreferrer"&gt;tokens&lt;/a&gt; and can handle a maximum of 256 tokens, so it’s useful for sentences or small paragraphs. If you have a bigger source of text data than that, you’ll need to &lt;a href="https://philna.sh/blog/2024/09/18/how-to-chunk-text-in-javascript-for-rag-applications/" rel="noopener noreferrer"&gt;split your data into appropriately sized chunks&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Local embedding models like this are useful if you’re experimenting on your own machine, or have the &lt;a href="https://onnxruntime.ai/docs/get-started/with-javascript/node.html" rel="noopener noreferrer"&gt;right hardware to run them efficiently when deployed&lt;/a&gt;. It's an easy way to get comfortable with different models and get a feel for how things work without having to sign up to a bunch of different API services.&lt;/p&gt;

&lt;p&gt;Having said that, there are a lot of useful vector embedding models available as an API, so let's take a look at them next.&lt;/p&gt;

&lt;h2&gt;
  
  
  APIs
&lt;/h2&gt;

&lt;p&gt;There is an abundance of services that provide embedding models as APIs. These include LLM providers, like &lt;a href="https://platform.openai.com/docs/guides/embeddings" rel="noopener noreferrer"&gt;OpenAI&lt;/a&gt;, &lt;a href="https://ai.google.dev/gemini-api/docs/embeddings" rel="noopener noreferrer"&gt;Google&lt;/a&gt; or &lt;a href="https://docs.cohere.com/docs/cohere-embed" rel="noopener noreferrer"&gt;Cohere&lt;/a&gt;, as well as specialist providers like &lt;a href="https://www.voyageai.com/" rel="noopener noreferrer"&gt;Voyage AI&lt;/a&gt; or &lt;a href="https://jina.ai/embeddings/" rel="noopener noreferrer"&gt;Jina&lt;/a&gt;. Most providers have general purpose embedding models, but some provide models trained for specific datasets, like &lt;a href="https://docs.voyageai.com/docs/embeddings" rel="noopener noreferrer"&gt;Voyage AI's finance, law and code&lt;/a&gt; optimised models.&lt;/p&gt;

&lt;p&gt;These API providers provide HTTP APIs, often with an npm package to make it easy to call them. You’ll typically need an API key from the service and you can then generate embeddings by sending your text to the API.&lt;/p&gt;

&lt;p&gt;For example, you can &lt;a href="https://ai.google.dev/gemini-api/docs/embeddings" rel="noopener noreferrer"&gt;use Google's text embedding models through the Gemini API&lt;/a&gt; like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@google/generative-ai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;genAI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;genAI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getGenerativeModel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-004&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;values&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// =&amp;gt; [0.04574034, 0.038084425, -0.00916391, ...]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each API is different though, so while making a request to create embeddings is normally fairly straightforward, you’ll likely have to learn a new method for each API you want to call—unless of course, you try one of the available frameworks that are intended to simplify this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frameworks
&lt;/h2&gt;

&lt;p&gt;There are many projects out there, like &lt;a href="https://js.langchain.com/docs/introduction/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; or &lt;a href="https://ts.llamaindex.ai/" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt;, that create abstractions over the various parts of the GenAI toolchain, including embeddings.&lt;/p&gt;

&lt;p&gt;Both LangChain and LlamaIndex enable you to generate embeddings via APIs or local models, all with the same interface. For example, here’s how you can &lt;a href="https://js.langchain.com/docs/integrations/text_embedding/google_generativeai/" rel="noopener noreferrer"&gt;create the same embedding as above using the Gemini API and LangChain&lt;/a&gt; together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/google-genai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-004&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// =&amp;gt; [0.04574034, 0.038084425, -0.00916391, ...]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To compare, this is what it looks like to use the &lt;a href="https://js.langchain.com/docs/integrations/text_embedding/openai/" rel="noopener noreferrer"&gt;OpenAI embeddings model through LangChain&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OpenAIEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-3-large&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// =&amp;gt; [0.009445431, -0.0073068426, -0.00814802, ...]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Aside from changing the name of the import and sometimes the options, the embedding models all have a consistent interface to make it easier to swap them out.&lt;/p&gt;

&lt;p&gt;If you’re using LangChain to create your entire pipeline, these embedding interfaces work very well alongside the vector database interfaces. You can provide an embedding model to the database integration and LangChain handles generating the embeddings as you insert documents or perform vector searches. For example, here is how to embed some documents using Google's embeddings and store them in &lt;a href="https://js.langchain.com/docs/integrations/vectorstores/astradb/" rel="noopener noreferrer"&gt;Astra DB via LangChain&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/google-genai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AstraDBVectorStore&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/community/vectorstores/astradb&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAIEmbeddings&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text-embedding-004&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;AstraDBVectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// a list of document objects to put in the store&lt;/span&gt;
  &lt;span class="nx"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// the embeddings model&lt;/span&gt;
  &lt;span class="nx"&gt;astraConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// config to connect to Astra DB&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you provide the embeddings model to the database object, you can then use it to perform vector searches too.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;vectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similaritySearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Are robots allowed to protect themselves?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LlamaIndex allows for similar creation of embedding models and vector stores that use them. Check out &lt;a href="https://ts.llamaindex.ai/docs/llamaindex/tutorials/rag" rel="noopener noreferrer"&gt;the LlamaIndex documentation on RAG&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As a bonus, the lists of models that &lt;a href="https://js.langchain.com/docs/integrations/text_embedding/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; and &lt;a href="https://ts.llamaindex.ai/docs/llamaindex/modules/embeddings" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt; integrate are good examples of popular embedding models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Directly in the database
&lt;/h2&gt;

&lt;p&gt;So far, the methods above mostly involve creating a vector embedding independently of storing the embedding in a vector database. When you want to store those vectors in a vector database like Astra DB, it looks a bit like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DataAPIClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/astra-db-ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_COLLECTION&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;$vector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.04574034&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.038084425&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.00916391&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This assumes you have already created a vector enabled collection with the correct number of dimensions for the model you are using.&lt;/p&gt;

&lt;p&gt;You can also search against the documents in your collection using a vector like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.04574034&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.038084425&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.00916391&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this case, you have to create your vectors first, and then store or search against the database with them. Even in the case of the frameworks, that process happens, but it’s just abstracted away.&lt;/p&gt;

&lt;p&gt;With Astra DB, you can have the database generate the embeddings for you as you’re inserting documents into a collection or as you perform a vector search against a collection.&lt;/p&gt;

&lt;p&gt;This is called &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=how-to-create-vector-embeddings-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB vectorize&lt;/a&gt;; here's how it works.&lt;/p&gt;

&lt;p&gt;First, set up an embedding provider integration. There is a built-in integration offering the &lt;a href="https://build.nvidia.com/nvidia/embed-qa-4" rel="noopener noreferrer"&gt;NVIDIA NV-Embed-QA model&lt;/a&gt;, or you can choose one of the other providers and configure them with your own API key.&lt;/p&gt;

&lt;p&gt;Then when you set up a collection, you can choose which embedding provider you want to use and set the correct number of dimensions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffpb60b3qbxf9b2tbbp34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffpb60b3qbxf9b2tbbp34.png" width="800" height="547"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, when you add a document to this collection, you can add the content using the special key &lt;code&gt;$vectorize&lt;/code&gt; and a vector embedding will be created.&lt;/p&gt;

&lt;p&gt;await collection.insertOne({&lt;br&gt;
  $vectorize: "A robot may not injure a human being or, through inaction, allow a human being to come to harm."&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;When you want to perform a vector search against this collection, you can sort by the special &lt;code&gt;$vectorize&lt;/code&gt; field and again, Astra DB will handle creating vector embeddings and then performing the search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Are robots allowed to protect themselve?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This has several advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It's robust, as Astra DB handles the interaction with the embedding provider&lt;/li&gt;
&lt;li&gt;It can be quicker than making two separate API calls to create embeddings and then store them&lt;/li&gt;
&lt;li&gt;It's less code for you to write&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Choose the method that works best for your application
&lt;/h2&gt;

&lt;p&gt;There are many models, providers, and methods you can use to turn text into vector embeddings. Creating vector embeddings from your content is a vital part of the RAG pipeline and it does require some experimentation to get it right for your data.&lt;/p&gt;

&lt;p&gt;You have the choice to host your own models, call on APIs, use a framework, or let Astra DB handle creating vector embeddings for you. And, if you want to avoid code altogether, you could choose to use &lt;a href="https://docs.langflow.org/chat-with-rag" rel="noopener noreferrer"&gt;Langflow's drag-and-drop interface to create your RAG pipeline&lt;/a&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>genai</category>
      <category>vectordatabase</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Building a Weather App with a Raspberry Pi, Astra DB, and Langflow</title>
      <dc:creator>Aaron Ploetz</dc:creator>
      <pubDate>Fri, 14 Mar 2025 19:11:35 +0000</pubDate>
      <link>https://dev.to/datastax/building-a-weather-app-with-a-raspberry-pi-astra-db-and-langflow-1fdl</link>
      <guid>https://dev.to/datastax/building-a-weather-app-with-a-raspberry-pi-astra-db-and-langflow-1fdl</guid>
      <description>&lt;p&gt;To celebrate PI Day this year, we thought it would be fun to build something with a Raspberry Pi that uses Astra DB and/or Langflow. Fortunately, I have just the project in-mind: A weather application!&lt;/p&gt;

&lt;p&gt;To that end, our goal will be to use the National Weather Service’s (NWS) data API. Essentially, we will call this API to get the most recent weather data, store it in Astra DB, and display it on a simple front-end.&lt;/p&gt;

&lt;h2&gt;
  
  
  Requirements
&lt;/h2&gt;

&lt;p&gt;To build our project, we’re going to need a few things. First of all, our development environment will use the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Java 17&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Spring Boot&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Maven&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Vaadin&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;An Astra DB account with an active database and Langflow instance.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And of course, we’ll also need a Raspberry Pi. For this project, we used a &lt;a href="https://www.canakit.com/canakit-raspberry-pi-5-starter-kit-turbine-black.html?srsltid=AfmBOoom-Ao7vUEyjsHw82_uCTXSI9467dsXh8lqSu9nGP2zi-1K6zvg" rel="noopener noreferrer"&gt;Cana Kit™ Raspberry Pi 5 Starter Kit PRO Turbine Black&lt;/a&gt; (4GB RAM / 128GB Micro SD).&lt;/p&gt;

&lt;h2&gt;
  
  
  The weather application
&lt;/h2&gt;

&lt;p&gt;The weather application we will use can be found in this GitHub repository: &lt;a href="https://github.com/aar0np/weather-app/tree/main" rel="noopener noreferrer"&gt;https://github.com/aar0np/weather-app/tree/main&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This application originally appeared in Chapter 8 of the book &lt;a href="https://www.amazon.com/Code-Java-practical-efficient-applications/dp/9355519990/" rel="noopener noreferrer"&gt;Code with Java 21&lt;/a&gt;. The original project was designed to work with DataStax Astra DB, using the CQL protocol. Our fork of it is a bit different, as it can refresh its data view from either the Astra DB Data API or from a Langflow API endpoint.&lt;/p&gt;

&lt;p&gt;At its core, the project is a Java Spring Boot application, which has a Vaadin web front end, and also exposes two restful endpoints. The idea behind the endpoints is as much for testing as it is functional. One endpoint pulls the most recent update from the NWS API for a given weather station ID, and stores it in Astra DB. The other endpoint retrieves the most-recent reading from Astra DB for a particular station and year/month combination.&lt;/p&gt;

&lt;h3&gt;
  
  
  Restful examples
&lt;/h3&gt;

&lt;p&gt;Pulling the latest reading from the NWS for the station KMSP (Minneapolis/St.Paul International Airport), and storing it in Astra DB:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -X PUT http://127.0.0.1:8080/weather/astradb/api/latest/station/kmsp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This endpoint returns a response similar to the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"stationId":"https://api.weather.gov/stations/KMSP","monthBucket":202503,"timestamp":"2025-03-07T22:53:00Z","readingIcon":"https://api.weather.gov/icons/land/day/few?size=medium","stationCoordinatesLatitude":-93.22,"stationCoordinatesLongitude":44.88,"temperatureCelsius":5.6,"windDirectionDegrees":310,"windSpeedKMH":20.52,"windGustKMH":0.0,"visibilityM":16090,"precipitationLastHour":0.0,"cloudCover":{"7620":"FEW","1830":"FEW"}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This endpoint pulls the latest reading for a specific station and year/month combination:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -X GET http://127.0.0.1:8080/weather/astradb/api/latest/station/kmsp/month/202503
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This endpoint returns a response similar to the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"stationId":"kmsp","monthBucket":202503,"timestamp":"2025-03-07T22:53:00Z","readingIcon":"https://api.weather.gov/icons/land/day/few?size=medium","stationCoordinatesLatitude":-93.22,"stationCoordinatesLongitude":44.88,"temperatureCelsius":5.6,"windDirectionDegrees":310,"windSpeedKMH":0.0,"windGustKMH":0.0,"visibilityM":0,"precipitationLastHour":0.0,"cloudCover":{"7620":"FEW","1830":"FEW"}}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: The above restful GET call is also used to populate the web frontend.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Astra DB Data API
&lt;/h3&gt;

&lt;p&gt;First, create a new Astra DB database. We can also use an existing database, as long as we create the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Keyspace named: “weatherapp”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Non-vector collection named: “weather_data”&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are two primary controller methods that handle the above data calls. The first method is named &lt;code&gt;putLatestAstraAPIData&lt;/code&gt; and handles the restful PUT call. First, it performs a GET call on the NWS API endpoint for the &lt;code&gt;stationid&lt;/code&gt; that was passed-in. It takes the payload, maps it to an Astra DB Data API Document type named weatherDoc. Then, it saves weatherDoc in Astra DB (via the Data API). Finally, it maps the response to a &lt;code&gt;WeatherReading&lt;/code&gt; object named &lt;code&gt;currentReading&lt;/code&gt;, and returns it. The code for the &lt;code&gt;putLatestAstraAPIData()&lt;/code&gt; method is shown below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@PutMapping("/astradb/api/latest/station/{stationid}")
public ResponseEntity&amp;lt;WeatherReading&amp;gt; putLatestAstraAPIData(
             @PathVariable(value="stationid") String stationId) {

       LatestWeather response = restTemplate.getForObject(
                    "https://api.weather.gov/stations/" + stationId + 
                    "/observations/latest", LatestWeather.class);

       Document weatherDoc = mapLatestWeatherToDocument(response, stationId);

       // save weather reading
       collection.insertOne(weatherDoc);

       // build response
       WeatherReading currentReading =
                    mapLatestWeatherToWeatherReading(response);

      return ResponseEntity.ok(currentReading);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The other controller method is named &lt;code&gt;getLatestAstraAPIData&lt;/code&gt; and it handles the RESTful GET call. This method takes the &lt;code&gt;stationid&lt;/code&gt; and the &lt;code&gt;monthBucket&lt;/code&gt; that were passed-in, and uses the Data API to find any matching documents. As there might be multiple, the results are sorted in descending order by timestamp, and the top document is processed. This ensures that the latest document is mapped and returned.&lt;/p&gt;

&lt;p&gt;The code for the &lt;code&gt;getLatestAstraAPIData&lt;/code&gt;() method is shown below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@GetMapping("/astradb/api/latest/station/{stationid}/month/{month}")
public ResponseEntity&amp;lt;WeatherReading&amp;gt; getLatestAstraAPIData(
             @PathVariable(value="stationid") String stationId,
             @PathVariable(value="month") int monthBucket) {

       Filter filters = Filters.and(eq("station_id",(stationId)),
                           (eq("month_bucket",monthBucket)));
       Sort sort = Sorts.descending("timestamp");
       FindOptions findOpts = new FindOptions().sort(sort);
       FindIterable&amp;lt;Document&amp;gt; weatherDocs = collection.find(filters, findOpts);
       List&amp;lt;Document&amp;gt; weatherDocsList = weatherDocs.all();

       if (weatherDocsList.size() &amp;gt; 0) {
              Document weatherTopDoc = weatherDocsList.get(0);
              WeatherReading currentReading =
                           mapDocumentToWeatherReading(weatherTopDoc);
              return ResponseEntity.ok(currentReading);
       }

       return ResponseEntity.ok(new WeatherReading());
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Langflow API
&lt;/h3&gt;

&lt;p&gt;This entire process can also work through Langflow. Open up Langflow, create a new flow, and pick the “Simple Agent” template. &lt;em&gt;This simple agent is all that we need.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61w0oze19un6hzdc8r8l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F61w0oze19un6hzdc8r8l.png" alt="A sample flow created by selecting the “Simple Agent” template in Langflow." width="753" height="675"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;A sample flow created by selecting the “Simple Agent” template in Langflow.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The agent is built with the URL “tool,” which allows the agent to call out to external web addresses, including APIs. To expose this agent, we simply need to click on the “API” tab and make note of the Langflow endpoint URL. We will add this URL as an environment variable with our application.&lt;/p&gt;

&lt;p&gt;Inside our application, our call to Langflow is handled by a method named askAgent. Simply put, this method calls our Langflow API endpoint, maps the result, and returns it to the UI. The code for the &lt;code&gt;askAgent()&lt;/code&gt; method can be seen below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public WeatherReading askAgent (AgentRequest req) {

       String reqJSON = new Gson().toJson(req);
       HttpEntity&amp;lt;String&amp;gt; requestEntity =
                    new HttpEntity&amp;lt;&amp;gt;(reqJSON, langflowHeader);

       ResponseEntity&amp;lt;LangflowResponse&amp;gt; resp =
                    restTemplate.exchange(LANGFLOW_URL,
                    HttpMethod.POST,
                    requestEntity,
                    LangflowResponse.class);

       LangflowResponse lfResp = resp.getBody();
       LangflowOutput1[] outputs = lfResp.getOutputs();

       return mapLangflowResponseToWeatherReading(outputs);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The method inside our Vaadin UI code that calls the &lt;code&gt;askAgent()&lt;/code&gt; method, is named refreshLangflow() and is triggered by a button on the UI. It composes a message for our Langflow agent, sends it, and uses the data returned to refresh the UI. The code can be seen below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;private void refreshLangflow() {

      String message = "Please retrieve the latest weather data (including the weather icon url) in a text format using this endpoint: "
+ "https://api.weather.gov/stations/" + stationId.getValue() + 
"/observations/latest";
       latestWeather = controller.askAgent(new AgentRequest(message));

       refreshData(latestWeather);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After reviewing the code, we should now be ready to build and configure our hardware.&lt;/p&gt;

&lt;h2&gt;
  
  
  Raspberry Pi
&lt;/h2&gt;

&lt;p&gt;First of all, we will need to assemble the Pi. Fortunately, Cana Kit has a great &lt;a href="https://www.canakit.com/pi5-case" rel="noopener noreferrer"&gt;setup video&lt;/a&gt; that walks through the entire process.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: I prefer to use Cana Kits, because they come with everything that you need, such as a Micro HDMI cable, a heat sink with fan, and a Micro SD card.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Once the Pi is assembled and running, it will check for updates and reboot. When we get to the Raspberry Pi OS desktop, we will have a few things to install (using the Terminal application).&lt;/p&gt;

&lt;h3&gt;
  
  
  Java
&lt;/h3&gt;

&lt;p&gt;For our application to run, we need a Java Virtual Machine (JVM). As we will also need to build our application locally, we’ll need a Java Development Kit (JDK) as well. In our case, our Pi had Java 17 installed, and this is sufficient for our purposes.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: The Raspberry Pi OS makes it difficult to install and configure newer versions of Java. Fortunately, our project compiles just fine with Java 17.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Maven
&lt;/h3&gt;

&lt;p&gt;Maven is a build- and dependency-management tool for Java. Our project was built with Maven, so we will need to install it as well:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt install maven
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Git
&lt;/h3&gt;

&lt;p&gt;Our Pi also had Git installed. After creating a new SSH key and adding it to your GitHub account, we should be able to clone the project repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone git@github.com:aar0np/weather-app.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will download the code and create a local directory for our application, where we can build and run it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting it all together
&lt;/h2&gt;

&lt;p&gt;First, we will cd into our project directory and then build it with Maven:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd weather-app
mvn clean install -Pproduction
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we need to define three environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export ASTRA_DB_API_ENDPOINT=https://not-real-us-east1.apps.astra.datastax.com
export ASTRA_DB_APP_TOKEN=AstraCS:wtqNOTglg:725REAL238dEITHER563486d
export ASTRA_LANGFLOW_URL=https://api.langflow.astra.datastax.com/lf/6f-not-real-9493/api/v1/run/060d2-not-real-caef?stream=false
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can now run our application. Maven will create a JAR file in the &lt;code&gt;weather-app/target&lt;/code&gt; directory. If we locate this JAR file, we can run it like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;java -jar target/weatherapp-0.0.1-SNAPSHOT.jar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A successful run should produce several log messages, the last of which should look similar to this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2025-03-07T20:02:02.259-06:00  INFO 53787 --- [WeatherApp] [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port 8080 (http) with context path '/'
2025-03-07T20:02:02.278-06:00  INFO 53787 --- [WeatherApp] [           main] c.d.weatherapp.WeatherappApplication     : Started WeatherappApplication in 1.795 seconds (process running for 2.147)
2025-03-07T20:02:04.855-06:00  INFO 53787 --- [WeatherApp] [nio-8080-exec-1] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring DispatcherServlet 'dispatcherServlet'
2025-03-07T20:02:04.855-06:00  INFO 53787 --- [WeatherApp] [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet        : Initializing Servlet 'dispatcherServlet'
2025-03-07T20:02:04.856-06:00  INFO 53787 --- [WeatherApp] [nio-8080-exec-1] o.s.web.servlet.DispatcherServlet        : Completed initialization in 0 ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From another terminal window/tab, let’s add some data to the DB:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -X PUT http://127.0.0.1:8080/weather/astradb/api/latest/station/kmsp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, navigate to the local IP on port 8080: &lt;a href="http://127.0.0.1:8080/" rel="noopener noreferrer"&gt;http://127.0.0.1:8080/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Station ID is pre-populated with “kmsp” and the current year/month is auto-generated. Clicking either “Astra DB Refresh” or “Langflow Refresh” should produce something similar to this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3i0uug0pq1ylomb71iz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs3i0uug0pq1ylomb71iz.png" alt="Our finished application running on a Raspberry Pi" width="720" height="444"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: The Astra DB Refresh will be faster than the Langflow Refresh.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Depending on the intended usage, it might be preferable to add a crontab entry for the PUT call to keep the data recent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;crontab -e
...
15 * * * * curl -X PUT http://127.0.0.1:8080/weather/astradb/api/latest/station/kmsp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With that, we should have a working weather application running on a Raspberry Pi! And without having to mount any pesky sensors in the backyard. Want to try this yourself? Get started with &lt;a href="https://www.datastax.com/products/datastax-astra?utm_medium=byline&amp;amp;utm_campaign=building-a-weather-app-with-raspberry-pi-astra-db-langflow&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt; and &lt;a href="https://www.datastax.com/products/langflow?utm_medium=byline&amp;amp;utm_campaign=building-a-weather-app-with-raspberry-pi-astra-db-langflow&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt; today!&lt;/p&gt;

&lt;p&gt;Happy Pi Day!&lt;/p&gt;

</description>
      <category>programming</category>
      <category>langflow</category>
      <category>raspberrypi</category>
      <category>ai</category>
    </item>
    <item>
      <title>Top 3 Mistakes I Made While Building AI Agents</title>
      <dc:creator>melienherrera</dc:creator>
      <pubDate>Wed, 12 Mar 2025 15:13:56 +0000</pubDate>
      <link>https://dev.to/datastax/top-3-mistakes-i-made-while-building-ai-agents-ah1</link>
      <guid>https://dev.to/datastax/top-3-mistakes-i-made-while-building-ai-agents-ah1</guid>
      <description>&lt;p&gt;Agents have become a hot topic in AI, and, like many of you, I initially wondered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“Why can’t I just prompt my LLM to do this task for me?”&lt;/li&gt;
&lt;li&gt;“What’s the difference between prompting a model versus using an agent?”&lt;/li&gt;
&lt;li&gt;“Oh no, another AI concept that I have to learn?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After diving into agent development, I quickly realized why this approach has generated so much buzz. Unlike simple LLM prompts, agents can interact with external tools, maintain state across multiple steps, and execute complex workflows. Agents are like a personal assistant who can email contacts, write documentation, and schedule appointments – deciding which tool to use when, and understanding the right moment to apply it.&lt;/p&gt;

&lt;p&gt;This journey wasn’t without challenges. Like many developers in their discovery phase, I made mistakes along the way while building my personal assistant app. Each misstep taught me valuable lessons that improved my approach to building agents.&lt;/p&gt;

&lt;p&gt;In this post, I’ll share my top three mistakes, in hopes that by “building in public,” I can help you avoid these same pitfalls. Let’s dive in!&lt;/p&gt;

&lt;h2&gt;
  
  
  Mistake #1: Overestimating the agent’s capabilities
&lt;/h2&gt;

&lt;p&gt;My first mistake when I started building agents seemed to be a simple but critical one. I learned that agents have agency: an ability to decide and reason where a base LLM does not. They can select tools, maintain context, and execute multi-step plans. Because of this, I drastically underestimated the importance of clear, detailed instructions to the agent’s system prompt, and overestimated the agent’s capability to figure things out on its own.&lt;/p&gt;

&lt;p&gt;Newsflash: agents are still powered by LLMs! Agents use LLMs as their core reasoning and decision-making engine. This means that they have both the same strengths and the same limitations of their underlying language model.&lt;/p&gt;

&lt;p&gt;I initially created vague prompts like “You are a helpful assistant that can email people, create docs, and other operational tasks. Be clear and concise and maintain a professional tone throughou.t” I assumed that because the agent had access to email tools and documentation tools it would intuitively understand when and how to use them appropriately. However, this was not the case and my prompt was simply not enough.&lt;/p&gt;

&lt;p&gt;What I learned through trial and error is that while agents add powerful tool-using capabilities, they are not “magic” – entirely. They still need the same level of clear guidance and explicit instructions that you’d provide in a direct LLM prompt – perhaps even more so. The agent needs to know not just that it has access to tools, but precisely when to invoke them, how to interpret their outputs, and how to integrate them back into the expected workflow. Here’s my my improved prompt:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9kjta958fjite89fh75n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9kjta958fjite89fh75n.png" alt="An image of an improved prompt, starting with " width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I fundamentally misunderstood how to effectively prompt an agent. Once I started writing detailed system prompts like the one above, providing examples where needed, and referencing tool names where I could, my agent’s performance improved dramatically.&lt;/p&gt;

&lt;p&gt;My second critical mistake was attempting to create ONE “super agent” equipped with every possible tool needed for my personal assistant app. I understood the concept of agents and tools, so I began to connect a bunch of these tools to my agent. Remember, the actions I needed the agent to be able to do spanned across document processing, email communication, data retrieval, even basic chatbot capabilities. I thought this would essentially create a powerful, all-in-one assistant.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjrbqa8xjj6qu3a5xcseb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjrbqa8xjj6qu3a5xcseb.png" alt="Example of overloading the agent with multiple tools" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;An example of overloading the agent with multiple tools.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I found out that my agent became overwhelmed with options and struggled to manage context across complex tasks and multi-step requests such as “Access this doc [doc link], summarize it, then draft an email”. The agent would take incorrect steps, forget steps in the process, or confuse tools like the Google Docs tool versus the URL tool versus hallucinating a response from the LLM. Additionally, response times would increase dramatically depending on how complex the task was.&lt;/p&gt;

&lt;p&gt;The solution to this came when I restructured my approach to use a multi-agent architecture with specialized components. I created an agent with focused toolsets: a document agent, an email agent, a RAG agent—and I plan to implement more! Connecting these is an orchestrator agent that essentially acts as the “decision-maker” of the app and routes tasks to the appropriate specialized agent based on the user’s request.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn4wjg9u0eg0q82ifhb31.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn4wjg9u0eg0q82ifhb31.png" width="800" height="452"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The orchestrator agent’s role is to understand the user’s intent, break complex requests into subtasks, and delegate them to the right agent. It was now able to handle requests such as the one above “Access this doc [doc link], summarize it, then draft an email” and break it down into something like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;First, use GOOGLEDOCS_AGENT to access the doc link&lt;/li&gt;
&lt;li&gt;Second, use LLM to summarize it, and form the content for the email draft&lt;/li&gt;
&lt;li&gt;Third, use GMAIL_AGENT to create the actual email draft for the user to be able to review and easily send it off&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This mistake taught me that complex AI workflows benefit from division of labor, just as humans do. Each agent should have a clearly defined scope with the right tools for the specific job. Just make sure you assign and describe those tools and jobs correctly.&lt;/p&gt;

&lt;p&gt;This leads me to my third and probably most-critical mistake. After refining my agent prompts and implementing multi-agent architecture, I thought I was on the right track. But I quickly encountered another obstacle: my tools were not being used correctly, or in some cases, not at all. This led me to my second mistake in the agent development process, which was not properly naming or describing each tool for the agent.&lt;/p&gt;

&lt;p&gt;When implementing tools in &lt;a href="https://www.datastax.com/products/langflow?utm_medium=byline&amp;amp;utm_campaign=top-three-mistakes-building-agents&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt;, I initially gave them generic names like “Email Tool” or “Docs Tool” with minimal descriptions. I assumed that since I had properly connected the APIs through Composio (a third-party app integration tool) and the functionality worked when tested individually, the agent would inherently understand how to use that. Though the agent did come through sometimes, it did not happen 100% of the time.&lt;/p&gt;

&lt;p&gt;I discovered that meaningful tool names and descriptions are critical for the agent’s decision-making process. For example, if the input is “Access this marketing doc and summarize it: [docs link]”, the agent has to match the intent to the appropriate tool. My original Google Docs tool implementation looked something like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Name: “Docs Agent”&lt;/li&gt;
&lt;li&gt;Description: “Use this to create and access docs”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With vague names and descriptions, the agent would sometimes struggle to make the correct decision consistently, fail to use the tool, or use it incorrectly. With the above description, it would attempt to use the URL tool instead of accessing the doc through the Google Docs API, which has the proper permissions.&lt;/p&gt;

&lt;p&gt;After recognizing the issue, I implemented more descriptive naming and description:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Name - “GOOGLEDOCS_AGENT”&lt;/li&gt;
&lt;li&gt;Description - “A Google Docs tool with access to the following tools: creating new Google Docs (give relevant titles, context, etc), edit existing Google Docs, retrieve existing Google Docs via the Google Docs link”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1us7n6oxbdhtpib19w0b.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1us7n6oxbdhtpib19w0b.gif" width="1024" height="1024"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The improvement was immediate and significant. With clearer naming conventions and more detailed and prescriptive descriptions, the agent began consistently selecting the right tools for each task. Tool descriptions are essentially API documentation for your agent. Through this mistake, I learned that an agent, just like an LLM, can only be as effective as the information you provide about its available tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;After overcoming these three mistakes – underestimating the importance of prompting, overloading the agent with tools, and poorly implementing tools – I’ve gained valuable insights into effective agent development.&lt;/p&gt;

&lt;p&gt;The most important lesson that I learned is something I feel that I’ve always known – our tools are only as powerful as how we make them to be. Agents are powerful, and they can seem magical—but they aren’t. We still have to provide the right tools, detailed descriptions, and structured architecture in order for them to shine. Tools like &lt;a href="https://langflow.new/ui" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt; have really helped me break down the concept of agents, fail fast, and iterate on my mistakes. It’s about finding the right balance between giving your agents enough information while avoiding overwhelming them with too many options or vague instructions.&lt;/p&gt;

&lt;p&gt;For those who are embarking on their own agent-building journey, I hope you learned a thing or two from this post. The field of AI agents is still growing and evolving fast – what works well today may change tomorrow as models improve.&lt;/p&gt;

&lt;p&gt;What mistakes have you encountered while getting started with agents? Please start a discussion in our &lt;a href="https://dtsx.io/join-discord" rel="noopener noreferrer"&gt;Discord&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQ)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is an agent and how do they differ from regular LLM prompts?
&lt;/h3&gt;

&lt;p&gt;Unlike simple LLM prompts, agents can interact with external tools, maintain state across multiple steps, and execute complex workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are multi-agents?
&lt;/h3&gt;

&lt;p&gt;“Multi-agents” use specialized AI agents to focus on specific tasks or domains. Instead of using one “super agent” connected to every possible tool, a multi-agent architecture uses dedicated agents for specific functions (like document processing, email management, or data retrieval).&lt;/p&gt;

&lt;h3&gt;
  
  
  What does this personal assistant app do?
&lt;/h3&gt;

&lt;p&gt;This personal assistant uses AI agents to handle multiple tasks, multi-step tasks, and more, such as drafting emails, summarizing meeting notes, and retrieving knowledge from a database.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Langflow, and why use it?
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://langflow.org/" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt; is a visual IDE for building generative and agentic AI workflows. It simplifies creating complex AI flows, enables quick iteration, and integrates seamlessly with applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  What tools are used?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Langflow – AI app development, agents&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://astra.datastax.com/?utm_medium=byline&amp;amp;utm_campaign=top-three-mistakes-building-agents&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt; – Vector database, data retrieval, RAG&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://composio.dev/" rel="noopener noreferrer"&gt;Composio&lt;/a&gt; – Application integration platform for AI Agents and LLMs, handles Gmail and Google Doc API integrations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where can I find the flow file?
&lt;/h3&gt;

&lt;p&gt;At my Github: &lt;a href="https://github.com/melienherrera/personal-assistant-langflow" rel="noopener noreferrer"&gt;https://github.com/melienherrera/personal-assistant-langflow&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
    </item>
    <item>
      <title>Introducing Astra DB for AI Agents: A New Era of Database Interaction</title>
      <dc:creator>Tejas Kumar</dc:creator>
      <pubDate>Wed, 12 Mar 2025 08:06:35 +0000</pubDate>
      <link>https://dev.to/datastax/introducing-astra-db-for-ai-agents-a-new-era-of-database-interaction-1364</link>
      <guid>https://dev.to/datastax/introducing-astra-db-for-ai-agents-a-new-era-of-database-interaction-1364</guid>
      <description>&lt;p&gt;Today we're thrilled to unveil a new way of interacting with our flagship vector database, Astra DB. Say hello to Astra DB over MCP—an innovative way to communicate with your database that leverages the Model Context Protocol (MCP) to let you create and manage databases without writing a single line of code.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/wRR-SzcI0zc"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Model Context Protocol (MCP)?
&lt;/h2&gt;

&lt;p&gt;MCP is an innovation &lt;a href="https://www.anthropic.com/news/model-context-protocol" rel="noopener noreferrer"&gt;first pioneered by Anthropic&lt;/a&gt; in late 2024. It’s a standardized protocol designed for sharing context between language models and tools. This means that any MCP server can communicate with any MCP client, enabling language models to execute functions agentically on your behalf. Imagine being able to hand off entire functions to an AI—MCP makes that possible.&lt;/p&gt;

&lt;p&gt;For example, popular MCP clients include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://claude.ai/download" rel="noopener noreferrer"&gt;Claude Desktop&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.cursor.com/" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both can consume data from MCP servers and act agentically as a result. Let’s get hands on with Astra DB over MCP.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hands-on with Astra DB over MCP
&lt;/h2&gt;

&lt;p&gt;In our demo, we explore how Astra DB over MCP unlocks a new way of interacting with your data. Let’s walk through the process.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Set up your Astra DB environment
&lt;/h3&gt;

&lt;p&gt;To get started, you need an Astra DB application token and an API endpoint. To get these, you’ll have to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sign up for Astra DB -&lt;/strong&gt; It’s free and quick. Just &lt;a href="https://astra.datastax.com/signup?utm_medium=byline&amp;amp;utm_campaign=astra-db-for-ai-agents-mcp&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;sign up&lt;/a&gt;, create a database, and you’ll receive your API endpoint along with an application token. Here are &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/dataapiclient.html#set-environment-variables?utm_medium=byline&amp;amp;utm_campaign=astra-db-for-ai-agents-mcp&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;more detailed instructions&lt;/a&gt; to do so.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create your database -&lt;/strong&gt; For this demo, we set up a vector database named “my_mcp_db.” Astra DB’s multi-cloud capability means you can choose your preferred region, and the database is ready within minutes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Integrating with an MCP client
&lt;/h3&gt;

&lt;p&gt;Once your Astra DB instance is ready, you can integrate it with an MCP client like Claude Desktop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Configure Claude Desktop -&lt;/strong&gt; Open the app, go to Preferences → Developer → Edit Config. This will take you to a JSON file. Paste the following JSON configuration snippet that includes your DB token and API endpoint.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Launch and verify -&lt;/strong&gt; Restart Claude Desktop and watch as it connects to Astra DB—instantly revealing 10 available MCP tools.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From here, you can ask Claude to do anything you like inside your database: create collections, insert data, clean up, and more. This is a handy way of interacting with your database via an AI assistant, but we can do more when we use Cursor as our MCP client.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a full, end-to-end application with a UI, database, and API
&lt;/h2&gt;

&lt;p&gt;The real magic happens when you use Astra DB over MCP in Cursor. To set this up:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Go to Settings -&amp;gt; Cursor Settings -&amp;gt; MCP&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;From there, you can add the server by clicking the "+ Add New MCP Server" button and entering the following values:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Name - Whatever you want&lt;/li&gt;
&lt;li&gt;Type - Command&lt;/li&gt;
&lt;li&gt;Command -
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;env ASTRA_DB_APPLICATION_TOKEN=your_astra_db_token ASTRA_DB_API_ENDPOINT=your_astra_db_endpoint npx -y @datastax/astra-db-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once added, your editor will be fully connected to your Astra DB database.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frnd1zc2a77nbggkcrvmd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frnd1zc2a77nbggkcrvmd.png" alt="an image showing your editor connected to your database." width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now you can invoke the Cursor agent (by pressing Cmd+I on macOS) and ask it to build anything you want: whenever a database is needed, it will automatically operate Astra DB to do whatever is required.&lt;/p&gt;

&lt;p&gt;In our demo in the video above, the language model agent executes a series of tasks: from setting up the collection to auto-generating Next.js route handlers and fixing UI issues on the fly. The result? A fully functional to-do list app powered entirely by Astra DB over MCP.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;Astra DB over MCP demonstrates the incredible potential of combining any tool with AI agents. By enabling agentic interactions between tools and language models:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Developers can accelerate time to production without the overhead of boilerplate code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Non-technical users can create applications that are normally reserved for seasoned programmers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Innovation is democratized, letting you build everything from a Twitter clone to a YouTube replica with minimal effort.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What’s next?
&lt;/h2&gt;

&lt;p&gt;We’re excited to see what you’ll build using this new mode of development. Whether you’re a developer, a startup founder, or a tech enthusiast, Astra DB over MCP opens up a world of possibilities. So, what will you create? &lt;a href="https://discord.com/invite/datastax" rel="noopener noreferrer"&gt;Join the conversation on Discord&lt;/a&gt;, try out Astra DB over MCP, and let us know how you’re leveraging the power of agentic database interactions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQ)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. What is Astra DB over MCP?
&lt;/h3&gt;

&lt;p&gt;Astra DB over MCP is a new method of interacting with our flagship vector database, Astra DB, using the Model Context Protocol. It allows you to perform database operations through prompts with an AI agent—without writing any code.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. What is the Model Context Protocol (MCP)?
&lt;/h3&gt;

&lt;p&gt;MCP is an open standard, first pioneered by Anthropic in late 2024, that enables seamless communication between language models and external tools. It allows AI systems to share context and execute functions agentically on your behalf.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Do I need to write any code to interact with Astra DB over MCP?
&lt;/h3&gt;

&lt;p&gt;No! One of the key benefits of this new integration is that you can perform complex database operations—such as creating collections, inserting data, and building entire applications—without writing a single line of code. MCP clients like Claude Desktop, Cursor, etc. manage all the interactions for you.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. How do I get started with Astra DB over MCP?
&lt;/h3&gt;

&lt;p&gt;Simply &lt;a href="https://astra.datastax.com/signup?utm_medium=byline&amp;amp;utm_campaign=astra-db-for-ai-agents-mcp&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;sign up for Astra DB&lt;/a&gt;, create your database to receive an API endpoint and an application token, and then configure your MCP client by updating its settings with these credentials.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. What are some examples of MCP clients?
&lt;/h3&gt;

&lt;p&gt;Popular MCP clients include &lt;a href="https://claude.ai/download" rel="noopener noreferrer"&gt;Claude Desktop&lt;/a&gt;—a desktop application for interacting with AI models—and &lt;a href="https://www.cursor.com/" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;, an AI-enabled version of VS Code that integrates MCP tools directly into your development workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Is Astra DB over MCP an open-source project?
&lt;/h3&gt;

&lt;p&gt;Yes, AstraDB over MCP is an open-source project. You can &lt;a href="http://github.com/datastax/astra-db-mcp" rel="noopener noreferrer"&gt;access the code and contribute to its development via GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. Where can I get help if I encounter issues?
&lt;/h3&gt;

&lt;p&gt;You can refer to our detailed documentation, check out the GitHub repository for troubleshooting tips, or &lt;a href="https://discord.com/invite/datastax" rel="noopener noreferrer"&gt;join our Discord community&lt;/a&gt; where fellow users and developers share advice and best practices.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>claude</category>
      <category>cursor</category>
    </item>
    <item>
      <title>5 GenAI Things You Didn't Know About Astra DB</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Thu, 06 Mar 2025 23:07:23 +0000</pubDate>
      <link>https://dev.to/datastax/5-genai-things-you-didnt-know-about-astra-db-3am9</link>
      <guid>https://dev.to/datastax/5-genai-things-you-didnt-know-about-astra-db-3am9</guid>
      <description>&lt;p&gt;Astra DB is a high-performance NoSQL database powered by Apache Cassandra® with built-in vector search, but that's just what &lt;a href="https://www.datastax.com/products/datastax-astra?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;the product page&lt;/a&gt; says. Not everything fits onto one page, so I wanted to share a few things that you might not already know about Astra DB and how it helps you to build accurate, low-latency, retrieval-augmented generation (RAG) powered generative AI apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB can create vector embeddings for you
&lt;/h2&gt;

&lt;p&gt;When ingesting data for a RAG application, there are several steps you need to take: document loading, text parsing, &lt;a href="https://www.datastax.com/blog/how-to-chunk-text-in-javascript-for-rag-applications?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;chunking text&lt;/a&gt;, &lt;a href="https://www.datastax.com/blog/how-to-create-vector-embeddings-in-node-js?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;creating vector embeddings&lt;/a&gt;, and storing it in the database. Astra DB can simplify the process by combining those last two steps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.datastax.com/blog/simplifying-vector-embedding-generation-with-astra-vectorize?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra Vectorize&lt;/a&gt; can create vector embeddings for your text chunks at the point of inserting them into the collection.&lt;/p&gt;

&lt;p&gt;When you create an Astra DB collection, you can choose one of the &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html#supported-embedding-providers?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;supported embedding models&lt;/a&gt;. There are models available from OpenAI (including Azure OpenAI), Voyage AI, Mistral AI, Jina AI, and Upstage. Astra DB also hosts NVIDIA embedding models that run in the same environment as the database, boosting performance—&lt;a href="https://www.datastax.com/blog/build-generative-ai-wikidata-datastax-nvidia#ingestion-workflow-3?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Wikidata reduced their data ingestion time from 30 days to two with Vectorize&lt;/a&gt;—and ensuring the data never leaves the database.&lt;/p&gt;

&lt;p&gt;Once you have set up your collection with your embedding provider of choice, ingesting data with Vectorize is a case of providing the text you want turned into a vector as a special &lt;code&gt;$vectorize&lt;/code&gt; property in the documents you are storing. In TypeScript, this looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DataAPIClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/astra-db-ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_COLLECTION&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertOne&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A robot may not injure a human being or, through inaction, allow a human being to come to harm.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then to perform a vector search against this collection you use the &lt;code&gt;$vectorize&lt;/code&gt; field to sort by your query.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Are robots allowed to protect themselves?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can learn more about &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Astra Vectorize in the documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB supports graph RAG
&lt;/h2&gt;

&lt;p&gt;Depending on your data, regular vector search can sometimes miss context, which makes it harder for &lt;a href="https://www.datastax.com/guides/what-is-a-large-language-model?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;large language models (LLMs)&lt;/a&gt; to answer certain queries. &lt;a href="https://www.datastax.com/guides/graph-rag?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Graph RAG&lt;/a&gt; is a technique that takes your documents, extracts links between them, and uses those links to retrieve extra contextual information at the retrieval stage. Providing extra linked context to an LLM makes for &lt;a href="https://www.datastax.com/blog/better-llm-integration-and-relevancy-with-content-centric-knowledge-graphs?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;more accurate and informed answers&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Astra DB supports graph RAG via LangChain. You can replace the &lt;code&gt;AstraDBVectorStore&lt;/code&gt; with &lt;a href="https://python.langchain.com/api_reference/astradb/graph_vectorstores/langchain_astradb.graph_vectorstores.AstraDBGraphVectorStore.html" rel="noopener noreferrer"&gt;&lt;code&gt;AstraDBGraphVectorStore&lt;/code&gt;&lt;/a&gt; and ensure you ingest your data in a way that extracts the links between documents. A simplified ingestion example that reads a URL, extracts HTML links, strips the HTML, and splits the text into chunks before storing in Astra DB (using Astra Vectorize to create embeddings) might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_loaders&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AsyncHtmlLoader&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.graph_vectorstores.extractors&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;HtmlLinkExtractor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LinkExtractorTransformer&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoupTransformer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_astradb&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AstraDBGraphVectorStore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CollectionVectorServiceOptions&lt;/span&gt;

&lt;span class="n"&gt;vectorize_options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CollectionVectorServiceOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nvidia&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NV-Embed-QA&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AstraDBGraphVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;graph&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;api_endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;collection_vector_service_options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;vectorize_options&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;urls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.datastax.com/guides/graph-rag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.datastax.com/blog/build-graph-rag-with-unstructured-and-astra-db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AsyncHtmlLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;transformer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LinkExtractorTransformer&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nc"&gt;HtmlLinkExtractor&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;as_document_extractor&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;
&lt;span class="n"&gt;bs4_transformer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BeautifulSoupTransformer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;transformer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bs4_transformer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transform_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then to search Astra DB, you can use the graph store's &lt;code&gt;traversal_search&lt;/code&gt; method to first retrieve a number of document chunks (&lt;em&gt;k&lt;/em&gt;), before traversing the graph to the specified depth for additional chunks. In this case, we perform the search initially finding four chunks using a similarity search and then traversing the graph to a depth of two to return related chunks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;traversal_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;traversal_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What are the differences between Graph RAG and naive RAG?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check out this &lt;a href="https://www.datastax.com/blog/build-graph-rag-with-unstructured-and-astra-db?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;full tutorial on building graph RAG with Unstructured and Astra DB&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB supports ColBERT
&lt;/h2&gt;

&lt;p&gt;Graph RAG can help if your context is spread across chunks, but there are other situations where graph RAG won't necessarily help. If your data contains terms that aren't in the training data of your embedding model, it can be difficult to get accurate similarity search results.&lt;/p&gt;

&lt;p&gt;One way to overcome this is to use &lt;a href="https://thenewstack.io/overcoming-the-limits-of-rag-with-colbert/" rel="noopener noreferrer"&gt;ColBERT&lt;/a&gt;. ColBERT creates a vector per token in a body of text, creating a sliding window of context over entire passages and capturing unknown context much better. This does require more storage for the extra vectors, but if accuracy is your priority, it’s worthwhile.&lt;/p&gt;

&lt;p&gt;You can use ColBERT with Astra DB in LangChain by using the RAGStack implementation.&lt;/p&gt;

&lt;p&gt;To ingest the data, you can use the &lt;code&gt;ColbertEmbeddingModel&lt;/code&gt; and &lt;code&gt;ColbertVectorStore&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;ragstack_colbert&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CassandraDatabase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ColbertEmbeddingModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ColbertVectorStore&lt;/span&gt;

&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ColbertEmbeddingModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CassandraDatabase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_astra&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;astra_token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;database_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ASTRA_DB_DATABASE_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;keyspace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default_keyspace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ColbertVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;embedding_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_texts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;YOUR_LIST_OF_TEXTS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;myDocs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then performing a similarity search is pretty much the same as any other vector store search in LangChain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;ragstack_colbert&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;CassandraDatabase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ColbertEmbeddingModel&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;ragstack_langchain.colbert&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ColbertVectorStore&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;LangchainColbertVectorStore&lt;/span&gt;

&lt;span class="n"&gt;colbert_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ColbertEmbeddingModel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;colbert_database&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CassandraDatabase&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_astra&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;astra_token&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;YOUR_ASTRA_DB_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;database_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;YOUR_ASTRA_DB_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;keyspace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default_keyspace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vector_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LangchainColbertVectorStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;colbert_database&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;embedding_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;colbert_embedding&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is ColBERT?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vector_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;similarity_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check out this full tutorial on &lt;a href="https://www.datastax.com/blog/highly-accurate-retrieval-for-your-rag-application-with-colbert-and-astra-db?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;using ColBERT with Astra DB&lt;/a&gt;, or for a faster alternative, &lt;a href="https://www.datastax.com/blog/colbert-live-makes-your-vector-database-smarter?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Jonathan Ellis's ColBERT Live!, which uses Answer AI's colbert-small-v1 model and is supported by Astra DB&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB indexes your vectors live
&lt;/h2&gt;

&lt;p&gt;Your vector database needs to be both accurate and speedy in order to ensure the performance of your application. When you are ingesting or updating data in your collection, rebuilding the index takes time and leaves you with slow queries or out of date data.&lt;/p&gt;

&lt;p&gt;Astra DB's vector indexing capabilities are a combination of &lt;a href="https://docs.datastax.com/en/cql/hcd/develop/indexing/sai/sai-overview.html?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Cassandra's storage-attached indexing (SAI)&lt;/a&gt; and &lt;a href="https://github.com/jbellis/jvector" rel="noopener noreferrer"&gt;JVector, a non-blocking, concurrent, graph-based vector index&lt;/a&gt;. What this means is that Astra DB doesn't need to rebuild or block access to its index when you are inserting vectors, they are updated live.&lt;/p&gt;

&lt;p&gt;The upshot of this is high throughput and accuracy even under mixed loads of reads and writes. Check out &lt;a href="https://www.datastax.com/blog/5-vector-search-challenges-and-how-we-solved-them-in-apache-cassandra#problem-3-concurrency-4?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;this benchmark of throughput and accuracy against Pinecone&lt;/a&gt;, particularly when Pinecone is performing indexing. Astra DB doesn't sacrifice throughput or accuracy under load; it will always be there for your application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB is integrated in all your favourite frameworks
&lt;/h2&gt;

&lt;p&gt;We've seen so far in this post that &lt;a href="https://python.langchain.com/docs/integrations/vectorstores/astradb/" rel="noopener noreferrer"&gt;Astra DB is available in LangChain&lt;/a&gt;, but you can also find it in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://js.langchain.com/docs/integrations/vectorstores/astradb/" rel="noopener noreferrer"&gt;LangChain.JS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.llamaindex.ai/en/stable/examples/vector_stores/AstraDBIndexDemo/" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt; and &lt;a href="https://ts.llamaindex.ai/docs/api/classes/AstraDBVectorStore" rel="noopener noreferrer"&gt;LlamaIndex.TS&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.datastax.com/blog/using-genai-to-find-a-needle-with-haystack-and-astra-db?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Haystack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://mastra.ai/examples/rag/insert-embedding-in-astra" rel="noopener noreferrer"&gt;Mastra&lt;/a&gt; (a newer framework, built by the team behind &lt;a href="https://www.gatsbyjs.com/" rel="noopener noreferrer"&gt;Gatsby&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And of course Astra DB is integrated into &lt;a href="https://www.langflow.org/" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt;. Deeply integrated! Once you enter your application token into the Astra DB component, your databases will automatically load. Then once you select your database, you can pick the collection you need too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhydqtvi2dq3i1i38wb9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhydqtvi2dq3i1i38wb9.gif" alt="An animation showing how to use the Astra DB Langflow component. After you set an Application Token a dropdown option for Database appears, populated with your databases. Once you pick a database, a dropdown option for Collection appears allowing you to pick the collection to use." width="412" height="760"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can even create a new database from within Langflow. Oh, and Langflow supports using Astra Vectorize when ingesting or performing vector search too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3u2p8mqguohxqve1i2jg.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3u2p8mqguohxqve1i2jg.gif" alt="An animation showing an Astra DB component in Langflow. When changing the collection to a Vectorize powered collection, the component updates to Astra Vectorize and disconnects the embedding model, which we then delete." width="600" height="713"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Langflow is a great visual way to build agents, and Astra DB makes it easy to build RAG or agentic RAG within Langflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Astra DB is ready to help you build transformative AI
&lt;/h2&gt;

&lt;p&gt;Whether you're looking to build with Langflow or any number of other frameworks, or try out alternative vector searches like graph RAG or ColBERT, Astra DB is there to help. And it will do it quickly, creating vectors for you via Vectorize and indexing them live so your data is always up to date.&lt;/p&gt;

&lt;p&gt;There are so many different applications you can build; check out examples like this &lt;a href="https://www.datastax.com/blog/building-resumai-langflow-astra-db-openai?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;AI resume assistant&lt;/a&gt;, &lt;a href="https://www.datastax.com/blog/rag-voice-agent-twilio-openai-astra-db-node-js?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;RAG-powered voice agent&lt;/a&gt;, or &lt;a href="https://www.datastax.com/blog/building-hum-to-search-music-recognition-app-vector-search?utm_medium=byline&amp;amp;utm_campaign=five-genai-things-about-astra-db&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;hum-to-search music recognition app&lt;/a&gt;, all powered by Astra DB.&lt;/p&gt;

&lt;p&gt;From chat bots to autonomous agents, Astra DB supports you in building the GenAI apps that are going to transform your business.&lt;/p&gt;

</description>
      <category>genai</category>
      <category>vectordatabase</category>
      <category>rag</category>
    </item>
    <item>
      <title>How to Stream Responses from the Langflow API in Node.js</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 05 Mar 2025 21:34:53 +0000</pubDate>
      <link>https://dev.to/datastax/how-to-stream-responses-from-the-langflow-api-in-nodejs-41l5</link>
      <guid>https://dev.to/datastax/how-to-stream-responses-from-the-langflow-api-in-nodejs-41l5</guid>
      <description>&lt;p&gt;Building flows and AI agents in Langflow is one of the fastest ways to experiment with generative AI. Once you've built your flow, you’ll want to integrate it into your own application. Langflow exposes an API for this; &lt;a href="https://www.datastax.com/blog/use-langflow-api-in-node-js?utm_medium=byline&amp;amp;utm_campaign=how-to-stream-responses-from-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;we’ve written before about how to use it in Node.js&lt;/a&gt;. We've also seen that &lt;a href="https://www.datastax.com/blog/fetch-streams-api-for-faster-ux-generative-ai-apps?utm_medium=byline&amp;amp;utm_campaign=how-to-stream-responses-from-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;streaming GenAI outputs makes for a better user experience&lt;/a&gt;. So today, we're going to combine the two and show you how to stream results from your Langflow flows in Node.js.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvt09ihkjb1c9tg09gpkf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvt09ihkjb1c9tg09gpkf.png" alt="Example code for using the JavaScript Langflow client. It reads:  import { LangflowClient } from " width="800" height="482"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Using the Langflow client
&lt;/h2&gt;

&lt;p&gt;The easiest way to use the Langflow API is with the &lt;a href="https://www.npmjs.com/package/@datastax/langflow-client" rel="noopener noreferrer"&gt;@datastax/langflow-client npm module&lt;/a&gt;. You can get started with the client by installing the module with npm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @datastax/langflow-client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Langflow client can be used with both self-hosted and DataStax-hosted Langflow. You can see in-depth examples of &lt;a href="https://www.datastax.com/blog/use-langflow-api-in-node-js#initializing-for-datastax-hosted-langflow-2?utm_medium=byline&amp;amp;utm_campaign=how-to-stream-responses-from-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;how to set it up for either version of Langflow in this blog post&lt;/a&gt;. But the quick version is that for either type of Langflow, you start by importing the client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;LangflowClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/langflow-client&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For self-hosted Langflow you need the URL where you’re hosting Langflow and, if you've set up user authorisation, an &lt;a href="https://docs.langflow.org/configuration-api-keys" rel="noopener noreferrer"&gt;API key&lt;/a&gt;. You then initialise the client with both:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;baseURL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://localhost:7860&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LangflowClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;baseURL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For DataStax-hosted Langflow, &lt;a href="https://www.datastax.com/blog/use-langflow-api-in-node-js#initializing-for-datastax-hosted-langflow-2?utm_medium=byline&amp;amp;utm_campaign=how-to-stream-responses-from-langflow-api-in-node-js&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;you need your Langflow ID&lt;/a&gt; and to generate an API key. Then you create a client with the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;langflowId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_LANGFLOW_ID&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LangflowClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;langflowId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Streaming with the Langflow client
&lt;/h2&gt;

&lt;p&gt;To stream through the API, you need a flow that’s set up for streaming responses. A streaming flow needs a model with streaming capabilities and the stream flag turned on, connected to a chat output. The basic prompting example, with streaming turned on, is a good example of this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qgitvo2l7u4vbotakif.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qgitvo2l7u4vbotakif.png" alt="A screenshot of a Langflow flow with a chat input and prompt that are both connected to the OpenAI model. The OpenAI model component has the Stream setting enabled and it is connected to a Chat Output component." width="800" height="557"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you don't already have a flow, you can use the basic prompting flow as an example.&lt;/p&gt;

&lt;p&gt;Once you have your flow in place, open the API modal and get the flow ID.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fevtcg4n1p5ge30jdvbxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fevtcg4n1p5ge30jdvbxl.png" alt="A screenshot of the Langflow API modal. It shows the API URL and points out that the flow ID can be found in the URL after api/v1/run." width="800" height="557"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the flow ID and the Langflow client, you can create a flow object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flowId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_FLOW_ID&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;flowId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To stream a response from the flow, you can use the &lt;code&gt;[stream function](https://www.npmjs.com/package/@datastax/langflow-client#streaming)&lt;/code&gt;. The response is a &lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream#examples" rel="noopener noreferrer"&gt;&lt;code&gt;ReadableStream&lt;/code&gt;&lt;/a&gt; that you can iterate asynchronously over.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are three types of event that the stream emits; this is what each of them means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;add_message&lt;/code&gt;: a message has been added to the chat. It can refer to a human input message or a response from an AI.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;token&lt;/code&gt;: a token has been emitted as part of a message being generated by the model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;end&lt;/code&gt;: all tokens have been returned; this message will also contain the same full response that you get from a non-streaming request&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to log out just the text from a flow response you can do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;token&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8sltv8sbs7dknlt3s2z5.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8sltv8sbs7dknlt3s2z5.gif" alt="An animation of a terminal program running the code that logs each chunk. It logs its response one word at a time, but quickly." width="558" height="404"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The stream function takes all the same arguments as the &lt;a href="https://www.npmjs.com/package/@datastax/langflow-client#calling-a-flow" rel="noopener noreferrer"&gt;run function&lt;/a&gt;, so you can provide tweaks for your components, too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating with Express
&lt;/h2&gt;

&lt;p&gt;If you want to make an API request from an &lt;a href="https://expressjs.com/" rel="noopener noreferrer"&gt;Express server&lt;/a&gt; and then stream it to your own front-end, you can do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/stream&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text/plain&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Transfer-Encoding&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;chunked&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;flow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Hello, how are you?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;token&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We explored &lt;a href="https://www.datastax.com/blog/fetch-streams-api-for-faster-ux-generative-ai-apps" rel="noopener noreferrer"&gt;how you can handle a stream on the front-end in this blog post&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stream your flows
&lt;/h2&gt;

&lt;p&gt;Langflow enables you to rapidly build, experiment with, and deploy GenAI applications and with the &lt;a href="https://www.npmjs.com/package/@datastax/langflow-client" rel="noopener noreferrer"&gt;JavaScript Langflow client&lt;/a&gt; you can easily stream those responses in your JavaScript applications.&lt;/p&gt;

&lt;p&gt;Please do try out the Langflow client; if you have any issues, please raise them on &lt;a href="https://github.com/datastax/langflow-client-ts" rel="noopener noreferrer"&gt;the GitHub repo&lt;/a&gt;. If you're looking for more inspiration for building AI agents with Langflow, check out these posts that cover how to &lt;a href="https://www.datastax.com/blog/build-simple-ai-agent-with-langflow-composio" rel="noopener noreferrer"&gt;build an agent that can manage your calendar with Langflow and Composio&lt;/a&gt; or see how you can &lt;a href="https://www.datastax.com/blog/local-ai-using-ollama-with-agents" rel="noopener noreferrer"&gt;build local agents with Langflow and Ollama&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>node</category>
      <category>ai</category>
      <category>langflow</category>
      <category>genai</category>
    </item>
    <item>
      <title>Build a RAG-Powered Voice Agent with Twilio Voice, OpenAI, Astra DB, and Node.js</title>
      <dc:creator>Phil Nash</dc:creator>
      <pubDate>Wed, 19 Feb 2025 23:08:48 +0000</pubDate>
      <link>https://dev.to/datastax/build-a-rag-powered-voice-agent-with-twilio-voice-openai-astra-db-and-nodejs-266n</link>
      <guid>https://dev.to/datastax/build-a-rag-powered-voice-agent-with-twilio-voice-openai-astra-db-and-nodejs-266n</guid>
      <description>&lt;p&gt;With the &lt;a href="https://platform.openai.com/docs/guides/realtime" rel="noopener noreferrer"&gt;OpenAI Realtime API&lt;/a&gt;, you can build speech-to-speech applications that let you interact directly with a generative AI model by speaking with it. Talking directly to a model feels really natural, and the Realtime API makes it possible to build experiences like this into your own applications and businesses.&lt;/p&gt;

&lt;p&gt;One example of this was built by Twilio: it enables you to  &lt;a href="https://www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;connect a phone call to GPT-4o with Node.js&lt;/a&gt; (or, if you prefer, &lt;a href="https://www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-python" rel="noopener noreferrer"&gt;Python&lt;/a&gt;). The example is great, but it only shows connecting to a plain GPT-4o with a system prompt that encourages owl facts and jokes. Much as I like owl facts, I wanted to see what else we could achieve with a voice agent like this.&lt;/p&gt;

&lt;p&gt;In this post, we'll show you how to extend the original assistant into an agent that can choose to use tools to augment its response. We'll give it additional, up-to-date knowledge via &lt;a href="https://www.datastax.com/guides/what-is-retrieval-augmented-generation?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;retrieval-augmented generation (RAG)&lt;/a&gt; using Astra DB.&lt;/p&gt;

&lt;p&gt;Want to try it out before we dive into the details? Call (855) 687-9438 (that's 855-6-TSWIFT) and have a chat!&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;First, you’ll need to set up the application from the &lt;a href="https://www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;Twilio blog post&lt;/a&gt;, so you'll need a Twilio account and an OpenAI API key. Make sure you can make a call and chat with the bot successfully.&lt;/p&gt;

&lt;p&gt;You will also need a &lt;a href="https://astra.datastax.com/signup?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;free DataStax account&lt;/a&gt; so you can set up RAG with Astra DB.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we’re going to build
&lt;/h2&gt;

&lt;p&gt;We already have a voice-capable bot that you can speak to over the phone. We're going to gather some up-to-date data and store it in Astra DB to help the bot answer questions.&lt;/p&gt;

&lt;p&gt;The OpenAI Realtime API enables you to define tools that the model can use to execute functions and extend its capabilities. We’ll give the model a tool that enables it to search the database for additional information (this is an example of &lt;a href="https://www.youtube.com/watch?v=MYPDsV_825U" rel="noopener noreferrer"&gt;agentic RAG&lt;/a&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  Ingesting data
&lt;/h2&gt;

&lt;p&gt;To test out this agent, we're going to write a quick script to load and parse a web page, turn the content into chunks, turn those chunks into vector embeddings, and store them in Astra DB.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create your database
&lt;/h3&gt;

&lt;p&gt;To kick this process off, you'll need to create a database. &lt;a href="https://astra.datastax.com?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;Log into your DataStax account&lt;/a&gt; and, on the Astra DB dashboard, click &lt;em&gt;Create a Database&lt;/em&gt;. Choose a &lt;em&gt;Serverless (Vector)&lt;/em&gt; database, give it a name, and pick a provider and region. That will take a couple of minutes to provision. While it's doing that, have a think about some good web pages you might want to ingest into this database.&lt;/p&gt;

&lt;p&gt;Once the database is ready, click on the &lt;em&gt;Data Explorer&lt;/em&gt; tab and then the &lt;em&gt;Create Collection +&lt;/em&gt; button. Give your collection a name, ensure it is a vector-enabled collection and choose NVIDIA as the embedding generation method. This will &lt;a href="https://docs.datastax.com/en/astra-db-serverless/databases/embedding-generation.html?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;automatically generate vector embeddings for the content we insert into the collection&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connect to the database
&lt;/h3&gt;

&lt;p&gt;Open the &lt;a href="https://github.com/twilio-samples/speech-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;application code&lt;/a&gt; in your favourite text editor. To get the application running, you’ll have created a .env file and populated it with your OpenAI API key (and if you didn't do that yet, now is definitely the time). Open that &lt;em&gt;.env&lt;/em&gt; file and add some more environment variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="nv"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fill in the variables with the information from your database. You can find the API endpoint and generate an application token from the database overview in the Astra DB dashboard. Enter the name of the collection you just created, too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmhwc58vn9zlqz8r250tw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmhwc58vn9zlqz8r250tw.png" alt="A screenshot of the database details in the DataStax dashboard. On the right of the page you can see your API endpoint and generate an application token. The collection can be seen in the Data Explorer tab." width="800" height="639"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now we can connect to the database in the application. Install the Astra DB client from npm.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @datastax/astra-db-ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a new file in the application called db.js. Open the file and enter the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DataAPIClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@datastax/astra-db-ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dotenv&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;dotenv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DataAPIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_APPLICATION_TOKEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;db&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_API_ENDPOINT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ASTRA_DB_COLLECTION_NAME&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code loads the client from the Astra DB module and the variables in the &lt;em&gt;.env&lt;/em&gt; file into the environment. It then uses those environment variables as credentials to connect to the collection, and exports the collection object to be used elsewhere in the application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get some data
&lt;/h3&gt;

&lt;p&gt;Now let's create a script that loads and parses a web page, then splits it into chunks and stores it in Astra DB. This script is going to combine some of the techniques in blog posts about &lt;a href="https://www.datastax.com/blog/html-content-retrieval-augmented-generation-readability-js?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;scraping web pages&lt;/a&gt;, &lt;a href="https://www.datastax.com/blog/how-to-chunk-text-in-javascript-for-rag-applications?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;chunking text&lt;/a&gt;, and &lt;a href="https://www.datastax.com/blog/how-to-create-vector-embeddings-in-node-js?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;creating vector embeddings&lt;/a&gt;. To read more in depth about those, check out those posts.&lt;/p&gt;

&lt;p&gt;Install the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @langchain/textsplitters @mozilla/readability jsdom
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a file called &lt;em&gt;ingest.js&lt;/em&gt; and copy the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;RecursiveCharacterTextSplitter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@langchain/textsplitters&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Readability&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@mozilla/readability&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;JSDOM&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;jsdom&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./db.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;parseArgs&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;node:util&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;values&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parseArgs&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;short&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;u&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;values&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSDOM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Readability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;chunkSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;chunkOverlap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;splitText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insertMany&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This script:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;uses the Node.js &lt;a href="https://nodejs.org/api/util.html#utilparseargsconfig" rel="noopener noreferrer"&gt;argument parser&lt;/a&gt; to get a URL from the command line arguments&lt;/li&gt;
&lt;li&gt;loads the web page at that URL&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.datastax.com/blog/html-content-retrieval-augmented-generation-readability-js?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;parses the content from the page using Readability.js and JSDOM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.datastax.com/blog/how-to-chunk-text-in-javascript-for-rag-applications?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;splits the text into 500 character chunks with 100 character overlap&lt;/a&gt; using the &lt;code&gt;RecursiveCharacterTextSplitter&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;turns the chunks into objects where the chunk of text becomes the &lt;code&gt;$vectorize&lt;/code&gt; property&lt;/li&gt;
&lt;li&gt;inserts all the documents into the collection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://www.datastax.com/blog/simplifying-vector-embedding-generation-with-astra-vectorize?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;Using the &lt;code&gt;$vectorize&lt;/code&gt; property tells Astra DB to automatically create vector embeddings&lt;/a&gt; for this content.&lt;/p&gt;

&lt;p&gt;We can now run this file from the command line. For example, here's how to ingest the Wikipedia page on Taylor Swift:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node ingest.js &lt;span class="nt"&gt;--url&lt;/span&gt; https://en.wikipedia.org/wiki/Taylor_Swift
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this command has been run, check the collection in the DataStax dashboard to see the contents and the vectors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy54sly6h2c8zijyiejm3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy54sly6h2c8zijyiejm3.png" alt="Viewing the Data Explorer in Astra DB. There should be a table of the content that you have ingested, in the $vectorize column you will find the text data and in the $vector column, the vector data that the database created for you." width="800" height="639"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Build the voice agent
&lt;/h2&gt;

&lt;p&gt;To turn our existing voice assistant into an agent that can choose to search the database for more information, we need to provide it with a tool, or function, that it can choose to use.&lt;/p&gt;

&lt;p&gt;Create a new file called tools.js and open it in your editor. Start by importing &lt;code&gt;collection&lt;/code&gt; from &lt;em&gt;db.js&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./db.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we need to create the function that the agent can use to search the database.&lt;/p&gt;

&lt;p&gt;When the OpenAI agent provides parameters to call a function with, it does so as an object. So the function should receive an object, from which we can destruct to extract the query. We'll then use the query to perform a vector search against our collection.&lt;/p&gt;

&lt;p&gt;We can use Astra DB Vectorize to automatically create a vector embedding of the query. We'll also limit the results to the top 10 and ensure we return the text from the chunks by selecting &lt;code&gt;$vectorize&lt;/code&gt; in the &lt;a href="https://docs.datastax.com/en/astra-db-serverless/api-reference/documents.html?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent#projection-operations" rel="noopener noreferrer"&gt;projection&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Calling &lt;code&gt;find&lt;/code&gt; on the collection with these arguments will return a cursor, which we can turn into an array by calling &lt;code&gt;toArray&lt;/code&gt;. We then iterate over the array of documents, extracting just the text and then joining the resulting array with a newline to create a single string result that can be provided as context to the agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;taylorSwiftFacts&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="p"&gt;{},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;projection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$vectorize&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s2"&gt;n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I've called the function &lt;code&gt;taylorSwiftFacts&lt;/code&gt; because that's what I loaded with my ingestion script; feel free to use a different name.&lt;/p&gt;

&lt;p&gt;This is our first tool; we can write more, but for now we can just export this as an object of tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TOOLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;taylorSwiftFacts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To help the model choose when to use this tool, it needs a description of what it can do and the arguments it expects. For each tool you provide a type, name, description, and the parameters.&lt;/p&gt;

&lt;p&gt;For our function the type will be "function" and the name is &lt;code&gt;taylorSwiftFacts&lt;/code&gt;. The description will tell the agent that we have up-to-date information about Taylor Swift that it can search for. The parameters are a JSON schema description of the arguments your function expects, this tool is relatively simple as it only requires one parameter called query, which is a string. The full description looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DESCRIPTIONS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;taylorSwiftFacts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Search for up to date information about Taylor Swift from her wikipedia page&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The search query&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our tool definition is complete for now, so let's add them to our agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Handling function calls in a voice agent
&lt;/h3&gt;

&lt;p&gt;We've been building supporting functions around the existing application so far, but to connect our tool to the agent we need to dig into the main body of code. Open &lt;em&gt;index.js&lt;/em&gt; in our editor and start by importing the tool we just defined:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Fastify&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fastify&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;WebSocket&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ws&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dotenv&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;fastifyFormBody&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@fastify/formbody&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;fastifyWs&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@fastify/websocket&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;DESCRIPTIONS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;TOOLS&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./tools.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need to update the system prompt to more accurately describe what the agent is capable of with the tool available to it. Since we ingested the wikipedia page for Taylor Swift earlier, we can update it to behave like a Taylor Swift superfan.&lt;/p&gt;

&lt;p&gt;Find the &lt;code&gt;SYSTEM_MESSAGE&lt;/code&gt; constant and update with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SYSTEM_MESSAGE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a helpful and bubbly AI assistant who loves Taylor Swift. You can use your knowledge about Taylor Swift to answer questions, but if you don't know the answer, you can search for relevant facts with your available tools.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we need to provide the tool we have built to the agent. Find the &lt;code&gt;initializeSession&lt;/code&gt; function, it defines a &lt;code&gt;sessionUpdate&lt;/code&gt; object that includes all the details to initialize the agent. Add a tools property to the session object using the &lt;code&gt;DESCRIPTIONS&lt;/code&gt; object we imported earlier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sessionUpdate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;session.update&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="na"&gt;turn_detection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;server_vad&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="na"&gt;input_audio_format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;g711_ulaw&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;output_audio_format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;g711_ulaw&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;voice&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;VOICE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SYSTEM_MESSAGE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;modalities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;audio&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                    &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DESCRIPTIONS&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can also provide tools on a request-by-request basis, but this agent will benefit from access to this tool in all its interactions.&lt;/p&gt;

&lt;p&gt;Finally we need to handle the event when the model requests to use a tool. Find the event handler for when the connection to OpenAI receives a message, it looks like: &lt;code&gt;openAiWs.on('message', … )&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Change the event handler to an &lt;code&gt;async&lt;/code&gt; function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;openAiWs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;message&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the Realtime API wants to use a tool, it sends an event with the type "response.done." Within the event object there are outputs, and if one of the outputs has a type of "function_call" we know the model wants to use one of its tools.&lt;/p&gt;

&lt;p&gt;The output provides the name of the function it wants to call and the arguments. We can look up the tool in our object of &lt;code&gt;TOOLS&lt;/code&gt; that we imported, then call it with the arguments.&lt;/p&gt;

&lt;p&gt;When we have the result of the function call we pass it back to the model so that it can choose what to do next. We do so by creating a new message with the type "conversation.item.create" and within that message we include an item with the type "function_call_output", the output of the function call, and the ID that the original event had, so that the model can tie the response to the original query.&lt;/p&gt;

&lt;p&gt;We send this to the model as well as another message with the type "response.create" which requests the model use this new information to return a new response.&lt;/p&gt;

&lt;p&gt;Overall, this enables the model to request to use the database search function we defined and provide the arguments it wants to call the function with. We are then responsible for calling the function and returning the results to the model. The whole code looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;      &lt;span class="nx"&gt;openAiWs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;message&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;LOG_EVENT_TYPES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Received event: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;response.done&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;outputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
              &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;functionCall&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;outputs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;output&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function_call&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
              &lt;span class="p"&gt;);&lt;/span&gt;
              &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;functionCall&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;
                  &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;conversationItemCreate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;conversation.item.create&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="na"&gt;item&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function_call_output&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;call_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;functionCall&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;call_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="p"&gt;};&lt;/span&gt;
                &lt;span class="nx"&gt;openAiWs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;conversationItemCreate&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
                &lt;span class="nx"&gt;openAiWs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;response.create&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
              &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="c1"&gt;// other event handlers&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start the application and make sure it is connected to your Twilio number as described in the &lt;a href="https://www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;Twilio blog post&lt;/a&gt;. Now we can call and chat all things Taylor Swift.&lt;/p&gt;

&lt;p&gt;If you want to try this out with my assistant, you can give it a call on (855) 687-9438.&lt;/p&gt;

&lt;p&gt;This is now a new way to connect with the &lt;a href="https://www.datastax.com/blog/using-astradb-vector-to-build-taylor-swift-chatbot?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;Taylor Swift bot we built&lt;/a&gt; a while back. So now you can &lt;a href="https://www.tswift.ai/" rel="noopener noreferrer"&gt;chat with SwiftieGPT online&lt;/a&gt; or on the phone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Give your voice assistants some agency
&lt;/h2&gt;

&lt;p&gt;Real-time voice agents are very cool, but they have all the same drawbacks as a plain LLM. In this post we added agentic RAG capabilities to our voice agent and it was able to use up-to-date knowledge to answer our questions about Taylor Swift.&lt;/p&gt;

&lt;p&gt;When you provide a voice agent with tools, like context from a vector database, the results are very impressive. The combination of Twilio, OpenAI, and Astra DB creates a very powerful agent.&lt;/p&gt;

&lt;p&gt;You can find the code to this in my &lt;a href="https://github.com/philnash/speech-assistant-openai-realtime-api-node" rel="noopener noreferrer"&gt;fork of the Twilio project&lt;/a&gt;. You don't have to stop here though; you can define and add further tools to the agent. Make sure you check out &lt;a href="https://platform.openai.com/docs/guides/function-calling#best-practices-for-defining-functions" rel="noopener noreferrer"&gt;OpenAI's best practices for defining functions for your models&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you're interested in building other agents, check out &lt;a href="https://www.datastax.com/blog/build-simple-ai-agent-with-langflow-composio?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=voice-agent" rel="noopener noreferrer"&gt;how to work with Langflow and Composio&lt;/a&gt; or the &lt;a href="https://www.youtube.com/watch?v=mn1ZnlqnQlg" rel="noopener noreferrer"&gt;workshop and videos from the recent Hacking Agents event&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Are you excited about voice agents or agentic RAG? Come chat about it and what you're building in the &lt;a href="https://discord.gg/datastax" rel="noopener noreferrer"&gt;DataStax Devs Discord&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Want to roll up your sleeves and build with OpenAI, Twilio, Cloudflare, Unstructured, and DataStax? &lt;a href="https://lu.ma/hacking-agents-hackathon" rel="noopener noreferrer"&gt;Join us on Feb. 28 in San Francisco for the Hacking Agents Hackathon&lt;/a&gt;, an epic 24-hour hackathon where we'll be diving into what developers can build with the latest and greatest in AI tooling.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>node</category>
      <category>twilio</category>
      <category>openai</category>
      <category>ai</category>
    </item>
    <item>
      <title>Unlocking Local AI: How to Use Ollama with Agents</title>
      <dc:creator>David Jones-Gilardi</dc:creator>
      <pubDate>Thu, 13 Feb 2025 17:13:50 +0000</pubDate>
      <link>https://dev.to/datastax/unlocking-local-ai-how-to-use-ollama-with-agents-2hc7</link>
      <guid>https://dev.to/datastax/unlocking-local-ai-how-to-use-ollama-with-agents-2hc7</guid>
      <description>&lt;p&gt;By now, there’s a good chance you’ve heard about generative AI or agentic flows (If you’re not familiar with agents and how they work watch &lt;a href="https://www.youtube.com/watch?v=NuxsHifAQa4" rel="noopener noreferrer"&gt;this video&lt;/a&gt; to get up to speed). There’s plenty of information out there about building agents with providers like OpenAI or Anthropic. However, not everyone is comfortable with exposing their data to public model providers. We get a consistent drum of questions from folks wondering if there’s a more secure and cheaper way to run agents. &lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; is the answer.&lt;/p&gt;

&lt;p&gt;If you've ever wondered how to run AI models securely on your own machine without sharing your data with external providers, well, here you go!&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you’d rather watch this content, &lt;a href="https://www.youtube.com/watch?v=bZDk5sgMLsk&amp;amp;t=2s" rel="noopener noreferrer"&gt;here’s a video&lt;/a&gt; covering the same topic.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why use Ollama?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://ollama.com/" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; enables you to run models locally, ensuring that your data remains private and secure. Not only that, it won’t cost you any tokens. With Ollama, you can confidently run models on your hardware, knowing that your data is safe.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Getting started with Ollama&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install the model
&lt;/h3&gt;

&lt;p&gt;If you haven’t used Ollama before, you’ll need to install it locally first. Download and install the version needed for your operating system. It takes about five minutes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2iaipmn74p4yjnqi6pyy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2iaipmn74p4yjnqi6pyy.png" alt="Download and install Ollama" width="800" height="485"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then, navigate to the models section and select tools. It's crucial to choose models that support tool calling when you want to build an agent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2h4m19kuaoozfmuh3um2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2h4m19kuaoozfmuh3um2.png" alt="Choose " width="800" height="485"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg88t8z6f2y12d26jn32n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg88t8z6f2y12d26jn32n.png" alt="Choose " width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For this post, we'll use &lt;a href="https://ollama.com/library/qwen2.5" rel="noopener noreferrer"&gt;Alibaba's Qwen 2.5 7 billion parameter model&lt;/a&gt;, which is a great choice for local tool calling and agent interactions. It's only a 4.7GB download (Llama 3.1 405b is 243GB!) and is suitable to run on most machines.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzsni3raiihk2954v9x7q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzsni3raiihk2954v9x7q.png" alt="Copy the " width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Copy the installation command and paste it into your terminal after installing Ollama. Once the download is complete, you're ready to start working with the model!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2dl5t56pztvn1v1tpgq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2dl5t56pztvn1v1tpgq.png" alt="Execute " width="800" height="131"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Setting up Langflow
&lt;/h3&gt;

&lt;p&gt;Next, we'll use &lt;a href="https://www.langflow.org/" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt;, a visual IDE that enables you to build generative and agentic AI flows in a low-code or no-code environment. If you're not familiar with Langflow, check out &lt;a href="https://www.datastax.com/products/langflow?utm_medium=byline&amp;amp;utm_campaign=local-ai-using-ollama-with-agents&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;this link&lt;/a&gt; for more information.&lt;/p&gt;

&lt;p&gt;1. Install Langflow: Use “&lt;a href="https://github.com/langflow-ai/langflow?tab=readme-ov-file#-quickstart" rel="noopener noreferrer"&gt;uv pip install langflow&lt;/a&gt;” in your terminal to install Langflow locally.&lt;/p&gt;

&lt;p&gt;2. Create a new flow: Choose the “Simple Agent” template.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6j4tyh2wy5aaggs1ybiq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6j4tyh2wy5aaggs1ybiq.png" alt="Choose the " width="800" height="415"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once opened, you’ll see a ready-made simple agentic flow complete with an agent (defaulting to OpenAI’s gpt-4o-mini LLM), both URL and calculator tools, and chat input and output components.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqb29clbfwjsq02slom8o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqb29clbfwjsq02slom8o.png" alt="The Simple Agent flow defaults to using OpenAI's gpt-4o-mini" width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Transitioning to Ollama
&lt;/h2&gt;

&lt;p&gt;Now, let's switch from OpenAI to Ollama:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Select custom model -&lt;/strong&gt; In the model provider list, choose the custom option.&lt;/p&gt;

&lt;p&gt;Since our goal is to use Ollama and not OpenAI, click the “Model Provider” dropdown in the agent component and choose “Custom.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Add Ollama component:&lt;/strong&gt; Drag and drop the Ollama model into your flow and connect the “Language Model” nodes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Refresh the model list and choose qwen2.5 -&lt;/strong&gt; Make sure to refresh the model name dropdown to populate the available models. It's essential to have Ollama running locally for this setup to work.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;To use an Ollama model with your agent, it must support tool calling. In Langflow, enable the “Tool Model Enabled” radio button to filter models that have this capability. Once enabled, select **qwen2.5&lt;/em&gt;* for your operations.*&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Running your query&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Now, let's run a query using the Ollama model. Open the “Playground” and try typing in an example like “convert 200 USD to INR”. If everything is wired up correctly, the model will attempt to answer your query using the tools at its disposal.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmkakblgict6do1hm085.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmkakblgict6do1hm085.png" alt="Open the Playground using the top right hand corner menu" width="800" height="557"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fryfeuq4iasscdw961c95.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fryfeuq4iasscdw961c95.png" alt="Try an example like " width="800" height="402"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Keep in mind that local models may take longer to process, especially larger ones. However, Qwen 2.5 is optimized for smaller machines, making it pretty solid for local use.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Experimenting with inputs
&lt;/h3&gt;

&lt;p&gt;When working with smaller local models, you may need to experiment with your inputs. Sometimes, you might have to explicitly instruct the model to do something, like use the web to find the latest exchange rates. Adjusting the model's temperature settings can also help; starting with a conservative value (like 0.10) is a good practice, but feel free to increase it for more creative responses.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu73y96bijss56j08m3f3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu73y96bijss56j08m3f3.png" alt="The smaller model didn't quite get it right the first time. Give it a nudge with an extra instruction like " width="800" height="857"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Notice how the agent updated its approach when I told it to “use the web to get the latest exchange rates” and gave the correct answer. This time, it used the URL tool to grab the latest exchange rate from the web as compared to relying solely on the knowledge it was trained on.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1f4emdciml6nic937br0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1f4emdciml6nic937br0.png" alt="Now we can see it properly used the URL tool to fetch exchange rates" width="800" height="897"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Finally, once your Ollama agent is set up within Langflow, you can integrate it into your applications via API, allowing you to enable your apps with full agentic capability.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu27huh5jd0gu5zup68b9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu27huh5jd0gu5zup68b9.png" alt="Use the API option to connect AI flows to your applications" width="800" height="234"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's all it takes to harness the power of local models securely with Ollama and your agents. If you have any questions or need further assistance, feel free to reach out on our &lt;a href="https://discord.gg/datastax" rel="noopener noreferrer"&gt;Discord&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;

</description>
      <category>softwaredevelopment</category>
      <category>ai</category>
    </item>
    <item>
      <title>How to Build a Simple AI Agent with Langflow and Composio</title>
      <dc:creator>melienherrera</dc:creator>
      <pubDate>Mon, 10 Feb 2025 19:51:42 +0000</pubDate>
      <link>https://dev.to/datastax/how-to-build-a-simple-ai-agent-with-langflow-and-composio-13d4</link>
      <guid>https://dev.to/datastax/how-to-build-a-simple-ai-agent-with-langflow-and-composio-13d4</guid>
      <description>&lt;p&gt;Are you trying to understand AI agents? Or perhaps you’ve started building agents, but are still struggling with tools and how to connect them to app integrations. DataStax Langflow and Composio are a great combination to help you understand these concepts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.datastax.com/products/langflow?utm_source=google&amp;amp;utm_medium=cpc&amp;amp;utm_campaign=ggl_s_apac_idph_brand&amp;amp;utm_term=datastax+database&amp;amp;utm_content=brand&amp;amp;utm_medium=byline&amp;amp;utm_campaign=build-simple-ai-agent-with-langflow-composio&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt; is a visual low-code AI application builder that allows you to build agents quickly for rapid development, and &lt;a href="https://composio.dev/" rel="noopener noreferrer"&gt;Composio&lt;/a&gt; is an integration platform that gives developers access to hundreds of tools like GitHub, Salesforce, and Google.&lt;/p&gt;

&lt;p&gt;In this tutorial, you’ll learn how to create a simple agent in Langflow using Composio as a tool to connect to your Google calendar. Let’s get started!&lt;/p&gt;

&lt;h2&gt;
  
  
  Set up
&lt;/h2&gt;

&lt;p&gt;For this tutorial, you’ll need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;a href="https://astra.datastax.com/signup?type=langflow&amp;amp;utm_medium=byline&amp;amp;utm_campaign=build-simple-ai-agent-with-langflow-composio&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;DataStax Langflow account&lt;/a&gt; to build your AI agent&lt;/li&gt;
&lt;li&gt;A &lt;a href="https://app.composio.dev/apps" rel="noopener noreferrer"&gt;Composio account&lt;/a&gt; for connecting your tools and integrations&lt;/li&gt;
&lt;li&gt;An &lt;a href="https://platform.openai.com/docs/quickstart" rel="noopener noreferrer"&gt;OpenAI account and API key&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once those two accounts are created, proceed to the following steps to get started on building an AI agent with Composio.&lt;/p&gt;

&lt;h2&gt;
  
  
  Start with Composio
&lt;/h2&gt;

&lt;p&gt;Composio is an application integration platform that gives you access to many different tools that you could use within your AI application. This means that you no longer have to manage APIs for performing actions like creating, deleting, or updating a Google Calendar event; you just need to go through Composio and the work is done for you. We’ll walk through this here.&lt;/p&gt;

&lt;p&gt;Once you’ve created your Composio account, you should be dropped into their dashboard. Copy your API key on the top right hand corner. Save this in your clipboard or preferred notes application for later.&lt;/p&gt;

&lt;p&gt;Once you have obtained your API key, head over to the “Apps” tab on the left side navigation bar.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fib1z0k1lbw6eiyfa0rwh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fib1z0k1lbw6eiyfa0rwh.png" alt="an image describing how to get your API key in composio" width="800" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here, you’ll see all of the available tools and integrations that you can connect with through Composio (283 and counting at the time of writing this blogpost!). Use the “Googlecalendar” integration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq231ktih3wc81sg6nrb8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq231ktih3wc81sg6nrb8.png" alt="An image showing the available tools and integrations that you can connect with through Composio  - and highlighting the “Googlecalendar” integration." width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then go to “Setup Googlecalendar integration.”&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwot2ftrbroqvmtvkks8p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwot2ftrbroqvmtvkks8p.png" alt="An image highlighting the Setup Googlecalendar integration" width="800" height="189"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Follow the steps to complete the integration with your preferred method. They offer options through code with Python or JavaScript—or simply go through authentication via Google sign-on. Once this is completed, you should receive an “Integration Successful” message, which means that you have successfully connected to Google through Composio.&lt;/p&gt;

&lt;p&gt;You’ll be dropped into Step 3/3, “Execute tools,” where you can play around with each individual action in a playground with natural language, test out different parameters, and connect with JS and Python via various frameworks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0003rc58lm5141xklemj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0003rc58lm5141xklemj.png" alt="An image showing the " width="800" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that you have your Google Calendar integration set up and your API key handy, you'll start building a simple AI agent with Langflow using Composio as a tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  Langflow
&lt;/h3&gt;

&lt;p&gt;Head over to your &lt;a href="https://astra.datastax.com/langflow?utm_medium=byline&amp;amp;utm_campaign=build-simple-ai-agent-with-langflow-composio&amp;amp;utm_source=devto" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt; account and create a new flow by clicking the “Create Flow” button, which will bring up the start up menu below. You’’ll be using the “Simple Agent” flow on the “Get started” menu.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqfyen2qujkeq1qoaeecp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqfyen2qujkeq1qoaeecp.png" alt="The Langflow start menu" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You’ll be dropped into the visual editor where you’ll notice that there’s already a flow built out. Each of the blocks that you see are called “Components.” Each component represents a functional step in the end-to-end AI flow. The “Agent” component defaults to using the gpt-4o-mini model from OpenAI, but you can choose to use other models if you prefer. This is where you’ll need to put your OpenAI API key.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduwpfeuu30gajladsydl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduwpfeuu30gajladsydl.png" alt="An image showing where to enter your OpenAI API key in Langflow" width="800" height="423"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, on the left-side navigation, you can scroll down to “Bundles” and find the Composio bundle. Drag and drop this to the flow and connect it to the “Agent” component” using the “Tool” linking points.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzlxf3i4z3f5lsqyp5jan.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzlxf3i4z3f5lsqyp5jan.png" alt="An image showing you where to find the Composio bundle in Langflow and how to drag and drop this to the flow and connect it to the “Agent” component” using the “Tool” linking points." width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Refer back to the API key you got from the Composio dashboard and put it in the “Composio” component. Select the “GOOGLECALENDAR” app name, and press the “refresh” button. You’ll know that the connection with the integration has been successful when you see “GOOGLECALENDAR CONNECTED” appear under “Auth Status.”&lt;/p&gt;

&lt;p&gt;For the purpose of this demo, select from the dropdown under “Actions to use” select all of them. This will allow you to Create, Update, Delete, and Retrieve events!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fremuskoof010t9qta7dz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fremuskoof010t9qta7dz.png" alt="How to select from the dropdown under “Actions to use”" width="498" height="816"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You've now set up all the components you need for your agent with Composio. It’s time to run the flow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Run the flow
&lt;/h2&gt;

&lt;p&gt;To test the flow, go to the “Playground” located in the top right corner. You can use the chat interface to give example queries to your agent flow and see how the agent makes decisions between tools.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frqvbin70tub4k3vuotbu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frqvbin70tub4k3vuotbu.png" alt="An image highlighting the Langflow " width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For example, try typing in the chat input: “Add 1+1” and you’ll notice that the agent determines that it needs to use the Calculator tool to perform the query. You can inspect this by clicking the drop down menu in the agent logs where it says “AI gpt-4o-mini”.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjho0799lchvtxkvj3pvb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjho0799lchvtxkvj3pvb.png" alt="An image showing how to type in the chat input: “Add one plus one” in the Langflow playground" width="800" height="438"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsglxvn23qdfu78qq638o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsglxvn23qdfu78qq638o.png" alt="an image showing the results of typing add one plus one in the playground chat input" width="800" height="915"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, try giving it a query such as “Can you check if I have availability for January 28, 2025 at 3pm? If it's free, schedule a meeting with Bob.” Observe the response here and what decisions the agent had to make using Composio. What actions do you see it calling? What was the final response? Navigate to your Google calendar and see the created event appear on your calendar.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;You’ve officially set up a simple AI agent using Composio as a tool! You were able to easily connect with your Google Calendar and perform actions without having to configure the API yourself, thanks to the power of the Composio integration and Langflow’s component-based visual app-building interface. But the exploration doesn’t end here. As you saw, there are over LOTS of integrations to try within Composio—and you can easily test them all using &lt;a href="https://www.datastax.com/products/langflow?utm_medium=byline&amp;amp;utm_source=devto&amp;amp;utm_campaign=composio&amp;amp;utm_content=" rel="noopener noreferrer"&gt;Langflow&lt;/a&gt;!&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>langflow</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
