<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aasawari Sahasrabuddhe</title>
    <description>The latest articles on DEV Community by Aasawari Sahasrabuddhe (@aasawari).</description>
    <link>https://dev.to/aasawari</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2754078%2F1155f87a-c336-4d9c-a104-92365f8a63da.jpg</url>
      <title>DEV Community: Aasawari Sahasrabuddhe</title>
      <link>https://dev.to/aasawari</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aasawari"/>
    <language>en</language>
    <item>
      <title>Enhancing MongoDB's Performance With Horizontal Scaling</title>
      <dc:creator>Aasawari Sahasrabuddhe</dc:creator>
      <pubDate>Mon, 30 Jun 2025 12:53:00 +0000</pubDate>
      <link>https://dev.to/mongodb/enhancing-mongodbs-performance-with-horizontal-scaling-2963</link>
      <guid>https://dev.to/mongodb/enhancing-mongodbs-performance-with-horizontal-scaling-2963</guid>
      <description>&lt;p&gt;Imagine you work for a small business that just boomed with that one TikTok or Instagram reel. The application is flooded with requests, while the company was already planning on adding some new products to the application. &lt;/p&gt;

&lt;p&gt;As a team maintaining databases for your organization, you will likely encounter an inevitable challenge when your single database instance that once handled everything smoothly comes crashing down because of increased demand. As your product catalog and customer base expand exponentially, your single MongoDB instance starts to struggle. Query response times increase, impacting user experience, and scaling vertically (adding more resources to a single server) proves to be unsustainable and costly. With the growing infrastructure demand, you as the tech decision maker need a more efficient solution to address the performance bottlenecks and deliver a better user experience. &lt;/p&gt;

&lt;p&gt;This is where scaling your application through MongoDB’s sharding transforms the architecture, helps increase in throughput, provides better performance, and much more. &lt;/p&gt;

&lt;p&gt;In this article, we will gain a deeper understanding of MongoDB sharding capabilities, explore the difference between horizontal and vertical scaling, and discuss how  you can scale your application better. &lt;/p&gt;

&lt;h2&gt;
  
  
  The difference between horizontal and vertical scaling
&lt;/h2&gt;

&lt;p&gt;Before we get into the details of the horizontal scaling approach, it is equally important to understand the difference between both types of scaling strategies. &lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Horizontal scaling&lt;/th&gt;
&lt;th&gt;Vertical scaling&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Load distribution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;This distributes the load across multiple servers.&lt;/td&gt;
&lt;td&gt;This improves the performance by upgrading the hardware being utilised.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Example&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Adding VMs in a cluster&lt;/td&gt;
&lt;td&gt;Adding memory capacity of existing VM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Workload distribution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Distributed across different nodes&lt;/td&gt;
&lt;td&gt;Single node handles the workload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Concurrency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Since the workload is distributed, it becomes easy to handle concurrent requests.&lt;/td&gt;
&lt;td&gt;Multi-threading is responsible for concurrency.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Costs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Optimal over time&lt;/td&gt;
&lt;td&gt;Less cost-effective&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Failure resilience&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lower as other nodes have the backup&lt;/td&gt;
&lt;td&gt;Single point of failure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;From the above table, we see that in certain scenarios, vertical scaling proves inefficient because, among other things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The CPU limit is reached after a point and no further vertical scaling is possible.
&lt;/li&gt;
&lt;li&gt;The single point of failure would result in a crash and application downtime.
&lt;/li&gt;
&lt;li&gt;The interrelation between performance and cost becomes exponentially worse.
&lt;/li&gt;
&lt;li&gt;Setting up global users from a single server location introduces latency issues. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Based on parameters like cost, future growth, reliability, flexibility, and the complexity of the application, you have enough information to choose between the horizontal or vertical approach. &lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding sharding in databases
&lt;/h2&gt;

&lt;p&gt;Sharding in a database is the concept of distributing the data across different machines. To avoid excessive load on a single machine and improve the retrieval process, the database administrator (DBA) makes the decision to divide the data into small, manageable pieces from the existing database. These pieces are known as shards. This helps in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Managing only a subset of reads and writes on the complete application.
&lt;/li&gt;
&lt;li&gt;Allowing queries in different shards to operate parallelly.
&lt;/li&gt;
&lt;li&gt;Providing better scalability. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To address the system growth, MongoDB supports horizontal scaling using &lt;a href="https://www.mongodb.com/docs/manual/sharding/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;sharding&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;In the next section, we will get into a deeper understanding of MongoDB’s sharded clusters. &lt;/p&gt;

&lt;h2&gt;
  
  
  A MongoDB sharded cluster
&lt;/h2&gt;

&lt;p&gt;In MongoDB, when the data is distributed into different shards, there are a few terms which are important:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Shard: The small pieces of the larger dataset are known as chunks. These chunks are distributed across various shards and each shard is typically a  &lt;a href="https://www.mongodb.com/docs/manual/tutorial/deploy-replica-set/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;replica sets&lt;/a&gt; in themselves.
&lt;/li&gt;
&lt;li&gt;Mongos: When a request is received, a mongos is responsible to route the query to the responsible shard.
&lt;/li&gt;
&lt;li&gt;Config server: These maintain the metadata and configuration settings for the cluster.
&lt;/li&gt;
&lt;li&gt;Shard keys: This is the key parameter to divide the dataset into smaller shards. &lt;a href="https://www.mongodb.com/docs/manual/core/sharding-choose-a-shard-key/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;Choosing the right shard key&lt;/a&gt; is the most essential and important step in creating the shards. Therefore, this decision should be made wisely. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The below diagram is a representation of sharded cluster architecture used in production:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfu7dy01qr1jyo37yi0f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfu7dy01qr1jyo37yi0f.png" alt="A representation of sharded cluster architecture used in production" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating a sharded cluster
&lt;/h2&gt;

&lt;p&gt;To create a sharded cluster on a local deployment, follow the steps in the &lt;a href="https://www.mongodb.com/docs/manual/tutorial/deploy-shard-cluster/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;official MongoDB documentation&lt;/a&gt;. Maintaining this is not as daunting as it seems, but if you are looking for a more streamlined process, &lt;a href="https://www.mongodb.com/cloud/atlas/register/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;MongoDB Atlas&lt;/a&gt; offers a fully managed sharded cluster setup. &lt;/p&gt;

&lt;p&gt;With MongoDB Atlas, you can deploy a sharded cluster in just a few clicks, and the platform handles all aspects of maintenance, including monitoring, backups, and automatic scaling. This allows you to focus on developing your application without worrying about the underlying infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring and balancing a sharded cluster
&lt;/h2&gt;

&lt;p&gt;Managing a sharded cluster is necessary to get optimal performance from the application. After a collection is sharded, MongoDB will try to distribute it across the shards created based on the selection of the shard key. Therefore, it becomes important to analyse the shards and understand how data is being distributed across. To help monitor the shards better, MongoDB provides basic commands like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;sh.status&lt;span class="o"&gt;()&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;db.printSghardingStatus&lt;span class="o"&gt;()&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These commands provide basic information about the shards and existing chunks in the cluster. &lt;/p&gt;

&lt;p&gt;It is also important to note that selecting the right shard key plays an important role in creating &lt;a href="https://www.mongodb.com/docs/manual/core/sharding-troubleshooting-shard-keys/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;hot shards&lt;/a&gt; and &lt;a href="https://www.mongodb.com/docs/manual/core/sharding-troubleshooting-shard-keys/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;jumbo chunks&lt;/a&gt;. To help with managing these chunks, the &lt;a href="https://www.mongodb.com/docs/manual/core/sharding-balancer-administration/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;MongoDB Sharded Cluster Balancer&lt;/a&gt; plays an important role. This monitors the amount of data in each shard for each collection. &lt;/p&gt;

&lt;p&gt;In essence, effective shard key selection and regular monitoring with the balancer are vital to maintain a well-distributed and high-performing MongoDB sharded cluster.&lt;/p&gt;

&lt;h2&gt;
  
  
  To shard or not to shard?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  When to shard
&lt;/h3&gt;

&lt;p&gt;MongoDB supports horizontal scaling using &lt;a href="https://www.mongodb.com/docs/manual/sharding/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;sharding&lt;/a&gt;. The question here is: What are the ideal scenarios when sharding is a better approach?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Increasing data size:&lt;/strong&gt; When the data can no longer fit into single server storage space, sharding is what one should consider as it horizontally scales and distributes the data across different shards.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Increase in performance:&lt;/strong&gt; When the application is required to have high read and write rates, a single replica set would not be sufficient to do this, hence spreading across reduces the contention and improves the performance of the database.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability requirements:&lt;/strong&gt; If the application is expected to grow in terms of users and traffic, and it is always good to shard to handle the load in a manageable way. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;While knowing when to shard your database is important, it is equally important to understand when &lt;em&gt;not&lt;/em&gt; to shard. &lt;/p&gt;

&lt;h3&gt;
  
  
  When &lt;em&gt;not&lt;/em&gt; to shard
&lt;/h3&gt;

&lt;p&gt;As a software engineer, knowing the best fit is the most crucial step before we reach a decision. To decide whether to shard or not to shard, it is important to understand: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;If the database is relatively small and fits within the storage, there is probably no requirement to shard the collection.
&lt;/li&gt;
&lt;li&gt;Selection of the shard key is the most important step for sharding the collection. Messing up the selection of the shard key can result in jumbo chunks and might not help improve the application.
&lt;/li&gt;
&lt;li&gt;Selection of the wrong shard key could also result in operating on scatter gather queries and could therefore impact the performance of the application.
&lt;/li&gt;
&lt;li&gt;Managing the backups and restores in a sharded collection is more complex compared to a non-sharded collection. Therefore, maintaining the backups could be difficult to manage.
&lt;/li&gt;
&lt;li&gt;Finally, sharding in a multi-cloud environment can result in latency problems between the shards. &lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Sharding best practices
&lt;/h2&gt;

&lt;p&gt;If you have a sharded cluster locally or on-prem, there are a few considerations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Selecting the right sharding key helps in even distribution of data and read and write operations.
&lt;/li&gt;
&lt;li&gt;Avoid scatter gather queries.
&lt;/li&gt;
&lt;li&gt;Use hash based sharding to help in uniform distribution of reads and writes.
&lt;/li&gt;
&lt;li&gt;Add a new shard before the current shards gets overloaded, as a precautionary measure.
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;As modern day applications scale rapidly, ensuring the database can keep up is an important step. Horizontal sharding in MongoDB offers a powerful way to scale, distribute storage, and ease the compute across multiple machines. This article aims to give you a clear and concise understanding of sharding in MongoDB, what it is, how it works, and why it matters. As your application scales and your data grows, knowing when to move from a single-node setup to a sharded cluster can make all the difference in performance and reliability.&lt;/p&gt;

&lt;p&gt;To know more about the technical concepts, visit &lt;a href="https://www.mongodb.com/docs/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=sharding_mongodb&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;MongoDB’s official documentation&lt;/a&gt;.  &lt;/p&gt;

</description>
      <category>scaling</category>
      <category>mongodb</category>
      <category>database</category>
      <category>development</category>
    </item>
    <item>
      <title>Building a Chatbot With Symfony and MongoDB</title>
      <dc:creator>Aasawari Sahasrabuddhe</dc:creator>
      <pubDate>Wed, 28 May 2025 12:30:00 +0000</pubDate>
      <link>https://dev.to/mongodb/building-a-chatbot-with-symfony-and-mongodb-5c8g</link>
      <guid>https://dev.to/mongodb/building-a-chatbot-with-symfony-and-mongodb-5c8g</guid>
      <description>&lt;p&gt;We are living in the age of AI. Almost every modern application or website offers some level of AI integration. Whether it is Google Docs, WhatsApp chat, or Zoom, you will come across an AI integration, especially chatbots, which have become essential tools for enhancing user engagement and support. So why not build one tailored for your own application?&lt;/p&gt;

&lt;p&gt;A well-designed chatbot doesn't just answer questions—it creates interactions that enhance user experience and build loyalty. The convergence of advanced AI technologies with robust web frameworks has opened new possibilities for businesses to deploy intelligent conversational agents that truly understand user intent.&lt;/p&gt;

&lt;p&gt;In this tutorial, we will walk through the steps to build a chatbot application for the &lt;a href="https://symfony.com/doc/current/index.html" rel="noopener noreferrer"&gt;Symfony Documentation&lt;/a&gt; along with&lt;a href="https://www.doctrine-project.org/projects/doctrine-orm/en/3.3/index.html" rel="noopener noreferrer"&gt; Doctrine ORM&lt;/a&gt; and &lt;a href="https://www.doctrine-project.org/projects/doctrine-mongodb-odm/en/2.9/index.html" rel="noopener noreferrer"&gt;MongoDB Doctrine ODM&lt;/a&gt; documentation pages. This application would help you answer Symfony-related questions related to ORM or ODM with structured, accurate, and context-aware responses. A chatbot, in general, uses the powerful AI techniques called &lt;strong&gt;r&lt;/strong&gt;etrieval augmented generation, or RAG.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is retrieval augmented generation (RAG)?
&lt;/h2&gt;

&lt;p&gt;RAG is a technique that enhances traditional language models by combining them with a retrieval system. This is how it works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Retrieval: When a user prompts a question, the system retrieves information from the knowledge base or database. &lt;/li&gt;
&lt;li&gt;Augmentation: The retrieval document is then passed to the &lt;a href="https://www.mongodb.com/resources/basics/artificial-intelligence/large-language-models/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;large language models&lt;/a&gt;, or LLMs, to generate the response. &lt;/li&gt;
&lt;li&gt;Generation: The final response is crafted by the LLMs, but it’s based on the real-world content that was fetched during the retrieval step.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The below diagram provides you with more context on how the RAG applications are created.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faa0mr7s0jggd3tcqjz8g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faa0mr7s0jggd3tcqjz8g.png" alt="Diagram representing the request flow in a RAG application" width="800" height="373"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this comprehensive guide, we will walk through all the steps required for building a chatbot application using the Symfony framework. We are using the source files of the documentation of &lt;a href="https://github.com/symfony/symfony-docs" rel="noopener noreferrer"&gt;Symfony&lt;/a&gt; and &lt;a href="https://github.com/doctrine" rel="noopener noreferrer"&gt;Doctrine&lt;/a&gt; from their GitHub repositories to clone all the files on a local system and then import them into &lt;a href="https://www.mongodb.com/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;MongoDB&lt;/a&gt; in the form of chunks. &lt;/p&gt;

&lt;p&gt;The chunks are basically partitions of a complete document to be stored in the database. We will be using &lt;a href="https://github.com/LLPhant/LLPhant?tab=readme-ov-file#get-started" rel="noopener noreferrer"&gt;LLPhant&lt;/a&gt;, a comprehensive PHP generative AI framework. Further, we are making use of &lt;a href="https://www.voyageai.com/" rel="noopener noreferrer"&gt;Voyage AI&lt;/a&gt; to generate the embeddings and later using the &lt;a href="https://platform.openai.com/docs/overview" rel="noopener noreferrer"&gt;OpenAI LLM&lt;/a&gt; model for formatting our responses. &lt;/p&gt;

&lt;p&gt;Let’s understand each of these steps in detail. &lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Project overview&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;This chatbot application uses the RAG architecture to provide accurate responses based on Symfony documentation. Key components include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Symfony back end: Manages API communication, handles user queries, and integrates with MongoDB, Voyage AI, and OpenAI &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;MongoDB: Stores chunked Symfony documentation and corresponding vector embeddings; uses &lt;a href="https://www.mongodb.com/products/platform/atlas-vector-search/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;Atlas's vector search&lt;/a&gt; for retrieving relevant content &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Voyage AI: Converts documentation chunks into semantic vector embeddings for efficient similarity search &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;OpenAI: Generates context-aware responses using retrieved documentation snippets and user queries &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Twig front end: Provides a simple web interface for user interaction and response display&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Setting up the development environment&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;To start setting up the environment for the project, we need to have a few things ready:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create an Atlas cluster: Create your first free cluster. Follow the MongoDB documentation page, &lt;a href="https://www.mongodb.com/docs/atlas/tutorial/deploy-free-tier-cluster/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;Deploy a Free Cluster&lt;/a&gt;, or make use of the &lt;a href="https://www.mongodb.com/blog/post/announcing-mongodb-mcp-server/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;MongoDB Model Context Protocol (MCP) Server&lt;/a&gt; to create your free cluster. You can follow the documentation on the &lt;a href="https://mcpmarket.com/server/mongodb-atlas" rel="noopener noreferrer"&gt;MCP market&lt;/a&gt; for detailed steps. &lt;/li&gt;
&lt;li&gt;PHP 8 and above: Download the latest PHP version from the official site's &lt;a href="https://www.php.net/downloads.php" rel="noopener noreferrer"&gt;downloads&lt;/a&gt; page. &lt;/li&gt;
&lt;li&gt;Symfony version 7 and above: Get the steps to install Symfony from the &lt;a href="https://symfony.com/doc/current/setup.html" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Creating a Voyage AI API key: Get your Voyage AI API key from the Voyage AI &lt;a href="https://dashboard.voyageai.com/organization/api-keys" rel="noopener noreferrer"&gt;documentation&lt;/a&gt; page. Also, make sure to set the pricing so that embeddings are being created. &lt;/li&gt;
&lt;li&gt;OpenAI API key: Create your OpenAI key from the&lt;a href="https://platform.openai.com/api-keys" rel="noopener noreferrer"&gt; API key page&lt;/a&gt;. &lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Creating the Symfony chatbot application&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Creating the Symfony project&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;To create the Symfony project, you need to follow the steps below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer create-project symfony/skeleton SymfonyDocsChatBot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the project is created, a few dependencies need to be downloaded. These steps will be addressed sequentially as required.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Setting up project dependencies&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;To build the chatbot application, there are a few dependencies we need to install. &lt;/p&gt;

&lt;p&gt;We will be using LLPhant to perform the chunking of the documents. To install in your Symfony application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer require theodo-group/llphant
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In order to store the chunks into MongoDB, we need to have Doctrine MongoDB ODM installed with the Symfony application. To do so: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install the MongoDB extension using PECL:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pecl &lt;span class="nb"&gt;install &lt;/span&gt;mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Once installed, install the Doctrine Bundle. The bundle integrates the Doctrine MongoDB ODM into Symfony, helping you to configure and use it in your application. To do so:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer require doctrine/mongodb-odm-bundle
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To create the front end, we will be making use of &lt;a href="https://twig.symfony.com/" rel="noopener noreferrer"&gt;Twig&lt;/a&gt;. It is a fast, secure, and  flexible templating engine for PHP used to build clean and structured HTML views in Symfony applications.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer require symfony/twig-bundle
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this application, we are sending HTTP requests to Voyage AI and OpenAI to perform embedding and formatting of the response. To make use of HTTP requests, install the guzzlehttp package repository.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer require symfony/http-client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, to have a better look and feel of the response being generated, we will make use of CommonMark, which converts the Markdown into HTML format. To do so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;composer require &lt;span class="s2"&gt;"league/commonmark”
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, at the end, the &lt;a href="https://github.com/mongodb-developer/SymfonyChatBotApp/blob/main/composer.json" rel="noopener noreferrer"&gt;composer.json&lt;/a&gt; will look like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"require"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"php"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;gt;=8.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ext-ctype"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ext-iconv"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"doctrine/mongodb-odm-bundle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^5.3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"league/commonmark"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^2.7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"symfony/console"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7.2.*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"symfony/dotenv"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7.2.*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"symfony/flex"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"symfony/framework-bundle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7.2.*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"symfony/http-client"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7.2.*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"symfony/runtime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7.2.*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"symfony/twig-bundle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7.2.*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"symfony/yaml"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"7.2.*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"theodo-group/llphant"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^0.10.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"twig/extra-bundle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^3.21"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"twig/markdown-extra"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^3.21"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"twig/twig"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^2.12|^3.0"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point, we are all set to start building the application. &lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Building the chatbot application&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In this section, we will walk through building a chatbot using Symfony, MongoDB, and LLPhant. We’ll go through each step—from ingesting raw docs to returning intelligent answers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Copy the RST files into Symfony project’s public directory
&lt;/h3&gt;

&lt;p&gt;To create the chatbot application, the first step is to get the raw files on which the processing has to happen. To do that, clone the &lt;a href="https://github.com/symfony/symfony-docs" rel="noopener noreferrer"&gt;Symfony Docs GitHub repository&lt;/a&gt;. Perform the following steps:&lt;/p&gt;

&lt;p&gt;Once this is cloned, it will create a folder for all the RST files for the documentation pages. The next step here is to store the raw files inside the Symfony project. &lt;/p&gt;

&lt;p&gt;The Symfony documentation repository has now been cloned to have better and more polished Symfony answers to the questions asked to the chatbot. &lt;/p&gt;

&lt;p&gt;Perform the same steps for &lt;a href="https://github.com/doctrine/mongodb-odm" rel="noopener noreferrer"&gt;Doctrine ODM&lt;/a&gt; and &lt;a href="https://github.com/doctrine/orm" rel="noopener noreferrer"&gt;Doctrine ORM&lt;/a&gt; documentations. &lt;/p&gt;

&lt;p&gt;The documentation's raw files are available in the &lt;code&gt;docs&lt;/code&gt; folder. &lt;/p&gt;

&lt;p&gt;Once done, move the RST files into the public directory of the Symfony project. After this, you should be able to see your project structure like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspmu4qqqsrgvywoxxaf2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fspmu4qqqsrgvywoxxaf2.png" alt="Screenshot representing how the raw files will be placed inside the folder" width="750" height="1102"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Create chunks and store them in MongoDB
&lt;/h3&gt;

&lt;p&gt;Generally, most RAG applications are centered around using LLMs to answer questions about large repositories, which, in our case, are the Symfony documentation pages. Processing larger files for the LLMs becomes expensive, and chunking addresses this by breaking large texts into smaller segments. Also, in RAG, embedding these chunks allows the system to retrieve only the most relevant parts for a query, reducing token usage and improving answer accuracy. &lt;/p&gt;

&lt;p&gt;To perform the chunking of the RST files, we are creating a Symfony Command class file that uses the LLPhant’s Document Splitter function. This function splits the documents based on the number of characters and returns one chunk at a time for the complete document. &lt;/p&gt;

&lt;p&gt;In the below &lt;a href="https://github.com/mongodb-developer/SymfonyChatBotApp/blob/main/src/Command/CreateChunks.php" rel="noopener noreferrer"&gt;CreateChunks.php&lt;/a&gt; class file, this function will create the chunks and store them in MongoDB.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Document\ChunkedDocuments&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;LLPhant\Embeddings\DataReader\FileDataReader&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Attribute\AsCommand&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Command\Command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Input\InputInterface&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Output\OutputInterface&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Style\SymfonyStyle&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;LLPhant\Embeddings\DocumentSplitter\DocumentSplitter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; 
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Doctrine\ODM\MongoDB\DocumentManager&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;AsCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'app:create-chunks'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'This command will generate chunks from the rst files and store into MongoDB'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CreateChunks&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="nv"&gt;$defaultName&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'app:create-chunks'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;DocumentManager&lt;/span&gt; &lt;span class="nv"&gt;$documentManager&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;DocumentManager&lt;/span&gt; &lt;span class="nv"&gt;$documentManager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;parent&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;documentManager&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$documentManager&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;InputInterface&lt;/span&gt; &lt;span class="nv"&gt;$input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;OutputInterface&lt;/span&gt; &lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$io&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SymfonyStyle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Chunking all .rst files and storing them into MongoDB"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="nv"&gt;$directory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'../../public/'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nb"&gt;is_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$directory&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Directory not found: "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$directory&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nv"&gt;$totalChunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\DirectoryIterator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$directory&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$fileInfo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$fileInfo&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;isFile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nv"&gt;$fileInfo&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getExtension&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s1"&gt;'rst'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$filePath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$fileInfo&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getPathname&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Processing file: "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$fileInfo&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getFilename&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

            &lt;span class="nv"&gt;$dataReader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FileDataReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$filePath&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="nv"&gt;$documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$dataReader&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getDocuments&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

            &lt;span class="nv"&gt;$splittedDocuments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DocumentSplitter&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;splitDocuments&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$documents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$splittedDocuments&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nv"&gt;$chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ChunkedDocuments&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
                &lt;span class="nv"&gt;$chunk&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

                &lt;span class="nv"&gt;$chunk&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setSourceName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;sourceName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nv"&gt;$chunk&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setCreatedAt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\DateTime&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

                &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;documentManager&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;persist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$chunk&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nv"&gt;$totalChunks&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;documentManager&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;documentManager&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; 
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Successfully stored &lt;/span&gt;&lt;span class="nv"&gt;$totalChunks&lt;/span&gt;&lt;span class="s2"&gt; chunks in MongoDB."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SUCCESS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To execute the command, use the below command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php bin/console app:create-chunks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Create an embedding for the chunk text that has been stored in MongoDB
&lt;/h3&gt;

&lt;p&gt;Once the chunks have been stored, the next step is to create the &lt;a href="https://www.mongodb.com/resources/basics/vector-embeddings/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;embeddings&lt;/a&gt;. An embedding is a mathematical representation of data in the high dimensional space known as vector space. These are the vector representations of the words, phrases, or, at times, complete texts or images. These numerical representations provide more contextual awareness of the words. &lt;/p&gt;

&lt;p&gt;To create these embeddings, we will be making use of the &lt;a href="https://docs.voyageai.com/docs/embeddings" rel="noopener noreferrer"&gt;Voyage AI embedding model&lt;/a&gt; and storing them in MongoDB. To do so, we will create another Symfony command to create those embeddings and flush them into the database.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://github.com/mongodb-developer/SymfonyChatBotApp/blob/main/src/Command/EmbeddedChunks.php" rel="noopener noreferrer"&gt;EmbeddedChunks.php&lt;/a&gt; command class is responsible for generating vector embeddings for document chunks stored in MongoDB. It contains two main functions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The execute() method retrieves all chunked documents from the ChunkedDocuments collection and processes them in batches. Each batch of content is then passed to the Voyage AI API for embedding.&lt;/li&gt;
&lt;li&gt;The embedAndPersist() method handles the actual REST API call to Voyage AI. It sends the batch of texts, receives the generated embeddings, and stores them back into the corresponding documents in the database.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;?php&lt;/span&gt;

&lt;span class="k"&gt;declare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strict_types&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kn"&gt;namespace&lt;/span&gt; &lt;span class="nn"&gt;App\Command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;App\Document\ChunkedDocuments&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Doctrine\ODM\MongoDB\DocumentManager&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Contracts\HttpClient\HttpClientInterface&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Attribute\AsCommand&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Command\Command&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Input\InputInterface&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Output\OutputInterface&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\Console\Style\SymfonyStyle&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;use&lt;/span&gt; &lt;span class="nc"&gt;Symfony\Component\DependencyInjection\Attribute\Autowire&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;AsCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'app:embed-chunks'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'This command will create embedding for the stored chunks using Voyage AI API key and store into MongoDB database'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddedChunks&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;DocumentManager&lt;/span&gt; &lt;span class="nv"&gt;$documentManager&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;#[Autowire(env: 'VOYAGE_API_KEY')]&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$voyageAiApiKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="na"&gt;#[Autowire(env: 'VOYAGE_ENDPOINT')]&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$voyageEndpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="na"&gt;#[Autowire(env: 'BATCH_SIZE')]&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$batchSize&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="na"&gt;#[Autowire(env: 'MAX_RETRIES')]&lt;/span&gt;
        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$maxRetries&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

        &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;HttpClientInterface&lt;/span&gt; &lt;span class="nv"&gt;$client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;parent&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;__construct&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;protected&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;InputInterface&lt;/span&gt; &lt;span class="nv"&gt;$input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;OutputInterface&lt;/span&gt; &lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$io&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SymfonyStyle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$output&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Batch embedding documents from MongoDB using Voyage AI'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nv"&gt;$documents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;documentManager&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getRepository&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ChunkedDocuments&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;findAll&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$documents&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'No documents found.'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SUCCESS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;


        &lt;span class="nv"&gt;$batchedContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
        &lt;span class="nv"&gt;$originalDocs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
        &lt;span class="nv"&gt;$embeddedCount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$documents&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getContent&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="nv"&gt;$text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;mb_substr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; 

            &lt;span class="nv"&gt;$batchedContent&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nv"&gt;$originalDocs&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$doc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$batchedContent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;batchSize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nv"&gt;$embeddedCount&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;embedAndPersist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$batchedContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$originalDocs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nv"&gt;$batchedContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
                &lt;span class="nv"&gt;$originalDocs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
                &lt;span class="nb"&gt;gc_collect_cycles&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$batchedContent&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$embeddedCount&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;embedAndPersist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$batchedContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$originalDocs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;documentManager&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Embedded &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$embeddedCount&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; documents."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Command&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SUCCESS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;embedAndPersist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;HttpClientInterface&lt;/span&gt; &lt;span class="nv"&gt;$client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$inputTexts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$originalDocs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;SymfonyStyle&lt;/span&gt; &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'input'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$inputTexts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'model'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'voyage-3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'input_type'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'document'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="nv"&gt;$embedded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nv"&gt;$attempt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nv"&gt;$success&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$attempt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;maxRetries&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nv"&gt;$success&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$attempt&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

            &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POST'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;voyageEndpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'headers'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                    &lt;span class="s1"&gt;'Authorization'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Bearer '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;voyageAiApiKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'application/json'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="s1"&gt;'json'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="nv"&gt;$data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;json_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getContent&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;isset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'data'&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\RuntimeException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Missing "data" in Voyage AI response'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'data'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$index&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$embeddingResult&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nv"&gt;$embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$embeddingResult&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'embedding'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$embedding&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

                &lt;span class="nv"&gt;$originalDoc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$originalDocs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$index&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
                &lt;span class="nv"&gt;$originalDoc&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setcontentEmbedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$embedding&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nv"&gt;$originalDoc&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;setcreatedAt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\DateTime&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
                &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;documentManager&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;persist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$originalDoc&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nv"&gt;$embedded&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

            &lt;span class="nv"&gt;$success&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;\Symfony\Contracts\HttpClient\Exception\TransportExceptionInterface&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
                 &lt;span class="nc"&gt;\Symfony\Contracts\HttpClient\Exception\ClientExceptionInterface&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;
                 &lt;span class="nc"&gt;\Symfony\Contracts\HttpClient\Exception\ServerExceptionInterface&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Attempt &lt;/span&gt;&lt;span class="nv"&gt;$attempt&lt;/span&gt;&lt;span class="s2"&gt; failed: "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
            &lt;span class="nb"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nv"&gt;$attempt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;\Throwable&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$io&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Unexpected error: '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$embedded&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before the REST API call is made, it is important to store the API key and URL in the .env file as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;VOYAGE_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;VoyageAPI_KEY&amp;gt;
&lt;span class="nv"&gt;VOYAGE_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://api.voyageai.com/v1/embeddings
&lt;span class="nv"&gt;BATCH_SIZE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;32
&lt;span class="nv"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run the above command to create an embedding:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;php &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;memory_limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2G bin/console app:embed-chunks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is important to note here that this step works in batches and hence, this requires some time to create the embedding and store it in the database. &lt;/p&gt;

&lt;p&gt;At this point, you should be able to see chunks and their embedding being stored inside the MongoDB database. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffcoe2nwsnyfmzrkphv6t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffcoe2nwsnyfmzrkphv6t.png" alt="Atlas UI representing how the chunks of the raw files has been stored" width="800" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Create vector search index
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;Vector search&lt;/a&gt; in MongoDB is a search method that allows you to perform semantic searching. It searches based on the vector embeddings that are close to the search query. It makes use of the vector search index to perform the vector search. &lt;/p&gt;

&lt;p&gt;To create a vector search index, navigate to the Search Tab of the Atlas UI. Click on the Vector Search tab and name the index, and select the right collection name on which the index is to be created. Click “Next” and select the path and the similarity method for the index. The below picture gives a reference for how the fields are to be filled.  &lt;/p&gt;

&lt;p&gt;Once done, click “Next,” and you should have your first vector index created. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmwno3hbcp8ciou15jyq7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmwno3hbcp8ciou15jyq7.png" alt="Atlas UI representing steps to generate the vector index" width="800" height="698"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To know more about creating an index, follow the documentation, &lt;a href="https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-type/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;How to Index Fields for Vector Search&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;At this point, we are all set to create the responses. &lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Create a vector search response into LLM formed response
&lt;/h3&gt;

&lt;p&gt;This is the last and final step to perform the vector search on the stored, chunked data. &lt;/p&gt;

&lt;p&gt;To do this, we first need to create the &lt;a href="https://github.com/mongodb-developer/SymfonyChatBotApp/blob/main/src/Controller/ChatController.php" rel="noopener noreferrer"&gt;ChatController.php&lt;/a&gt;, which will hit the response call to generate the response. &lt;/p&gt;

&lt;p&gt;Let’s first create the ChatController as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Response&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nv"&gt;$answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;isMethod&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'POST'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;strtolower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'question'&lt;/span&gt;&lt;span class="p"&gt;)));&lt;/span&gt;

        &lt;span class="nv"&gt;$answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getPredefinedResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$answer&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nv"&gt;$answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;responseService&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getResponseForQuestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'chat.html.twig'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'question'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'answer'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This controller will be hit from the front-end twig page created. This &lt;a href="https://github.com/mongodb-developer/SymfonyChatBotApp/blob/main/templates/chat.html.twig" rel="noopener noreferrer"&gt;chat.html.twig&lt;/a&gt; renders the below page:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhy6a15lt7bwmv7ysomx2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhy6a15lt7bwmv7ysomx2.png" alt="Screenshot for the chatbot UI" width="800" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When a user submits a question via the form and clicks the "Send" button, the controller processes the input. &lt;/p&gt;

&lt;p&gt;It first checks for any predefined answers. If none are found, it delegates the query to &lt;a href="https://github.com/mongodb-developer/SymfonyChatBotApp/blob/main/src/Service/ResponseService.php" rel="noopener noreferrer"&gt;ResponseService.php&lt;/a&gt;, which performs the vector search and returns the most relevant answer based on the indexed content.&lt;/p&gt;

&lt;p&gt;The response service file has three functions:&lt;/p&gt;

&lt;p&gt;getResponseForQuestion() is the main entrypoint for generating the response. This generates an embedding using the generateEmbedding function and then performs the $vectorSearch aggregation. &lt;/p&gt;

&lt;p&gt;This further passes the query response to the formatResponse() function to generate an LLM formatted response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getResponseForQuestion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$embeddedquestion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;generateEmbedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nv"&gt;$embeddedquestion&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;"Sorry, couldn't understand the question."&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nv"&gt;$collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;documentManager&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getDocumentCollection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ChunkedDocuments&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;class&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="nv"&gt;$pipeline&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Pipeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Stage&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;vectorSearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'vector_index'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'contentEmbedding'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queryVector&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;$embeddedquestion&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;numCandidates&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Stage&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;project&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_id&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="nv"&gt;$cursor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$collection&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;aggregate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$pipeline&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toArray&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$contents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;array_map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nv"&gt;$cursor&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;formatResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$contents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;generateEmbeddings() sends the user's question to Voyage AI to generate a vector embedding.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;generateEmbedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;?array&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\GuzzleHttp\Client&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;voyageEndpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'headers'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'Authorization'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Bearer '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;voyageAiApiKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'application/json'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'json'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'input'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$text&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="s1"&gt;'model'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'voyage-3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'input_type'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'query'&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'timeout'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="nv"&gt;$data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;json_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getBody&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getContents&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'data'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'embedding'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;\Throwable&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally, the formatResponse()** **sends the retrieved context and user question to OpenAI's API to format the response in a structured and helpful way. It creates a structured message payload expected by OpenAI's chat completion endpoint.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;formatResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;array&lt;/span&gt; &lt;span class="nv"&gt;$contents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="nv"&gt;$client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;\GuzzleHttp\Client&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nv"&gt;$combineResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;implode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;array_map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'trim'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$contents&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="nv"&gt;$messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'system'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'You are a Symfony chatbot. You help users by answering questions based on Symfony documentation.'&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"Using the following context from Symfony documentation:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$combineResponse&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$messages&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"User query: "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$query&lt;/span&gt;
        &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$client&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;openAiApiUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'headers'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'Authorization'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Bearer '&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;openAiApiKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'application/json'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'json'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="s1"&gt;'model'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'gpt-4.1'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s1"&gt;'messages'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$messages&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="s1"&gt;'timeout'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;

        &lt;span class="nv"&gt;$data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;json_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getBody&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getContents&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nv"&gt;$OpenAiResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'choices'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'message'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="s1"&gt;'No reply from OpenAI.'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$OpenAiResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;\Throwable&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;"Error generating response: "&lt;/span&gt; &lt;span class="mf"&gt;.&lt;/span&gt; &lt;span class="nv"&gt;$e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;getMessage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Again, before using these functions, it is important to add the OPENAI_KEY and URL added in the .env file as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;OPENAI_KEY&amp;gt;
&lt;span class="nv"&gt;OPENAI_API_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://api.openai.com/v1/chat/completions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, the getPredefinedResponse() method is used in the ChatController to handle simple, commonly used conversational phrases like "hi," "thanks," and "goodbye" with instant, hardcoded replies. This approach offers several key benefits:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Performance: It prevents the need to call the external APIs, resulting in faster response. &lt;/li&gt;
&lt;li&gt;Cost-efficient: Since this reduces the number of requests being sent to Voyage AI and OpenAI, it helps largely in reducing the costs. &lt;/li&gt;
&lt;li&gt;Resource optimization: This prevents unnecessary computational overhead, reserving the more complex and expensive AI operations for actual Symfony-related technical queries.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To do so, we have added an extra function that takes care of the predefined questions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;getPredefinedResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$message&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;?string&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$responses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'hi'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Hello! How can I help you with Symfony today?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'hello'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Hello! What can I do for you today? We recommend asking a Symfony based question'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'hey'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Hey there! You can ask me anything about Symfony'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'how are you'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'I am doing fine, how are you? Are you looking for Symfony questions'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'good morning'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Good morning! Do you have any Symfony related queries?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'good evening'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Good evening! Need help with something Symfony-related?'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'bye'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Goodbye! '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'goodbye'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Bbye'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'thank you'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'You are welcome! Please feel free to ask any further questions'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'thanks'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Glad to be helpful'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'good night'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'Good night'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$responses&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$key&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;strpos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this stage, your application is fully set up and ready to be deployed as a chatbot for the Symfony Documentation page. To launch the application locally and start interacting with the chatbot, simply run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;symfony server:start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will start the Symfony development server and make your chatbot accessible for use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing
&lt;/h2&gt;

&lt;p&gt;Once the application is started, you can start by asking questions like: &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What are the key differences between Doctrine ORM and ODM?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You should get a response like: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57l0ai1em9tkaowlmuut.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F57l0ai1em9tkaowlmuut.png" alt="Screenshot of the chat application responding the question asked to describe difference between doctrine orm and odm" width="800" height="906"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Or ask: &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does Symfony support different persistence layers like Doctrine ORM vs ODM?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The application replies: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fww0e4o6n9hqr28k54fdb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fww0e4o6n9hqr28k54fdb.png" alt="Screenshot of the chat application responding the question asked to describe persistence layer  like doctrine orm and odm" width="800" height="906"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Similarly, let’s also check how the application responds to preformatted questions like “hello,” “bye,” etc. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4v5rmtuolfigqaqerbh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4v5rmtuolfigqaqerbh.png" alt="Screenshot of the chat application responding the question asked to answer to Hi" width="800" height="344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After this, you can play around with the questions for chatbot and see the response being received.  &lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Finally, building intelligent chatbots is no longer reserved for complex, cloud-heavy architectures. With tools like Symfony, MongoDB Atlas, and vector search, we now have a clean, elegant, and developer-friendly way to create search-driven, conversational agents that are fast, context-aware, and scalable.&lt;/p&gt;

&lt;p&gt;In this tutorial, we have explored how to build a powerful chatbot application using Symfony and MongoDB’s vector search. By integrating natural language processing, embeddings, and MongoDB Atlas, we created a bot that can intelligently search through technical documentation and return accurate responses in real time. Symfony's modularity and MongoDB's rich document model make this a scalable and extensible solution for developer support, internal tools, and customer service platforms.&lt;/p&gt;

&lt;p&gt;If you have any further questions related to Symfony or any tech stack that has been mentioned, please refer to our &lt;a href="https://www.mongodb.com/docs/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=chatbot_symfony&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;official documentation page&lt;/a&gt; for more understanding. &lt;/p&gt;

</description>
      <category>symfony</category>
      <category>ai</category>
      <category>rag</category>
      <category>mongodb</category>
    </item>
    <item>
      <title>Data Modeling for Java Developers: Structuring With PostgreSQL and MongoDB</title>
      <dc:creator>Aasawari Sahasrabuddhe</dc:creator>
      <pubDate>Thu, 20 Mar 2025 23:00:00 +0000</pubDate>
      <link>https://dev.to/mongodb/data-modeling-for-java-developers-structuring-with-postgresql-and-mongodb-456l</link>
      <guid>https://dev.to/mongodb/data-modeling-for-java-developers-structuring-with-postgresql-and-mongodb-456l</guid>
      <description>&lt;p&gt;Application and system designs have always been considered the most essential step in application development. All the later steps and technologies to be used depend on how the system has been designed. If you are a Java developer, choosing the right approach can mean distinguishing between a rigid, complex schema and a nimble, scalable solution. If you are a Java developer who works with PostgreSQL or other relational databases, you understand the pain of representing the many-to-many relationships between the tables. &lt;/p&gt;

&lt;p&gt;This tutorial will ease your pain with these or other relationships defined in databases by making use of a document database, MongoDB. &lt;/p&gt;

&lt;p&gt;In this article, we’ll understand both approaches, contrasting PostgreSQL’s relational rigour with MongoDB’s document-oriented simplicity. You’ll learn how to weigh trade-offs like ACID compliance versus scalability and discover why MongoDB’s design might save you time. &lt;/p&gt;

&lt;h2&gt;
  
  
  Relationships in databases
&lt;/h2&gt;

&lt;p&gt;Relationships in databases introduce the connection between the entities of two or more different collections or tables. These relationships help the applications to have the dependencies interlinked with each other. To understand this in brief, let’s look at an example of a library management system that has tables for authors, books, issue details, etc. inside the database. &lt;/p&gt;

&lt;h3&gt;
  
  
  One-to-one relationship
&lt;/h3&gt;

&lt;p&gt;Each record in Table A is related to only one record in Table B. Example: A &lt;strong&gt;Library Member&lt;/strong&gt; table and an &lt;strong&gt;Issue Details&lt;/strong&gt; table can have a one-to-one relationship, where each member can have only one active book issued at a time.&lt;/p&gt;

&lt;h3&gt;
  
  
  One-to-many relationship
&lt;/h3&gt;

&lt;p&gt;A single record in Table A can be related to multiple records in Table B. For instance, a &lt;strong&gt;Book&lt;/strong&gt; can have many &lt;strong&gt;Reviews&lt;/strong&gt; or a user can have many contact addresses. &lt;/p&gt;

&lt;h3&gt;
  
  
  Many-to-many relationship
&lt;/h3&gt;

&lt;p&gt;In this type of relationship, multiple records in Table A can relate to various records in Table B, requiring a junction (or bridge) table. For example, a &lt;strong&gt;Book&lt;/strong&gt; can be issued to multiple &lt;strong&gt;Members&lt;/strong&gt; over time, and a &lt;strong&gt;Member&lt;/strong&gt; can borrow numerous &lt;strong&gt;Books&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Relational vs. document databases
&lt;/h2&gt;

&lt;p&gt;A relational database is based on rigid predefined schemas and emphasises relationships. These work well when the structure is strict and is not subjected to much change in the future. SQL or MySQL has been the backbone of relational databases for many years now. &lt;/p&gt;

&lt;p&gt;Whether you’re organizing customer contacts or powering complex financial systems, SQL’s straightforward yet powerful commands have made it a go-to tool for businesses worldwide—and it’s been that way for decades. Relational databases thrive on flexibility, adapting to everything from small-scale apps to enterprise-level solutions. That’s why industries of all kinds, from retail to healthcare, rely on them to store, manage, and make sense of their data. Think of SQL as the quiet workhorse behind the scenes, keeping everything running smoothly.&lt;/p&gt;

&lt;p&gt;A document database, on the other hand, is meant for a more flexible structure. They work well with modern applications where a variety of data is stored and processed from the applications. A document database represents data as documents, typically using formats like JSON (JavaScript object notation) or BSON (Binary JSON). A document database is often referred to as a NoSQL database.&lt;/p&gt;

&lt;p&gt;MongoDB is the most popular database used in modern application development. The flexible storage capability allows the application to scale horizontally and grow efficiently with the rapid changes. &lt;/p&gt;

&lt;h2&gt;
  
  
  Postgres implementation with Java
&lt;/h2&gt;

&lt;p&gt;To understand how data in a Postgres database would be implemented with Java, let us consider an example ER diagram for a library management system consisting of Books, Authors, and Issue Details. &lt;/p&gt;

&lt;p&gt;The ER diagram below depicts the one-to-many relationship between Books and Issue Details and the many-to-many relationship between Author and Books using the Author_Books table created using join operations. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8e3jyy84glm4d45ge9g8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8e3jyy84glm4d45ge9g8.png" alt="ER diagram for the library management system using four different tables" width="800" height="652"&gt;&lt;/a&gt; &lt;br&gt;
&lt;em&gt;Fig: ER diagram for the library management system using four different tables&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Now, let us try to understand how these entities will be represented using the Java Hibernate code and the relationship annotations. &lt;/p&gt;

&lt;p&gt;Let's first understand the Author representing the Author entity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Entity&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Author&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="nd"&gt;@GeneratedValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GenerationType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;IDENTITY&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="n"&gt;authorID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;nationality&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@ManyToMany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mappedBy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cascade&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CascadeType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ALL&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Book&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similarly, the Books entity would be represented as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Entity&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Book&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="nd"&gt;@GeneratedValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GenerationType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;IDENTITY&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="n"&gt;bookID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;isbn&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;genre&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;publishedYear&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@ManyToMany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cascade&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CascadeType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ALL&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="nd"&gt;@JoinTable&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"author_books"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;joinColumns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nd"&gt;@JoinColumn&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"bookID"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;inverseJoinColumns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nd"&gt;@JoinColumn&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"authorID"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Author&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@OneToMany&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mappedBy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"book"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cascade&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CascadeType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ALL&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;IssueDetails&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;issueDetails&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;//Getters and Setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, the Author_books table has been created using the join methods and hence, we do not need to define a separate entity class.&lt;br&gt;&lt;br&gt;
The relationship between the entities has been established using the relationship annotations like @ManytoMany. &lt;/p&gt;

&lt;p&gt;Similarly, to represent the one-to-many relationship between Books and the Issue Details using the @OneToMany annotation, we can write the entity class as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Entity&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;IssueDetails&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="nd"&gt;@GeneratedValue&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;strategy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GenerationType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;IDENTITY&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="n"&gt;issueID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@ManyToOne&lt;/span&gt;
    &lt;span class="nd"&gt;@JoinColumn&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"bookID"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nullable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Book&lt;/span&gt; &lt;span class="n"&gt;book&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="n"&gt;userID&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;issueDate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;returnDate&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Getters and Setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The relationships implemented in the above code snippet could be very simple. If you are wondering how, the answer to this is in the next section where we will understand how simple it can be to use MongoDB as the database. &lt;/p&gt;

&lt;h2&gt;
  
  
  MongoDB implementation with Java
&lt;/h2&gt;

&lt;p&gt;Let us understand the representation in a simpler way using the MongoDB document model and its data modelling techniques. &lt;/p&gt;

&lt;p&gt;For example, representing the many-to-many relationship between the authors and books can be done using &lt;a href="https://www.mongodb.com/docs/manual/data-modeling/concepts/embedding-vs-references/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=data-medding-with-java&amp;amp;utm_term=aasawari.sahasrabuddhe#embedded-data-models" rel="noopener noreferrer"&gt;embedding&lt;/a&gt; or &lt;a href="https://www.mongodb.com/docs/manual/data-modeling/concepts/embedding-vs-references/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=data-modelling-with-java&amp;amp;utm_term=aasawari.sahasrabuddhe#std-label-data-modeling-referencing" rel="noopener noreferrer"&gt;referencing&lt;/a&gt; data modelling techniques. &lt;/p&gt;

&lt;p&gt;It is also important to note that if authors and books are frequently queried together, embedding author references within books or vice versa might be efficient. However, since an author can write many books and a book can have multiple authors, embedding might lead to duplication. So maybe using references with arrays of author IDs in books and book IDs in authors would be better. But that could complicate updates if an author's information changes. Alternatively, just using an array of author IDs in the book documents and not maintaining the inverse in authors might suffice, depending on query needs. &lt;/p&gt;

&lt;p&gt;Depending on the modelling technique chosen, the documents inside in the collection would change. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;60d5ec9f4b1a8e2a1c8f7a1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
  &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;J.K. Rowling&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;jkrowling@example.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;nationality&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;British&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;books&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;60d5ec9f4b1a8e2a1c8f7a2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
    &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;60d5ec9f4b1a8e2a1c8f7a3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The book collection would look like the below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;60d5ec9f4b1a8e2a1c8f7a2&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
  &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Harry Potter and the Sorcerer's Stone&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;isbn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;9780590353427&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;genre&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Fantasy&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;publishedYear&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1997&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;60d5ec9f4b1a8e2a1c8f7a1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;60d5ec9f4b1a8e2a1c8f7a4&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
      &lt;span class="na"&gt;issueDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2023-10-01&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="na"&gt;returnDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;ISODate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2023-10-15&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Returned&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see from the above example, the document model of MongoDB is similar to the POJOs representation in Java. Therefore, the representation of the model class becomes simpler with Java. &lt;/p&gt;

&lt;p&gt;For example, the author class in Java would be written as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Author&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;nationality&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Book&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;bookIds&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; 
    &lt;span class="c1"&gt;//Getters and Setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The book class can be written as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"books"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Book&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;ObjectId&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;isbn&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;genre&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;publishedYear&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Author&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;authorIds&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="c1"&gt;//  Getters and Setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the above code snippet, you can see cleaner and less complex code without any complicated annotations to represent the relationships. &lt;/p&gt;

&lt;h2&gt;
  
  
  Scalability and performance
&lt;/h2&gt;

&lt;p&gt;After the demonstration of a few small code snippets from both databases, it is also important to understand the scalability and the performance considerations while creating a Java application. &lt;/p&gt;

&lt;p&gt;We have all heard by now that Postgres is a powerful relational database that handles the structured query, but what does it do when an application needs to be scaled? Postgres scales vertically, which involves upgrading the hardware of a single server (e.g., adding more CPU, RAM, or storage). This approach is ideal for applications with predictable growth and moderate traffic. Still, it comes with limitations as hardware upgrades can become expensive, and there’s a physical ceiling to how much a single server can handle.&lt;/p&gt;

&lt;p&gt;On the other hand, MongoDB is designed for scalability and high performance, especially regarding unstructured data and modern applications. MongoDB performs scalability using horizontal scaling, which means distributing data across multiple servers. This process in MongoDB is known as &lt;a href="https://www.mongodb.com/docs/manual/sharding/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=data-modelling-with-java&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;sharding,&lt;/a&gt; and the shards are created by selecting the right and appropriate shard key. Additionally, MongoDB optimizes read and write throughput through features like replica sets (for read scaling) and batch operations (for write scaling).&lt;/p&gt;

&lt;p&gt;The performance of Java applications with MongoDB is optimised using demoralizations and indexing. Embedding and referencing the data reduces the need for expensive joins and enhances the query and application performance. &lt;/p&gt;

&lt;h2&gt;
  
  
  Migration considerations
&lt;/h2&gt;

&lt;p&gt;When you migrate the data from Postgres to MongoDB, the process involves more than just moving the data. It specifically requires you to think about the structuring of the data and also how to manage the data. In the next section, we’ll talk about those briefly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rethinking schema design
&lt;/h3&gt;

&lt;p&gt;In a relational database like Postgres, data is often normalised across multiple tables—in our case, between Authors, Books, and Issue Details. In MongoDB, we need to combine the related data into embedded documents. For example, instead of storing authors and books in separate tables, you can embed author information directly within a book document.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"books"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Book&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Author&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Embedded authors&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;IssueDetails&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Embedded issues&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Replacing joins with aggregations
&lt;/h3&gt;

&lt;p&gt;Postgres makes use of joins to retrieve the related data. On the other hand, MongoDB relies mostly on writing aggregation pipelines to retrieve the related information. For example, a Postgres query might be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="n"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; 
&lt;span class="n"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt; 
&lt;span class="n"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;author_books&lt;/span&gt; &lt;span class="n"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;book_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;author_books&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;book_id&lt;/span&gt; 
&lt;span class="n"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt; &lt;span class="n"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;author_books&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;author_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;authors&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;author_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This could simply be done using the Java code using the aggregation as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="n"&gt;lookupStage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"$lookup"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"from"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"localField"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"foreignField"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"_id"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;append&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"as"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"authorDetails"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;);&lt;/span&gt; 
&lt;span class="nc"&gt;AggregateIterable&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;booksCollection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Arrays&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;asList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lookupStage&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Data modelling is a crucial step in application development and selecting the right database can significantly impact the scalability, performance, and maintainability of the system. For Java developers, identifying the strengths and weaknesses of the relational and non-relational databases is essential to correctly structure and manage the data. &lt;/p&gt;

&lt;p&gt;From this tutorial, we have understood that PostgreSQL, with its relational model, excels in scenarios requiring complex queries, ACID compliance, and structured data. Its ability to handle relationships through joins and enforce constraints makes it ideal for applications like financial systems, ERP platforms, and reporting tools. However, its reliance on vertical scaling can become a limitation as applications grow.&lt;/p&gt;

&lt;p&gt;On the other hand, MongoDB’s data document model offers greater flexibility, scalability, and higher performance, which makes it a great fit for modern applications. Its ability to handle unstructured data and scale horizontally allows developers to build nimble, scalable solutions without the rigidity of a predefined schema.&lt;/p&gt;

&lt;p&gt;MongoDB’s document model becomes a natural fit for the unstructured data in modern applications, offering Java developers the ability to store, query, and manipulate complex data without rigid schemas. With powerful indexing, aggregation pipelines, and native support for JSON-like BSON documents, MongoDB simplifies data access, transformations, and manipulation easier. Whether handling product catalogues, customer interactions, or AI-driven insights, MongoDB empowers Java developers to build scalable, high-performance applications with minimal friction.  &lt;/p&gt;

</description>
      <category>java</category>
      <category>mongodb</category>
      <category>postgres</category>
      <category>database</category>
    </item>
    <item>
      <title>Building a Spring Boot CRUD Application Using MongoDB’s Relational Migrator</title>
      <dc:creator>Aasawari Sahasrabuddhe</dc:creator>
      <pubDate>Tue, 18 Feb 2025 14:00:00 +0000</pubDate>
      <link>https://dev.to/mongodb/building-a-spring-boot-crud-application-using-mongodbs-relational-migrator-59kf</link>
      <guid>https://dev.to/mongodb/building-a-spring-boot-crud-application-using-mongodbs-relational-migrator-59kf</guid>
      <description>&lt;p&gt;Imagine this: You’re working with a relational database that’s served you well for years, but now your applications demand flexibility, scalability, and faster development cycles. Transitioning from structured tables to MongoDB’s dynamic, schema-less JSON documents might be daunting. That’s where MongoDB’s Relational Migrator becomes your secret weapon.&lt;/p&gt;

&lt;p&gt;The Relational Migrator tool built by MongoDB no longer only supports migrating data from relational databases to MongoDB. Rather, it supports features like code generation and also a query converter that converts a relational database, such as a Postgres query, or queries from views and stored procedures, into a MongoDB query. &lt;/p&gt;

&lt;p&gt;Relational Migrator is a free tool that leverages intelligent algorithms and Gen AI to solve the most common data modeling, code conversion, and migration challenges, reducing the effort and risk involved in migration projects.&lt;/p&gt;

&lt;p&gt;In this tutorial, we’ll explore how to harness the power of Spring Boot to build a fully functional CRUD application on top of your modernized database using the Relational Migrator. &lt;/p&gt;

&lt;p&gt;This guide will help modernize a legacy application built with relational databases into a complete application using MongoDB.&lt;/p&gt;

&lt;p&gt;Let's get started!&lt;/p&gt;

&lt;h2&gt;
  
  
  Pre-requisites
&lt;/h2&gt;

&lt;p&gt;Before we get into the actual implementation to build the application, below are the pre-requisites that you need to have:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Relational Migrator—you can download the tool available for free using the &lt;a href="https://www.mongodb.com/try/download/relational-migrator?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building%20a%20Spring%20Boot%20CRUD%20Application%20Using%20MongoDB%E2%80%99s%20Relational%20Migrator&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;MongoDB Tools&lt;/a&gt; download page&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;a href="https://github.com/mongodb-developer/relational-migrator-lab/blob/main/docker/sample-postgres-library/init/1-library-schema-and-data.sql" rel="noopener noreferrer"&gt;SQL schema script&lt;/a&gt; to create the Postgres database and tables &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.oracle.com/in/java/technologies/downloads/" rel="noopener noreferrer"&gt;Java 21&lt;/a&gt; or above&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The IDE of your choice&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A free Atlas cluster—create your first&lt;a href="https://account.mongodb.com/account/register?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building%20a%20Spring%20Boot%20CRUD%20Application%20Using%20MongoDB%E2%80%99s%20Relational%20Migrator&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt; Atlas cluster for free&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Relational Migrator
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://www.mongodb.com/docs/relational-migrator/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building%20a%20Spring%20Boot%20CRUD%20Application%20Using%20MongoDB%E2%80%99s%20Relational%20Migrator&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;Relational Migrator&lt;/a&gt; is a tool developed by MongoDB that helps you migrate data from a relational database into MongoDB. It is an application with a UI that helps you map the data from the relational schema to the MongoDB schema. &lt;/p&gt;

&lt;p&gt;Along with providing migration capability, the tool offers more than the name suggests. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Generate the application code&lt;/strong&gt;: After the data has been migrated into the Atlas cluster, the developer asks how we begin the text. Well! The code generation feature of the RM tool is the answer to the questions. This allows you to get the boilerplate code for performing the basic CRUD operations and also expand the project to perform other complex functionality based on the project requirements. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Query converter&lt;/strong&gt;: The tool's query converter feature helps you convert your SQL queries, views, and stored procedures into equivalent MongoDB queries and validate them. The query converter uses the relational schema, the MongoDB schema, and the mapping rules in your current project to determine how the queries should be converted. Conversions may fail or be incorrect if the queries reference tables that are not in your relational schema or if they are not mapped to MongoDB collections. This query converter also works for views and stored procedures.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Migrating the data from the PostgreSQL schema to MongoDB
&lt;/h2&gt;

&lt;p&gt;Once your free Atlas cluster is ready, download the &lt;a href="https://github.com/mongodb-developer/relational-migrator-lab/blob/main/docker/sample-postgres-library/init/1-library-schema-and-data.sql" rel="noopener noreferrer"&gt;SQL schema script&lt;/a&gt;, which has a sample  Postgres schema with data inserted into the tables. Use the &lt;a href="https://www.pgadmin.org/download/" rel="noopener noreferrer"&gt;PgAdmin&lt;/a&gt; application to upload the schema and explore the created schema. The sample schema created nine different tables and their relationships for this tutorial. We will use the same schema to generate the equivalent and simpler MongoDB schema, following the different design patterns. First, let's examine both schemas in detail and then understand how to convert them to the equivalent MongoDB schema for further application development. &lt;/p&gt;

&lt;h3&gt;
  
  
  Analysing the Postgres schema
&lt;/h3&gt;

&lt;p&gt;In this section, we will understand an example of a library management system where we have nine different tables, and the relationship between them is shown in the diagram below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjz0onokqk2qp839418n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjz0onokqk2qp839418n.png" alt="Image representing the relational schema diagram for the database" width="800" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The ER diagram above also depicts the relationship between these tables. Relational databases often require intricate joins, nested subqueries, and multiple layers of aggregation to retrieve even moderately complex data relationships. As a result, what might seem like a simple query can quickly escalate into a large, complex, and unwieldy SQL statement. This not only increases the cognitive load for developers but also makes the code harder to maintain and prone to errors.&lt;/p&gt;

&lt;p&gt;For example, let’s consider a scenario where we need to list all authors with their aliases.&lt;/p&gt;

&lt;p&gt;To do so, we would write the SQL query as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT a.name, aa.alias
FROM library.authors a
LEFT JOIN library.author_alias aa ON a.id = aa.author_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This leads to the joining of two different tables using the foreign key constraint. In the next section, we will understand how this query would look in MongoDB. &lt;/p&gt;

&lt;h3&gt;
  
  
  Creating mappings to generate the equivalent MongoDB schema
&lt;/h3&gt;

&lt;p&gt;The nine different tables in the relational schema need to be converted into five different collections using the &lt;a href="https://www.mongodb.com/docs/relational-migrator/mapping-rules/mapping-rules/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building%20a%20Spring%20Boot%20CRUD%20Application%20Using%20MongoDB%E2%80%99s%20Relational%20Migrator&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;mapping technique&lt;/a&gt; given by MongoDB. These techniques follow the &lt;a href="https://www.mongodb.com/docs/manual/data-modeling/design-patterns/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building%20a%20Spring%20Boot%20CRUD%20Application%20Using%20MongoDB%E2%80%99s%20Relational%20Migrator&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;design patterns&lt;/a&gt; followed by MongoDB in data modelling concepts. &lt;/p&gt;

&lt;p&gt;After the mappings have been applied to the tables, the MongoDB schema would like the following:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8dycmqr8zxtnpi6ovgtq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8dycmqr8zxtnpi6ovgtq.png" alt="Image representing the desired MongoDB schema" width="800" height="687"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's try to convert the above SQL query into the equivalent MongoDB query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;aggregate&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="na"&gt;$project&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;aliases&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is one such example where a huge ER diagram with nine different tables can be combined to form five different collections and make the query simpler. &lt;/p&gt;

&lt;p&gt;After applying the mapping techniques, once the desired MongoDB schema is achieved, the next step is to migrate the data into the MongoDB Atlas cluster. &lt;/p&gt;

&lt;h3&gt;
  
  
  Migrating the data into MongoDB
&lt;/h3&gt;

&lt;p&gt;To migrate the data into the cluster, click on “Data Migration” and select “Create Job Migration.” &lt;/p&gt;

&lt;p&gt;To begin the migration, you should have the host as the relational database connection secure with you, and the destination is where you can apply the MongoDB connection string. Once both the source and the destination databases are all set up, the next step would be to select the &lt;a href="https://www.mongodb.com/docs/relational-migrator/jobs/sync-jobs/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building%20a%20Spring%20Boot%20CRUD%20Application%20Using%20MongoDB%E2%80%99s%20Relational%20Migrator&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;type of migrations.&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;In MongoDB, Relational Migrator offers two types of migrations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Continuous migrations&lt;/strong&gt;: Continuous migration jobs cover new incoming data with a zero-downtime change data capture (CDC) migration strategy. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Snapshot migration&lt;/strong&gt;: This only runs once and usually does a point-in-time migration. &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After selecting the appropriate migration job based on your requirements, the next step is to click on “Review Summary” and then “Start.” The migration process will take some time, depending on the size of your data.&lt;/p&gt;

&lt;p&gt;Once the migration is complete, you'll find your data seamlessly transferred to the MongoDB Atlas cluster. However, this is often the point where developers encounter uncertainty about what to do next.&lt;/p&gt;

&lt;p&gt;But don’t worry—MongoDB Relational Migrator eliminates that confusion.&lt;/p&gt;

&lt;p&gt;The tool’s Code Converter feature is designed to simplify your journey further by providing boilerplate code in the programming language of your choice. This makes it easy to jumpstart your application development, so you can focus on building great features without worrying about the groundwork.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code generation with Relational Migrator
&lt;/h2&gt;

&lt;p&gt;To create an application with migrated data using Relational Migrator, you should click on the “Code Generation” tab and select the language and template as Java and Spring Data, respectively. The below screenshot from the Relational Migrator tools explains the steps for the code generation. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgftcazognj7pfmnk7clw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgftcazognj7pfmnk7clw.png" alt="Screenshot from Relational Migrator tool performing the code generation" width="800" height="559"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you select the framework, you are all set to download/copy the code in the editor of your choice for further processing. &lt;/p&gt;

&lt;h2&gt;
  
  
  Building Spring Boot application
&lt;/h2&gt;

&lt;p&gt;At this point, building the application becomes very easy, as all the essential code is provided. You just need to perform the business logic, and the application is all set to fulfil the requirements. &lt;/p&gt;

&lt;p&gt;In this tutorial, we are using IntelliJ as the IDE and copying all the code provided by the Relational Migrator into a new Spring Boot project created using the Spring Initializr. &lt;/p&gt;

&lt;p&gt;Depending on the data model that has been selected, the Relational Migrator tools create the &lt;a href="https://spring.io/guides/gs/accessing-data-jpa" rel="noopener noreferrer"&gt;entity&lt;/a&gt; and the &lt;a href="https://docs.spring.io/spring-data/data-commons/docs/1.6.1.RELEASE/reference/html/repositories.html" rel="noopener noreferrer"&gt;repository&lt;/a&gt; files for the application. &lt;/p&gt;

&lt;p&gt;Once the entity and the repository files are all copied/downloaded, all you need to do is create the service and the controller to perform the business logic of the applications. After all the files are copied, the project will look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjq1djs4ewlxkux75y8id.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjq1djs4ewlxkux75y8id.png" alt="Screenshot from InteliJ representing the project structure of the application" width="800" height="1254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The controller has all the REST API calls, and the service files are where we need to write the aggregate pipelines. &lt;/p&gt;

&lt;p&gt;Let us understand a few examples in the next section.&lt;/p&gt;

&lt;h3&gt;
  
  
  Examples of aggregation pipelines
&lt;/h3&gt;

&lt;p&gt;At this point in the application development, we have all the entity and repository files required for building the complete application. &lt;/p&gt;

&lt;p&gt;Let us take an example where you need to find all books reviewed within a specific date range. To perform so in the Postgres query, you would have the query as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    r.timestamp,
    u.name AS reviewer_name,
    b.title AS book_title,
    r.rating,
    r.text
FROM 
    library.reviews r
JOIN 
    library.books b ON r.book_id = b.id
JOIN 
    library.users u ON r.user_id = u.id
WHERE 
    r.timestamp BETWEEN '2022-01-01' AND '2024-12-31'
ORDER BY 
    r.timestamp ASC;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The equivalent MongoDB Java query in the schema can simply be written as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ReviewsEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getBookReviews&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;aggregation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Criteria&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"timestamp"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;gte&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;util&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1640995200000L&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;lte&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;util&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1735603200000L&lt;/span&gt;&lt;span class="o"&gt;))),&lt;/span&gt;
                &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;lookup&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"books"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"_id.bookId"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"_id"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"book"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;unwind&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"book"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;project&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"timestamp"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"timestamp"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"book.title"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"bookTitle"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"rating"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"rating"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sort&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;by&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;Direction&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ASC&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"timestamp"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;AggregationResults&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ReviewsEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;aggregation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"reviews"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;ReviewsEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similarly, a simple query can get an alias name for the author, for which the Postgres query can be written as below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT a.name, aa.alias
FROM library.authors a
LEFT JOIN library.author_alias aa ON a.id = aa.author_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This Postgres query can be simply used in MongoDB using the $project stage as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;AuthorAlias&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getAuthorAliases&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Aggregation&lt;/span&gt; &lt;span class="n"&gt;aggregation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newAggregation&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;Aggregation&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;project&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;andInclude&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"authorAliases"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;andExclude&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"_id"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;AggregationResults&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;AuthorAlias&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mongoTemplate&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;aggregate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;aggregation&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"authors"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
                &lt;span class="nc"&gt;AuthorAlias&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getMappedResults&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More such examples are provided in the &lt;a href="https://github.com/mongodb-developer/Spring-Boot-With-Relational-Migrator" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;, which you can use and extend to perform more functionalities. &lt;/p&gt;

&lt;p&gt;This tutorial therefore explains how to leverage the Relational Migrator and its functionality to create applications with no complexity and where the focus is more on the business logic and less on the boilerplate repetitive code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This article shows how transitioning from traditional relational databases to MongoDB’s dynamic and schema-less architecture can be overwhelming, but MongoDB’s Relational Migrator simplifies the process. This tool facilitates seamless data migration and offers advanced features like SQL-to-MongoDB query conversion and automatic boilerplate code generation. In a practical tutorial, the article demonstrates how to modernize a legacy relational database-driven application into a Spring Boot-based CRUD application using MongoDB and Relational Migrator. By converting a sample relational schema into an optimized MongoDB schema, the tool reduces the complexity of joins and nested queries, paving the way for faster and more intuitive development cycles.&lt;/p&gt;

&lt;p&gt;The guide further explores the step-by-step implementation, from migrating data into MongoDB Atlas using Snapshot or Continuous Migration strategies to generating ready-to-use application code for CRUD operations. The Relational Migrator provides a streamlined workflow for developers by offering code templates compatible with Java and Spring Data, making it easy to build on top of the migrated database. Additionally, the article showcases how MongoDB’s aggregation framework simplifies queries, such as filtering reviews within a date range or retrieving authors with aliases, compared to SQL. With examples and practical insights, this tutorial empowers developers to create scalable, modern applications effortlessly.&lt;/p&gt;

&lt;p&gt;If you have any questions related to the article or you wish to learn more about MongoDB concepts, please visit the &lt;a href="https://www.mongodb.com/community/forums/?utm_campaign=devrel&amp;amp;utm_source=third-party-content&amp;amp;utm_medium=cta&amp;amp;utm_content=Building%20a%20Spring%20Boot%20CRUD%20Application%20Using%20MongoDB%E2%80%99s%20Relational%20Migrator&amp;amp;utm_term=aasawari.sahasrabuddhe" rel="noopener noreferrer"&gt;MongoDB community forums&lt;/a&gt; for the latest updates.&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>springboot</category>
      <category>developers</category>
      <category>java</category>
    </item>
  </channel>
</rss>
