<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dheeraj Malhotra</title>
    <description>The latest articles on DEV Community by Dheeraj Malhotra (@dheeraj-lee27).</description>
    <link>https://dev.to/dheeraj-lee27</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2896492%2Fddfa77ee-0615-4ad7-827c-267eeb818658.png</url>
      <title>DEV Community: Dheeraj Malhotra</title>
      <link>https://dev.to/dheeraj-lee27</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dheeraj-lee27"/>
    <language>en</language>
    <item>
      <title>A Comprehensive Comparison of the Three Major Python Web Frameworks: Which One Do You Favor? | Opinion</title>
      <dc:creator>Dheeraj Malhotra</dc:creator>
      <pubDate>Tue, 20 May 2025 10:38:27 +0000</pubDate>
      <link>https://dev.to/dheeraj-lee27/a-comprehensive-comparison-of-the-three-major-python-web-frameworks-which-one-do-you-favor--2ec9</link>
      <guid>https://dev.to/dheeraj-lee27/a-comprehensive-comparison-of-the-three-major-python-web-frameworks-which-one-do-you-favor--2ec9</guid>
      <description>&lt;p&gt;When you search for Python web frameworks, Django, Flask, and FastAPI consistently appear. Our latest Python developer survey results confirm that these three frameworks remain the top choices for developers using Python for backend web development. All three are open-source and compatible with the latest Python versions.&lt;br&gt;
But how do you determine which web framework is best for your project? This article will explore the strengths and weaknesses of each framework and compare their performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;a href="https://www.djangoproject.com/" rel="noopener noreferrer"&gt;Django&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;Django is a "batteries-included" full-stack web framework used by companies like Instagram, Spotify, and Dropbox. Hailed as "the web framework for perfectionists with deadlines," Django was designed to make building robust web applications simpler and faster.&lt;br&gt;
First launched as an open-source project in 2005, Django is quite mature nearly 20 years later, yet it's still under active development. It's suitable for many web applications, including social media, e-commerce, and news and entertainment websites.&lt;br&gt;
Django follows a Model-View-Template (MVT) architecture, where each component has a specific role. The Model handles data and defines its structure. The View manages business logic, processes requests, and fetches necessary data from the Model. Finally, the Template renders this data to the end-user, similar to the View in a Model-View-Controller (MVC) architecture.&lt;br&gt;
As a full-stack web framework, Django can be used to build an entire web application, from the database to the HTML and JavaScript frontend.&lt;br&gt;
Alternatively, you can combine Django with a frontend framework (like React) using Django REST Framework to build mobile and browser-based applications.&lt;br&gt;
Explore our comprehensive Django guide, which includes an overview of the basics, a structured learning path, and other resources to help you master the framework.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fym7wvq7aya793enoge02.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fym7wvq7aya793enoge02.png" alt="Image description" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Advantages of Django
&lt;/h3&gt;

&lt;p&gt;There are many reasons why Django remains one of the most widely used Python web frameworks, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extensive Functionality: Django's "batteries-included" approach provides built-in features such as authentication, caching, data validation, and session management. Its Don't Repeat Yourself (DRY) principle speeds up development and reduces bugs.&lt;/li&gt;
&lt;li&gt;Easy Setup: Django simplifies dependency management by leveraging its built-in features, reducing the need for external packages. This streamlines initial setup, minimizes compatibility issues, and helps you get to work quickly.&lt;/li&gt;
&lt;li&gt;Database Support: Django's ORM (Object-Relational Mapping) makes data handling straightforward, allowing you to use databases like SQLite, MySQL, and PostgreSQL without SQL knowledge. However, it's less suited for non-relational databases like MongoDB.&lt;/li&gt;
&lt;li&gt;Security: Built-in defenses against common vulnerabilities like Cross-Site Scripting (XSS), SQL injection, and clickjacking can help you secure your application quickly from the start.&lt;/li&gt;
&lt;li&gt;Scalability: While monolithic, Django still allows for horizontal scaling of application architecture (business logic and templates), caching to reduce database load, and asynchronous processing for improved efficiency.&lt;/li&gt;
&lt;li&gt;Community and Documentation: Django boasts a large, active community and detailed documentation, providing ready-made tutorials and support.
### Disadvantages of Django
Despite its many advantages, you might consider options other than Django when developing your next web application.&lt;/li&gt;
&lt;li&gt;Not Lightweight: Its "batteries-included" design can be overkill for small applications, where a lightweight framework like Flask might be more suitable.&lt;/li&gt;
&lt;li&gt;Steeper Learning Curve: With Django's extensive functionality comes a naturally steeper learning curve, though many resources are available to help new developers.&lt;/li&gt;
&lt;li&gt;Performance: Django is generally slower compared to frameworks like Flask and FastAPI, but built-in caching and asynchronous processing can help improve response times.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;a href="https://flask.palletsprojects.com/en/stable/" rel="noopener noreferrer"&gt;Flask&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;Flask is a Python-based microframework for backend web development. But don't let the "micro" fool you; as we'll see, Flask isn't limited to small web applications.&lt;/p&gt;

&lt;p&gt;Flask is designed with a simple core based on Werkzeug WSGI (Web Server Gateway Interface) and Jinja2 templates. Notable Flask users include Netflix, Airbnb, and Reddit.&lt;br&gt;
Originally an April Fool's joke, Flask was released as an open-source project in 2010, a few years after Django. The microframework approach is fundamentally different from Django's. While Django takes a "batteries-included" approach with many features needed to build web applications, Flask is much more minimalist.&lt;br&gt;
The philosophy behind microframeworks is that everyone has their preferences, and developers should be free to choose their components. As a result, Flask doesn't include a database, ORM (Object-Relational Mapper), or ODM (Object Document Mapper).&lt;br&gt;
When building a web application with Flask, very little is predetermined. This can offer significant benefits, which we'll discuss below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9wh8mzdwukd3kx5hit89.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9wh8mzdwukd3kx5hit89.png" alt="Image description" width="800" height="333"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Advantages of Flask
&lt;/h3&gt;

&lt;p&gt;Through our State of the Developer Ecosystem survey, we've seen Flask's usage steadily increase over the past five years, surpassing Django for the first time in 2021.&lt;br&gt;
Reasons to choose Flask as your backend web framework include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lightweight Design: Flask's minimalist approach offers a flexible alternative to Django, ideal for smaller applications or projects that don't need Django's extensive features. However, Flask isn't limited to small projects and can be scaled as needed.&lt;/li&gt;
&lt;li&gt;Flexibility: Flask allows you to choose libraries and frameworks for core functionalities like data handling and user authentication. This enables you to select the best tools for your project and scale in unprecedented ways.&lt;/li&gt;
&lt;li&gt;Scalability: Flask's modular design makes it easy to scale horizontally. Using a NoSQL database layer can further enhance scalability.&lt;/li&gt;
&lt;li&gt;Gentle Learning Curve: Flask's simple design makes it easy to learn, though for more complex applications, you might need to explore more extensions.&lt;/li&gt;
&lt;li&gt;Community and Documentation: Flask has extensive (perhaps slightly more technical) documentation and a clear codebase. While Flask's community is smaller than Django's, it's consistently active and steadily growing.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Disadvantages of Flask
&lt;/h3&gt;

&lt;p&gt;While Flask has many advantages, you should consider a few things before using it for your web development project.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bring Your Own Everything: Flask's microframework design and flexibility require you to handle most core functionalities, including data validation, session management, and caching. While beneficial for flexibility, this can slow down development as you need to find existing libraries or build features from scratch. Additionally, long-term dependency management is necessary to ensure compatibility with Flask.&lt;/li&gt;
&lt;li&gt;Security: Flask has minimal built-in security. Beyond protecting client cookies, you must implement web security best practices and ensure the security of included dependencies, applying updates as needed.&lt;/li&gt;
&lt;li&gt;Performance: While Flask performs slightly better than Django, it lags behind FastAPI. Flask offers some ASGI support (the standard used by FastAPI), but it's more closely tied to WSGI.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;As the name suggests, FastAPI is a microframework for building high-performance web APIs using Python. Although relatively new (first released as open source in 2018), FastAPI has quickly gained popularity among developers, consistently ranking third in our list of most popular Python web frameworks since 2021.&lt;br&gt;
FastAPI is built on the ASGI (Asynchronous Server Gateway Interface) server Uvicorn and the web microframework Starlette. FastAPI adds data validation, serialization, and documentation to simplify building web APIs.&lt;/p&gt;

&lt;p&gt;When developing FastAPI, the creators of this microframework drew on their experience using many different frameworks and tools. Django was developed before frontend JavaScript web frameworks (like React or Vue.js) became popular, but FastAPI was designed with this environment in mind.&lt;br&gt;
In previous years, OpenAPI (formerly Swagger) emerged as a format for defining API structures and documenting APIs, providing FastAPI with an industry standard it could leverage.&lt;br&gt;
Beyond the implicit use case of creating RESTful APIs, FastAPI is also ideal for applications requiring real-time responses, such as messaging platforms and dashboards. Its high performance and asynchronous capabilities make it excellent for data-intensive applications, including machine learning models, data processing, and analysis.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0psma4oh1a9s72nnh4wc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0psma4oh1a9s72nnh4wc.png" alt="Image description" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Advantages of FastAPI
&lt;/h3&gt;

&lt;p&gt;In 2021, FastAPI first earned its own category in our State of the Developer Ecosystem survey, with 14% of respondents using this microframework.&lt;br&gt;
Since then, its usage has increased to 20%, while Flask and Django have seen slight declines in usage.&lt;br&gt;
Here are some reasons why developers choose FastAPI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Performance: FastAPI is designed for speed, supporting asynchronous processing and bidirectional WebSockets (provided by Starlette). It outperforms Django and Flask in benchmarks, making it ideal for high-traffic applications.&lt;/li&gt;
&lt;li&gt;Scalability: Like Flask, FastAPI is highly modular, making it easy to scale and perfectly suited for containerized deployments.&lt;/li&gt;
&lt;li&gt;Adherence to Industry Standards: FastAPI is fully compatible with OAuth 2.0, OpenAPI (formerly Swagger), and JSON Schema. This allows for easy implementation of secure authentication and generation of API documentation.&lt;/li&gt;
&lt;li&gt;Ease of Use: FastAPI uses Pydantic for type hints and validation, accelerating development by providing type checking, autocompletion, and request validation.&lt;/li&gt;
&lt;li&gt;Documentation: FastAPI comes with extensive documentation, and third-party resources are continuously growing, making it accessible to developers of all levels.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Disadvantages of FastAPI
&lt;/h3&gt;

&lt;p&gt;Before deciding to use FastAPI for your project, consider the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maturity: FastAPI is newer and lacks the maturity of Django or Flask. Its community is smaller, and the user experience might be less smooth due to less widespread use.&lt;/li&gt;
&lt;li&gt;Compatibility: As a microframework, FastAPI requires additional functionality to become a fully functional application. There are fewer compatible libraries compared to Django or Flask, which might require you to develop your own extensions.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Choosing Between Flask, Django, and FastAPI
&lt;/h2&gt;

&lt;p&gt;So, which Python web framework is best? As with many programming endeavors, the answer is "it depends."&lt;br&gt;
The right choice depends on the answers to several questions: What type of application are you building? What are your priorities? How do you expect the project to evolve in the future?&lt;/p&gt;

&lt;p&gt;All three popular &lt;a href="https://www.servbay.com/features/python" rel="noopener noreferrer"&gt;Python web frameworks&lt;/a&gt; have unique strengths, so evaluating them based on your application will help you make the best decision.&lt;/p&gt;

&lt;p&gt;If you need standard web application features out of the box, Django is a good choice, suitable for projects requiring a more robust structure. Its advantages are particularly evident when using relational databases, as its ORM simplifies data management and provides built-in security features. However, for smaller projects or simple applications, such extensive features might be overkill.&lt;br&gt;
Flask offers greater flexibility. Its minimalist design allows developers to pick and choose their desired extensions and libraries, making it suitable for projects needing custom functionality. This approach is ideal for startups or MVPs where requirements might change and evolve quickly. While Flask is easy to get started with, remember that building more complex applications will involve exploring many extensions.&lt;/p&gt;

&lt;p&gt;If speed is the top priority, FastAPI is a strong contender, especially for API-first or machine learning projects. It leverages modern Python features like type hints to provide automatic data validation and documentation. For applications requiring high performance, such as microservices or data-driven APIs, FastAPI is an excellent choice. Nevertheless, it might not be as rich in built-in features as Django or Flask, and you might need to manually implement additional functionalities.&lt;/p&gt;




&lt;h2&gt;
  
  
  Comprehensive Comparison of the Three Python Web Frameworks
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tf5kfewocycy5pacht9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1tf5kfewocycy5pacht9.png" alt="Image description" width="800" height="652"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Kickstart Your Web Development Project with ServBay
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxw6m1ofrtd19c1dkychw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxw6m1ofrtd19c1dkychw.png" alt="Image description" width="800" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;No matter which framework you choose, you'll need to set up a &lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;Python environment&lt;/a&gt; before web development. ServBay is currently the most convenient tool for this. ServBay provides support for Python 2.7, 3.5-3.14, making it suitable for both new and old projects.&lt;br&gt;
Do you have a specific project in mind that would help you decide which framework to use?            &lt;/p&gt;

</description>
    </item>
    <item>
      <title>Build Your Own AI Chatbot: A Complete Guide to Local Deployment with ServBay, Python, and ChromaDB</title>
      <dc:creator>Dheeraj Malhotra</dc:creator>
      <pubDate>Fri, 14 Mar 2025 14:53:28 +0000</pubDate>
      <link>https://dev.to/dheeraj-lee27/build-your-own-ai-chatbot-a-complete-guide-to-local-deployment-with-servbay-python-and-chromadb-84m</link>
      <guid>https://dev.to/dheeraj-lee27/build-your-own-ai-chatbot-a-complete-guide-to-local-deployment-with-servbay-python-and-chromadb-84m</guid>
      <description>&lt;p&gt;In an era where data privacy is paramount, setting up your own &lt;a href="https://www.servbay.com" rel="noopener noreferrer"&gt;local language model&lt;/a&gt; (LLM) provides a crucial solution for companies and individuals alike. This tutorial is designed to guide you through the process of creating a custom chatbot using &lt;a href="https://www.servbay.com" rel="noopener noreferrer"&gt;ServBay&lt;/a&gt;, &lt;a href="https://www.servbay.com" rel="noopener noreferrer"&gt;Python 3&lt;/a&gt;, and &lt;a href="https://www.trychroma.com/" rel="noopener noreferrer"&gt;ChromaDB&lt;/a&gt;, all hosted locally on your system. Exactly, you don't need to download any software except for &lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;Servbay&lt;/a&gt;. Here are the key reasons why you need this tutorial:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Complete Customization: Full control over configuration allows you to tailor the model to your specific needs without relying on third-party services.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Improved Privacy: Deploying your language model (LLM) locally protects sensitive information from online transmission risks, crucial for organizations handling private data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data Security Assurance: Minimizes security threats by keeping training materials, such as PDF files, secure within your environment, reducing exposure to external risks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Control Over Data Management: Freedom to handle and process data as desired, including embedding proprietary information into a ChromaDB vector store, ensuring alignment with your standards.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Internet Independence: Ensures consistent access to your chatbot without needing an internet connection, maintaining service even when offline.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This tutorial aims to guide you in building a robust and secure local chatbot that prioritizes your privacy and control.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmznyu9jledeys570ysl3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmznyu9jledeys570ysl3.png" alt="Image description" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Retrieval-Augmented Generation (RAG)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/what-is/retrieval-augmented-generation/" rel="noopener noreferrer"&gt;Retrieval-Augmented Generation&lt;/a&gt; (RAG) is an advanced technique that combines the strengths of information retrieval and text generation to create more accurate and contextually relevant responses. Here's a breakdown of how RAG works and why it's beneficial:&lt;/p&gt;

&lt;h4&gt;
  
  
  What is RAG?
&lt;/h4&gt;

&lt;p&gt;RAG is a hybrid model that enhances the capabilities of language models by incorporating an external knowledge base or document store. The process involves two main components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Retrieval: In this phase, the model retrieves relevant documents or pieces of information from an external source, such as a database or a vector store, based on the input query.&lt;/li&gt;
&lt;li&gt;Generation: The retrieved information is then used by a generative language model to produce a coherent and contextually appropriate response.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  How Does RAG Work?
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Query Input: The user inputs a query or question.&lt;/li&gt;
&lt;li&gt;Document Retrieval: The system uses the query to search an external knowledge base, retrieving the most relevant documents or snippets of information.&lt;/li&gt;
&lt;li&gt;Response Generation: The generative model processes the retrieved information, integrating it with its own knowledge to generate a detailed and accurate response.&lt;/li&gt;
&lt;li&gt;Output: The final response, enriched with specific and relevant details from the knowledge base, is presented to the user.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Benefits of RAG
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Enhanced Accuracy: By leveraging external data, RAG models can provide more precise and detailed answers, especially for domain-specific queries.&lt;/li&gt;
&lt;li&gt;Contextual Relevance: The retrieval component ensures that the generated response is grounded in relevant and up-to-date information, improving the overall quality of the response.&lt;/li&gt;
&lt;li&gt;Scalability: RAG systems can be easily scaled to incorporate vast amounts of data, enabling them to handle a wide range of queries and topics.&lt;/li&gt;
&lt;li&gt;Flexibility: These models can be adapted to various domains by simply updating or expanding the external knowledge base, making them highly versatile.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Why Use RAG Locally?
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Privacy and Security: Running a RAG model locally ensures that sensitive data remains secure and private, as it does not need to be sent to external servers.&lt;/li&gt;
&lt;li&gt;Customization: You can tailor the retrieval and generation processes to suit your specific needs, including integrating proprietary data sources.&lt;/li&gt;
&lt;li&gt;Independence: A local setup ensures that your system remains operational even without internet connectivity, providing consistent and reliable service.
By setting up a local RAG application with tools like Ollama, Python, and ChromaDB, you can enjoy the benefits of advanced language models while maintaining control over your data and customization options.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsx2ehfhlj8y67rpcq5dg.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsx2ehfhlj8y67rpcq5dg.PNG" alt="Image description" width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  ServBay
&lt;/h2&gt;

&lt;p&gt;ServBay is an integrated, graphical, one-click installation &lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;local web development environment&lt;/a&gt; designed specifically for web developers, Python developers, AI developers, and PHP developers. This software is particularly well-suited for macOS. It includes a range of commonly used web development services and tools, covering web servers, databases, programming languages, mail servers, queue services, and more. ServBay aims to provide developers with a convenient, efficient, and unified development environment.&lt;/p&gt;

&lt;p&gt;Core Features of ServBay&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Support for Multiple Python Versions: Run multiple Python versions simultaneously to meet the needs of different projects.&lt;/li&gt;
&lt;li&gt;Custom Domain Names and SSL Support: Easily configure local domain names and SSL certificates to simulate real production environments.&lt;/li&gt;
&lt;li&gt;Quick Operations: Supports startup on boot, quick access via the menu bar, and command-line management to enhance development efficiency.&lt;/li&gt;
&lt;li&gt;Unified Service Management: Integrates Python, PHP, Node.js, and Ollama, making it easy to manage multiple development services.&lt;/li&gt;
&lt;li&gt;Clean System Environment: Avoids system pollution by running all services in isolated environments.&lt;/li&gt;
&lt;li&gt;Intranet Penetration and Sharing: Supports intranet penetration for local websites, making it easier to share development results with team members.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  ServBay Installation Guide
&lt;/h3&gt;

&lt;p&gt;Requirements: macOS 12.0 Monterey or later&lt;br&gt;
&lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;Download the Latest Version of ServBay&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Double-click the downloaded .dmg file to open it.&lt;/li&gt;
&lt;li&gt;In the opened window, drag the ServBay.app icon into the Applications folder.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr5zksbj87oq7oaqi3e4a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr5zksbj87oq7oaqi3e4a.png" alt="Image description" width="800" height="566"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When using ServBay for the first time, initialization is required. Generally, you can select the default installation, or optionally select Ollama for AI programming support.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9afhsrgh6zryuwgzrwgf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9afhsrgh6zryuwgzrwgf.png" alt="Image description" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;After installation is complete, open ServBay.
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx9hnymcc7oy8bsaz5hpx.jpeg" alt="Image description" width="800" height="468"&gt;
&lt;/li&gt;
&lt;li&gt;Enter your password. After the installation is complete, you can find ServBay in the Applications directory.&lt;/li&gt;
&lt;li&gt;Access the main interface.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ykufp41undmdb4p31of.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ykufp41undmdb4p31of.png" alt="Image description" width="800" height="468"&gt;&lt;/a&gt;&lt;br&gt;
In addition to Python, ServBay also provides robust support for PHP and &lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;Node.js&lt;/a&gt;, covering a wide range of versions from PHP 5.6 to PHP 8.5 and Node.js 12 to Node.js 23.&lt;br&gt;
One of ServBay's key features is the ability to quickly switch between different software versions. This flexibility is essential for developers who need to test and deploy applications in various environments.&lt;/p&gt;
&lt;h4&gt;
  
  
  One-click installation of all Python versions
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fein61efxd2t2w7p8qrxf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fein61efxd2t2w7p8qrxf.png" alt="Image description" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  One-click installation of all Ollama models
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Facpzg5lttzg7owgw4165.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Facpzg5lttzg7owgw4165.png" alt="Image description" width="800" height="465"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feqgvnevblmsv19k61rzg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feqgvnevblmsv19k61rzg.png" alt="Image description" width="800" height="495"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;Before diving into the setup, ensure you have the following prerequisites in place:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python 3: Python is a versatile programming language that you'll use to write the code for your RAG app.&lt;/li&gt;
&lt;li&gt;ChromaDB: A vector database that will store and manage the embeddings of our data.&lt;/li&gt;
&lt;li&gt;Servbay: To download and serve custom LLMs in our local machine.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;
  
  
  Step 1: Install Python 3 and setup your environment
&lt;/h4&gt;

&lt;p&gt;To install and setup our Python 3 environment, follow these steps:&lt;br&gt;
Click the &lt;strong&gt;python&lt;/strong&gt; button of servbay, and then select one version of python.&lt;br&gt;
Then make sure your Python 3 installed and run successfully:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ python3 --version# Python 3.12.9
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a folder for your project. For example, local-rag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ mkdir local-rag
$ cd local-rag
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a virtual environment named venv:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ python3 -m venv venv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Activate the virtual environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ source venv/bin/activate
# Windows# venv\Scripts\activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 2: Install ChromaDB and other dependencies
&lt;/h4&gt;

&lt;p&gt;Install ChromaDB using pip:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ pip install --q chromadb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install Langchain tools to work seamlessly with your model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ pip install --q unstructured langchain langchain-text-splitters
$ pip install --q "unstructured[all-docs]"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install Flask to serve your app as a HTTP service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ pip install --q flask
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 3: Install Ollama
&lt;/h4&gt;

&lt;p&gt;To install Ollama, follow these steps:&lt;br&gt;
Click the &lt;strong&gt;AI&lt;/strong&gt; button of servbay, and then select a model you like.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv58ot47fbodok9gv0u71.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv58ot47fbodok9gv0u71.PNG" alt="Image description" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  Build the RAG app
&lt;/h3&gt;

&lt;p&gt;Now that you've set up your environment with Python, Ollama, ChromaDB and other dependencies, it's time to build your custom local RAG app. In this section, we'll walk through the hands-on Python code and provide an overview of how to structure your application.&lt;br&gt;
&lt;code&gt;app.py&lt;/code&gt;&lt;br&gt;
This is the main Flask application file. It defines routes for embedding files to the vector database, and retrieving the response from the model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;jsonify&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;embed&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;embed&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;get_vector_db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_vector_db&lt;/span&gt;  

&lt;span class="c1"&gt;# 设置临时文件夹  
&lt;/span&gt;&lt;span class="n"&gt;TEMP_FOLDER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TEMP_FOLDER&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;./_temp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;makedirs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TEMP_FOLDER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/embed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_embed&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;  
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;file&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No file part&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;  

    &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;file&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No selected file&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;  

    &lt;span class="n"&gt;embedded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;embedded&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File embedded successfully&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File embedded unsuccessfully&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;  

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/query&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;POST&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_query&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;  
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;query&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;jsonify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Something went wrong&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;  

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;debug&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;embed.py&lt;/code&gt;&lt;br&gt;
This module handles the embedding process, including saving uploaded files, loading and splitting data, and adding documents to the vector database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;werkzeug.utils&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;secure_filename&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.document_loaders&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;UnstructuredPDFLoader&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_text_splitters&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;get_vector_db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_vector_db&lt;/span&gt;  

&lt;span class="n"&gt;TEMP_FOLDER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TEMP_FOLDER&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;./_temp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

&lt;span class="c1"&gt;# Function to check if the uploaded file is allowed (only PDF files)  
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;allowed_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rsplit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pdf&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;  

&lt;span class="c1"&gt;# Function to save the uploaded file to the temporary folder  
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="c1"&gt;# Save the uploaded file with a secure filename and return the file path  
&lt;/span&gt;    &lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  
    &lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;secure_filename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  
    &lt;span class="n"&gt;file_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TEMP_FOLDER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;file_path&lt;/span&gt;  

&lt;span class="c1"&gt;# Function to load and split the data from the PDF file  
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_and_split_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="c1"&gt;# Load the PDF file and split the data into chunks  
&lt;/span&gt;    &lt;span class="n"&gt;loader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;UnstructuredPDFLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  
    &lt;span class="n"&gt;text_splitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecursiveCharacterTextSplitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text_splitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;  

&lt;span class="c1"&gt;# Main function to handle the embedding process  
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="c1"&gt;# Check if the file is valid, save it, load and split the data, add to the database, and remove the temporary file  
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;allowed_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
        &lt;span class="n"&gt;file_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;save_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
        &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_and_split_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
        &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_vector_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;persist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  
        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;query.py&lt;/code&gt;&lt;br&gt;
This module processes user queries by generating multiple versions of the query, retrieving relevant documents, and providing answers based on the context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.chat_models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatOllama&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PromptTemplate&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.output_parsers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StrOutputParser&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.runnables&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RunnablePassthrough&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain.retrievers.multi_query&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MultiQueryRetriever&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;get_vector_db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_vector_db&lt;/span&gt;  

&lt;span class="n"&gt;LLM_MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;LLM_MODEL&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deepseek-r1:1.5b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

&lt;span class="c1"&gt;# Function to get the prompt templates for generating alternative questions and answering based on context  
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_prompt&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;  
    &lt;span class="n"&gt;QUERY_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PromptTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;  
        &lt;span class="n"&gt;input_variables&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  
        &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You are an AI language model assistant. Your task is to generate five  
        different versions of the given user question to retrieve relevant documents from  
        a vector database. By generating multiple perspectives on the user question, your  
        goal is to help the user overcome some of the limitations of the distance-based  
        similarity search. Provide these alternative questions separated by newlines.  
        Original question: {question}&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;  
    &lt;span class="p"&gt;)&lt;/span&gt;  

    &lt;span class="n"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Answer the question based ONLY on the following context:  
    {context}  
    Question: {question}&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;  

    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;QUERY_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;  

&lt;span class="c1"&gt;# Main function to handle the query process  
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  
        &lt;span class="c1"&gt;# Initialize the language model with the specified model name  
&lt;/span&gt;        &lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatOllama&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;LLM_MODEL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

        &lt;span class="c1"&gt;# Get the vector database instance  
&lt;/span&gt;        &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_vector_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  

        &lt;span class="c1"&gt;# Get the prompt templates  
&lt;/span&gt;        &lt;span class="n"&gt;QUERY_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_prompt&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  

        &lt;span class="c1"&gt;# Set up the retriever to generate multiple queries using the language model and the query prompt  
&lt;/span&gt;        &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MultiQueryRetriever&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_llm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_retriever&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;QUERY_PROMPT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

        &lt;span class="c1"&gt;# Define the processing chain to retrieve context, generate the answer, and parse the output  
&lt;/span&gt;        &lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;RunnablePassthrough&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nc"&gt;StrOutputParser&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;  

        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chain&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;get_vector_db.py&lt;/code&gt;&lt;br&gt;
This module initializes and returns the vector database instance used for storing and retrieving document embeddings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.embeddings&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OllamaEmbeddings&lt;/span&gt;  
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_community.vectorstores.chroma&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Chroma&lt;/span&gt;  

&lt;span class="n"&gt;CHROMA_PATH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;CHROMA_PATH&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;chroma&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="n"&gt;COLLECTION_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;local-rag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="n"&gt;TEXT_EMBEDDING_MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TEXT_EMBEDDING_MODEL&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nomic-embed-text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_vector_db&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;  
    &lt;span class="c1"&gt;# Create an instance of the embedding model  
&lt;/span&gt;    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OllamaEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TEXT_EMBEDDING_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;show_progress&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  

    &lt;span class="c1"&gt;# Initialize the Chroma vector store with specified parameters  
&lt;/span&gt;    &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Chroma&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;  
        &lt;span class="n"&gt;collection_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;COLLECTION_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
        &lt;span class="n"&gt;persist_directory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;CHROMA_PATH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
        &lt;span class="n"&gt;embedding_function&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;  
    &lt;span class="p"&gt;)&lt;/span&gt;  

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Run your app!
&lt;/h3&gt;

&lt;p&gt;Create .env file to store your environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TEMP_FOLDER = './_temp'
CHROMA_PATH = 'chroma'
COLLECTION_NAME = 'local-rag'
LLM_MODEL = 'mistral'
TEXT_EMBEDDING_MODEL = 'nomic-embed-text'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the app.py file to start your app server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3 app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the server is running, you can start making requests to the following endpoints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Example command to embed a PDF file (e.g., resume.pdf):
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash  &lt;/span&gt;

curl &lt;span class="nt"&gt;--request&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--url&lt;/span&gt; http://localhost:8080/embed &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: multipart/form-data'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--form&lt;/span&gt; &lt;span class="nv"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;@/Users/liyinan/Documents/works/matrix_multi.pdf

Response
&lt;span class="o"&gt;{&lt;/span&gt;
&lt;span class="s2"&gt;"message"&lt;/span&gt;: &lt;span class="s2"&gt;"File embedded successfully"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Example &lt;span class="nb"&gt;command &lt;/span&gt;to ask a question to your model:


&lt;span class="nv"&gt;$ &lt;/span&gt;curl &lt;span class="nt"&gt;--request&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; http://localhost:8080/query &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s1"&gt;'{ "query": "Who is Nasser?" }'&lt;/span&gt;

&lt;span class="c"&gt;# Response&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"message"&lt;/span&gt;: &lt;span class="s2"&gt;"Nasser Maronie is a Full Stack Developer with experience in web and mobile app development. He has worked as a Lead Full Stack Engineer at Ulventech, a Senior Full Stack Engineer at Speedoc, a Senior Frontend Engineer at Irvins, and a Software Engineer at Tokopedia. His tech stacks include Typescript, ReactJS, VueJS, React Native, NodeJS, PHP, Golang, Python, MySQL, PostgresQL, MongoDB, Redis, AWS, Firebase, and Supabase. He has a Bachelor's degree in Information System from Universitas Amikom Yogyakarta."&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;By following these instructions, you can effectively run and interact with your custom local RAG app using Python, Ollama, and ChromaDB, tailored to your needs. Adjust and expand the functionality as necessary to enhance the capabilities of your application.&lt;br&gt;
By harnessing the capabilities of local deployment, you not only safeguard sensitive information but also optimize performance and responsiveness. Whether you're enhancing customer interactions or streamlining internal processes, a locally deployed RAG application offers flexibility and robustness to adapt and grow with your requirements.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>llm</category>
      <category>programming</category>
    </item>
    <item>
      <title>Unlocking Local AI Magic: DeepSeek and CodeGPT Empower Developers to Evolve Efficiently</title>
      <dc:creator>Dheeraj Malhotra</dc:creator>
      <pubDate>Thu, 06 Mar 2025 12:24:23 +0000</pubDate>
      <link>https://dev.to/dheeraj-lee27/unlocking-local-ai-magic-deepseek-and-codegpt-empower-developers-to-evolve-efficiently-2bmn</link>
      <guid>https://dev.to/dheeraj-lee27/unlocking-local-ai-magic-deepseek-and-codegpt-empower-developers-to-evolve-efficiently-2bmn</guid>
      <description>&lt;p&gt;With the rapid development of artificial intelligence technology, more and more developers are looking to integrate AI into their workflows. However, many cloud-based AI services may pose issues such as privacy and data security risks, high usage costs, reliance on internet connectivity, and limited customization options.Therefore, installing and running DeepSeek locally allows developers to harness the powerful capabilities of AI to enhance development efficiency while safeguarding privacy.&lt;/p&gt;

&lt;p&gt;CodeGPT is an AI tool based on GPT technology, specifically designed for software developers. It can assist with tasks such as code generation, optimization, debugging, documentation creation, and provide precise suggestions based on context. By integrating CodeGPT with DeepSeek, you can achieve efficient AI-assisted development in a local environment without depending on external cloud services.&lt;/p&gt;

&lt;p&gt;Below is a step-by-step guide to help you install and run DeepSeek locally and configure CodeGPT to enhance your development workflow:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkj3bb03jq0f2sbze85gg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkj3bb03jq0f2sbze85gg.png" alt="Image description" width="800" height="788"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Install Ollama and CodeGPT in VSCode
&lt;/h2&gt;

&lt;p&gt;To run DeepSeek locally, we first need to install Ollama, which allows us to run large language models (LLMs) on our machine, and CodeGPT, a VSCode extension that integrates these models to provide coding assistance. Using Ollama directly can be inconvenient due to issues such as having to use the command line to download models, unstable download speeds, and so on. Therefore, I decided to use a new integration tool called &lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;&lt;strong&gt;ServBay&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Introduction to ServBay
&lt;/h3&gt;

&lt;p&gt;Ollama is a lightweight platform that makes running local LLMs simple. ServBay, on the other hand, is a more user-friendly integration tool for Ollama. It provides an intuitive graphical interface and one-click installation. ServBay is a comprehensive and graphical local &lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;&lt;strong&gt;web development environment&lt;/strong&gt;&lt;/a&gt; designed for web developers, Python developers, AI developers, and PHP developers, particularly suitable for macOS.It incorporates a suite of commonly used web development tools and software, including web servers, databases, development languages, mail servers, queue services, and more, aiming to provide developers with a convenient, efficient, and unified development environment. Below is the list of &lt;a href="https://www.servbay.com/packages" rel="noopener noreferrer"&gt;&lt;strong&gt;tools and software packages&lt;/strong&gt;&lt;/a&gt; currently supported by ServBay.&lt;/p&gt;

&lt;h3&gt;
  
  
  Download ServBay
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Visit the official website: &lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;&lt;strong&gt;ServBay&lt;/strong&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi0kwbq34dxdjcv8vcq02.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi0kwbq34dxdjcv8vcq02.png" alt="Image description" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download the installer for macOS. Currently, only macOS is supported.
&lt;/li&gt;
&lt;li&gt;During installation, you can automatically select the option to install Ollama.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4lez2o85dzxv18py33gq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4lez2o85dzxv18py33gq.png" alt="Image description" width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Once you enter the interface, you can directly install DeepSeek.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgsekwfovd7i2xkaa175c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgsekwfovd7i2xkaa175c.png" alt="Image description" width="800" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Install CodeGPT in Visual Studio Code
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Open VSCode&lt;/strong&gt; and navigate to &lt;strong&gt;the Extensions Marketplace&lt;/strong&gt; (on macOS, press Ctrl + Shift + X or Cmd + Shift + X).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search for “CodeGPT”&lt;/strong&gt; and &lt;strong&gt;click Install&lt;/strong&gt;.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgm47rd3eyhbaz2bv5pj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgm47rd3eyhbaz2bv5pj.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Alternatively, you can create a free account at: &lt;a href="https://codegpt.co" rel="noopener noreferrer"&gt;https://codegpt.co&lt;/a&gt;. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After installing Ollama and CodeGPT, you're now ready to download and configure DeepSeek to start coding with AI locally. 🚀&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Preparing the Models
&lt;/h2&gt;

&lt;p&gt;Now that you have successfully installed ServBay and CodeGPT, it's time to download the models you'll use locally.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chat Model: deepseek-r1: 1.5b&lt;/strong&gt;, optimized for smaller environments, capable of running smoothly on most computers.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autocompletion Model: deepseek-coder: 1.3b&lt;/strong&gt;, which utilizes &lt;strong&gt;Fill-In-The-Middle (FIM)&lt;/strong&gt; technology. This allows it to make intelligent autocompletion suggestions as you write code, predicting and suggesting the intermediate portions of a function or method—not just the beginning or end.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Download the Chat Model (deepseek-r1: 1.5b)
&lt;/h3&gt;

&lt;p&gt;Follow the steps in the interface and simply click to download.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0hkaktabjpyi5xl15iz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0hkaktabjpyi5xl15iz.png" alt="Image description" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To Start Using the Chat Model:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open &lt;strong&gt;CodeGPT&lt;/strong&gt; in &lt;strong&gt;VSCode&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Navigate to the &lt;strong&gt;Local LLMs&lt;/strong&gt; section in the sidebar.
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnwqv9xxd4smhv2w8fb7d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnwqv9xxd4smhv2w8fb7d.png" alt="Image description" width="712" height="1096"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;From the available options, select Ollama as the local LLM provider.
&lt;/li&gt;
&lt;li&gt;Select the model &lt;strong&gt;deepseek-r1: 1.5b&lt;/strong&gt;. 
Now, you can effortlessly query the model about your code. Simply highlight any code in the editor, add extra files to the query using the # symbol, and leverage powerful command shortcuts, such as:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2rfut2nlu2l0q5k6k8zv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2rfut2nlu2l0q5k6k8zv.png" alt="Image description" width="800" height="642"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;/fix&lt;/strong&gt; — Used to fix errors in the code or suggest improvements.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;/refactor&lt;/strong&gt; — Used to clean up and improve the structure of the code.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;/Explain&lt;/strong&gt; — Get a detailed explanation of any code snippet.
This chat model is perfect for assisting with specific problems or receiving suggestions about your code.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Download the Autocompletion Model (deepseek-coder: latest)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcp3vqv2sc0kp5of75m20.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcp3vqv2sc0kp5of75m20.png" alt="Image description" width="800" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the list of available models, select &lt;strong&gt;deepseek-coder: latest&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Once selected, you can start coding. As you type, the model will begin providing real-time code suggestions, helping you effortlessly complete functions, methods, or even entire code blocks. &lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Enjoy Seamless, Local, and Private AI-Driven Coding
&lt;/h2&gt;

&lt;p&gt;After setting up the models, you can now fully enjoy the benefits of these powerful tools without relying on external APIs. By running everything locally on your computer, you ensure complete privacy and control over your coding environment.  Rest assured, with no data leaving your computer, everything remains secure and private.👏&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkzsy1vpi7njpovjg3m3l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkzsy1vpi7njpovjg3m3l.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmz3gyctj69hwl3xsrvuv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmz3gyctj69hwl3xsrvuv.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you enjoyed this article 👏 👏 👏, please give it a clap.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>developers</category>
      <category>deepseek</category>
    </item>
    <item>
      <title>Make AI Models Your Perfect Roommate!</title>
      <dc:creator>Dheeraj Malhotra</dc:creator>
      <pubDate>Tue, 04 Mar 2025 06:55:52 +0000</pubDate>
      <link>https://dev.to/dheeraj-lee27/make-ai-models-your-perfect-roommate-1lj4</link>
      <guid>https://dev.to/dheeraj-lee27/make-ai-models-your-perfect-roommate-1lj4</guid>
      <description></description>
      <category>ai</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Make AI Models Your Perfect Roommate! (ServBay+Ollama+ChatBox)</title>
      <dc:creator>Dheeraj Malhotra</dc:creator>
      <pubDate>Sat, 01 Mar 2025 19:59:23 +0000</pubDate>
      <link>https://dev.to/dheeraj-lee27/make-ai-models-your-perfect-roommate-servbayollamachatbox-2ond</link>
      <guid>https://dev.to/dheeraj-lee27/make-ai-models-your-perfect-roommate-servbayollamachatbox-2ond</guid>
      <description>&lt;h2&gt;
  
  
  At the very beginning:
&lt;/h2&gt;

&lt;p&gt;Recently,DeepSeek has become so popular that everyone wants to give it a try. The arrival of DeepSeek can be said to have sparked a nationwide AI craze.&lt;br&gt;
Initially, I thought deploying DeepSeek locally would be very challenging, but after testing, it turned out to be quite simple. The process is as easy as installing new software on your computer, and it takes only about ten minutes to complete the local deployment.&lt;br&gt;
Today, let's talk about how to deploy the DeepSeek-R1-1.5B model locally on your own computer.&lt;br&gt;
Installation Steps Overview:&lt;/p&gt;

&lt;blockquote&gt;
&lt;ol&gt;
&lt;li&gt;Install &lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;&lt;strong&gt;ServBay&lt;/strong&gt;&lt;/a&gt;: Why install it? Because it allows you to one-click install Ollama and one-click install DeepSeek.&lt;/li&gt;
&lt;li&gt;Install a GUI that supports Ollama: Is the command-line interface not user-friendly? Don't worry, we've got the perfect GUI prepared for you.&lt;/li&gt;
&lt;li&gt;Start seamless conversations with DeepSeek!&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Ⅰ. Why Deploy DeepSeek Locally?
&lt;/h2&gt;

&lt;p&gt;Some of you may share the same experience as me—frequently encountering the message "Server busy, please try again later" when using DeepSeek. However, with local deployment, this issue simply doesn’t exist.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqh1b58ut022z8z4img6j.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqh1b58ut022z8z4img6j.PNG" alt="Image description" width="800" height="226"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I recommend everyone deploy the &lt;strong&gt;DeepSeek-R1-1.5B&lt;/strong&gt; model. Why this version? Here’s why:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Compact and Efficient :DeepSeek-R1-1.5B is a lightweight model with only 1.5 billion parameters. Sounds “tiny,” right? But don’t underestimate it—this is a “small but mighty” model! It only requires 3GB of VRAM to run, meaning even computers with modest configurations can handle it with ease. Moreover, it performs exceptionally well in mathematical reasoning, even surpassing GPT-4o and Claude 3.5 in some benchmarks. Of course, if your computer has a higher configuration, you can opt for other versions.&lt;/li&gt;
&lt;li&gt;Higher Flexibility  : Locally deployed large models offer more flexibility as they are usually not limited by external platforms. Users gain complete control over the model's behavior and content.&lt;/li&gt;
&lt;li&gt;No Content Censorship  : Large models accessed via API calls may face restrictions from the service provider’s content policies (e.g., OpenAI’s ChatGPT limits responses on sensitive topics). However, locally deployed models can be adjusted based on the user's needs, allowing for discussions on a broader range of topics.&lt;/li&gt;
&lt;li&gt;Privacy Protection  : All data is processed locally without the need for uploading to the cloud, making it suitable for scenarios with high data privacy requirements.&lt;/li&gt;
&lt;li&gt;Full Control  : Users have complete control over the model’s runtime environment, data inputs and outputs, as well as updates and optimizations of the model.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In summary, local deployment eliminates response failures and significantly enhances user satisfaction. Users with other requirements can refer to the content below for alternative options.&lt;/p&gt;

&lt;h2&gt;
  
  
  II. Hardware Requirements for Different Versions of DeepSeek
&lt;/h2&gt;

&lt;p&gt;Below are the hardware requirements for different versions of the DeepSeek model. You can choose the version that best matches your computer's configuration.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzzktl90xe4gulnw5wa3t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzzktl90xe4gulnw5wa3t.png" alt="Image description" width="800" height="497"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Ⅲ. Deployment Steps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Download ServBay
&lt;/h3&gt;

&lt;p&gt;Requires macOS 12.0 Monterey or later. Currently, they do not support a Windows version, but according to the official statement, it will be available soon. &lt;br&gt;
&lt;a href="https://www.servbay.com/" rel="noopener noreferrer"&gt;Download the latest version of ServBay&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsp8cr1lte3mado0aikvu.PNG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsp8cr1lte3mado0aikvu.PNG" alt="Image description" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The installation file is only 20MB. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Installation Steps&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Double-click the downloaded .dmg file.
&lt;/li&gt;
&lt;li&gt;In the opened window, drag the ServBay.app icon into the Applications folder.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0s6a5mo9raygzpouua0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn0s6a5mo9raygzpouua0.png" alt="Image description" width="800" height="566"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When using ServBay for the first time, initialization is required.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  One-click selection of Ollama
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffzeykylxdg86snjxzvjp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffzeykylxdg86snjxzvjp.png" alt="Image description" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once the installation is complete, open ServBay to begin using it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frqobylvkv274v5r1vva9.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frqobylvkv274v5r1vva9.jpeg" alt="Image description" width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enter the password. Once the installation is complete, you can find ServBay in the Applications directory.&lt;/li&gt;
&lt;li&gt;Access the main interface.
Originally, Ollama required a complex process to install and start services, but with ServBay, you can start with a single click, install the AI models you need, and no longer worry about configuring environment variables. Even ordinary users with no knowledge of development can use it with just one click. One-click start and stop, multi-threaded rapid model downloads, and as long as your macOS system is sufficient, you can run multiple large AI models simultaneously.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Remember Ollama's local path and port number&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlhs6my83fop5asofzhr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqlhs6my83fop5asofzhr.png" alt="Image description" width="800" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  DeepSeek One-click download for DeepSeek
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjgu7tes88xx9zzoooy8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjgu7tes88xx9zzoooy8.png" alt="Image description" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On my computer, the download speed for DeepSeek8B even exceeded 60MB per second, completely surpassing other similar tools. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frky3t68nlyeb47yesw6n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frky3t68nlyeb47yesw6n.png" alt="Image description" width="800" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With ServBay and Ollama, I was able to deploy DeepSeek locally. Look, it's running smoothly!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwusvqadvhyqdsg9l77cc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwusvqadvhyqdsg9l77cc.png" alt="Image description" width="800" height="456"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This way, we've achieved local deployment of large models using ServBay + Ollama. However, it's currently limited to the command line, and there's no GUI yet!&lt;/p&gt;

&lt;h3&gt;
  
  
  2. GUI Recommendation - Chatbox (Personally Tested, Best Option)
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Chatbox is easy to use (Free, powerful, supports file uploads, Recommendation rating: 🌟🌟🌟🌟🌟).
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://chatboxai.app/" rel="noopener noreferrer"&gt;Chatbox Download&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwnp1apgg6q7ei0acjt58.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwnp1apgg6q7ei0acjt58.png" alt="Image description" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After downloading, access the main interface:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8vcpqq75ptg3s63jrmt3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8vcpqq75ptg3s63jrmt3.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Click on settings and follow the steps to make modifications.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcbn65jah3khs7yevkzjj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcbn65jah3khs7yevkzjj.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Save changes, and you can start conversations. You'll see that the modifications have been successfully applied.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwulo053f6bb2n7wj5q6x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwulo053f6bb2n7wj5q6x.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2sjt95h9h8pk53ziluoa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2sjt95h9h8pk53ziluoa.png" alt="Image description" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With this, we've completed the full deployment of the DeepSeek-R1-1.5B model based on ServBay.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary:
&lt;/h2&gt;

&lt;p&gt;Friends, isn't it incredibly easy to locally deploy the DeepSeek-R1-1.5B model using ServBay? By following the steps above, it takes only 10 minutes to turn your computer into an "intelligent assistant."&lt;br&gt;&lt;br&gt;
Moreover, this model not only runs efficiently but can also make a big impact in various scenarios. Go ahead and give it a try!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>deepseek</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
