Ajeet Singh Raina

Posted on Oct 9, 2023 • Edited on Oct 11, 2023 • Originally published at collabnix.com

Getting Started with GenAI Stack powered with Docker, LangChain, Neo4j and Ollama

#docker #dockercon #neo4j #ai

At DockerCon 2023, Docker announced a new GenAI Stack - a great way to quickly get started building GenAI-backed applications with only a few clicks.

The GenAI Stack came about through a collaboration between Docker, Neo4j, LangChain, and Ollama. The goal of the collaboration was to create a pre-built GenAI stack of best-in-class technologies that are well integrated, come with sample applications, and make it easy for developers to get up and running. The goal of the collaboration was to create a pre-built GenAI stack of best-in-class technologies that are well integrated, come with sample applications, and make it easy for developers to get up and running.

What is GenAI Stack?

The GenAI Stack is a one-stop shop for getting started with GenAI app development. It is basically a set of Docker containers that are orchestrated by Docker Compose. It provides all the tools and resources you need to build and run GenAI apps, without having to worry about setting up and configuring everything yourself. It makes it easy to build and run AI apps that can generate text, code, and other creative content.

What is GenAI Stack composed of?

The stack is a set of Docker containers that make it easy to experiment with building and running Generative AI (GenAI) apps. The containers provide a dev environment of a pre-built, support agent app with data import and response generation use-cases. It includes:

Ollama - A management tool for local LLMs (Ollama)
Neo4j - A database for grounding and
GenAI apps based on LangChain

Why Ollama, Neo4j and LangChain?

LangChain and Ollama were involved in the collaboration because of their expertise in LLMs. LangChain is a programming and orchestration framework for LLMs, and Ollama is a tool for running and managing LLMs locally.

Neo4j was involved in the collaboration because of its expertise in graph databases and knowledge graphs. Neo4j recognized that the combination of graphs and LLMs is powerful and that it could be used to build GenAI applications that are more accurate and reliable.

What is Ollama all about?

Ollama is a lightweight, extensible framework for building and running large language models (LLMs) on a local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.

Benefits of Ollama

If you are interested in building and running LLMs on your local machine, I encourage you to check out Ollama. Here are some of the benefits of using Ollama:

It is easy to use and install.
It supports a wide range of LLMs.
It is extensible and customizable.
It is actively maintained and updated.

Supported File Formats by Ollama

Ollama supports importing GGUF and GGML file formats, which means that you can use it to run a wide range of LLMs, including:

Llama 2
Code Llama
Bard
Jurassic-1 Jumbo

Ollama also supports customizing and creating your own models. This makes it a powerful tool for researchers and developers who are working on new advances in LLM technology.

Ollama is now available as an official Docker image

With 4,300 DockerHub downloads till date, Ollama is now available as an official Docker sponsored open-source image, making it simpler to get up and running with large language models using Docker containers.

If you are interested in building and running LLMs on your local machine, I encourage you to check out Ollama. It is a great tool for getting started with GenAI app development.

Ollama can handle running LLMs with GPU acceleration on macOS. It does this by using Docker Desktop to create a containerized environment with the necessary dependencies installed.

What is Neo4j?

Neo4j is a native graph database that is used in the GenAI Stack to provide grounding for large language models (LLMs). Grounding is the process of anchoring LLMs to real-world knowledge and context. This is important because it helps LLMs to generate more accurate and relevant responses.

Neo4j is a good choice for grounding LLMs because it is fast and scalable. It can also store and query complex graph data, which is ideal for representing the relationships between different entities in the real world.

In the GenAI Stack, Neo4j is used to store a knowledge graph that contains information about a variety of topics, such as people, places, and events. The LLM can then access this knowledge graph to generate more accurate and relevant responses to user queries.

For example, if a user asks the LLM "What is the capital of France?", the LLM can query the knowledge graph to find out that the capital of France is Paris. The LLM can then generate a response such as "The capital of France is Paris."

Neo4j is also used in the GenAI Stack to provide context for LLMs. Context is the information that surrounds a piece of text. It is important because it helps LLMs to better understand the meaning of the text.

Here are some of the benefits of using Neo4j in the GenAI Stack:

It is fast and scalable.
It can store and query complex graph data.
It can provide grounding and context for LLMs.
It can help LLMs to generate more accurate and relevant responses to user queries.

If you are working on developing GenAI applications, I encourage you to consider using Neo4j. It is a powerful tool that can help you to build more accurate and reliable AI systems.

What is LangChain?

LangChain is a programming and orchestration framework for large language models (LLMs). It provides a simple and intuitive way to interact with LLMs, and it makes it easy to build GenAI applications.

What is LangChain built on?

LangChain is built on top of PyTorch, and it provides a Pythonic API for interacting with LLMs. LangChain also provides a number of features that make it easy to build and deploy GenAI applications, including:

A built-in model server that makes it easy to deploy LLMs to production.
A library of pre-trained LLMs that can be used to build GenAI applications quickly and easily.
A set of tools for debugging and monitoring LLM applications.

LangChain plays an important role in the GenAI Stack. It provides the programming and orchestration framework that is needed to build GenAI applications. LangChain also provides a number of features that make it easy to build and deploy GenAI applications to production.

Benefits of LangChain

Here are some of the benefits of using LangChain in the GenAI Stack:

It provides a simple and intuitive way to interact with LLMs.
It makes it easy to build GenAI applications.
It provides a built-in model server for deploying LLMs to production.
It provides a library of pre-trained LLMs.
It provides a set of tools for debugging and monitoring LLM applications.

If you are interested in building GenAI applications, I encourage you to check out LangChain. It is a powerful tool that can help you to build and deploy reliable AI systems.

Component of GenAI Stack

GenAI Stack comes bundled with the core components you need to get started, already integrated and set up for you in Docker containers. It makes it really easy to experiment with new models, hosted locally on your machine (such as Llama2) or via APIs (like OpenAI’s GPT). It is already set up to help you use the Retrieval Augmented Generation (RAG) architecture for LLM apps, which, in my opinion, is the easiest way to integrate an LLM into an application and give it access to your own data

It comes bundled with the core components you need to get started, already integrated and set up for you in Docker containers
It makes it really easy to experiment with new models, hosted locally on your machine (such as Llama2) or via APIs (like OpenAI’s GPT)
It is already set up to help you use the Retrieval Augmented Generation (RAG) architecture for LLM apps, which, in my opinion, is the easiest way to integrate an LLM into an application and give it access to your own data
Includes Neo4j as the default database for vector search and knowledge graphs, and it’s available completely for free.

Getting Started

Prereq

Step 1. Install Docker Desktop for Mac 4.23.0

Note: There is a performance issue that impacts python applications in the latest release of Docker Desktop v4.24.0. Until a fix is available, please use version 4.23.0 or earlier.

Please note that currently, Windows is not supported by Ollama, so Windows users need to generate a OpenAI API key and configure the stack to use gpt-3.5 or gpt-4 in the .env file.

Step 2. Install Ollama on Mac OS

Visit this link to download and install Ollama on Macbook.

Choose your preferrable operating system.

Step 3. Create OpenAI Secret API Keys

Visit this link to create your new OpenAI Secret API Keys.

Step 4. Sign Up for LangChain Beta for API Keys

Visit this link in order to create Langchain Endpoint and API Keys. You will need the following information

LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_TRACING_V2=true # false
LANGCHAIN_PROJECT=default
LANGCHAIN_API_KEY=ls__cbabccXXXXXX

Step 5. Clone the repository

 git clone https://github.com/docker/genai-stack
 cd genai-stack

Step 6. Create .env file

cat .env 
OPENAI_API_KEY=sk-EsNJzI5uMBCXXXXXXXX
OLLAMA_BASE_URL=http://host.docker.internal:11434
NEO4J_URI=neo4j://database:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=password
LLM=llama2 #or any Ollama model tag, or gpt-4 or gpt-3.5
EMBEDDING_MODEL=sentence_transformer #or openai or ollama

LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_TRACING_V2=true # false
LANGCHAIN_PROJECT=default
LANGCHAIN_API_KEY=ls__cbabccXXXXXX

Don't forget to change "localhost" to "database" under NEO4J_URI entry.

Step 7. Bring up Compose services

 docker compose up -d --build

 pulling 8daa9615cce3...  93% |██████████████  | (3.5/3.8 GB, 29 MB/s) [2m... pulling model (250s) - will take several minutes
genai-stack-pull-model-1  | ... pulling model (260s) - will take several minutes
pulling 8daa9615cce3... 100% |███████████████| (3.8/3.8 GB, 29 MB/s)
genai-stack-pull-model-1  | ... pulling model (270s) - will take several minutes
pulling 8c17c2ebb0ea... 100% |█████████████████| (7.0/7.0 kB, 3.9 MB/s)
pulling 7c23fb36d801... 100% |█████████████████| (4.8/4.8 kB, 989 kB/s)
genai-stack-pull-model-1  | ... pulling model (280s) - will take several minutes
pulling bec56154823a... 100% |████████████████████| (59/59 B, 103 kB/s)
pulling e35ab70a78c7... 100% |█████████████████████| (90/90 B, 15 kB/s)
genai-stack-pull-model-1  | ... pulling model (290s) - will take several minutes
pulling 09fe89200c09... 100% |██████████████████| (529/529 B, 4.2 MB/s)
genai-stack-pull-model-1  | verifying sha256 digest
genai-stack-pull-model-1  | writing manifest
genai-stack-pull-model-1  | removing any unused layers
genai-stack-pull-model-1  | success
genai-stack-pull-model-1 exited with code 0
genai-stack-loader-1      |
genai-stack-loader-1      | Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
genai-stack-loader-1      |
genai-stack-pdf_bot-1     |
genai-stack-pdf_bot-1     | Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
genai-stack-pdf_bot-1     |
genai-stack-bot-1         |
genai-stack-bot-1         | Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
genai-stack-bot-1         |
genai-stack-bot-1         |
genai-stack-bot-1         |   You can now view your Streamlit app in your browser.
genai-stack-bot-1         |
genai-stack-bot-1         |   URL: http://0.0.0.0:8501
genai-stack-bot-1         |
genai-stack-pdf_bot-1     |
genai-stack-pdf_bot-1     |   You can now view your Streamlit app in your browser.
genai-stack-pdf_bot-1     |
genai-stack-pdf_bot-1     |   URL: http://0.0.0.0:8503
genai-stack-pdf_bot-1     |
genai-stack-loader-1      |
genai-stack-loader-1      |   You can now view your Streamlit app in your browser.
genai-stack-loader-1      |
genai-stack-loader-1      |   URL: http://0.0.0.0:8502
genai-stack-loader-1      |

Step 8. Viewing the Services on Docker Dashboard

Step 9. Accessing the app

Visit http://0.0.0.0:8502 to access the following:

Click "Import". It will take a minute or two to run the import. Most of the time is spent generating the embeddings. After or during the import you can click the link to http://localhost:7474 and log in with username “neo4j” and password “password” as configured in docker compose. There, you can see an overview in the left sidebar and show some connected data by clicking on the “pill” with the counts.

The data loader will import the graph using the following schema.

Result:

The graph schema for Stack Overflow consists of nodes representing Questions, Answers, Users, and Tags. Users are linked to Questions they’ve asked via the “ASKED” relationship and to Answers they’ve provided with the “ANSWERS” relationship. Each Answer is also inherently associated with a specific Question. Furthermore, Questions are categorized by their relevant topics or technologies using the “TAGGED” relationship connecting them to Tags.

Step 10. Accessing the Neo4j

As instructed, open http://localhost:7474 and log in with username “neo4j” and password “password” as configured in docker compose.

Query the Imported Data via a Chat Interface Using Vector + Graph Search

This application server on http://localhost:8501 has the classic LLM chat UI and lets the user ask questions and get answers.

There’s a switch called RAG mode where the user can rely either completely on the LLMs trained knowledge (RAG: Disabled), or the more capable (RAG: Enabled) mode where the application uses similarity search using text embedding and graph queries to find the most relevant questions and answers in the database.

Click "Highly ranked questions"

Accessing GenAI Stack PDF Bot

http://0.0.0.0:8503/ to access the PDF bot.

DEV Community