DEV Community: Eti Ijeoma

The Right Way to Store Money in Your Database

Eti Ijeoma — Sat, 13 Jun 2026 01:10:02 +0000

Every database has to take the values you write in code, like integers, strings, and decimals, and encode them into binary before writing them to disk. Most of the time, that encoding is invisible. But with money, the wrong encoding can quietly corrupt your data in ways that no query or application logic will catch.

If you have worked with money in a database, you have probably heard the advice that you should not use floating-point types. That advice is solid, but it is usually given without explaining what is actually going wrong at the binary level. And without that understanding, it is hard to evaluate your options and pick the right storage type for your application.

In this article, we will start by showing how a floating-point column can silently return the wrong total. Then we will go a level deeper into how decimal fractions are represented in binary, and why that makes exact storage impossible in a fixed number of bits. From there, we will walk through the safer alternatives and the tradeoffs between them so you can make an informed choice. The examples use PostgreSQL, but the types have equivalents in most databases, so the ideas apply anywhere.

The Problem With Storing Money With Floating-Point Types

Let us say we are building a simple wallet system. Since money can have decimal places, it may feel natural to store the amount with a floating-point type.

In PostgreSQL, one floating-point type we can use is double precision.

CREATE DATABASE ledger;

Now create a table for the wallet transactions.

CREATE TABLE wallet_transactions (
    id serial PRIMARY KEY,
    amount double precision NOT NULL
);

Let us insert 0.1 into the table 100 times.

INSERT INTO wallet_transactions (amount)
SELECT 0.1
FROM generate_series(1, 100);

If we add 0.1 together 100 times, the result should be 10.

0.1 × 100 = 10

Now let us sum the values in the table.

SELECT SUM(amount) AS total_amount
FROM wallet_transactions;

With double precision, this will happen:

 total_amount
-------------------
 9.99999999999998

We inserted 0.1 100 times, so we expected the result to be exactly 10. But because double precision stores approximate floating-point values, PostgreSQL returns a value that is only close to 10. To understand why this happens, we need to look at how decimal fractions are represented in binary.

Decimal Fractions in Binary

By the end of this section, you will be in the first group

A decimal fraction is a number like 0.1, 0.2, or 10.99. These numbers look simple in base 10, but the computer has to represent them in base 2, where the only available digits are 0 and 1.

For whole numbers, converting to binary is straightforward. A number like 13 can be represented exactly as 1101 in binary, and as long as it fits within the available bits, nothing is lost.

Fractions are different. To convert a decimal fraction to binary, we multiply by 2 repeatedly. Each time, the whole number part of the result becomes the next binary digit, and we continue with the remaining fraction.

Let us try this with 0.1.

0.1 × 2 = 0.2  -> digit 0, keep 0.2
0.2 × 2 = 0.4  -> digit 0, keep 0.4
0.4 × 2 = 0.8  -> digit 0, keep 0.8
0.8 × 2 = 1.6  -> digit 1, keep 0.6
0.6 × 2 = 1.2  -> digit 1, keep 0.2

In the last line, we get back to 0.2, which means the pattern will keep repeating. The 0011 part repeats every time, and the binary form of 0.1 becomes 0.0001100110011001100110011...

This is where the problem starts. 0.1 does not end cleanly in binary. To represent it exactly, the computer would need an infinite number of binary digits.

But a computer cannot store infinite digits. A floating-point value has a fixed amount of space. For example, a 64-bit float has only 64 bits available, so the computer stores the closest binary value it can fit into that space. That means the value stored is not exactly 0.1. It is only very close to it, something like 0.10000000000000000555... and that tiny difference is where floating-point precision errors come from.

What Should We Use Instead?

Now that we know why floating-point types break, let us look at the types that do not.
In PostgreSQL, the common options are money, integer, bigint, and numeric.

The Monetary Type

The money type stores a currency amount with fixed fractional precision. It may look like the natural choice for money because the name says exactly that.

CREATE TABLE payments (
    id serial PRIMARY KEY,
    amount money NOT NULL
);

The problem is that its behavior depends on a locale setting in PostgreSQL called lc_monetary. This setting controls how the database formats currency values, including the currency symbol, the decimal separator, and the position of the sign. Different servers or environments can have different locale settings, so the same stored value might be displayed as $10.99 on one server and 10,99 € on another. That kind of inconsistency can cause issues when moving data between environments or when different services read from the same database.

Storing Money as Whole Numbers

A better approach is to store money in the smallest unit of the currency as an integer. This keeps the stored value as a whole number instead of a floating-point value.

So instead of storing 10.99 dollars, you store 1099 cents

Integer

integer stores whole numbers and uses 4 bytes of storage. It is compact, fast, and exact, so it can work well when the values are small enough for its range.

CREATE TABLE payments (
    id serial PRIMARY KEY,
    amount_cents integer NOT NULL
);

This works because 1099 is stored as an exact whole number. There is no .99 for the database to approximate. The application can divide by 100 when it needs to display the value as dollars.

The main thing to watch is range. integer is efficient, but it can become too small if the system stores large amounts, long-running balances, or totals that keep growing.

Bigint

bigint is very similar to integer. It also stores whole numbers, but the main difference is that it can store a much larger range of values. The extra range comes with extra storage, because bigint uses 8 bytes instead of 4 bytes.

CREATE TABLE payments (
    id serial PRIMARY KEY,
    amount_cents bigint NOT NULL
);

To compare this with the earlier float example, let us store 0.1 dollars as 10 cents.

INSERT INTO payments (amount_cents)
SELECT 10
FROM generate_series(1, 100);

Now when we sum the values, the result is exact.

SELECT SUM(amount_cents) AS total_amount_cents
FROM payments;

 total_amount_cents
--------------------
 1000

1000 cents is exactly 10 dollars. This is the same exact result we would get with the integer type above, because both integer and bigint store the value as a whole number.

The reason to choose bigint is when the amount can grow beyond what integer can safely hold. If the values will always stay within the integer range, integer is more compact. If the values can grow over time, bigint gives you a larger range without changing the storage approach.

Numeric

If you want to store exact decimal values without converting to the smallest currency unit first, numeric is the type to use. It is also called decimal, and it lets you store values like 10.99 directly without the floating-point approximation problem.

It can also store very large numbers with decimal places.

9876543210.12345

So unlike integer and bigint, we do not have to convert the amount into the smallest currency unit first.

CREATE TABLE payments (
    id serial PRIMARY KEY,
    amount numeric NOT NULL
);

When numeric is used like this, the column is not forced into a fixed number of digits or decimal places. It can store very large decimal values, up to the database’s internal limit.

Under the hood, numeric is not stored like double precision. It uses a variable-length decimal representation instead of binary floating-point approximation. That is why values like 0.1, 10.99, and 25.50 can be stored exactly.

If we want to control the size of the value, we can define precision and scale.

CREATE TABLE payments (
    id serial PRIMARY KEY,
    amount numeric(5, 2) NOT NULL
);

In numeric(5, 2), 5 is the precision. It means the total number of digits the value can have. 2 is the scale. It means the number of digits allowed after the decimal point.

So this value fits:

999.99

But this one does not:

1000.00

1000.00 has six digits in total, but numeric(5, 2) only allows five.

The cost is that numeric is heavier than integer and bigint. It can use more storage, and calculations can be slower because the database is working with exact decimal digits instead of simple whole-number types.

So numeric is a strong option when you want exact decimal storage and you do not want to manually convert the amount into the smallest currency unit.

Conclusion

Floating-point types are designed for speed, not precision, and for money, even a tiny rounding error can compound into a real problem over time. The safest approach in most cases is to store money as a whole number in the smallest unit of the currency using integer or bigint. If you do not want to deal with that conversion in your application, numeric gives you exact decimal storage out of the box at the cost of extra storage and slower calculations. Whichever type you choose, the important thing is that the value stored in your database is exactly the value you put in.

Docker Volumes vs. Bind Mounts: Choosing the Right Storage for Your Containers.

Eti Ijeoma — Tue, 07 Jan 2025 21:10:50 +0000

Data persistence in application deployment using containerization is often a critical challenge that can make or break your application’s performance and reliability. Containers are ephemeral, meaning they spin up, execute tasks, and can be destroyed in moments.

When a containerized application is removed or destroyed, all changes made to the container itself, including the data stored in it, are lost. As a result, any files or data stored in the container's file system are erased when the containers are removed, and a new container will be created without the previous changes.

This behavior can pose a threat when working with containerized applications that rely on persisting data such as logs, databases, or sensitive configuration files. Losing such data every time a container is recreated can disrupt your software development workflow, and affect the overall functionality of the application.

To avoid losing your application data, Docker provides two key features called Docker Volumes, and bind mounts as solutions to persist data within your Docker container.

In this tutorial, we will take an extensive look at Docker volumes, and Bind Mounts, their unique features, comparative analysis, and use case recommendations for the both of them

Docker Volumes

Docker volumes are data stores used for persistent data storage for your containerized applications. In the Docker environment, you can create a volume using the docker volume create command or use the default volumes that are created whenever a container is created.

Docker volumes are decoupled from the host’s specific file system structure by providing a specific directory within ts storage area, typically located at /var/lib/docker/volumes/ on Linux and Unix-based systems. This helps to eliminate the complexity of managing storage locations.

The following command can be used create, list, and remove volumes using simple CLI commands

$ docker volume create my_volume   # Create a new volume
$ docker volume ls                 # List all volumes
$ docker volume inspect my_volume  # Inspect a specific volume
$ docker volume rm my_volume       # Remove a volume
$ docker run -v <my_volume>:<container_path> <image_name>:<tag>  # Run a container with the specified volume mounted to the container path

Consider a scenario with a PostgreSQL database. With a Docker volume, you can ensure that the database files present will persist even if the container is deleted by running the following command.

docker run -v postgres_data:/var/lib/postgresql/data postgres: latest

In this example, the PostgreSQL data stored in /var/lib/PostgreSQL/data is persisted in the postgres_data volume, which will survive all container restarts.

Docker volumes provide a separation layer between the container environment and the storage. The containers can access these volumes using mount points, while Docker manages the overall storage infrastructure.

Volumes work with Linux, Unix, and Windows Docker environments. Different drivers store data in different services. Local storage is the default within your Docker environment, but storage adapters like NFS volumes and CIFS shares are alternative volumes.

Docker also has the Docker’s API that allows the dynamic creation of volumes especially within a continuous integration and deployment pipeline.

Key Features of Docker Volumes

Data Independence
Docker Volumes offer isolation and independence from host directory structures. These volumes can also be moved from one operating system to another environment with minimal access to the host system, enhancing its security
Easy Backup and Migration capabilities
Since Docker volumes are portable, they offer a good mechanism for backup and migration within your infrastructure. Volumes can easily be copied using Docker CLI commands, and there is also support for cloud-based backup strategies.
Universal Storage compatibility
Docker volumes are platform-agnostic, which means that they provide uniform functionality across different operating systems, cloud platforms, and several other containerized environments.

When to use Docker Volumes

Docker volumes are specifically designed to support stateful applications, making them ideal for various use cases, including the following:

Database storage

When working with databases, you should mount the volume to the storage directories used by these databases. These databases may include MySQL, Postgres, and Mongo to ensure that your data persists after the container restarts.

Application Data Storage

Data that is generated by your application should be stored in a volume for persistent storage. Data includes documents, photos, and file uploads.

Cached Data Storage. Use a volume to persist any cached data generated within your application that would take time to rebuild if the container restarts.

Bind Mounts

Bind Mounts are one of the most direct and fundamental methods of persistent storage within a containerized environment. A bind mount creates a bi-directional and real-time mapping that mirrors the host filesystem’s exact state within your container environment.

Unlike Docker volumes, bind mounts allow you to mount a specific file or directory from the host's file system into the container. This connection creates a direct link between the host and container file systems, with the same path structure and access to the host files.

When you create a bind mount, Docker creates a reference point between the host directory and the container directory. Any change made in either the host or container directory reflects on the corresponding location.

# Basic bind mount syntax
docker run -v /host/path:/container/path image_name

In this example, /host/path is mounted to /container/path to create a connection between the host and the container filesystem.

Within the Docker ecosystem, bind mounts excited before Docker volumes based on Linux mount features. When Docker was created, bind mounts were the initial method for persistent storage.

Consider a scenario in which you need to mount a web development environment to a development container. The following Docker script can achieve this.

docker run -v /home/developer/project:/app \
-w /app \
node:latest \
npm start

The command above mounts the local project directory into the container’s /app directory, allowing quick, real-time code changes.

Key Features of Bind Mounts

Direct access to the Host’s Machine system With Bind Mounts, there is a direct connection to the host machine's filesystem. This provides minimal latency and synchronization of the host machine and container environment.
Control over the Mounted directory. Developers can gain control over what files and directories can be exposed to containers.

# Mount a specific subdirectory
docker run -v /home/user/project/src:/app/src image_name
# Mount individual files
docker run -v /home/user/config.json:/app/config.json image_name

The setup above selects the exact files to be exposed to the container environment.

Use Cases of Bind Mounts

Local development environments
Bind mounts work well in local development workflows because they provide instant code synchronization, which helps maintain consistency in the development environment.
Sharing of configuration files
Here, you can mount host configuration files into a container to configure web servers, databases, and application systems.

docker run -v /host/nginx.conf:/etc/nginx/nginx.conf nginx: latest

Database Data Storage Bind mounts help map database storage directories on the host to the container for data persistence. Consider the scenario for a PostgreSQL database.

docker run -v /host/pgdata:/var/lib/postgresql/data postgres: latest

Comparative Analysis Between Docker Volumes and Bind Mounts

1. Performance Comparison

Docker volumes and bind mounts differ in the performance characteristics they offer in a containerized environment.

Docker Volumes: This is designed for optimized storage. It also offers faster IO operations and caching across the host systems. Docker volumes manage storage allocation efficiently, but they often have a higher memory overhead.
Bind Mounts: Bind Mounts provide direct access to the file system of the host.
However, its performance is dependent on the host filesystem. Since it maps to that filesystem, it uses minimal memory.

The performance of Docker volumes and Bind Mount can also depend on the type of storage (HDD vs. SSD), the host filesystem (NTFS, ext4), and container runtime settings.
In terms of performance, Docker volumes provide more consistent and optimized performance with good storage management features. They are ideal for production environments and in the management of large-scale stateful applications like databases.

### Security Considerations.

Docker Volumes: This provides an isolated environment away from the host filesystem, reducing the potential for security risks. Controlled access mechanisms also ensure limited exposure to the host system.
Bind Mounts: Bind mounts expose the host filesystem directly to the container and inherit the host system’s permissions. This may lead to unauthorized access, with unintended file modifications. To avoid this, use read-only mounts and implement strict access controls for the host directories.

### Portability and Compatibility.

Docker Volumes: These are portable and consistent across different platforms, making storage management, backup, and migration strategies easier. They also work well with other Docker tools, such as Docker Desktop.
Bind Mounts: These are usually platform-dependent, and their behavior is usually tied to the host operating system. To configure this, you may need to perform a lot of path mapping, especially if both systems have different operating systems. File system differences may also occur during configuration, so extra attention is required.

Use Case Recommendations for Docker Volumes vs Bind Mounts

When to use Docker Volumes

Docker volumes are the best choice for production environments where reliability and scalability are critical. They also work well in microservice architectures, where different containers need shared access to data. When working with persistent database storage, docker volumes help keep files safe after a container restarts or is recreated. This makes them ideal for setting up database infrastructure such as PostgreSQL or MongoDB.

Due to their platform-agnostic nature, Docker volumes are ideal for cross-platform deployments. This feature makes it easier to deploy containerized applications across diverse operating systems, both on-premise and in the cloud.

In addition, Docker volumes are ideal in a CI/CD pipeline, where sensitive files and data, such as logs, artifacts, build files and dependencies, need to be stored and reused. This ensures proper data protection and isolation.

When to use Bind Mounts

Bind mounts are the best option for development and testing scenarios. They offer flexibility and real-time synchronization of files from the host to the container. Bind mounts also help developers edit code and make changes in the container. Thus, they are ideal for web development scenarios where comprehensive testing is important in the application's development stage.

Bind mounts have low overhead requirements, making them a lightweight option for applications that do not require persistent storage or cross-platform compatibility. They are best for short-lived containers that will not scale to multiple environments.

Conclusion

Docker Volumes and bind mounts are two approaches to container storage in a containerized environment. They each have unique strengths and trade-offs. In this article, we have shown how Docker volumes provide platform-agnostic storage with great security features across different environments, while bind mounts offer access to the host filesystem with minimal overhead.

This article provides the information you need to adopt the best practices for a successful application deployment. Having proper knowledge of docker volumes and storage is ideal for creating the right persistent storage features for your application.

Building RAG-Powered Applications with LangChain, Pinecone, and OpenAI

Eti Ijeoma — Fri, 13 Dec 2024 02:41:26 +0000

The rise of large language models (LLMs) that can understand and generate human-like text has really transformed the area of artificial intelligence, allowing machines to understand and generate text that feels very human-like. While there are numbers of LLMs are available, In this article, we will focus on the generative pre-trained transformer (GPT), one of OpenAI's most advanced models.

GPT models were first trained on many datasets, most of which were gathered from the Internet. This really improved how well the model can think. Because of that, it can do a good job with different natural language processing (NLP) tasks, including question-answering, summarizing, and generating human-like text.

In this article, we will explore how the retrieval augmentation generation technique can mitigate these limitations and improve the performance of language models.

Retrieval Augmentation Generation

Retrieval Augmentation Generation (RAG) is a technique for generating a high-quality, context-aware response by combining the initial prompt with information retrieved from external sources of knowledge and passing the augmented prompt to LLM. These knowledge sources may include collections of web pages, documents, or other textual resources that improve the LLM's understanding of information. Giving language models access to external data after initial training will optimize the way we train language models. In this article, We will leverage the capabilities of OpenAI alongside Langchain and Pinecone to create a context-aware chatbot.

Langchain

Langchain is a tool that's great for creating applications that use large language models.It is available as a Python or JavaScript package, allowing software developers with to create applications based on pre-existing AI models. Langchain can connect language models (LLMs) to data sources, allowing them to interact with the environment. Check out this documentation for a guide on Langchain and how to get started.

Pinecone

Pinecone is a cloud-based vector database specializing in efficiently storing, indexing, and querying high-dimensional vectors. It is designed for effective similarity searches, enabling you to find vectors that are most similar based on metrics like EEuclidean distance or cosine similarity.

To improve information retrieval for our AI model, we convert our knowledge documents into a special form called word embedding" or a "word vector" by using an embedding model and storing them in a vector database. This makes searching for relevant retrieval information faster and more accurate, especially when dealing with unstructured or semi-structured text data.

Bringing it All Together

With our understanding of retrieval augmentation generation, we will leverage the power of OpenAI LLM, Langchain, and Pinecone to create a question-answering application.

Implementation will be as follows: We will provide knowledge base documents, embed them, and store them in a pinecone. When a query is provided, it is converted into word embeddings using the same embedding model as the knowledge base text. The embedded query is then used to query Pinecone, to find the most similar and relevant vectors. These similar vectors will be translated back into the original language and used to help the LLM generate context-based responses.

Steps Involved in Retrieval Augmented Generation

Data preparation and Ingestion: After we gather our knowledge base data from different source(s). It is important to ingest (transform) the data into a standard structure that the language model can easily process. Langchain provides Document loader tools that are responsible for loading text from different sources, (text file, CSV file, youtube transcript, etc) to create a Langchain document. A Langchain document has two fields;

page_content: Contains the text of the file
metadata: It stores additional relevant information about the text such as the text URL.

Examples of document loaders are Text loaders which can open a text file and Transform loaders which can open from any list of specific formats (HTML, CSV, etc) and load into a document.

Chunking Text: This entails breaking down a document into smaller, more meaningful fragments, which are often shaped like sentences. This procedure is critical, especially when dealing with lengthy text inside the context window constraints of GPT-4, which presently supports a maximum of 8,912 tokens. The idea is to construct manageable fragments that are semantically coherent.

To accomplish this, we use a length function, such as tiktoken, to calculate the sizes of these smaller fragments. The goal is to prevent suddenly splitting relevant information during chunking. In addition, we include an overlap that allows nearby pieces to share their content. This overlap contributes to continuity by including common words or phrases at the end of one chunk and the start of the next.

Embedding: This process transforms complex data e.g. text, images, and audio into high-dimensional numeric vectors. Numerical storage allows for effective storage and processing. Also, embedding methods such as Word embeddings (Word2Vec) capture the semantic relationship between words and concepts. This semantic information is essential during the retrieval augmentation where understanding the context and relationships between terms is crucial for generating relevant content.

Embedding tools: Langchain integrates with several models for generating word or sentence embeddings. In this article, we will make use of the OpenAIEmbeddings to create embeddings.

Storage: The embedded documents are then stored in Pinecone’s vector database. Pinecone has some indexing techniques that organize and optimize the search to facilitate efficient retrieval of vectors similar to a query vector.

Retrieval: The received query is embedded and then used to search Pinecone’s vector database for the most relevant documents. For example, In a similarity search, the term "Dog" may be represented numerically as [0.617]. Anytime a word is searched, they are also converted into vectors. a good model ensures that words with similar contexts, such as "puppy," yield closely aligned number series, like [0.691], reflecting the shared context between the words.

Generation: The language model produces an accurate response for the query by utilizing the retrieved document as an additional context.

Implementing the question-and-answering chatbot

In this section, we’ll walk through a practical example of building an international football match question-answering bot. You can download the context-aware document will come from Kaggle data set.

Setting up the environment

To begin, we will create a .env file in which we will keep the Pinecone and OpenAI environment secret keys. To generate or find the secret keys for Pinecone and OpenAI, click the attached link.

OPENAI_API_KEY=""
PINECONE_ENV=""
PINECONE_API_KEY=""

Next, we'll utilize pip to install the necessary packages.

!pip install openai \
      \ langchain 
       \ pinecone 
       \ tiktoken

Then we'll import the relevant libraries.

import os
import time
from langchain.llms import OpenAI
from langchain.vectorstores import Pinecone
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import ConversationalRetrievalChain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT, QA_PROMPT
from langchain.chains.question_answering import load_qa_chain
from langchain.embeddings.openai import OpenAIEmbeddings

We'll then use the Langchain DirectoryLoader to load documents from a directory. The documents in this example will be in CSV format and placed in the /data directory.

# In your project folder, create this directory structure which will hold the context-aware documents downloaded earlier 
directory = '/data' 
def load_docs(directory):
  loader = DirectoryLoader(directory, glob='**/*.csv', show_progress=True, loader_cls=CSVLoader)
  documents = loader.load()
  return documents

documents = load_docs(directory)
print(f"Loaded {len(documents)} documents")

Next, we will split the documents into smaller chunks. This can be complex but we will simplify it using RecursiveCharacterTextSplitter from langchain. First, we will determine the custom length using the tiktoken library which will enable us to recursively split the data into n tokens

# Tell tiktoken what model we'd like to use for embeddings
tiktoken.encoding_for_model('text-embedding-ada-002')

# Intialize a tiktoken tokenizer (i.e. a tool that identifies individual tokens (words))
tokenizer = tiktoken.get_encoding('cl100k_base')

# Create our custom tiktoken function
def tiktoken_len(text: str) -> int:
    """
    Split up a body of text using a custom tokenizer.

    :param text: Text we'd like to tokenize.
    """
    tokens = tokenizer.encode(
        text,
        disallowed_special=()
    )
    return len(tokens)

def chunk_by_size(text: str, size: int = 50) -> list[Document]:
    """
    Chunk up text recursively.

    :param text: Text to be chunked up
    :return: List of Document items (i.e. chunks).|
    """
    text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = size,
    chunk_overlap = 20,
    length_function = tiktoken_len,
    add_start_index = True,
)
    return text_splitter.create_documents([text])

We will initialize Initialize our OpenAI model

# Initialize our OpenAI model
OPENAI_API_KEY = getpass("OpenAI API Key: ")
model_name = 'text-embedding-ada-002'

embeddings = OpenAIEmbeddings(
    model=model_name,
    openai_api_key=OPENAI_API_KEY
)

After that, Pinecone will be initialized using the environment and Pincone API key. If an index does not already exist, one will be created and configured to store 1536 dimension vectors that correspond to the embedding's length, using cosine similarity as the similarity metric. The Pincone instance will be created using the embeddings and index that have been provided, and the document will be added to the vector store using the Pinecone.from_documents() method.

PINECONE_API_KEY = os.getenv('PINECONE_API_KEY')
PINECONE_ENV = os.getenv('PINECONE_ENV')
# Initialize Pinecone client
pinecone.init(
    api_key=PINECONE_API_KEY,
    environment=tPINECONE_ENV
)
index_name = 'international-sport'

if index_name not in pinecone.list_indexes():
    print(f'Creating Index {index_name}...')
    pinecone.create_index(index_name, dimension=1536, metric='cosine') # Create index. This might take a while to create
    # wait a moment for the index to be fully initialized
     time.sleep(1)
     print('Done')
else:
    print(f'index {index_name} already exists')
index = Pinecone.from_documents(docs, embeddings, index_name=index_name) 

# to retieve the number of vectors in the embedding
index = pinecone.Index(INDEX_NAME)
print(index.describe_index_stats())

Finding similar document

We will now define the function to search Pinecone for similar documents based on the user query input

get_similar_docs(cls, query, index, k=5):
    found_docs = index.similarity_search(query, k=k)
    print(found_docs)
    if len(found_docs) == 0:
      return "Sorry, There is no relevant answer to your question. Please try again."
    logger.info("Found document similar to the query")
    return found_docs

Next, we'll use the Langchain PromptTemplate to create a predefined parameterized text format that will be used to direct response generation in a specific context.

template = """
You are an AI chatbot with a sense of humor.
Your mission is to turn the user's input into funny jokes.

{chat_history}
Human: {human_input}
Chatbot:"""

new_prompt = PromptTemplate(
input_variables=["chat_history", "human_input"],
template=template
)
new_memory = ConversationBufferMemory(memory_key="chat_history")

Now, we will write a function that uses OpenAI LLM, a question-answering chain load_qa_chain from Langchain, and takes the user's query as input. You can choose from a variety of chain types depending on your use case; in this case, we'll use the stuff chain_type, which uses all of the text from the prompt's documents.

model_name = "gpt-3.5-turbo-0301"
llm = OpenAI(model_name=model_name)

chain = load_qa_chain(llm, chain_type="stuff")

def retrieve_answer(query):
  similar_docs = get_similar_docs(query)
  return chain.run(input_documents=similar_docs, question=query)

Lastly, we will test the functionality of the question-answering system using the query below

query = "How many away goals did scotland score in the last century?"
reponse = retrieve_answer(query)
print(answer)
 //419

Conclusion

In this article, we explained how we can mitigate hallucination during the response generation by giving it some context. We wrapped it up by building a context-aware question-answering system that utilizes the power of semantic search to extract relevant information from a set of documents that will give context to Open AI.