Marco Gonzalez for AWS Community Builders

Posted on May 3

3GPP Insights: Expert Chatbot with Amazon Bedrock & RAG

#rag #bedrock #5g #aws

Being in the Telecom field for quite a few years now, I often hear the same questions: "Can you show me in which part of the 3GPP standard this/that feature is mentioned? or Is this solution aligned to current 3GPP Standards?". Whether it's a passionate new grad who just joined the company and wants to prove his/her value to the team, or a suspicious customer who loves to dive deep into every detail to make them look cooler with their boss😉, the goal is the same: Get the desired data in a human-readable format.

Thinking of the different ways to use GenAI for the Telecom Field, I came up with the following blog entry. What if you could just simply ask a GenAI model a 3GPP-feature compliance question and get the answer in seconds (or minutes depending on which LLM model you are testing)? Let's get started 🤓

Call-Flow:

Below is the Architecture and call-flow of this 3GPP Chatbot, I will briefly explain each item in the following section:

Call Flow Explanation:

Data Integration Workflow:

Load Data into Memory through PyPDFLoader, a library provided by Langchain, which allows us to insert all the data from the PDF into memory. This initial action should be performed by our good old Telco-guru.
Break down the ingested data into single pages, paragraphs, lines, and then characters until we can create vector embeddings from smaller chunks of data. For that, I will use "RecursiveCharacterTextSplitter," defining the desired chunk of characters.
Create vector embeddings from the chunks of characters. In this demo, I will use Amazon Titan Text embedding.
The last step is to store vector embeddings in a Vector store and create an index for easier search and retrieval.

End-user Flow:

A - The flow starts with our Telco Newgrad posting a question, which then goes to the Titan Text embedding model.
B - Titan will create vector embeddings for that specific question.
C - Once these vector embeddings are created, a similarity check will be in place in the vector store.
D - If a match is found, a response or "context" will be retrieved and sent to the next step, which is the Foundation Model.
E - The question and context will then be combined and sent to our Foundation Model, in this case, Llama3.
F - A human-readable answer will be generated and prepared to be sent back to our Telco Newgrad.
G - The final step is an accurate response sent back through the chatbox, solving our new grads' 3GPP-related questions and saving them minutes (or hours) in the process.

Implementation

Pre-requisites:

The following tools must be installed before apply and test the code, so please check all items before moving on with the next steps.

VSCode (Recommended one for its Anaconda Integration)
Python
AWS CLI
IAM role for VSCode
Anaconda Navigator --> Open VSCode from Anaconda Navigator
Install Boto3 pip install Boto3
Install langchain pip install langchain
Install Streamlit for an easy FrontEnd option pip install streamlit
Install Bedrock pip install Bedrock
Install Flask-SQLalchemy pip3 install flask-sqlalchemy
Install Pypdf pip install pypdf
Install faiss-gpu pip install faiss-gpu or pip install faiss-cpu

1. Data Load Operation:

Our first piece of code will include the Data Load operation. We will create a new .py file and use the below code as reference

import os
from langchain.document_loaders import PyPDFLoader
data_load=PyPDFLoader('https://www.etsi.org/deliver/etsi_ts/129500_129599/129510/16.04.00_60/ts_129510v160400p.pdf')
data_test=data_load.load_and_split()
print(len(data_test))
print(data_test[0]) ##You can test by replacing [0] to the page number you want to fetch

Below code can then be omitted as its sole purpose is to help us understand how PyPDFLoader works :)

print(len(data_test))
print(data_test[0]) ##You can test by replacing [0] to the page number you want to fetch

2. Data Transformation:

For the Data transformation, we need to start by splitting the original text into smaller chunks.

Refer to this link for official Langchain Documentation for Text Splitter Langchain-Text-Splitter
Name of this file: data_split_test.py

#1. Import OS, Document Loader, Text Splitter, Bedrock Embeddings, Vector DB, VectorStoreIndex, Bedrock-LLM
import os
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

#2. Define the data source and load data with PDFLoader
data_load=PyPDFLoader('https://www.etsi.org/deliver/etsi_ts/129500_129599/129510/16.04.00_60/ts_129510v160400p.pdf')
#3. Split the Text based on Character, Tokens etc. - Recursively split by character - ["\n\n", "\n", " ", ""]
data_split=RecursiveCharacterTextSplitter(separators=["\n\n", "\n", " ", ""], chunk_size=100,chunk_overlap=10)
data_sample = 'The mandatory standard HTTP headers as specified in clause 5.2.2.2 of 3GPP TS 29.500 [4] shall be supported.'
data_split_test = data_split.split_text(data_sample)
print(data_split_test)

3. Embedding, Vector Store & Index operation

For this step, we will invoke our Bedrock Titan model "amazon.titan-embed-text-v1" and create a vector store and Index.

Name of this file: rag_backend.py

#Import OS, Document Loader, Text Splitter, Bedrock Embeddings, Vector DB, VectorStoreIndex, Bedrock-LLM
import os
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import BedrockEmbeddings
from langchain.vectorstores import FAISS
from langchain.indexes import VectorstoreIndexCreator
from langchain.llms.bedrock import Bedrock

# Wrap within a function
def 3gpp_index():
    #2. Define the data source and load data with PDFLoader
data_load=PyPDFLoader('https://www.etsi.org/deliver/etsi_ts/129500_129599/129510/16.04.00_60/ts_129510v160400p.pdf')
    #3. Split the Text based on Character, Tokens etc. - Recursively split by character - ["\n\n", "\n", " ", ""]
    data_split=RecursiveCharacterTextSplitter(separators=["\n\n", "\n", " ", ""], chunk_size=100,chunk_overlap=10)
    #4. Create Embeddings -- Client connection
    data_embeddings=BedrockEmbeddings(
    credentials_profile_name= 'default',
    model_id='amazon.titan-embed-text-v1')
    #5à Create Vector DB, Store Embeddings and Index for Search - VectorstoreIndexCreator
    data_index=VectorstoreIndexCreator(
        text_splitter=data_split,
        embedding=data_embeddings,
        vectorstore_cls=FAISS)
    #5b  Create index for 3GPP document
    db_index=data_index.from_loaders([data_load])
    return db_index

4. LLM creation + Context

It's time to create a Foundation Model that will process both the query and the generated Context. I have selected "meta.llama3-8b-instruct-v1:0" as it's an opensource solution and opensource == no hidden costs ;)

Name of this file: rag_backend.py

#Function to connect to Bedrock Foundation Model - Llama3 Foundation Model
def 3gpp_llm():
    llm=Bedrock(
        credentials_profile_name='default',
        model_id='meta.llama3-8b-instruct-v1:0',
        model_kwargs={
        "max_tokens_to_sample":3000,
        "temperature": 0.1,
        "top_p": 0.9})
    return llm
# The following function searches the user prompt and the best match from Vector DB and sends both to LLM.
def 3gpp_rag_response(index,question):
    rag_llm=3gpp_llm()
    3gpp_rag_query=index.query(question=question,llm=rag_llm)
    return 3gpp_rag_query

5. FrontEnd and Final Integration

The below frontend code is provided by AWS and Streamlit. Below modifications were done to align with our lab:
Name of this file: rag_frontend.py

import streamlit as st 
import rag_backend as 3gpp_demo ### replace rag_backend with your backend filename

st.set_page_config(page_title="3GPP Q and A with RAG") 

new_title = '<p style="font-family:sans-serif; color:Blue; font-size: 30px;">3GPP Chatbot Guru with RAG 🧩</p>'
st.markdown(new_title, unsafe_allow_html=True) 

if 'vector_index' not in st.session_state: 
    with st.spinner("⏳ Please wait for our minions to finish preparing your answer in the back👾👾"): 
        st.session_state.vector_index = demo.3gpp_index() ### Your Index Function name from Backend File

input_text = st.text_area("Input text", label_visibility="collapsed") 
go_button = st.button("📌Answer this Chatbot Gur", type="primary") ### Button Name

if go_button: 

    with st.spinner("📢Minions are still working 👾👾"): ### Spinner message
        response_content = demo.hr_rag_response(index=st.session_state.vector_index, question=input_text) ### replace with RAG Function from backend file
        st.write(response_content)

Once above code is prepared, you just need to compile it and run it using below command:
streamlit run rag_frontend.py

Wrap-up

Thank you for joining me on this journey through the exciting potential of generative AI in the telecommunications sector. As I've shown in this short blog entry, using large language models (LLMs) like Llama3 can revolutionize how we interact with complex 3GPP standards, providing rapid, precise answers that empower both technical professionals and business stakeholders.

Whether you're a developer looking to integrate advanced AI capabilities into your applications, or a non-developer curious about leveraging AI to enhance operational efficiency, I encourage you to experiment with LLMs.

Why wait? Start your LLM journey now and unleash the full potential of AI in your personal or professional projects.

Happy Learning!

DEV Community

3GPP Insights: Expert Chatbot with Amazon Bedrock & RAG

Call-Flow:

Call Flow Explanation:

Data Integration Workflow:

End-user Flow:

Implementation

Pre-requisites:

1. Data Load Operation:

2. Data Transformation:

3. Embedding, Vector Store & Index operation

4. LLM creation + Context

5. FrontEnd and Final Integration

Wrap-up

Top comments (0)

Read next

AWS EKS: From IRSA to Pod Identity With Terraform

Secure Pattern for Deploying WASM on S3

Deploy Sendy on AWS EC2 with Apache in Ubuntu

Say Goodbye to Manual Deployments: Automate Your EC2 Autoscaling with CodeDeploy and GitHub Actions