Index
- Introduction
- Azure Resources
- Code
- Do not forget to Clean the Cloud
- Conclusion
Introduction
In this lab, we will build a full RAG pipeline using Azure. RAG is a technique where, instead of relying solely on a language model's training data, we first retrieve relevant documents from an external knowledge base and then pass them to the model to generate a more accurate and grounded answer.
To do this, we will use two Azure services: Azure AI Search as the vector database to store and retrieve document embeddings, and Azure AI Foundry to deploy the embedding model and the generation model.
By the end of this lab, you will have a working RAG pipeline running on Azure.
Azure Resources
Azure AI Search
Azure AI Search is a cloud search service that supports full-text search, filters, and vector search. In this lab, we are using it as a vector database. We store document embeddings in it and query them using cosine similarity to find the most relevant documents for a given input.
To set it up, we need to provision the resource and get two values: VECTOR_SEARCH_ENDPOINT and VECTOR_SEARCH_KEY, which will be used as environment variables.
- Go to the Azure Portal and open
AI Search.
- Click
Createto create a new search service.
- Create a new or use an existing resource group. (Suggest: create a new one so its easy to delete the resources later on.)
- Give a unique name for
Service name. - Make sure the
Pricing tieris free, unless you want to experience paid service. - Finally, click
Review + Createbutton at the bottom and then clickCreatebutton to create the resource.
- Next, do to the resource dashboard and copy the
Urlfrom theEssentials. ThisUrlwill be used asVECTOR_SEARCH_ENDPOINT.
- To get the
VECTOR_SEARCH_KEYgo to theKeystab underSettingssection from the left navbar. From this screen, copy thePrimary admin key.
Azure AI Foundry
Azure AI Foundry is a platform for deploying and managing AI models on Azure. It lets you deploy base models (like OpenAI models) as endpoints that you can call from your own code. In this lab, we are using it to deploy two models: text-embedding-3-small to generate embeddings, and a generation model to produce the final answer.
To set it up, we need to provision the resource and get two values: AZURE_OPEN_API_KEY and AZURE_OPEN_API_ENDPOINT, which will be used as environment variables.
- Go to the Azure AI Foundry
- Click
Create newbutton to create a new project. - For the resource type, keep the recommended option and click
Next.
- Give it a good name, keep everything else default and click
Create.
- Finally, from the project overview page, copy
API Keyto use asAZURE_OPEN_API_KEYandMicrosoft Foundry project endpointto use asAZURE_OPEN_API_ENDPOINT.
- Now from the left navbar, click
Models + endpointsunderMy assetssection. - Click
Deploy base model. Now we will deploy an embedding model and a generation model.
- Search for
text-embedding-3-smalland clickConfirm.
- Change the
Deployment typetoStandardand clickDeployto deploy the model.
Code
- Open this notebook in Google Colab.
- Add the following environment variables by clicking on this key button, and grant them notebook access
AZURE_OPEN_API_ENDPOINTAZURE_OPEN_API_KEYVECTOR_SEARCH_ENDPOINTVECTOR_SEARCH_KEY
- Finally, you can run the commands in the notebook.
Clean the Cloud
- On Azure Portal go to
All Resourcesand delete all the resources we created for this lab.
Conclusion
In this lab, we set up a full RAG pipeline on Azure using Azure AI Search as the vector database and Azure AI Foundry to deploy the embedding and generation models. Thanks for reading! If you want to understand the math behind how the retrieval step works, check out my other blog on the Math behind Embeddings and Cosine Similarity.
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.