DEV Community

Cover image for Hands-on: Azure AI Search & AI Foundry for RAG
Dev J. Shah 🥑
Dev J. Shah 🥑

Posted on

Hands-on: Azure AI Search & AI Foundry for RAG

Index

  1. Introduction
  2. Azure Resources
  3. Code
  4. Do not forget to Clean the Cloud
  5. Conclusion

Introduction

In this lab, we will build a full RAG pipeline using Azure. RAG is a technique where, instead of relying solely on a language model's training data, we first retrieve relevant documents from an external knowledge base and then pass them to the model to generate a more accurate and grounded answer.

To do this, we will use two Azure services: Azure AI Search as the vector database to store and retrieve document embeddings, and Azure AI Foundry to deploy the embedding model and the generation model.
By the end of this lab, you will have a working RAG pipeline running on Azure.

Azure Resources

Azure AI Search

Azure AI Search is a cloud search service that supports full-text search, filters, and vector search. In this lab, we are using it as a vector database. We store document embeddings in it and query them using cosine similarity to find the most relevant documents for a given input.

To set it up, we need to provision the resource and get two values: VECTOR_SEARCH_ENDPOINT and VECTOR_SEARCH_KEY, which will be used as environment variables.

  1. Go to the Azure Portal and open AI Search.

Screenshot From 2026-03-07 11-48-30

  1. Click Create to create a new search service.

Screenshot From 2026-03-07 11-49-39

  1. Create a new or use an existing resource group. (Suggest: create a new one so its easy to delete the resources later on.)
  2. Give a unique name for Service name.
  3. Make sure the Pricing tier is free, unless you want to experience paid service.
  4. Finally, click Review + Create button at the bottom and then click Create button to create the resource.

Screenshot From 2026-03-07 11-53-43

  1. Next, do to the resource dashboard and copy the Url from the Essentials. This Url will be used as VECTOR_SEARCH_ENDPOINT.

Screenshot From 2026-03-07 11-57-59

  1. To get the VECTOR_SEARCH_KEY go to the Keys tab under Settings section from the left navbar. From this screen, copy the Primary admin key.

Screenshot From 2026-03-07 11-59-14

Azure AI Foundry

Azure AI Foundry is a platform for deploying and managing AI models on Azure. It lets you deploy base models (like OpenAI models) as endpoints that you can call from your own code. In this lab, we are using it to deploy two models: text-embedding-3-small to generate embeddings, and a generation model to produce the final answer.

To set it up, we need to provision the resource and get two values: AZURE_OPEN_API_KEY and AZURE_OPEN_API_ENDPOINT, which will be used as environment variables.

  1. Go to the Azure AI Foundry
  2. Click Create new button to create a new project.
  3. For the resource type, keep the recommended option and click Next.

Screenshot From 2026-03-07 12-21-42

  1. Give it a good name, keep everything else default and click Create.

Screenshot From 2026-03-07 12-33-28

  1. Finally, from the project overview page, copy API Key to use as AZURE_OPEN_API_KEY and Microsoft Foundry project endpoint to use as AZURE_OPEN_API_ENDPOINT.

Screenshot From 2026-03-07 12-37-15

  1. Now from the left navbar, click Models + endpoints under My assets section.
  2. Click Deploy base model. Now we will deploy an embedding model and a generation model.

Screenshot From 2026-03-07 12-41-27

  1. Search for text-embedding-3-small and click Confirm.

Screenshot From 2026-03-07 12-43-59

  1. Change the Deployment type to Standard and click Deploy to deploy the model.

Screenshot From 2026-03-07 12-45-04


Code

  1. Open this notebook in Google Colab.
  2. Add the following environment variables by clicking on this key button, and grant them notebook access
    • AZURE_OPEN_API_ENDPOINT
    • AZURE_OPEN_API_KEY
    • VECTOR_SEARCH_ENDPOINT
    • VECTOR_SEARCH_KEY

Screenshot From 2026-03-07 13-01-29

  1. Finally, you can run the commands in the notebook.

Clean the Cloud

  1. On Azure Portal go to All Resources and delete all the resources we created for this lab.

Screenshot From 2026-03-07 13-05-17


Conclusion

In this lab, we set up a full RAG pipeline on Azure using Azure AI Search as the vector database and Azure AI Foundry to deploy the embedding and generation models. Thanks for reading! If you want to understand the math behind how the retrieval step works, check out my other blog on the Math behind Embeddings and Cosine Similarity.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.