Azure AI Search — Powering RAG Applications

Build semantic search and RAG applications with Azure AI Search — vector search, hybrid search, and AI enrichment.

slides

Slide 1 / 9

Azure AI Search

Semantic & Vector Search — The Foundation of RAG
Add as a Connection in Azure AI Foundry for RAG apps
Azure AI & Machine Learning — Episode 20

Speaker Script

“Welcome back. Today we're covering Azure AI Search — formerly Cognitive Search — which has evolved from a traditional search engine into the critical data layer for AI applications. If you want to build a RAG system that lets users chat with your documents, Azure AI Search is where your documents live and how the relevant passages get retrieved. In Azure AI Foundry you add AI Search as a named connection, and Prompt Flow uses it automatically as the retrieval tool. Understanding AI Search is fundamental to building enterprise AI applications.”

Slide 2 / 9

What is Azure AI Search?

Fully managed search service
Index any structured or unstructured content
Multiple search modes: keyword, semantic, vector, hybrid
AI enrichment — extract structure from unstructured data during indexing
Scales to billions of documents

Speaker Script

“Azure AI Search is a fully managed search service that indexes your content and makes it searchable. You can index documents from Azure Blob Storage, Azure SQL, Cosmos DB, or push data directly via API. What makes it modern is its support for multiple search paradigms — traditional keyword search, semantic ranking using language models, and vector search for meaning-based similarity. For AI applications, hybrid search combining keyword and vector gives the best results.”

Slide 3 / 9

Keyword vs Semantic vs Vector Search

Keyword — exact word matching (traditional BM25)
Semantic — re-ranks results using language model for relevance
Vector — finds semantically similar content using embeddings
Hybrid — combines keyword + vector for best of both worlds
Vector search finds meaning, not just words

Speaker Script

“Traditional keyword search finds documents containing the exact words you searched for. If you search 'car maintenance', you won't find documents about 'vehicle servicing' even though they're the same thing. Semantic ranking uses a language model to re-rank keyword results by true relevance. Vector search converts your query and all documents into numerical vectors using an embedding model — documents with similar meaning end up close together in vector space, so a search for 'car maintenance' finds 'vehicle servicing' too.”

Slide 4 / 9

Vector Embeddings Explained

Embedding = array of numbers representing text meaning
Similar text → similar vectors (close in vector space)
Generated by embedding model (text-embedding-3-large)
Stored in your search index as a vector field
Query text is also embedded → find nearest vectors

Speaker Script

“Embeddings are the mathematical representation of text meaning. An embedding model — like OpenAI's text-embedding-3-large — converts a piece of text into an array of 1,536 numbers. Texts with similar meaning produce similar number arrays. When you index your documents, you generate embeddings for each chunk and store them. At query time, you generate an embedding for the user's question and find the document chunks whose embeddings are closest — those are the most relevant passages.”

Slide 5 / 9

Building a Search Index

Index = the searchable data store
Fields: id, title, content, category (searchable/filterable)
Vector field: contentVector (dimensions matching embedding model)
Indexer — automatic crawling from Azure data sources
Chunking — split large documents into searchable passages

Speaker Script

“A search index is like a database table optimized for search. You define fields — title, content, category — and mark which are searchable, filterable, or sortable. For vector search, add a vector field configured with the right dimensions for your embedding model. An indexer automatically crawls data sources — Azure Blob Storage, SQL Database, Cosmos DB — and updates the index on a schedule. Large documents must be chunked into smaller passages before indexing — typically 500-1000 tokens per chunk, with some overlap.”

Slide 6 / 9

RAG Architecture with AI Search

1. Ingest: chunk documents → embed → index in AI Search
2. Query: embed user question → vector search → top K chunks
3. Augment: include top K chunks in GPT-4o prompt
4. Generate: GPT-4o answers using chunks as context
5. Cite: include source references in response

Speaker Script

“The complete RAG architecture has two phases. Ingestion: split your documents into chunks, generate embeddings for each chunk using Azure OpenAI, and store both the text and vectors in Azure AI Search. Query: when a user asks a question, embed the question, run a hybrid search to find the most relevant chunks, include those chunks in a GPT-4o prompt, and return the generated answer with citations. This pattern lets users chat with any document corpus — manuals, contracts, research papers, knowledge bases.”

Slide 7 / 9

AI Enrichment During Indexing

Skillsets — apply AI transformations during indexing
OCR — extract text from scanned PDFs and images
Entity extraction — find names, locations, organizations
Key phrase extraction — surface main topics
Language detection — multilingual document handling
Custom skills — call any external API during indexing

Speaker Script

“AI Search can apply AI transformations to your content during indexing using Skillsets. OCR extracts text from scanned PDFs or images — making even old paper-based documents searchable. Entity extraction automatically identifies people, places, and organizations in documents. Key phrase extraction surfaces the main topics. These enriched fields are stored in the index alongside the original content, dramatically improving search relevance and enabling new query patterns.”

Slide 8 / 9

Live Azure Demo

Create AI Search resource
Define an index with vector field
Index sample documents with embeddings
Run keyword, vector, and hybrid searches
Demonstrate RAG: search → prompt → GPT-4o answer

Speaker Script

“Let's build a search index from scratch. I'll create an AI Search resource, define an index with both text and vector fields, upload and index some sample documents, then compare the results of keyword search versus vector search on the same query. Finally, I'll wire it up to Azure OpenAI to demonstrate the complete RAG flow — user question to search to GPT-4o answer with document citations.”

Slide 9 / 9

Summary & What's Next

✅ AI Search — managed search engine for AI applications
✅ Vector search — meaning-based similarity using embeddings
✅ Hybrid search — best results for RAG retrieval
✅ RAG pattern: ingest → embed → index → search → generate
✅ AI enrichment — OCR, entities, key phrases during indexing
Next: Azure Machine Learning — Build Custom AI Models →

Speaker Script

“Azure AI Search is the search backbone of enterprise AI. Paired with Azure OpenAI, it enables powerful RAG applications that let users interact naturally with any document corpus. Next video we go deep on Azure Machine Learning — for when pre-built AI APIs aren't enough and you need to train custom models on your own data. This is where data science meets cloud engineering.”

🖥️Azure Demo Steps

1Create an Azure AI Search resource in Azure Portal
2In Azure AI Foundry (ai.azure.com) — add AI Search as a connection
3Create an index with title, content, and vector fields
4Upload and index sample documents
5Run a keyword search query
6Run a vector similarity search
7Run a hybrid search (keyword + vector)
8Use AI Foundry Prompt Flow to wire AI Search + OpenAI for RAG