The Designing and Implementing a Microsoft Azure AI Solution (AI-102) exam is for developers who build AI into apps. It is not a data science exam (that's DP-100). It is about APIs, SDKs, architecture, and security.
If you are preparing for AI-102, this cheat sheet covers the critical services, decision matrices, and implementation details you need to know.
1. AI Architecture & Security
Before you code, you must secure the resource.
Resource Types
- Multi-Service Resource: Access to Vision, Language, Search, etc., with a single key/endpoint. Good for development/prototyping.
- Single-Service Resource: Access to only one service (e.g., "Computer Vision" resource). Required if you need fine-grained cost tracking or specific tiers (e.g., Free Tier F0).
Authentication
- Subscription Keys: Easiest, but least secure. Rotating keys requires app downtime unless you use Key Vault.
-
Managed Identity: The "Best Practice." No keys in code. Assign an RBAC role (e.g.,
Cognitive Services User) to the VM/App Service. - Key Vault: Store keys here and access via Secret URI.
Containers (Docker)
- Use Case: Compliance (data cannot leave premises) or low-latency edge computing.
- Billing: Containers are not free. They must connect to the Azure Billing Endpoint every 15 minutes to report usage.
2. Computer Vision & Custom Vision
Knowing which service to pick is 50% of the battle.
The Decision Matrix
| Service | Use Case | Example |
|---|---|---|
| Computer Vision | Pre-trained models. Generic analysis. | "Describe this image." "Is there a dog?" "Generate a thumbnail." |
| Custom Vision | You bring your own training data. Specific domains. | "Is this my specific product?" "Is this screw defective?" |
| Face API | Human face analysis. | Identity verification, emotion detection, age estimation. |
Custom Vision Types
- Classification: "What is this image?" (Output: Tag
Toaster98%).- Multiclass: One tag per image (Dog OR Cat).
- Multilabel: Multiple tags per image (Dog AND Grass AND Ball).
- Object Detection: "Where is the object?" (Output: Bounding Box coordinates
[x,y,w,h]+ Label).
The "Read" API (OCR)
Standard OCR is synchronous. The Read API is asynchronous for large documents (PDFs/Images).
- POST request to
analyzeendpoint. - Receive
202 Accepted+Operation-Locationheader. - GET request to
Operation-Locationloop until status issucceeded.
3. Natural Language Processing (Language Service)
The consolidation of LUIS and Text Analytics.
Conversational Language Understanding (CLU)
Replaces LUIS.
-
Intent: What the user wants to do (e.g.,
BookFlight). -
Entity: The parameters of the action (e.g.,
Paris,Tomorrow). - Utterance: What the user actually said (e.g., "Fly me to Paris tomorrow").
- None Intent: Crucial for handling garbage input. Always required.
Key Capabilities
- Sentiment Analysis: Returns confidence scores (0 to 1) for Positive, Neutral, Negative.
- Key Phrase Extraction: Pulls main points ("The food was good but the service was slow" -> "food", "service").
- Entity Linking: Connects words to Wikipedia/Knowledge Graph (e.g., "Venus" -> Planet vs. "Venus" -> Goddess).
Translator Service
- Translate: Text to Text.
- Transliterate: Convert script (e.g., Japanese characters to Latin alphabet) without translating the meaning.
-
Profanity Filtering: Settings:
NoAction,Marked(***), orDeleted.
4. Knowledge Mining (Azure AI Search)
Formerly Azure Search. The most complex topic on the exam.
The Enrichment Pipeline
- Data Source: Where data lives (SQL, Blob, Cosmos).
- Indexer: The engine that crawls the data.
- Skillset: The AI processing steps (OCR, Translate, Entity Extraction).
- Built-in Skills: Microsoft provided.
- Custom Skills: Call a Function App/Web API for custom logic.
- Index: The searchable JSON document store.
Key Concepts
-
Push vs. Pull:
- Pull: Indexer crawls data (SQL/Blob).
- Push: Your code pushes JSON directly to the index (good for rare data sources).
- Knowledge Store: Saves the enriched data (e.g., the text extracted from images) into tables/blobs for other apps to use (like PowerBI).
Search Syntax
-
Full Text:
search=run(Finds "run", "running", "runner"). -
OData Filter:
$filter=category eq 'Luxury'(Exact match).
5. Document Intelligence (Form Recognizer)
Extracting Key-Value pairs from documents.
Model Types
- Read: Just text extraction (OCR).
- General Document: Pre-trained for common structures.
- Prebuilt: Invoices, Receipts, ID Cards, Tax Forms (W2).
-
Custom Template: You label data based on visual layout.
- Requirements: Minimum 5 examples of the same layout to train.
-
Custom Neural: Understands complex, unstructured documents.
- Requirements: Slower to train, more expensive.
6. Conversational AI (Bot Framework)
Building the interface.
- Bot Framework Composer: Visual drag-and-drop tool to build dialogs.
- Adaptive Cards: JSON snippets that render UI (Buttons, Forms, Images) consistently across any channel (Teams, Web, Slack).
- Direct Line: The channel used to connect a bot to a custom mobile app or website.
📝 Exam "Gotchas"
- Content Safety: Know the difference between breaking (error) and flagging (warning) content.
- Video Indexer: It provides insights like OCR in video, Face detection, and Speaker diarization (Who spoke when?).
- Private Endpoints: If a question mentions "Banking" or "High Security," the answer almost always involves disabling public access and using Private Link.
- Speech Translation: You can translate Speech-to-Text (See words on screen) or Speech-to-Speech (Hear translated audio).
- Region Support: Not all features (especially Neural Voice) are available in every Azure region.
Good luck with your AI-102!
Top comments (0)