Ntombizakhona Mabaso

for AWS Community Builders

Posted on Jan 16

Explain The Basic Concepts of Generative AI

#aws #ai #aipractitioner #cloud

🤖 Exam Guide: AI Practitioner
Domain 2: Fundamentals of Generative AI
📘Task Statement 2.1

Domain 1 gives you the language of AI and ML.
Domain 2 shifts the focus to something more specific and increasingly central in AWS exam content and real-world workloads: Generative AI.

Instead of predicting a label us fraud/not fraud or a number such as next month’s demand...GenAI models generate new content: text, images, audio, video, and code based on patterns learned from massive datasets.

This domain is about understanding the core building blocks (tokens, embeddings, transformers, diffusion), common use cases, and the high-level lifecycle of foundation models.

🎯 Objectives

The objective of Task 2.1 is to define the core concepts and terminology of generative AI, recognize common real-world use cases for generative AI models, and describe the foundation model lifecycle from data and model selection through pre-training, fine-tuning, evaluation, deployment, and feedback.

1) Define foundational GenAI concepts

1.1 Tokens

Tokens are the basic units a language model reads and writes.
A token might be a word, part of a word, punctuation, or whitespace.
Why Tokens Matter: cost/latency and model limits are often tied to token counts (input + output).

1.2 Chunking

Chunking is splitting large text into smaller segments (“chunks”) so it can be processed effectively.
Chunking is commonly used in Retrieval-Augmented Generation (RAG): store/search chunks, then provide the most relevant chunks to the model.
Why Chunking Matters: models have context window limits therefore chunking helps you fit useful information into the prompt.

1.3 Embeddings

Embeddings are numeric representations of content (text/images/etc.) that capture meaning.
Similar items end up with similar embeddings.
Embeddings Are Used For:

semantic search
clustering
recommendation
retrieval.

1.4 Vectors

A vector is the array of numbers that represents an embedding.
You compare vectors with similarity metrics (e.g., cosine similarity) to find “closest meaning.”

1.5 Prompt Engineering

Prompt Engineering is crafting instructions and context to guide model outputs.
Prompt Engineering Techniques Include:

clear instructions
role prompting
examples (few-shot)
constraints
formatting requirements
and grounding with retrieved context

Effective Prompt Engineering Techniques matter because they can dramatically improve quality without retraining.

1.6 Transformer-based LLMs

Many modern LLMs are based on the transformer architecture.
The key idea in transformer based LLMs are the attention mechanisms that help the model focus on relevant parts of the input.
Strengths of Transformers:

language understanding/generation
summarization
extraction
reasoning-like behavior (with limitations)

1.7 Foundation Models (FMs)

Foundation models are large, general-purpose models trained on broad datasets and adaptable to many tasks.
LLMs are a common type of foundation model, but FMs can also be for images, audio, or multimodal tasks.

1.8 Multimodal Models

Multimodal models can accept and/or generate more than one modality (e.g., text + image).
Example: provide an image and ask for a description, or provide text and generate an image.

1.9 Diffusion models

Diffusion models are commonly used for image generation.
They learn to reverse a “noising” process: starting from noise and iteratively producing an image.
Why Diffusion Models Matter: diffusion is a major approach behind high-quality text-to-image generation.

2) Identify Potential Use Cases for GenAI Models

GenAI is useful when you want to generate, transform, or interact with content at scale.

Common Real-world use cases include:

2.1 Summarization

Meeting notes
incident reports
legal docs
support tickets
medical notes (with governance)

2.2 AI Assistants And Chatbots

Employee helpdesk
IT ops assistant
knowledge-base Q&A
HR policy assistant

2.3 Translation

Multilingual customer support
global documentation
localization pipelines

2.4 Code Generation

Boilerplate generation
code explanation
test generation
refactoring assistance

2.5 Customer service agents

Draft responses
classify intents
suggest resolutions
automate routine interactions

2.6 Search (Semantic Search)

Search by meaning, not just keywords (often powered by embeddings + retrieval)

2.7 Recommendation engines

Use embeddings and generative reasoning to propose relevant products/content

2.8 Image, Video, and Audio Generation

Marketing creatives
product mockups
voiceovers
prototyping,
content creation workflows

GenAI is especially strong when tasks involve language/content generation or understanding unstructured data, not when you need guaranteed deterministic outputs.

3) Describe The Foundation Model Lifecycle

Foundation models follow a lifecycle similar in spirit to ML, but with GenAI-specific steps and decisions:

3.1 Data selection

Choose large, diverse datasets (text, images, code, etc.), applying filtering and quality controls.
Includes handling sensitive data, licensing, and safety considerations.

3.2 Model selection

Choose an existing FM or decide to pre-train/customize based on needs:

capability requirements (quality, reasoning, multimodal)
latency/cost constraints
domain specificity
governance requirements

3.3 Pre-training

Train the foundation model on massive corpora to learn general representations.
This is expensive and typically done by large providers.

3.4 Fine-tuning

Adapt the model to a specific domain/task using additional data (often smaller and higher quality).
Can improve tone, format, domain knowledge, and task performance.

3.5 Evaluation

Evaluate quality, safety, and performance:

task quality (helpfulness, correctness)
robustness (edge cases)
bias/fairness
toxicity/safety
hallucination tendencies (as applicable)

3.6 Deployment

Serve the model for real-time or batch use.
Requires decisions on scaling, latency, cost controls, and access management.

3.7 Feedback

Collect user feedback and operational signals (e.g., thumbs up/down, corrections, failure reports).
Use feedback to improve prompts, retrieval data, guardrails, or to trigger fine-tuning/retraining.

Key Idea: In production, GenAI improvement is often iterative via prompting + retrieval + feedback, not just “train once and forget.”

💡 Quick Questions

1. What is a token, and why does token count matter for LLM usage?
2. What is chunking, and when would you use it in a GenAI solution?
3. What are embeddings (vectors), and what problem do they commonly help solve?
4. What does prompt engineering aim to improve, and name one simple technique.
5. In the foundation model lifecycle, what’s the difference between pre-training and fine-tuning?

Additional Resources

✅ Answers to Quick Questions

1. A token is a unit of text the model processes (word, subword, punctuation, etc.). Token count matters because it affects context limits (how much the model can read at once), latency, and often cost (pricing commonly depends on input/output tokens).

2. Chunking is splitting large text into smaller pieces so it can be searched or fit into a model’s context window. It’s commonly used in RAG workflows (store chunks, retrieve the most relevant ones, then add them to the prompt).

3. Embeddings are numeric representations of content; the embedding is stored as a vector. They’re commonly used for semantic similarity tasks such as semantic search, retrieval, clustering, and recommendations.

4. Prompt engineering aims to improve the model’s output quality and reliability by giving clearer instructions and context. One technique is few-shot prompting (providing a couple of examples of the desired input/output format).

5. Pre-training teaches a model broad, general capabilities from massive datasets. Fine-tuning adapts that pre-trained model to a specific domain or task using additional targeted data.

DEV Community