Ntombizakhona Mabaso

for AWS Community Builders

Posted on Jan 17

Understand The Capabilities And Limitations Of Generative AI For Solving Business Problems

#aws #ai #aipractitioner #cloud

🤖 Exam Guide: AI Practitioner
Domain 2: Fundamentals of Generative AI
📘Task Statement 2.2

🎯 Objectives

This task is about making good business decisions with GenAI: knowing what it’s great at, where it fails, how to pick the right model for the job, and how to measure success beyond “the demo looked good.”

1) Advantages of GenAI

1.1 Adaptability

One model can perform many tasks with minimal changes (often just by changing the prompt).
Example:
The same LLM can

summarize
classify
extract fields
draft emails
and answer questions

1.2 Responsiveness

Produces outputs quickly and interactively, enabling real-time experiences (assistants, copilots).
Supports iterative refinement: user can correct and the model can respond immediately.

1.3 Simplicity

You can often deliver useful functionality without building a custom ML pipeline.
Prompting + retrieval can replace complex rules-based systems or multiple specialized NLP models.

1.4 Works Well With Unstructured Data

Especially strong for text-heavy workflows:

tickets
documents
chats
knowledge bases

1.5 Cross-Domain Generalization

Foundation models can handle tasks across domains (legal, HR, IT, marketing) better than narrow models without starting from scratch.

2) Disadvantages And Limitations of GenAI

2.1 Hallucinations

The model may generate confident but incorrect information.
Risk increases when prompts are vague, context is missing, or the question requires precise factual grounding.

2.2 Interpretability

Hard to explain why the model produced a specific output.
This matters in regulated or high-stakes decisions.

2.3 Inaccuracy

Even when not “hallucinating,” outputs can be partially wrong, incomplete, or misaligned with business rules.
LLMs are not guaranteed to be factually correct or up-to-date.

2.4 Nondeterminism

Outputs can vary between runs even with the same prompt (depending on sampling/temperature and system behavior).
Makes strict reproducibility and test assertions harder than traditional software.

2.5 Other Practical Constraints

Data privacy/security: prompts may contain sensitive data; requires controls.
Latency and cost: larger models can be slow/expensive at scale.
Context window limits: cannot “read everything”; needs chunking/retrieval strategies.
Policy and safety concerns: risk of toxic output, leakage of sensitive info, or policy violations.

3) Factors For Selecting The Right GenAI Model

When choosing a GenAI approach or model, consider:

3.1 Model Type / Modality

Text-only LLM vs multimodal model (text+image) vs image generation (diffusion), etc.
Choose based on required inputs/outputs (text, image, audio, video).

3.2 Performance Requirements

Latency Targets: interactive chat vs offline processing
Throughput / Concurrency: how many requests
Cost per request and budget constraints

3.3 Capability fit

Does the model perform well on your task?

Summarization quality
Instruction following
Tool/function calling (if building agents)
Domain-specific language

3.4 Constraints

Context length needs: long documents may require retrieval
Output format needs: JSON, strict templates
Reliability requirements: do you need citations/grounding?

3.5 Compliance and Governance

Data residency requirements
PII handling and retention policies
Audit/logging needs
Model/provider restrictions: acceptable use, training data policies

3.6 Customization Needs

Can Prompting / RAG meet requirements?
Do you need Fine-tuning for tone, style, or domain patterns?
Do you need Guardrails and validation layers?

4) Determine Business Value And Metrics for GenAI Applications

GenAI success should be measured with both Model Quality and Business Outcomes.

4.1 Business Value Examples

Reduced agent handling time (AHT) in support
Higher conversion rate from better product discovery
Faster content creation cycles for marketing
Reduced cost of operations via automation
Improved customer satisfaction through better self-service

4.2 Example Metrics

4.2.1 Cross-Domain Performance

How well the solution generalizes across different departments/topics without rework.
Metric Example: task success rate across multiple knowledge domains.

4.2.2 Efficiency

Time saved, fewer manual steps, reduced escalations.
Metric Examples:

AHT (Average Handle Time)
Tickets Resolved Per Hour
Cost Per Case

4.2.3 Conversion Rate

Improved purchase or signup completion due to better guidance or recommendations.
*Metric Example: * checkout conversion uplift after assistant launch.

4.2.4 Average Revenue Per User (ARPU)

Monetization impact
Metric Example: ARPU increase for users exposed to assistant/recommendations.

4.2.5 Accuracy

Must be defined for the task, e.g., correct extraction fields, correct classification.
Metric Examples:

human-rated correctness
exact match for extracted fields
grounded answer rate.

4.2.6 Customer Lifetime Value (CLV)

Long-term retention or loyalty impact.
Metric example: churn reduction in cohorts that use GenAI support.

Pick metrics aligned to the business goal. A model can be “impressive” but still fail if it doesn’t improve efficiency, revenue, or customer outcomes or if risk/cost is too high.

💡 Quick Questions

1. Name two advantages of GenAI for business workflows.
2. What is a hallucination, and why is it risky in customer-facing apps?
3. Give one reason nondeterminism can be a problem in production.
4. List two factors you’d consider when selecting a GenAI model for a regulated industry.
5. For a GenAI shopping assistant, name one metric tied to business value.

Additional Resources

✅ Answers to Quick Questions

1. Adaptability (one model can handle many tasks via prompting) and responsiveness (interactive, real-time outputs).

(Simplicity/time-to-value is also a valid advantage.)

2. A hallucination is when the model generates information that sounds confident but is incorrect or fabricated. It’s risky because it can mislead users, create compliance/legal issues, and damage trust (especially if presented as fact).

3. One reason nondeterminism is a production problem is that the same prompt can yield different outputs across runs, which makes results harder to test, reproduce, and consistently control for quality or policy compliance.

4. Compliance/governance requirements (PII handling, audit/logging, data residency) and interpretability/reliability needs (e.g., grounded answers, stricter controls/guardrails, lower risk of hallucinations).

(Latency/cost and provider policies are also valid factors.)

5. Conversion rate (e.g., increased checkout completion) or ARPU (average revenue per user).

DEV Community