Romina Elena Mendez Escobar

Posted on Sep 9 • Edited on Oct 22

GenAI Foundations – Chapter 1: Prompt Basics: From Theory to Practice

#ai #rag #data #machinelearning

👉 “From key concepts to first practical prompts”

Introduction

In recent years, Artificial Intelligence (AI) has evolved from being a distant concept to becoming a transversal technology that impacts industries, processes, and products. Within this landscape, Generative AI has emerged as one of the most disruptive branches, capable of creating new content (📄text, 🏞 images, 📢 audio, or 🎬 video) from existing data.
Around this field, many terms and approaches are often used interchangeably, which can create confusion. For this reason, in this article aims to bring clarity by explaining the most relevant concepts and showing how to take them from theory to practice.

Fundamentals of generative AI

What is Generative AI?

Generative AI is a branch of artificial intelligence focused on creating new and original content, such as 📄 text, 🏞 images, 📢audio, 🎬 video or even synthetic data.
Its development has been made possible by deep learning, particularly through advanced architectures like transformers, which can process information in parallel and capture complex relationships across large volumes of data.

What are the Foundation Models?

Foundation Models (FMs) are deep learning neural networks trained on massive, diverse, and largely unlabeled datasets.

Essentially, these models allow for significantly faster and cheaper development of AI solutions by reusing pre-trained models, avoiding the need to build them from scratch, which would be extremely costly due to the volume of data, parameters, and computational resources required.

🧩 Key elements of the process:

Unlabeled data: more abundant and easier to obtain than labeled data, which requires specialized manual work.
Large-scale training: millions or billions of parameters, powerful infrastructure, and extensive preprocessing.
Base model: the result of training, capable of handling a wide range of tasks before being adapted to specific cases.

What is a token?

Understanding how foundation models process information requires diving into the mechanics of language processing. These models require a structured method to handle and understand textual data, for this reason, the token is the fundamental piece of this process.

A token is the basic unit of information that foundation models use to process language. Instead of working with entire sentences, the text is broken into smaller fragments (words, subwords, or characters) depending on the tokenization strategy.

Each token is converted into a numerical representation (embedding) within a vector space, where tokens with similar meanings are positioned closer together. This mechanism allows foundation models to capture semantic and contextual relationships between words, phrases, or concepts.

Why are tokens important?

Tokens are the basic unit of text that language models process, and they are also the unit by which usage is billed. Every interaction with a model is measured in tokens, so costs are directly tied to the number of tokens processed.

On average, 1 token ≈ 4 characters in English, or about 0.75 of a word. This approximation helps estimate how much text you can input or expect in an output before hitting usage or 💰 cost limits.

Types of tokens considered in billing:

Input tokens → the text you send as a prompt.
Output tokens → the text generated in the response.
Cached tokens → reused tokens from the conversation history (cheaper).
Reasoning tokens → intermediate reasoning steps in advanced models.

🎚 How can I calculate a budget and analyze token usage with 🐍Python?

When building real applications with large language models, it becomes crucial to estimate costs and enforce input limits.
For this purpose, OpenAI provides a dedicated Python library called tiktoken, which ensures tokenization is consistent with the models. Using this library, developers can:

Count tokens precisely in any given text.
Estimate costs by combining token counts with the official pricing tiers.
Validate context length to confirm that prompts and conversation history fit within a model’s maximum context window.

This makes it easy to prepare accurate **budget estimates* and to design systems that prevent users from exceeding token limits.

To make this more tangible, I created a Streamlit demo where you can visualize how it is tokenized, and instantly get an estimated cost.

The project is available on my GitHub,

RominaElenaMendezEscobar / tiktoken-streamlit-budget

Streamlit app to calculate tokens using tiktoken

🔤 Tokenization Demo with tiktoken

This project is a simple Streamlit app that demonstrates how text is tokenized using tiktoken, the official tokenization library developed by OpenAI.

tiktoken is used by OpenAI models to split text into tokens — the fundamental units that language models process.
With this app, you can:

Enter any text and see how it is divided into tokens.
Visualize tokens with colors for better readability.
Check token IDs and bytes.
Estimate costs based on a per-token price.

⚙️ Installation

Clone the repository and install the dependencies:

pip install streamlit tiktoken

🚀 Usage

Run the Streamlit app with:

streamlit run app.py

📚 About tiktoken

Tiktoken is a fast and efficient library created by OpenAI to ensure that tokenization is consistent with their models. It is essential for:

Counting tokens in prompts and responses.
Estimating usage costs.
Validating that text fits within the model’s maximum context…

View on GitHub

If you find this tutorial helpful, feel free to leave a star ⭐️ and follow me to get notified about new articles. Your support helps me grow within the tech community and create more valuable content! 🚀

🏷 What are the types of Foundation Models?

Foundation models cover multiple modalities and domains, going beyond text to include other forms of content such as code, images, or even audiovisual data.
Among them, Large Language Models (LLMs) are the most popular, specialized in understanding and generating natural language. Their success has been so evident that, for many, "generative AI" is directly associated with them, although in reality they represent only a part of this broader ecosystem.

One way to classify Foundation models is by modality, that is, the type of content they are capable of generating or processing. Prominent examples include:

Text: models like GPT, Claude, or LLaMA, capable of understanding and producing coherent text in multiple languages and styles.
💻 Code: tools like GitHub Copilot, CodeT5, StarCoder, or CodeLlama, trained to generate and complete code in different programming languages.
🖼️ Image: models like Stable Diffusion, DALLE, or Midjourney, specialized in producing original images from written prompts.
🎵 Audio: systems like MusicLM, Jukebox, Whisper, or Suno, ranging from music creation to voice-to-text transcription.
🎬 Video: generators like Sora, RunwayML, or Pika, which create audiovisual sequences from textual descriptions.
🧊 3D: generators like Point-E or Shape-E, which create three-dimensional objects—from point clouds to textured meshes—based on descriptions.
🔀 Multimodal: models like GPT, Claude, or Gemini, capable of combining text, images, and in some cases audio, to perform complex tasks requiring understanding of multiple content types.

The prompt as an interface

✏️ What is a prompt?

A prompt is the instruction or set of instructions that we provide to a generative AI model to guide its response.

What is the structure of a prompt?

Most well-designed prompts can be divided into two main components:

1️⃣ Preamble (defines the context and expectations of the model)
- Context: background or relevant information so that the response is aligned with the situation or specific domain.
- Instruction/Task: the specific action we want the model to perform accurately.

Example: expected format or sample response, so the model has a clear guide of the desired outcome.

2️⃣ Input (the content that the model must process, transform, or generate).

Practical Example:

Context: "You are a virtual assistant specialized in professional writing for a technology company that offers cloud solutions."
Instruction: "Write a formal email to invite a client to the launch of a new product."
Input: Client: Joan Smith, Product: AI for Supply Chain, Date: September 3, Company: TechLogistics Inc.

❤️ Core principles for effective prompts

Be explicit about the task. State the action and the expected output format.
Provide only relevant context. Include background that constrains the answer; avoid unrelated details.
Constrain scope. Specify boundaries (domain, length limits, style, audience).
Make evaluation possible. Define acceptance criteria or a rubric when feasible.

🔮 What’s Next?

Now that we understand the foundations of generative AI and prompts, the next step is to explore concrete prompt engineering techniques in action. You can continue with the next chapter in this series: GenAI Foundations – Chapter 2: Prompt Engineering in Action – Unlocking Better AI Responses

📖 Series Overview

You can find the entire series on my Profile:

✏️ GenAI Foundations – Chapter 1: Prompt Basics – From Theory to Practice
🧩 GenAI Foundations – Chapter 2: Prompt Engineering in Action – Unlocking Better AI Responses
📚 GenAI Foundations – Chapter 3: RAG Patterns – Building Smarter AI Systems
✅ GenAI Foundations – Chapter 4: Model Customization & Evaluation – Can We Trust the Outputs?
🗂️ GenAI Foundations – Chapter 5: AI Project Planning – The Generative AI Canvas

📚 References

Academy OpenAI. (2025, febrero 13). Advanced prompt engineering. https://academy.openai.com/home/videos/advanced-prompt-engineering-2025-02-13
Anthropic. (s.f.). Creating message batches. Anthropic Documentation. https://docs.anthropic.com/en/api/creating-message-batches
AWS. (s.f.). ¿Qué son los modelos fundacionales?. https://aws.amazon.com/es/what-is/foundation-models/
AWS. (s.f.). ¿Qué es Retrieval-Augmented Generation (RAG)?. https://aws.amazon.com/es/what-is/retrieval-augmented-generation/
Cloud Skills Boost. (s.f.). Introduction to generative AI. Google Cloud. https://www.cloudskillsboost.google/course_templates/536
Google Developers. (s.f.). Ingeniería de instrucciones para la IA generativa https://developers.google.com/machine-learning/resources/prompt-eng?hl=es-419
Google Developers. (s.f.). Información general: ¿Qué es un modelo generativo? https://developers.google.com/machine-learning/gan/generative?hl=es-419
IBM. (s.f.). What is LLM Temperature?. https://www.ibm.com/think/topics/llm-temperature
IBM. (s.f.). ¿Qué es el prompt engineering ? https://www.ibm.com/es-es/think/topics/prompt-engineering
IBM. (s.f.). AI hallucinations. https://www.ibm.com/es-es/think/topics/ai-hallucinations
Luke Salamone. (s.f.). What is temperature?. https://blog.lukesalamone.com/posts/what-is-temperature/
McKinsey & Company. (2024-04-02). What is generative AI?https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-generative-ai
New York Times. (2025-05-08). La IA es cada vez más potente, pero sus alucinaciones son cada vez peores https://www.nytimes.com/es/2025/05/08/espanol/negocios/ia-errores-alucionaciones-chatbot.html
Prompt Engineering. (2024-04-06). Complete Guide to Prompt Engineering with Temperature and Top-p https://promptengineering.org/prompt-engineering-with-temperature-and-top-p/
Prompting Guide. (s.f.). ReAct prompting. https://www.promptingguide.ai/techniques/react
Prompting Guide. (s.f.). Consistency prompting. https://www.promptingguide.ai/techniques/consistency
Learn Prompting. (2024-09-27). Self-Calibration Prompting https://learnprompting.org/docs/advanced/self_criticism/self_calibration
AI Prompt Theory. (2025-07-08). Temperature and Top p: Controlling Creativity and Predictability https://aiprompttheory.com/temperature-and-top-p-controlling-creativity-and-predictability/?utm_source=chatgpt.com
Vellum. (s.f.). How to use JSON Mode https://www.vellum.ai/llm-parameters/json-mode?utm_source=www.vellum.ai&utm_medium=referral
OpenAI. (2025-08). What are tokens and how to count them?. https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them
Milvus.(s.f.) What are benchmark datasets in machine learning, and where can I find them?. https://milvus.io/ai-quick-reference/what-are-benchmark-datasets-in-machine-learning-and-where-can-i-find-them

DEV Community