<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yashasviben Patel</title>
    <description>The latest articles on DEV Community by Yashasviben Patel (@yashasviben_patel).</description>
    <link>https://dev.to/yashasviben_patel</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2968769%2Ff10688ae-0131-4730-8a25-a70ccbd28520.jpg</url>
      <title>DEV Community: Yashasviben Patel</title>
      <link>https://dev.to/yashasviben_patel</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yashasviben_patel"/>
    <language>en</language>
    <item>
      <title>Advanced Prompting Techniques and Embeddings in AI</title>
      <dc:creator>Yashasviben Patel</dc:creator>
      <pubDate>Mon, 26 May 2025 02:12:30 +0000</pubDate>
      <link>https://dev.to/yashasviben_patel/advanced-prompting-techniques-and-embeddings-in-ai-3o69</link>
      <guid>https://dev.to/yashasviben_patel/advanced-prompting-techniques-and-embeddings-in-ai-3o69</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As AI continues to evolve, mastering how we prompt and guide language models has become just as important as the models themselves. This chapter explores cutting-edge &lt;strong&gt;prompting strategies&lt;/strong&gt; and &lt;strong&gt;vector-based&lt;/strong&gt; text representations that significantly enhance the capabilities of modern AI systems.&lt;/p&gt;

&lt;p&gt;From adjusting randomness with temperature and &lt;strong&gt;top-P&lt;/strong&gt; sampling to guiding AI thought processes with techniques like &lt;strong&gt;Chain-of-Thought (CoT)&lt;/strong&gt; and &lt;strong&gt;ReAct&lt;/strong&gt; prompting, we unlock ways to improve both the creativity and reliability of AI-generated responses. You'll also discover how embeddings and cosine similarity allow us to quantify meaning and relevance between pieces of text laying the groundwork for powerful applications in search, recommendation, and question-answering systems.&lt;/p&gt;

&lt;p&gt;Finally, we explore &lt;strong&gt;Retrieval-Augmented Generation (RAG)&lt;/strong&gt; and &lt;strong&gt;ChromaDB&lt;/strong&gt;for combining traditional knowledge retrieval with generative AI, offering a practical approach to building systems that are both smart and informed.&lt;/p&gt;

&lt;p&gt;Whether you're building chatbots, search engines, or decision-making tools, the techniques in this chapter will help you get the most out of your AI models making them not just responsive, but truly intelligent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Prompting Strategies&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Understanding AI Model Configuration&lt;/strong&gt;&lt;br&gt;
When working with AI models like &lt;strong&gt;Gemini 2.0&lt;/strong&gt;, configuring the model parameters is crucial for controlling the diversity, randomness, and output length. Some of the key parameters include:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Temperature&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Controls randomness in token selection.&lt;/li&gt;
&lt;li&gt;Higher values (e.g., &lt;strong&gt;0.8–1.0&lt;/strong&gt;) produce more diverse and creative responses.&lt;/li&gt;
&lt;li&gt;Lower values (e.g.,&lt;strong&gt;0.1–0.3&lt;/strong&gt;) make the model more deterministic and focused.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Setting temperature to 0 forces greedy decoding&lt;/strong&gt; (selecting the most probable token at each step).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Top-P (Nucleus Sampling)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Defines a probability threshold for selecting tokens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Top-P = 1&lt;/strong&gt; considers all tokens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Top-P &amp;lt; 1&lt;/strong&gt; restricts token selection to the most probable ones.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example Configuration in Python&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model_config = types.GenerateContentConfig(
    temperature=0.1,
    top_p=1,
    max_output_tokens=5, )

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Zero-Shot Prompting&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Zero-shot learning is when an AI model is given a prompt without prior examples and must generate a response based solely on its training knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt:&lt;/strong&gt;&lt;br&gt;
Classify movie reviews as POSITIVE, NEUTRAL, or NEGATIVE.&lt;br&gt;
Review: "The movie had stunning visuals but a weak storyline."&lt;br&gt;
Sentiment:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI Response:&lt;/strong&gt;&lt;br&gt;
NEUTRAL&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Use Zero-Shot Learning?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requires no training data.&lt;/li&gt;
&lt;li&gt;Works well for basic classification and fact-based queries.&lt;/li&gt;
&lt;li&gt;May struggle with nuanced or domain-specific tasks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Chain of Thought (CoT) Prompting&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Chain of Thought (CoT) prompting enhances reasoning by making the AI model explicitly break down its thought process step by step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Without CoT:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Prompt: What is 23 × 47?&lt;br&gt;
AI Response: 1081&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example With CoT:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Prompt: Solve step by step: What is 23 × 47?&lt;br&gt;
AI Response: First, break it down:&lt;br&gt;
23 × 47 = (23 × 40) + (23 × 7)&lt;br&gt;
= 920 + 161&lt;br&gt;
= 1081&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why CoT Prompting&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enhances logical reasoning &lt;/li&gt;
&lt;li&gt;Reduces AI hallucinations&lt;/li&gt;
&lt;li&gt;Makes AI outputs transparent and verifiable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. ReAct Prompting (Reasoning + Acting)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ReAct (Reasoning + Acting)&lt;/strong&gt; prompting is a method where the AI model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Thinks through the problem (reasoning).&lt;/li&gt;
&lt;li&gt;Performs an action (e.g., searching an external source).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example Using Wikipedia Search in LangChain&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from langchain.chat_models import ChatGoogleGenerativeAI
from langchain.tools import WikipediaQueryRun
from langchain.agents import initialize_agent, AgentType

llm = ChatGoogleGenerativeAI(model="gemini-pro")
wikipedia = WikipediaQueryRun()

agent = initialize_agent(
    tools=[wikipedia],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

agent.run("Who discovered penicillin?")

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why ReAct Prompting&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automates reasoning and fact-finding&lt;/li&gt;
&lt;li&gt;Improves factual accuracy by verifying sources&lt;/li&gt;
&lt;li&gt;Reduces incorrect assumptions by AI&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5. Thinking Mode in Gemini Flash 2.0&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The experimental Thinking Mode in &lt;strong&gt;Gemini Flash 2.0&lt;/strong&gt; is designed to simulate a model's internal reasoning process before generating a final response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How It Works:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The AI internally brainstorms ideas before finalizing an answer.&lt;/li&gt;
&lt;li&gt;The API only returns the final response, but you can view the thought process in AI Studio.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Prompt:Who discovered penicillin?&lt;/code&gt;&lt;br&gt;
&lt;code&gt;Thinking Mode's Internal Thought Process:&lt;/code&gt;&lt;br&gt;
&lt;code&gt;Penicillin is an antibiotic. The discovery happened in the early 20th century. The scientist who discovered it was Alexander Fleming in 1928.&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
&lt;code&gt;Final AI Response:&lt;/code&gt;&lt;br&gt;
&lt;code&gt;Alexander Fleming discovered penicillin in 1928.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Thinking Mode&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stronger reasoning capabilities without extra prompting&lt;/li&gt;
&lt;li&gt;Improves response accuracy&lt;/li&gt;
&lt;li&gt;Best for knowledge-based and analytical queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Embeddings&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Understanding Embeddings and Cosine Similarity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Embeddings convert text into numerical vectors, making it easier for AI models to compare text similarity. Each sentence is mapped into high-dimensional space.&lt;/p&gt;

&lt;p&gt;Example Embedding Output:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ID   Document                     Embedding&lt;br&gt;
0    "AI is the future"          [0.12, 0.34, ...]&lt;br&gt;
1    "Farming is important"      [0.56, 0.78, ...]&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
Cosine Similarity measures how similar two embeddings are:&lt;br&gt;
&lt;code&gt;Cosine Similarity = (A ⋅ B) / (||A|| * ||B||)&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1.0 → Identical text&lt;/li&gt;
&lt;li&gt;0.0 → Completely different text&lt;/li&gt;
&lt;li&gt;-1.0 → Opposite meaning
Example in Python:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

a = np.array([0.12, 0.34, 0.56])
b = np.array([0.12, 0.33, 0.57])

similarity_score = cosine_similarity([a], [b])
print(similarity_score)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;2. Retrieval-Augmented Generation (RAG)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;RAG enhances AI models by fetching external documents to generate better responses. Steps include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Retrieve relevant documents from a knowledge base.&lt;/li&gt;
&lt;li&gt;Embed the retrieved text into a structured format.&lt;/li&gt;
&lt;li&gt;Generate a final answer by combining AI generation with retrieved content.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fedth68ti07g3iclasdky.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fedth68ti07g3iclasdky.png" alt="Image description" width="764" height="522"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;query = "Impact of climate change on agriculture"
prompt = f"You are an AI assistant. Answer the question using the retrieved text.\nQUESTION: {query}\n"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Implementing Embeddings in ChromaDB&lt;/strong&gt;&lt;br&gt;
ChromaDB allows storing and retrieving embeddings efficiently. Example code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import chromadb

DB_NAME = "agriculture_db"
embed_fn = GeminiEmbeddingFunction()
embed_fn.document_mode = True

chroma_client = chromadb.Client()
db = chroma_client.get_or_create_collection(name=DB_NAME, embedding_function=embed_fn)

db.add(documents=documents, ids=[str(i) for i in range(len(documents))])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Visualizing Similarity with Heatmaps&lt;/strong&gt;&lt;br&gt;
A heatmap represents how similar different texts are. Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import seaborn as sns
import pandas as pd

# Create similarity matrix
df = pd.DataFrame([e.values for e in response.embeddings], index=truncated_texts)
similarity_matrix = df @ df.T

# Plot heatmap
sns.heatmap(similarity_matrix, vmin=0, vmax=1, cmap="Greens")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
This chapter provided a deep dive into advanced AI prompting techniques, embeddings, and similarity scoring. By leveraging these tools, you can build more intelligent and reliable AI applications, ensuring responses are accurate, structured, and contextually relevant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;References&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://aistudio.google.com/app/prompts" rel="noopener noreferrer"&gt;https://aistudio.google.com/app/prompts&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.langchain.com/docs/components/agents" rel="noopener noreferrer"&gt;https://docs.langchain.com/docs/components/agents&lt;/a&gt;&lt;br&gt;
&lt;a href="https://ai.google.dev/gemini-api/docs/embed" rel="noopener noreferrer"&gt;https://ai.google.dev/gemini-api/docs/embed&lt;/a&gt;&lt;br&gt;
&lt;a href="https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html" rel="noopener noreferrer"&gt;https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html&lt;/a&gt;&lt;br&gt;
&lt;a href="https://haystack.deepset.ai/overview/intro" rel="noopener noreferrer"&gt;https://haystack.deepset.ai/overview/intro&lt;/a&gt;&lt;br&gt;
&lt;a href="https://docs.trychroma.com" rel="noopener noreferrer"&gt;https://docs.trychroma.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>llm</category>
      <category>python</category>
    </item>
  </channel>
</rss>
