<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: rahulbhave</title>
    <description>The latest articles on DEV Community by rahulbhave (@rahulbhave).</description>
    <link>https://dev.to/rahulbhave</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F769031%2Fe7fd9184-bd08-4384-b68d-3afe600ed9e0.png</url>
      <title>DEV Community: rahulbhave</title>
      <link>https://dev.to/rahulbhave</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rahulbhave"/>
    <language>en</language>
    <item>
      <title>🤖 AutoReviewer AI – Daily Digest for AI Researchers using Runner H</title>
      <dc:creator>rahulbhave</dc:creator>
      <pubDate>Sun, 08 Jun 2025 17:27:32 +0000</pubDate>
      <link>https://dev.to/rahulbhave/autoreviewer-ai-daily-digest-for-ai-researchers-using-runner-h-4d2h</link>
      <guid>https://dev.to/rahulbhave/autoreviewer-ai-daily-digest-for-ai-researchers-using-runner-h-4d2h</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/runnerh"&gt;Runner H "AI Agent Prompting" Challenge&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;AutoReviewer AI is an autonomous agent that acts as a daily research assistant for AI/ML enthusiasts, researchers, and developers. It automates the process of tracking, filtering, summarizing, and categorizing the most recent and relevant content from trusted sources such as arXiv, GitHub Trending, and Medium.&lt;/p&gt;

&lt;p&gt;Every day, it delivers a neatly formatted Markdown newsletter highlighting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The newest and most impactful AI research papers
&lt;/li&gt;
&lt;li&gt;Trending open-source tools and repos
&lt;/li&gt;
&lt;li&gt;Interesting articles and use cases in GenAI and ML
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🔗 &lt;strong&gt;GitHub Repo&lt;/strong&gt;: &lt;a href="https://github.com/rahul-bhave/AutoReviewerAI" rel="noopener noreferrer"&gt;github.com/rahul-bhave/AutoReviewerAI&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;🎥 &lt;strong&gt;Demo Video&lt;/strong&gt;:&lt;br&gt;&lt;br&gt;
&lt;a href="https://youtu.be/uslNumzYO7Q" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fulyt1f5zu6aude2xg28y.jpg" alt="Watch the demo on YouTube" width="480" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Used Runner H
&lt;/h2&gt;

&lt;p&gt;Runner H was instructed to carry out a multi-step autonomous process:&lt;/p&gt;

&lt;h3&gt;
  
  
  🔎 Fetch Content
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;RSS feeds from arXiv (&lt;code&gt;cs.LG&lt;/code&gt;, &lt;code&gt;cs.AI&lt;/code&gt;, &lt;code&gt;cs.CL&lt;/code&gt;, &lt;code&gt;stat.ML&lt;/code&gt;)
&lt;/li&gt;
&lt;li&gt;GitHub Trending filtered by ML/AI
&lt;/li&gt;
&lt;li&gt;Medium articles using keywords: “Generative AI”, “Machine Learning”, “AI use cases”&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 Filter &amp;amp; Classify
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Only content published or updated within the last 2 days is considered
&lt;/li&gt;
&lt;li&gt;Items are categorized into: GenAI, ML, NLP, Tools, or Open-source Projects
&lt;/li&gt;
&lt;li&gt;Every paper or repo is summarized in 2–3 lines
&lt;/li&gt;
&lt;li&gt;Optionally includes use cases or critiques&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  📄 Format Output
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Uses Markdown to generate a clean, categorized newsletter grouped by topic
&lt;/li&gt;
&lt;li&gt;Ready to share in email, Slack, blog posts, or personal reading feeds
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To record the demo I have used Runner H UI as I consumed all 10 runs provided with my login for my testing and recording :), however entire codebase to make this process automate is available in the GitHub repo shared with this submission.&lt;/p&gt;




&lt;h2&gt;
  
  
  Use Case &amp;amp; Impact
&lt;/h2&gt;

&lt;p&gt;This project is built for researchers, AI developers, and tech-curious professionals who are overwhelmed by the volume of new content but want to stay ahead of the curve. It turns a noisy landscape into a curated, actionable summary quickly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Who Benefits
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;AI/ML practitioners in academia or industry
&lt;/li&gt;
&lt;li&gt;Founders &amp;amp; product leads in GenAI
&lt;/li&gt;
&lt;li&gt;Students or job-seekers needing daily awareness of trends
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Impact
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Saves hours of manual scanning and summarizing
&lt;/li&gt;
&lt;li&gt;Ensures you never miss important releases
&lt;/li&gt;
&lt;li&gt;Gives actionable insights on how new research can be applied
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Social Love 💙
&lt;/h3&gt;

&lt;p&gt;Thanks to Runner H and H Company for creating such a powerful platform — the potential of autonomous AI workflows is just beginning to unfold.&lt;/p&gt;

&lt;p&gt;Let me know if you'd like to try AutoReviewer AI in your own workflow! 🚀&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>runnerhchallenge</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Crafting Effective Unit Tests for Generative AI Applications</title>
      <dc:creator>rahulbhave</dc:creator>
      <pubDate>Sat, 30 Nov 2024 19:45:49 +0000</pubDate>
      <link>https://dev.to/rahulbhave/crafting-effective-unit-tests-for-generative-ai-applications-lp1</link>
      <guid>https://dev.to/rahulbhave/crafting-effective-unit-tests-for-generative-ai-applications-lp1</guid>
      <description>&lt;p&gt;&lt;strong&gt;Overview:&lt;/strong&gt;&lt;br&gt;
Testing generative AI applications presents unique challenges due to the multitude of ways a valid response can be phrased for a given input. Despite these challenges, implementing thorough tests is crucial for maintaining the stability and reliability of any application.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Objective&lt;/strong&gt;&lt;br&gt;
The goal is to develop robust testing strategies for generative AI applications. This involves writing tests that consider the variability in valid responses to given inputs, thereby ensuring the stability and consistency of the application.&lt;/p&gt;

&lt;p&gt;In this blog, we will explore various techniques and best practices for writing unit tests that can effectively handle the dynamic nature of generative AI outputs. By the end, you'll have a good understanding of how to create tests that not only validate the correctness of responses but also enhance the overall robustness of your AI application.&lt;/p&gt;

&lt;p&gt;Stay tuned for detailed insights and practical examples!&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Environment:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Following examples us Colab Enterprise notebook environment in the Google Cloud Console. You will need to enable Vertex AI API's in the Google Cloud console. Post enabling API's create new notebook in Google Colab environment. In the new workbook start installing required packages and follow steps as shown below:&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Notebook setup:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Following command will start the process of allocating a runtime for you. This may take few minutes to fully initialize the runtime.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;from IPython.display import clear_output&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install pytest package&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;!pip install --quiet ipytest&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get the project id for your environment
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;project_id = !gcloud config get project
project_id = project_id[0]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;Import basic packages
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import vertexai
from vertexai.generative_models import GenerativeModel, GenerationConfig
from vertexai.language_models import TextGenerationModel

import pytest
import ipytest
ipytest.autoconfig()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;p&gt;&lt;strong&gt;Unit Testing:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let's see how to write test cases on following points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Write a test to generate and evaluate content&lt;/li&gt;
&lt;li&gt;Write a test to ensure the model avoids off-topic content&lt;/li&gt;
&lt;li&gt;Write a test to ensure the model adheres to the provided context&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;&lt;u&gt; Write a test to generate and evaluate content&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;Let's first create prompt template and test it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%%writefile prompt_template.txt

Respond to the user's query.
If the user asks about something other
than olympics 2024, reply with,
"Sorry, I don't know about that. Ask me something about sports instead."

Context: {context}

User Query: {query}
Response:

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can test this using following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@pytest.fixture
def prompt_template():
  with open("prompt_template.txt", "r") as f:
    return f.read()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we will write test case. This will be LLM-specific test. In the test function below, we will provide specific context, which represents context that you would typically pull from a RAG retrieval system or another external lookup to enhance your model’s response.&lt;/p&gt;

&lt;p&gt;We will use a known context and a query that you know can be answered from that context. Next, we will provide an evaluation prompt, clearly giving the evaluation model the expected answer.&lt;/p&gt;

&lt;p&gt;Our primary gen_model is asked to answer the query given the context using the prompt_template you created earlier. Then, the query and the gen_model's response are passed to the eval_model within the evaluation_prompt to assess if it got the answer correct.&lt;/p&gt;

&lt;p&gt;The eval_model can evaluate if the substance of the response is correct, even if the generative model has responded with full sentences that may not exactly match a pre-prepared reference answer. You’ll ask the eval_model to respond with a clear ‘yes’ or ‘no’ to assert that the test should pass.&lt;/p&gt;

&lt;p&gt;Note that, we will be using &lt;code&gt;gemini-1.5-flash-001&lt;/code&gt; as gen_model, while eval_model as &lt;code&gt;gemini-1.5-pro-001&lt;/code&gt;, you can use models as per your use cases or requirements.&lt;/p&gt;

&lt;p&gt;Initialize the vertex AI models and content (This is one time step and can be used in all tests discussed in this blog).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;vertexai.init(project=project_id, location="us-central1")

gen_config = GenerationConfig(
    temperature=0,
    top_p=0.6,
    candidate_count=1,
    max_output_tokens=4096,
)
gen_model = GenerativeModel("gemini-1.5-flash-001", generation_config=gen_config)

eval_config = {
        "temperature": 0,
        "max_output_tokens": 1024,
        "top_p": 0.6,
        "top_k": 40,
    }
eval_model = GenerativeModel("gemini-1.5-pro-001", generation_config=eval_config)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Define the test&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def test_basic_response(prompt_template):

  context = ("The 2024 Summer Olympics will be held in Paris, "
             + "featuring a wide range of sports including athletics, "
             + "swimming, and gymnastics. The event is expected to "
             + "attract athletes from all over the world.")

  query = "Where will the 2024 Summer Olympics be held?"

  evaluation_prompt = """
    Has the query been answered by the provided_response?
    The 2024 Summer Olympics will be held in Paris.
    Respond with only one word: yes or no

    query: {query}
    provided_response: {provided_response}
    evaluation: """

  prompt = prompt_template.format(context=context, query=query)

  response = gen_model.generate_content(prompt)
  print(response.text)
  ep = evaluation_prompt.format(query=query, provided_response=response.text)
  evaluation = eval_model.generate_content(ep)

  assert evaluation.text.strip() == "yes"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After running test as follows, you can see the test has passed.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;ipytest.run('-rP')&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cstqsbr6jqhvoyac2jx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cstqsbr6jqhvoyac2jx.png" alt="Image description" width="800" height="92"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;u&gt;Write a test to ensure the model avoids off-topic content&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;Let's see, how can we check if model avoids off topic content&lt;/p&gt;

&lt;p&gt;Paste this code into a cell and run it to confirm that the model could indeed answer a question like this if not restricted by your prompt template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;expo_prompt = "Which is capital of France?"
response = gen_model.generate_content(expo_prompt)
print(response.text)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will give following answer&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fexq4hgywsygxegowhvoz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fexq4hgywsygxegowhvoz.png" alt="Image description" width="800" height="74"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once again we’ll provide context and a query, with the query this time being something you don’t want the model to respond to. Your evaluation prompt then checks whether the gen_model declines to answer and directs the user back to the topic mentioned in context:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def test_staying_on_topic(prompt_template):
  evaluation_prompt = """
    Does the response decline to discuss a non-sports related topic
    and encourage the user to ask about sports instead?
    Respond with only one word: yes or no

    query: {query}
    provided_response: {provided_response}
    evaluation: """

  context = ("The 2024 Summer Olympics will be held in Paris, "
             + "featuring a wide range of sports including athletics, "
             + "swimming, and gymnastics. The event is expected to "
             + "attract athletes from all over the world.")

  query = "Which is the capital of France?"

  prompt = prompt_template.format(context=context, query=query)

  response = gen_model.generate_content(prompt)
  print(response.text)
  ep = evaluation_prompt.format(query=query, provided_response=response.text)
  evaluation = eval_model.generate_content(ep)

  assert evaluation.text.strip() == "no"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Again run the test and you can see tests has passed now, ensuring that model has declined to answer the question.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fttp07ynhmmwmkl52pr4l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fttp07ynhmmwmkl52pr4l.png" alt="Image description" width="800" height="160"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;u&gt;Write a test to ensure the model adheres to the provided context&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;Now, let's check if model adherence to the provided context. Paste this code into a cell and run it to confirm that the model could indeed answer a question like this if not restricted by your prompt template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;expo_prompt = "what is honey bee?"
response = gen_model.generate_content(expo_prompt)
print(response.text)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy3ayfov51ckl2kb9e048.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy3ayfov51ckl2kb9e048.png" alt="Image description" width="800" height="265"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now run the following test&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def test_staying_off_topic_1(prompt_template):
  evaluation_prompt = """
    Does the response decline to discuss a non-Olympics 2024 related topic
    and encourage the user to ask about the Olympics instead?
    Respond with only one word: yes or no

    query: {query}
    provided_response: {provided_response}
    evaluation: """

  context = ("The 2024 Summer Olympics will be held in Paris, "
             + "featuring a wide range of sports including athletics, "
             + "swimming, and gymnastics. The event is expected to "
             + "attract athletes from all over the world.")

  query = "What is honey bee?"

  prompt = prompt_template.format(context=context, query=query)

  response = gen_model.generate_content(prompt)
  print(response.text)
  ep = evaluation_prompt.format(query=query, provided_response=response.text)
  evaluation = eval_model.generate_content(ep)

  assert evaluation.text.strip() == "yes"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This test has failed, throwing assertion error.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8l5u5uimnlkc5tsskdw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8l5u5uimnlkc5tsskdw.png" alt="Image description" width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3ui1kk0w7wal7cl3whw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff3ui1kk0w7wal7cl3whw.png" alt="Image description" width="800" height="201"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now modify the prompt template as shown below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;%%writefile prompt_template.txt

Respond to the user's query. You should only talk about the following things:

sports
sports techniques
sports-related events
sports-related news
athletic events
sports industry If the user asks about something that is not related to sports, ask yourself again if it might be related to sports or the athletic industry. If you still believe the query is not related to sports or athletics, respond with: "Sorry, I don't know about that. Ask me something about sports instead." When answering, use only information included in the context.
Context: {context}

User Query: {query} Response:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Update the test as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def test_staying_off_topic_2(prompt_template):
  evaluation_prompt = """
    Does the response decline to discuss a non-sports related topic
    and encourage the user to ask about sports instead?
    Respond with only one word: yes or no

    query: {query}
    provided_response: {provided_response}
    evaluation: """

  context = ("The 2024 Summer Olympics will be held in Paris, "
             + "featuring a wide range of sports including athletics, "
             + "swimming, and gymnastics. The event is expected to "
             + "attract athletes from all over the world.")

  query = "What is honey bee?"

  prompt = prompt_template.format(context=context, query=query)

  response = gen_model.generate_content(prompt)
  print(response.text)
  ep = evaluation_prompt.format(query=query, provided_response=response.text)
  evaluation = eval_model.generate_content(ep)

  assert evaluation.text.strip() == "yes"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After running the test you can now see the test has passed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F54kf64v34ucf8pez5ti3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F54kf64v34ucf8pez5ti3.png" alt="Image description" width="800" height="75"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can see following two changes in the two tests&lt;/p&gt;

&lt;p&gt;Evaluation Focus: The primary difference lies in the evaluation prompt. Test 1 focuses on declining non-Olympics related topics, while Test 2 focuses on declining non-sports related topics.&lt;/p&gt;

&lt;p&gt;Expected Behavior: Both tests expect the model to decline answering the query about honey bees, but the context of what the model should encourage the user to ask about differs (Olympics vs. sports).&lt;/p&gt;

&lt;p&gt;These differences highlight how the evaluation criteria can be tailored to specific contexts, ensuring that the model stays on topic based on the given context.&lt;/p&gt;




&lt;p&gt;Conclusion:&lt;/p&gt;

&lt;p&gt;I tried to discuss few approaches on how unit testing can be included in your SDLC, you can also try out different context, evaluation prompts, queries according to your use cases. Also, you can try out changing following configurations, such as adjusting temperature etc.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gen_config = GenerationConfig(
    temperature=0,
    top_p=0.6,
    candidate_count=1,
    max_output_tokens=4096,
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I hope you will like this effort. I will be more than happy to connect with you and to know more about your experience with LLM testing, Gen AI. You can connect with &lt;a href="https://www.linkedin.com/in/rahulbhave/" rel="noopener noreferrer"&gt;me&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;References:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-code-completion-test-function#aiplatform_sdk_code_completion_test_function-python" rel="noopener noreferrer"&gt;https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-code-completion-test-function#aiplatform_sdk_code_completion_test_function-python&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-code-generation-unittest" rel="noopener noreferrer"&gt;https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-code-generation-unittest&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://cloud.google.com/vertex-ai/docs/evaluation/model-evaluation-notebook-tutorials" rel="noopener noreferrer"&gt;https://cloud.google.com/vertex-ai/docs/evaluation/model-evaluation-notebook-tutorials&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>testing</category>
      <category>ai</category>
      <category>unittest</category>
    </item>
    <item>
      <title>Using faker and pandas Python Libraries to Create Synthetic Data for Testing</title>
      <dc:creator>rahulbhave</dc:creator>
      <pubDate>Sun, 15 Sep 2024 18:11:12 +0000</pubDate>
      <link>https://dev.to/rahulbhave/using-faker-and-pandas-python-libraries-to-create-synthetic-data-for-testing-4gn4</link>
      <guid>https://dev.to/rahulbhave/using-faker-and-pandas-python-libraries-to-create-synthetic-data-for-testing-4gn4</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction:&lt;/strong&gt;&lt;br&gt;
Comprehensive testing is essential for data-driven applications, but it often relies on having the right datasets, which may not always be available. Whether you are developing web applications, machine learning models, or backend systems, realistic and structured data is crucial for proper validation and ensuring robust performance. Acquiring real-world data may be limited due to privacy concerns, licensing restrictions, or simply the unavailability of relevant data. This is where synthetic data becomes valuable.&lt;/p&gt;

&lt;p&gt;In this blog, we will explore how Python can be used to generate synthetic data for different scenarios, including:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Interrelated Tables: Representing one-to-many relationships.&lt;/li&gt;
&lt;li&gt;Hierarchical Data: Often used in organizational structures.&lt;/li&gt;
&lt;li&gt;Complex Relationships: Such as many-to-many relationships in enrollment systems.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We’ll leverage the &lt;a href="https://faker.readthedocs.io/en/master/" rel="noopener noreferrer"&gt;faker&lt;/a&gt; and &lt;a href="https://pandas.pydata.org/docs/" rel="noopener noreferrer"&gt;pandas&lt;/a&gt; libraries to create realistic datasets for these use cases.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Example 1: Creating Synthetic Data for Customers and Orders (One-to-Many Relationship)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In many applications, data is stored in multiple tables with foreign key relationships. Let’s generate synthetic data for customers and their orders. A customer can place multiple orders, representing a one-to-many relationship.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generating the Customers Table&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Customers table contains basic information such as CustomerID, name, and email address.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pandas as pd
from faker import Faker
import random

fake = Faker()

def generate_customers(num_customers):
    customers = []
    for _ in range(num_customers):
        customer_id = fake.uuid4()
        name = fake.name()
        email = fake.email()
        customers.append({'CustomerID': customer_id, 'CustomerName': name, 'Email': email})
    return pd.DataFrame(customers)

customers_df = generate_customers(10)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kt6e1y6q24wd65a8rr1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kt6e1y6q24wd65a8rr1.png" alt="Screen Shot" width="800" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This code generates 10 random customers using Faker to create realistic names and email addresses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generating the Orders Table&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now, we generate the &lt;code&gt;Orders&lt;/code&gt; table, where each order is associated with a customer through &lt;code&gt;CustomerID&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def generate_orders(customers_df, num_orders):
    orders = []
    for _ in range(num_orders):
        order_id = fake.uuid4()
        customer_id = random.choice(customers_df['CustomerID'].tolist())
        product = fake.random_element(elements=('Laptop', 'Phone', 'Tablet', 'Headphones'))
        price = round(random.uniform(100, 2000), 2)
        orders.append({'OrderID': order_id, 'CustomerID': customer_id, 'Product': product, 'Price': price})
    return pd.DataFrame(orders)

orders_df = generate_orders(customers_df, 30)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6vm5xqt4nspagqgh3vnw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6vm5xqt4nspagqgh3vnw.png" alt="Screen shot" width="800" height="515"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this case, the &lt;code&gt;Orders&lt;/code&gt; table links each order to a customer using the &lt;code&gt;CustomerID&lt;/code&gt;. Each customer can place multiple orders, forming a one-to-many relationship.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Example 2: Generating Hierarchical Data for Departments and Employees&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hierarchical data is often used in organizational settings, where departments have multiple employees. Let’s simulate an organization with departments, each of which has multiple employees.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generating the Departments Table&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;Departments&lt;/code&gt; table contains each department's unique &lt;code&gt;DepartmentID&lt;/code&gt;, name, and manager.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def generate_departments(num_departments):
    departments = []
    for _ in range(num_departments):
        department_id = fake.uuid4()
        department_name = fake.company_suffix()
        manager = fake.name()
        departments.append({'DepartmentID': department_id, 'DepartmentName': department_name, 'Manager': manager})
    return pd.DataFrame(departments)

departments_df = generate_departments(10)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr735i1eo7ncmtwyfll3q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr735i1eo7ncmtwyfll3q.png" alt="Screen shot" width="800" height="175"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generating the Employees Table&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Next, we generate the&lt;code&gt;Employees&lt;/code&gt;table, where each employee is associated with a department via &lt;code&gt;DepartmentID&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def generate_employees(departments_df, num_employees):
    employees = []
    for _ in range(num_employees):
        employee_id = fake.uuid4()
        employee_name = fake.name()
        email = fake.email()
        department_id = random.choice(departments_df['DepartmentID'].tolist())
        salary = round(random.uniform(40000, 120000), 2)
        employees.append({
            'EmployeeID': employee_id,
            'EmployeeName': employee_name,
            'Email': email,
            'DepartmentID': department_id,
            'Salary': salary
        })
    return pd.DataFrame(employees)

employees_df = generate_employees(departments_df, 100)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw2no4i4n99ff1no1uvru.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw2no4i4n99ff1no1uvru.png" alt="Screen shot" width="800" height="170"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This hierarchical structure links each &lt;code&gt;employee&lt;/code&gt; to a &lt;code&gt;department&lt;/code&gt;through &lt;code&gt;DepartmentID&lt;/code&gt;, forming a parent-child relationship.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Example 3: Simulating Many-to-Many Relationships for Course Enrollments&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In certain scenarios, many-to-many relationships exist, where one entity relates to many others. Let’s simulate this with students enrolling in multiple courses, where each course has multiple students.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generating the Courses Table&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def generate_courses(num_courses):
    courses = []
    for _ in range(num_courses):
        course_id = fake.uuid4()
        course_name = fake.bs().title()
        instructor = fake.name()
        courses.append({'CourseID': course_id, 'CourseName': course_name, 'Instructor': instructor})
    return pd.DataFrame(courses)

courses_df = generate_courses(20)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0svzrz1uea3yudfs3tf1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0svzrz1uea3yudfs3tf1.png" alt="Screen shot" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generating the Students Table&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def generate_students(num_students):
    students = []
    for _ in range(num_students):
        student_id = fake.uuid4()
        student_name = fake.name()
        email = fake.email()
        students.append({'StudentID': student_id, 'StudentName': student_name, 'Email': email})
    return pd.DataFrame(students)

students_df = generate_students(50)
print(students_df)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Famui7coh2og6rmxuaxcy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Famui7coh2og6rmxuaxcy.png" alt="Screen shot" width="800" height="556"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generating the Course Enrollments Table&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;CourseEnrollments&lt;/code&gt; table captures the many-to-many relationship between students and courses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def generate_course_enrollments(students_df, courses_df, num_enrollments):
    enrollments = []
    for _ in range(num_enrollments):
        enrollment_id = fake.uuid4()
        student_id = random.choice(students_df['StudentID'].tolist())
        course_id = random.choice(courses_df['CourseID'].tolist())
        enrollment_date = fake.date_this_year()
        enrollments.append({
            'EnrollmentID': enrollment_id,
            'StudentID': student_id,
            'CourseID': course_id,
            'EnrollmentDate': enrollment_date
        })
    return pd.DataFrame(enrollments)

enrollments_df = generate_course_enrollments(students_df, courses_df, 200)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fslyd1pnelpxusprrod1q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fslyd1pnelpxusprrod1q.png" alt="Screen shot" width="800" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this example, we create a linking table to represent many-to-many relationships between students and courses.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;br&gt;
Using Python and libraries like Faker and Pandas, you can generate realistic and diverse synthetic datasets to meet a variety of testing needs. In this blog, we covered:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Interrelated Tables: Demonstrating a one-to-many relationship between customers and orders.&lt;/li&gt;
&lt;li&gt;Hierarchical Data: Illustrating a parent-child relationship between departments and employees.&lt;/li&gt;
&lt;li&gt;Complex Relationships: Simulating many-to-many relationships between students and courses.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These examples lay the foundation for generating synthetic data tailored to your needs. Further enhancements, such as creating more complex relationships, customizing data for specific databases, or scaling datasets for performance testing, can take synthetic data generation to the next level.&lt;/p&gt;

&lt;p&gt;These examples provide a solid foundation for generating synthetic data. However, further enhancements can be made to increase complexity and specificity, such as:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Database-Specific Data: Customizing data generation for different database systems (e.g., SQL vs. NoSQL).&lt;/li&gt;
&lt;li&gt;More Complex Relationships: Creating additional interdependencies, such as temporal relationships, multi-level hierarchies, or unique constraints.&lt;/li&gt;
&lt;li&gt;Scaling Data: Generating larger datasets for performance testing or stress testing, ensuring the system can handle real-world conditions at scale.
By generating synthetic data tailored to your needs, you can simulate realistic conditions for developing, testing, and optimizing applications without relying on sensitive or hard-to-acquire datasets.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you like the article, please share it with your friends and colleagues. You can connect with &lt;a href="https://www.linkedin.com/in/rahulbhave/" rel="noopener noreferrer"&gt;me&lt;/a&gt; on LinkedIn to discuss any further ideas.&lt;/p&gt;




</description>
      <category>testing</category>
      <category>python</category>
      <category>faker</category>
      <category>pandas</category>
    </item>
    <item>
      <title>Automating Data Builds with dbt and GitHub Actions</title>
      <dc:creator>rahulbhave</dc:creator>
      <pubDate>Sun, 23 Apr 2023 09:45:44 +0000</pubDate>
      <link>https://dev.to/rahulbhave/automating-data-builds-with-dbt-and-github-actions-m3o</link>
      <guid>https://dev.to/rahulbhave/automating-data-builds-with-dbt-and-github-actions-m3o</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction:&lt;/strong&gt;&lt;br&gt;
As more and more organizations rely on data to drive their decision-making processes, it's becoming increasingly important to ensure that this data is accurate, consistent, and up-to-date. This is where dbt (data build tool) comes in – it's a popular open-source tool that's designed to help data analysts and engineers build, test, and maintain data pipelines.&lt;/p&gt;

&lt;p&gt;In this blog post, I will show you how to use GitHub Actions to automate data builds with dbt along with one custom test. Specifically, I will walk you through the YAML file provided in the &lt;a href="https://github.com/rahul-bhave/dbt-work/blob/main/.github/workflows/main.yml" rel="noopener noreferrer"&gt;GitHub repository &lt;/a&gt;, which sets up a CI/CD pipeline for dbt. This is not a production grade yaml file but to give conceptual understanding this will be definitely helpful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is YAML?&lt;/strong&gt;&lt;br&gt;
Before we dive into the YAML file, let's take a step back and define what YAML is. YAML is a human-readable data serialization language that's often used for configuration files in software development. It's similar to JSON in that it uses key-value pairs to represent data, but it's designed to be more readable and easier to work with.&lt;/p&gt;

&lt;p&gt;As a DevOps engineer, you'll likely be working with YAML files on a regular basis to configure CI/CD pipelines, automate deployments, and manage infrastructure as code. Understanding how to write and work with YAML files is a crucial skill for anyone in this field.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;YAML File Overview:&lt;/strong&gt;&lt;br&gt;
Now, let's take a closer look at the YAML file provided in the GitHub repository. This file is a GitHub Actions workflow that's designed to run dbt whenever changes are pushed to a specific branch in a GitHub repository.&lt;/p&gt;

&lt;p&gt;The first section of the file defines the name of the workflow ("CI") and specifies when it should run. In this case, it's set to run whenever changes are pushed to the "main" branch of the repository.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;name: ci-test

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]
env:
   DBT_PROFILES_DIR: ./

   DBT_POSTGRES_PW: ${{ secrets.DBT_POSTGRES_PW }}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next section of the file defines the job that will be run by the workflow. In this case, there's only one job called "build" that will run on the latest version of Ubuntu. The steps for this job include checking out the repository, setting up Python version 3.8, installing dependencies (including dbt), and running a test using dbt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jobs:

  test:
    name: Test
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:14
        env:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: jaffle_shop
          POSTGRES_SCHEMA: dbt_alice
          POSTGRES_THREAD: 4
        ports:
          - 5432:5432
        options: &amp;gt;-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    steps:

    - uses: actions/checkout@v2
    - name: Set up Python 3.9
      uses: actions/setup-python@v2
      with:
        python-version: 3.9
    - name: Install dependencies
      run: |
        cd jaffle_shop/
        python -m pip install --upgrade pip
        pip install -r requirements.txt
        pip install dbt-postgres
        pip install dbt-core
        pip install pytest
        dbt debug
        dbt seed
        dbt run
        dbt test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, there are two additional steps that run after the "Test" step. These steps will run test using pytest framework to check the results using assertions. You can add tests in this step as per your model or business logic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- name: Run tests
      run: |
         cd jaffle_shop/
         python -m pytest tests/functional/test_example_failing.py -sv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Final output will look like below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhsfwv1540osobo4ijhn6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhsfwv1540osobo4ijhn6.png" alt="Image description" width="800" height="398"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benefits of Automating Data Builds with dbt and GitHub Actions:&lt;/strong&gt;&lt;br&gt;
By automating the data build process with dbt and GitHub Actions, you can achieve a number of benefits, including:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Increased efficiency:&lt;/strong&gt;&lt;br&gt;
Automating the build process can save time and resources, as it eliminates the need for manual intervention and reduces the risk of human error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Improved accuracy:&lt;/strong&gt;&lt;br&gt;
By running tests and checks automatically, you can ensure that your data is accurate and consistent, which can lead to better decision-making.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Better collaboration:&lt;/strong&gt;&lt;br&gt;
By using GitHub Actions, you can collaborate more easily with other members of your team, as everyone can see the status of the build process and make changes as needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Increased transparency:&lt;/strong&gt; &lt;br&gt;
By automating the build process, you can create a more transparent and auditable data pipeline, which can be important for regulatory compliance and data governance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;br&gt;
In conclusion, automating data builds with dbt and GitHub Actions can be a powerful tool for data analysts and engineers who want to ensure that their data is accurate, consistent, and up-to-date. By understanding how to write and work with YAML files, you can set up a CI/CD pipeline that automates the build process and saves time and resources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;br&gt;
This is just an attempt to show how dbt builds and tests can be setup. The example yaml file can be still improved such as providing proper naming conventions, using github secrets. In case you are planning to us this file, in your daily build then please modify the file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;References:&lt;/strong&gt;&lt;br&gt;
To setup dbt, dbt test data, sbt custom tests and postgress service, I found following references useful:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;To know more about dbt you can refer &lt;a href="https://docs.getdbt.com/docs/introduction" rel="noopener noreferrer"&gt;dbt introduction&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;To setup the dbt test data used for this blog you can refer this &lt;a href="https://github.com/dbt-labs/jaffle_shop" rel="noopener noreferrer"&gt;Test data setup&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Creating &lt;a href="https://docs.github.com/en/actions/using-containerized-services/creating-postgresql-service-containers" rel="noopener noreferrer"&gt;postgress service&lt;/a&gt; using github actions&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Example of &lt;a href="https://github.com/dbt-labs/dbt-tests-adapter-custom-tests" rel="noopener noreferrer"&gt;dbt custom tests&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>dbt</category>
      <category>githubactions</category>
      <category>cicd</category>
      <category>testing</category>
    </item>
    <item>
      <title>Building a Solid Foundation: Best Practices for Test Automation Architecture in Microservices Testing</title>
      <dc:creator>rahulbhave</dc:creator>
      <pubDate>Sun, 09 Apr 2023 11:26:15 +0000</pubDate>
      <link>https://dev.to/rahulbhave/building-a-solid-foundation-best-practices-for-test-automation-architecture-in-microservices-testing-24b0</link>
      <guid>https://dev.to/rahulbhave/building-a-solid-foundation-best-practices-for-test-automation-architecture-in-microservices-testing-24b0</guid>
      <description>&lt;p&gt;As software applications become more complex and distributed, the adoption of microservices architecture has become increasingly popular. However, testing microservices presents unique challenges, including ensuring the functionality of each individual service, as well as the integration of multiple services. To address these challenges, it is crucial to have a solid test automation architecture in place.&lt;/p&gt;

&lt;p&gt;In this blog post, we'll explore some best practices for test automation architecture in microservices testing and provide code examples wherever possible to illustrate each practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Testing Considerations:
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Deciding on the appropriate testing framework:
&lt;/h4&gt;

&lt;p&gt;Selecting the right testing framework is crucial for the success of test automation in microservices testing. There are several testing frameworks available, such as JUnit, TestNG, pytest and Spock, to name a few. When selecting a framework, it is important to consider factors such as the programming language used in the microservices, the level of integration required, and the team's familiarity with the framework.&lt;/p&gt;

&lt;p&gt;For example, let's say we have a microservice that is written in Java, and we want to perform integration testing between this microservice and another microservice. We can use the TestNG framework to write integration tests, as shown in the following code snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@Test
public void testIntegration() {
  // API calls to the other microservice
  // assert expected response is returned
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Integrating with CI/CD pipeline:
&lt;/h4&gt;

&lt;p&gt;Test scripts should be created for each microservice to ensure their individual functionality. Additionally, it is important to integrate these test scripts with the CI/CD pipeline to enable continuous testing. This can be done using tools like Jenkins, circleci, GitHub actions which can be configured to run the tests automatically whenever changes are made to the microservices.&lt;/p&gt;

&lt;p&gt;Here's an example of how we can integrate a microservice's test scripts with Jenkins:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pipeline {
  agent any

  stages {
    stage('Build') {
      steps {
        // build the microservice
      }
    }
    stage('Test') {
      steps {
        // run the microservice's test scripts
      }
    }
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Leveraging containerization:
&lt;/h4&gt;

&lt;p&gt;Microservices are often deployed in containers, making it easier to manage and scale them. It is also beneficial to use containers for test environment management, as they provide a consistent and reproducible testing environment.&lt;/p&gt;

&lt;p&gt;For example, we can use Docker to create a container for a microservice's test environment, as shown in the following Dockerfile:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FROM openjdk:8-jre-slim

WORKDIR /app

COPY target/microservice.jar .

CMD ["java", "-jar", "microservice.jar"]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Implementing test data management strategies:
&lt;/h4&gt;

&lt;p&gt;Test data management is critical for ensuring the accuracy of tests. It is important to have a strategy in place for creating, managing, and maintaining test data.&lt;/p&gt;

&lt;p&gt;For example, let's say we have a microservice that requires user authentication. We can use a test data management tool like Faker to generate random user data, as shown in the following code snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User user = new User();
user.setUsername(Faker.instance().name().username());
user.setPassword(Faker.instance().internet().password());

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Utilizing mocking and stubbing techniques:
&lt;/h4&gt;

&lt;p&gt;Mocking and stubbing techniques are essential for testing microservices in isolation. They allow us to create mock or stub objects that mimic the behavior of external dependencies, enabling us to test individual microservices without relying on the functionality of other services.&lt;/p&gt;

&lt;p&gt;For example, let's say we have a microservice that depends on an external email service. We can use a mocking framework like Mockito to create tests. Below example code snippet that demonstrates the use of Mockito for mocking an external email service in a microservice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;public class EmailServiceTest {

  @Mock
  private EmailClient emailClient;

  @InjectMocks
  private EmailService emailService;

  @BeforeEach
  public void setUp() {
    MockitoAnnotations.initMocks(this);
  }

  @Test
  public void testSendEmail() {
    // create a mock email response
    EmailResponse emailResponse = new EmailResponse();
    emailResponse.setStatus(200);
    emailResponse.setMessage("Email sent successfully");

    // configure the mock email client to return the mock email response
    when(emailClient.sendEmail(any(EmailRequest.class))).thenReturn(emailResponse);

    // call the email service to send the email
    EmailRequest emailRequest = new EmailRequest();
    emailRequest.setTo("recipient@example.com");
    emailRequest.setFrom("sender@example.com");
    emailRequest.setSubject("Test email");
    emailRequest.setBody("This is a test email");
    EmailResponse response = emailService.sendEmail(emailRequest);

    // verify that the email was sent successfully
    assertEquals(200, response.getStatus());
    assertEquals("Email sent successfully", response.getMessage());
  }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, we are testing an EmailService that depends on an external EmailClient to send emails. We use Mockito to create a mock EmailClient, and configure it to return a mock EmailResponse when the sendEmail() method is called with any EmailRequest. We then call the EmailService's sendEmail() method with a test EmailRequest, and verify that the EmailResponse returned by the EmailService matches the mock EmailResponse.&lt;/p&gt;

&lt;h4&gt;
  
  
  Using Contract tests:
&lt;/h4&gt;

&lt;p&gt;Contract testing is a testing approach that focuses on testing the interactions between services in a microservices architecture. In this approach, the contracts that define these interactions are first defined and agreed upon by the teams responsible for the services. The contract specifies the input, output, and behavior of the service. Once the contract is agreed upon, tests are written to verify that each service is conforming to its contract. This approach can help catch issues early on in the development process and can help ensure that services can interoperate smoothly.&lt;/p&gt;

&lt;p&gt;Here's an example of how contract testing can work in practice:&lt;/p&gt;

&lt;p&gt;Let's say you have a microservices architecture where Service A sends a request to Service B and expects a response. The contract for this interaction might include the expected format of the request, the expected format of the response, and any constraints on the behavior of Service B (such as response time or error handling).&lt;/p&gt;

&lt;p&gt;To test this contract, you might create a set of tests that simulate requests from Service A to Service B and verify that the responses are in the expected format and meet the agreed-upon behavior constraints. These tests could be written in a testing framework like Pact, which is specifically designed for contract testing in microservices architectures.&lt;/p&gt;

&lt;p&gt;If one of these tests fails, it indicates that there's a problem with either Service A or Service B not conforming to the contract. By catching these issues early on in the development process, you can avoid more complex and time-consuming debugging later on. Here's an example of how you could write a consumer contract test using the Pact framework in Java:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import au.com.dius.pact.consumer.MockServer;
import au.com.dius.pact.consumer.dsl.PactDslWithProvider;
import au.com.dius.pact.consumer.junit5.PactConsumerTestExt;
import au.com.dius.pact.consumer.junit5.PactTestFor;
import au.com.dius.pact.core.model.annotations.Pact;
import au.com.dius.pact.core.model.RequestResponsePact;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.springframework.http.HttpHeaders;
import org.springframework.http.HttpStatus;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.client.RestTemplate;

import static org.junit.jupiter.api.Assertions.assertEquals;

@ExtendWith(PactConsumerTestExt.class)
@PactTestFor(providerName = "example-provider")
public class ExampleConsumerPactTest {

    @Pact(consumer = "example-consumer")
    public RequestResponsePact createPact(PactDslWithProvider builder) {
        return builder
                .given("a request for a user with id 123")
                .uponReceiving("a request for a user with id 123")
                    .path("/users/123")
                    .method("GET")
                .willRespondWith()
                    .status(HttpStatus.OK.value())
                    .headers(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
                    .body("{\"id\": 123, \"name\": \"John Smith\"}")
                .toPact();
    }

    @Test
    @PactTestFor(pactMethod = "createPact")
    public void shouldReturnUserWithId123(MockServer mockServer) {
        // Arrange
        RestTemplate restTemplate = new RestTemplate();
        String url = mockServer.getUrl() + "/users/123";

        // Act
        ResponseEntity&amp;lt;String&amp;gt; response = restTemplate.getForEntity(url, String.class);

        // Assert
        assertEquals(HttpStatus.OK, response.getStatusCode());
        assertEquals(MediaType.APPLICATION_JSON_VALUE, response.getHeaders().getContentType().toString());
        assertEquals("{\"id\": 123, \"name\": \"John Smith\"}", response.getBody());
    }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, we define a createPact method which creates a Pact for a request to the /users/123 endpoint of the example-provider. The Pact specifies that the response should have an HTTP 200 status code, a Content-Type header of application/json, and a body containing a JSON object with an id field of 123 and a name field of "John Smith".&lt;/p&gt;

&lt;p&gt;The Test method shouldReturnUserWithId123 uses the RestTemplate to send a request to the endpoint specified by the mockServer, which is provided by the Pact framework. The test then asserts that the response received from the server matches the expectations specified in the Pact.&lt;/p&gt;

&lt;p&gt;In conclusion, microservices testing can be a complex and challenging task due to the distributed nature of microservices and their dependencies on external services. However, by following the best practices outlined in this blog, such as isolating dependencies, leveraging containerization, and using appropriate testing frameworks and tools, you can ensure that your microservices are thoroughly tested and reliable. Additionally, automating your testing process can greatly improve the speed and efficiency of your testing efforts, enabling you to deliver high-quality microservices to your users more quickly. Remember, the key to successful microservices testing is to approach it with a comprehensive and well-planned strategy, and to continuously refine and improve your testing process as your microservices architecture evolves.&lt;/p&gt;

&lt;p&gt;Here are some references that you may find helpful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Fowler, M. (2014). Microservices: a definition of this new architectural term. &lt;a href="https://martinfowler.com/articles/microservices.html" rel="noopener noreferrer"&gt;https://martinfowler.com/articles/microservices.html&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Newman, S. (2015). Building Microservices: Designing Fine-Grained Systems. O'Reilly Media.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Newman, S. (2019). Monolith to Microservices: Evolutionary Patterns to Transform Your Monolith. O'Reilly Media.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;PACT. (n.d.). &lt;a href="https://pact.io/" rel="noopener noreferrer"&gt;https://pact.io/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I hope these references help you in your learning journey!&lt;/p&gt;

</description>
      <category>microservices</category>
      <category>testing</category>
      <category>pact</category>
      <category>qa</category>
    </item>
    <item>
      <title>Write a test to check the code written to compare Teradata and snowflake table data using python mock</title>
      <dc:creator>rahulbhave</dc:creator>
      <pubDate>Wed, 22 Mar 2023 09:42:37 +0000</pubDate>
      <link>https://dev.to/rahulbhave/write-a-test-to-compare-teradata-and-snowflake-table-using-python-mock-5a56</link>
      <guid>https://dev.to/rahulbhave/write-a-test-to-compare-teradata-and-snowflake-table-using-python-mock-5a56</guid>
      <description>&lt;p&gt;Database infrastructure is a critical component of any data-driven organization. It is important to ensure that the data in your database is accurate and up to date. In this article, we will discuss how to use Python mocking and patching to compare data in two Teradata and Snowflake table, in case infrastructure is not available for testing and you want to check your code is working fine. We will be leveraging the Python unittest module to write our tests and the unittest.mock module to mock the behavior of our database objects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Python code for comparing two tables in Teradata and Snowflake:
&lt;/h2&gt;

&lt;p&gt;To demo this concept, I have written a simple Python script that compares two tables in Teradata and Snowflake. The script takes the following parameters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teradata database name&lt;/li&gt;
&lt;li&gt;Teradata table name&lt;/li&gt;
&lt;li&gt;Snowflake database name&lt;/li&gt;
&lt;li&gt;Snowflake table name&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After reading the data from the tables, the data is stored in a Pandas DataFrame. The script then compares the two Data Frames and prints the number of rows that are different.&lt;/p&gt;

&lt;p&gt;The script is available on GitHub &lt;a href="https://github.com/rahul-bhave/mock_teradata_snowflake_db/tree/main/compare_tables" rel="noopener noreferrer"&gt;at&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing the Tests:
&lt;/h2&gt;

&lt;p&gt;The first step is to write the tests for our script. We will be using the Python unittest module to write our tests. The tests will be written in a separate file called test_compare_tables.py. The tests will be written in the following order:&lt;br&gt;
There are three tests that we will be writing:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Test that the script can connect to Teradata.&lt;/li&gt;
&lt;li&gt;Test that the script can connect to Snowflake.&lt;/li&gt;
&lt;li&gt;Test that the script can compare two tables.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Test that the script can connect to Teradata
&lt;/h2&gt;

&lt;p&gt;Test will be written to check if the script can connect to Teradata. We will be using the unittest.mock module to mock the behavior of the Teradata database object. The mock object will be used to return a Pandas DataFrame with some sample data. The test will then check if the script can connect to Teradata and return the data in the DataFrame.&lt;/p&gt;
&lt;h2&gt;
  
  
  Test that the script can connect to Snowflake
&lt;/h2&gt;

&lt;p&gt;The test will be written to check if the script can connect to Snowflake. We will be using the unittest.mock module to mock the behavior of the Snowflake database object. The mock object will be used to return a Pandas DataFrame with some sample data. The test will then check if the script can connect to Snowflake and return the data in the DataFrame.&lt;/p&gt;
&lt;h2&gt;
  
  
  Test that the script can compare two tables
&lt;/h2&gt;

&lt;p&gt;The test will be written to check if the script can compare two tables. We will be using the unittest.mock module to mock the behavior of the Teradata and Snowflake database objects. The mock objects will be used to return Pandas Data Frames with some sample data. The test will then check if the script can compare the two Data Frames and return the number of rows that are different.&lt;/p&gt;

&lt;p&gt;You can find Test file &lt;a href="https://github.com/rahul-bhave/mock_teradata_snowflake_db/blob/main/tests/test_compare_tables.py" rel="noopener noreferrer"&gt;at&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  How to test code used in this blog:
&lt;/h2&gt;

&lt;p&gt;To test the script, we will need to install the following:&lt;br&gt;
I. Install Python 3.8 or higher.&lt;br&gt;
II. Crate a virtual environment using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3 -m venv venv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;III. Activate the virtual environment using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;source venv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;IV. Install the required packages using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install -r requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;V. Run the tests as shown below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ python -m pytest tests/test_compare_tables.py -sv
============================= test session starts =============================
platform win32 -- Python 3.9.5, pytest-7.2.2, pluggy-1.0.0 -- C:\Users\rahulbhave\code\mock_teradata_snowflake_db\venv3\Scripts\python.exe
cachedir: .pytest_cache
rootdir: C:\Users\rahulbhave\code\mock_teradata_snowflake_db
collecting ... collected 3 items

tests/test_compare_tables.py::TestConnectToTeradata::test_connect_to_teradata PASSED
tests/test_compare_tables.py::TestConnectToSnowflake::test_connect_to_snowflake PASSED
tests/test_compare_tables.py::TestCompareTables::test_compare_tables PASSED

============================== 3 passed in 2.28s ==============================
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion:
&lt;/h2&gt;

&lt;p&gt;I hope you found this article useful. In this article, we discussed how to use Python mocking and patching to compare two tables in Teradata and Snowflake in case infrastructure is not available for testing and you want to check your code is working fine. We used the Python unittest module to write our tests and the unittest.mock module to mock the behavior of our database objects.&lt;/p&gt;

&lt;p&gt;The code for this article is available on GitHub &lt;a href="https://github.com/rahul-bhave/mock_teradata_snowflake_db" rel="noopener noreferrer"&gt;at&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you like the article, please share it with your friends and colleagues and don't forget to give star to the GitHub repo.&lt;/p&gt;

&lt;p&gt;You can &lt;a href="https://www.linkedin.com/in/rahulbhave/" rel="noopener noreferrer"&gt;connect&lt;/a&gt; with me on LinkedIn.&lt;/p&gt;

</description>
      <category>python</category>
      <category>testing</category>
      <category>mock</category>
      <category>pytest</category>
    </item>
    <item>
      <title>Selenium Tests using GitHub action</title>
      <dc:creator>rahulbhave</dc:creator>
      <pubDate>Mon, 06 Dec 2021 17:37:42 +0000</pubDate>
      <link>https://dev.to/rahulbhave/running-selenium-tests-using-github-action-40hb</link>
      <guid>https://dev.to/rahulbhave/running-selenium-tests-using-github-action-40hb</guid>
      <description>&lt;h3&gt;
  
  
  My Workflow
&lt;/h3&gt;

&lt;p&gt;This flow is created, to setup github action for running selenium Tests using headless chrome browser.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub action steps:
&lt;/h3&gt;

&lt;p&gt;I. Install dependencies-&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;name: Run Python Test
on:
  push:
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Install Python 3
        uses: actions/setup-python@v1
        with:
          python-version: 3.8
      - name: Install dependencies
        run: |
          set -ex
          wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
          sudo apt install ./google-chrome-stable_current_amd64.deb
          python -m pip install --upgrade pip
          pip install -r requirements.txt
          pip install msedge-selenium-tools
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;II. Run Tests&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Execute Test
      - name: Run test 
        run: python -m pytest tests/test_form.py 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Submission Category:
&lt;/h3&gt;

&lt;p&gt;DIY Deployments&lt;/p&gt;

&lt;h3&gt;
  
  
  Yaml File:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;name: Run Python Test
on:
  push:
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Install Python 3
        uses: actions/setup-python@v1
        with:
          python-version: 3.8
      - name: Install dependencies
        run: |
          set -ex
          wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
          sudo apt install ./google-chrome-stable_current_amd64.deb
          python -m pip install --upgrade pip
          pip install -r requirements.txt
          pip install msedge-selenium-tools
      # Execute Test
      - name: Run test
        run: python -m pytest tests/test_form.py

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>actionshackathon21</category>
      <category>productivity</category>
      <category>devops</category>
      <category>testing</category>
    </item>
    <item>
      <title>Running Tests against LocalStack using GitHub action</title>
      <dc:creator>rahulbhave</dc:creator>
      <pubDate>Mon, 06 Dec 2021 06:30:56 +0000</pubDate>
      <link>https://dev.to/rahulbhave/running-tests-against-localstack-using-github-action-52h0</link>
      <guid>https://dev.to/rahulbhave/running-tests-against-localstack-using-github-action-52h0</guid>
      <description>&lt;h3&gt;
  
  
  My Workflow
&lt;/h3&gt;

&lt;p&gt;This workflow I designed to run tests against &lt;a href="https://localstack.cloud/" rel="noopener noreferrer"&gt;LocalStack&lt;/a&gt;. I hope this Github action will help developer community or testers to test their respective test cases against LocalStack instead of executing it against the AWS Cloud which could be costly in some cases. For sample Lambda tests, I have referred This &lt;a href="https://github.com/ciaranevans/localstack_and_pytest_1" rel="noopener noreferrer"&gt;Repo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This github action flow is very simple and consists of following three steps-&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1&lt;/strong&gt;&lt;br&gt;
Running localstack services.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2&lt;/strong&gt;&lt;br&gt;
Install python and other test dependencies&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3&lt;/strong&gt;&lt;br&gt;
Run the tests against LocalStack&lt;/p&gt;

&lt;p&gt;The details such as Yaml file and Link to code are given below&lt;/p&gt;
&lt;h3&gt;
  
  
  Submission Category:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;DIY Deployments&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Yaml File or Link to Code
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Yaml file&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;name: Localstack CI
on: push
jobs:
  localstack:
    runs-on: ubuntu-latest
    services:
      localstack:
        image: localstack/localstack:latest
        env:
          SERVICES: lambda
          DEFAULT_REGION: eu-west-1
          AWS_ACCESS_KEY_ID: localkey
          AWS_SECRET_ACCESS_KEY: localsecret
        ports:
          - 4566:4566
          - 4571:4571
    steps:
      - uses: actions/checkout@v2
      - name: Install Python 3.8
        uses: actions/setup-python@v1
        with:
          python-version: 3.8
      - name: Install dependencies
        run: |
          pip3 install --upgrade pip==20.0.1
          pip3 install -r requirements.txt
      # Execute Tests lambda
      - name: Run test for sample lambda
        run: |
          cd lambda
          pytest -sv

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/rahul-bhave/localstack-utilities/blob/2780e0f4219d285ebe0c48de9e7e0c2da2d9f4ba/README.md" rel="noopener noreferrer"&gt;Link to GitHub Repo&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Additional Resources / Info
&lt;/h3&gt;

&lt;p&gt;I have referred following projects to create the github action-&lt;br&gt;
&lt;a href="https://localstack.cloud/" rel="noopener noreferrer"&gt;LocalStack&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/ciaranevans/localstack_and_pytest_1" rel="noopener noreferrer"&gt;Testing Lambda with Pytest&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
