DEV Community

Cover image for Generating BDD Test Cases Automatically with Google Gemini and React
Marco Baturan
Marco Baturan

Posted on

Generating BDD Test Cases Automatically with Google Gemini and React

Behavior-Driven Development (BDD) has transformed how modern software teams define and verify features. By focusing on behavior rather than just implementation, BDD ensures that developers, QA engineers, and business stakeholders are literally on the same page. However, the bridge between a raw business requirement and a technical Gherkin test case often feels like a manual, repetitive chore.

In this tutorial, we will explore how to eliminate this friction by building an AI-powered BDD tool. We will combine the power of Google Gemini 2.0 Flash, React, and Vercel Serverless Functions to generate high-quality Gherkin scenarios in seconds.

The Problem: Writing Gherkin Is Tedious
The BDD process typically looks like this: a Product Owner describes a feature, and then a QA engineer or Developer must translate that description into Gherkin syntax. While Gherkin is human-readable and powerful for automation, the actual process of writing it manually is prone to several "invisible" costs:

Syntactic Overhead: Even for experts, writing Given, When, And, Then blocks for every single scenario takes time. It’s a lot of typing for relatively simple logic.
Scenario Discovery: Brainstorming the "Happy Path" is easy. But remembering to document the negative cases (invalid emails, expired tokens) and the edge cases (null values, maximum character limits) requires significant mental focus.
Inconsistency: Different engineers might use slightly different phrasing for the same actions ("When I log in" vs "When the user logs in"), which can confuse automated test runners like Cucumber or Behave later in the pipeline.
Developers often find themselves staring at a blank .feature file, copy-pasting existing scenarios from other projects just to save time. This leads to "thin" test coverage where only the most obvious paths are documented. What if we could give an AI a simple sentence—"The user can reset their password via email"—and get back a structured suite of tests covering the happy path, failures, and those elusive edge cases?

What We Are Building
We are building the Gherkin Test Case Generator. It’s a clean, modern web application designed for one purpose: to take a functional description and turn it into a BDD-ready Gherkin file.

The application follows a simple but powerful flow:

Input Phase: The user types a natural language description into a "Functional Description" panel.
Processing Phase: The request is sent through a secure serverless function to Google’s Gemini 2.0 Flash model.
Refinement Phase: The AI, prompted as a BDD expert, identifies 3-5 distinct scenarios including happy paths, negative paths, and edge cases.
Output Phase: The results are streamed back to the frontend and displayed in a clean, monospaced view with a "Copy to Clipboard" feature.
This tool isn't meant to replace the QA engineer; it's meant to be their copilot. It provides a 90% complete foundation that can then be reviewed and refined, saving hours of manual drafting.

Tech Stack
To build a tool that feels fast, stays secure, and looks premium, we chose a modern stack that handles most of the heavy lifting for us:

Frontend: React (Vite) We used React for its component-based architecture and state management. Vite was chosen as the build tool because it’s significantly faster than legacy options like Create React App (CRA). In a world where developer experience (DX) matters, Vite’s instant hot module replacement is a game-changer.
Styling: Vanilla CSS & Tailwind (CDN) For this project, we prioritize visual excellence without the overhead of complex build steps. Tailwind CSS allows us to implement "glassmorphism" effects (blurred backgrounds, subtle borders) and dark mode with minimal effort.
AI Engine: Google Gemini 2.0 Flash While many developers default to GPT-4, Gemini 2.0 Flash is a standout choice for this use case. It is incredibly fast, features a gigantic context window, and is free for reasonable hobbyist usage. The "Flash" model is specifically optimized for low-latency tasks like text generation.
Backend / Hosting: Vercel Vercel provides the infrastructure for both our static frontend and our API layer. We use Vercel Serverless Functions (Node.js) to communicate with Google’s API. This ensures that our private API keys never touch the user’s browser.
Project Setup
The project is organized into specialized, single-purpose modules. We avoid monolithic files to ensure the code remains maintainable and easy to follow.

Folder Structure
gherkin-generator/
├── api/
│ └── generate.js # The Secure Serverless Bridge to Gemini
├── src/
│ ├── components/
│ │ ├── InputPanel.jsx # Handles user textarea and button logic
│ │ └── OutputPanel.jsx # Displays Gherkin and handles clipboard
│ ├── App.jsx # Central state management and UI layout
│ └── main.jsx # React rendering entry point
├── index.html # HTML shell (including Tailwind CDN)
├── vite.config.js # Configuration for local dev proxy
└── package.json # Core dependency manifest
Dependencies
Our package.json reflects a lightweight, modern React 19 environment. We deliberately keep external dependencies to a minimum to reduce the application’s bundle size.

{
"name": "gherkin-generator",
"private": true,
"version": "1.0.0",
"type": "module",
"dependencies": {
"react": "^19.0.0",
"react-dom": "^19.0.0"
},
"devDependencies": {
"@vitejs/plugin-react": "^4.3.0",
"vite": "^6.0.0"
}
}
Connecting to the Gemini API (Vercel Serverless)
One of the most common mistakes in early AI development is putting API keys directly into the frontend code. This is a massive security risk, as anyone can inspect your website and steal your credits.

To solve this, we create a server-side route at /api/generate.js. When the user clicks "Generate", our React app sends the description to this endpoint. The endpoint then adds our secret GEMINI_API_KEY (stored securely in Vercel’s environment settings) and calls Google’s servers.

Here is the implementation of our serverless function:

/**

  • Vercel Serverless Function: /api/generate
  • Acts as a secure proxy to talk to the Gemini API. */

export default async function handler(req, res) {
// 1. Security check: Only allow POST requests
if (req.method !== 'POST') {
return res.status(405).json({ error: 'Method not allowed' });
}

// 2. Input validation
const { description } = req.body || {};
if (!description || !description.trim()) {
    return res.status(400).json({ error: 'No description provided' });
}

// 3. Load secret key from environment variables
const apiKey = process.env.GEMINI_API_KEY;
if (!apiKey) {
    return res.status(500).json({ error: 'Server misconfiguration: missing API key.' });
}

// 4. Construct the prompt (Engineering the response)
const prompt = `You are a QA engineer expert in BDD and Gherkin syntax.
Enter fullscreen mode Exit fullscreen mode

Given the following functional description, generate between 3 and 5 test cases in Gherkin format.
Include: at least one happy path, one negative case, and one edge case.
Output only valid Gherkin. No explanations, no markdown fences, no extra text.

Functional description:
${description}`;

try {
    // 5. Call the Google Gemini 2.0 Flash API
    const response = await fetch(
        `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=${apiKey}`,
        {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
                contents: [{ parts: [{ text: prompt }] }],
                generationConfig: {
                    temperature: 0.4,
                    maxOutputTokens: 1024,
                }
            })
        }
    );

    if (!response.ok) {
        throw new Error(`Gemini API error: ${response.statusText}`);
    }

    const data = await response.json();
    const gherkinResult = data.candidates?.[0]?.content?.parts?.[0]?.text || '';

    // 6. Return the raw text result to the React app
    return res.status(200).json({ result: gherkinResult });
} catch (err) {
    console.error('Generation failed:', err);
    return res.status(502).json({ error: 'Failed to generate test cases. Please try again.' });
}
Enter fullscreen mode Exit fullscreen mode

}
The Prompt Engineering: Getting Clean Gherkin Output
Getting a Large Language Model (LLM) to output code instead of a chatty response is an art. If you ask Gemini "Write a Gherkin case for login", it might reply with: "Sure! Here is a Gherkin case... [Code Block]... Hope that helps!"

That "Hope that helps!" is noise that breaks our application. We use two specific techniques to ensure the output is ready for immediate use:

  1. Persona and Negative Constraints
    We explicitly tell Gemini who it is (You are a QA engineer expert) and what NOT to do (No explanations, no markdown fences). This ensures we get raw text that doesn't include the "

    ```gherkin" tags, which would be visible and ugly in our UI.

  2. Tuning the Temperature
    In the generationConfig, we set the temperature to 0.4.

A low temperature (0.0 - 0.2) would make the AI extremely repetitive and "stiff".
A high temperature (0.8 - 1.0) would make the AI creative, but it might start hallucinating non-existent BDD keywords or bizarre scenarios.
The "Goldilocks" zone (0.3 - 0.5) is perfect for technical tasks. It ensures the Gherkin syntax is strictly followed while allowing the model enough "creativity" to think of clever edge cases (like "What happens if the user tries to login with a 300-character password?").
Rendering the Output in React
On the frontend, we need to handle the state of the request and display the result in a way that respects the Gherkin formatting.

Our App.jsx component manages the primary state:

const [input, setInput] = useState('');
const [output, setOutput] = useState('');
const [loading, setLoading] = useState(false);
const [error, setError] = useState('');

const handleGenerate = async () => {
setLoading(true);
// ... fetch logic from Section 5 ...
setLoading(false);
}
We then pass this state down to the OutputPanel.jsx. This component is designed to feel like a code editor. We use a

 tag with whitespace-pre-wrap to ensure the tabs and lines generated by Gemini are preserved exactly.

// Simplified OutputPanel.jsx
export default function OutputPanel({ output }) {
const [copied, setCopied] = useState(false);

const handleCopy = async () => {
    await navigator.clipboard.writeText(output);
    setCopied(true);
    setTimeout(() => setCopied(false), 2000);
};

return (
    <section className="bg-gray-900 border border-gray-800 p-5 rounded-xl">
        <div className="flex justify-between items-center mb-4">
            <h2 className="text-gray-300 font-bold">Generated Scenarios</h2>
            <button 
              onClick={handleCopy}
              className="bg-brand-600 text-white px-4 py-2 rounded-lg text-sm"
            >
              {copied ? 'Copied!' : 'Copy to Clipboard'}
            </button>
        </div>
        <pre className="text-green-300 font-mono text-sm leading-relaxed">
            {output}
        </pre>
    </section>
);

}
The use of a green tint on the text (text-green-300) against a dark background gives the application a professional, "hacker-like" aesthetic that resonates well with developers.

Live Demo
Building in public is all about sharing the results. You can visit the live version of this app right now. It is deployed on Vercel and connected to a production Gemini instance.

🚀 Try the Gherkin Generator Live

What I Learned
Building this project from scratch was a masterclass in modern AI integration. Here are the five most honest lessons I walked away with:

AI Speed is User Experience: Using the "Flash" model was the right choice. Users hate waiting. When the Gherkin appears in under 2 seconds, the app feels like a tool. If it took 15 seconds, it would feel like a chore.
Prompt Engineering is Never Done: I initially didn't include "no markdown fences." The UI looked terrible because I was seeing "```

gherkin" at the top. You have to be incredibly explicit with LLMs about the exact characters you want.
Sanitization is Key: Sometimes, Gemini still includes a intro line like "Here is your output:". In a production-grade app, I’d eventually need to write a post-processing function to scan the text and remove anything before the Feature: keyword.
The "Temperature" Trap: I started with temperature 0.7, thinking it would give better edge cases. Instead, it started suggesting test steps that didn't make sense for web apps. Turning it down to 0.4 made the output much more reliable.
Vercel Serverless is Magic: Setting up a backend to hide API keys used to take an hour of configuring Express and CORS. With Vercel, it’s a single file in the /api folder. It’s the fastest way to build full-stack AI apps today.
If you are a developer looking to integrate AI into your workflow, don't start with complex agents. Start with a "helper" tool like this one. It solves a real problem, teaches you the basics of security and prompting, and provides immediate value to your team.

Built by GetDevWorks — https://www.getdevworks.com

Top comments (0)