You have six years of automation experience and a Cucumber suite that takes forty minutes to run, and you still can't tell me what a single feature file actually tests without reading three layers of glue code.
I've been there. You're the senior automation engineer on a team that adopted BDD because "business stakeholders need to read the tests." Except those stakeholders stopped reading them after sprint two. Now you're maintaining a suite where every scenario is a small novel, every step definition calls three other step definitions, and the person who wrote the original framework left the company eighteen months ago.
Here's what nobody tells you about BDD before your Cucumber suite becomes a maintenance nightmare.
Thing One: Gherkin Is Not a Programming Language, But You're Treating It Like One
Your feature files look like this:
Scenario: User completes checkout with valid payment
Given the user is logged in with email "test@example.com" and password "Password123!"
And the user has 3 items in their cart
And the user's shipping address is "123 Main St, Springfield, IL 62701"
And the user's billing address matches their shipping address
When the user selects "Credit Card" as payment method
And the user enters card number "4111111111111111"
And the user enters expiry date "12/28"
And the user enters CVV "123"
And the user clicks "Place Order"
Then the order confirmation page is displayed
And the order number is shown
And the user receives an email confirmation
This is not BDD. This is a test script written in a markup language that happens to parse into Cucumber.
The original promise of BDD was that scenarios describe behavior, not implementation. But somewhere along the way, teams started treating Gherkin as a readable scripting language. Every parameter gets inlined. Every step becomes a concrete action. The feature file becomes a play-by-play of the UI interaction.
The result? When the checkout flow changes — and it will — you have to update every scenario that touches checkout. Not the step definitions. The feature files themselves. Because the business logic is embedded in the Gherkin, not abstracted behind it.
Here's what a better version looks like:
Scenario: User completes checkout with valid payment
Given the user has items ready to purchase
When the user pays with a valid credit card
Then the order is confirmed
That's it. Three lines. The "how" lives in the step definition. The "what" lives in the feature file. When the payment provider changes, you update one step definition, not twenty scenarios.
Your step definition file should be the place where complexity lives. Your feature files should be so simple that a product manager could read them in a standup and nod along. If your Gherkin looks like a test script, you've already lost.
Thing Two: Your Step Definitions Are a Dependency Graph, Not a Library
Here's the pattern I see in every struggling Cucumber suite:
// step_definitions/checkout_steps.ts
import { Given, When, Then } from '@cucumber/cucumber';
import { loginAsUser } from './auth_steps';
import { addItemsToCart } from './cart_steps';
import { fillShippingAddress } from './shipping_steps';
Given('the user has items ready to purchase', async function () {
await loginAsUser.call(this, 'test@example.com', 'Password123!');
await addItemsToCart.call(this, 3);
await fillShippingAddress.call(this, '123 Main St');
});
This is a dependency graph masquerading as a test framework. Every step definition imports other step definitions. The call order matters. The shared this context accumulates state across steps. If one step fails, the entire scenario is poisoned, and you can't tell whether the failure is in the step that failed or in the accumulated state from the three steps before it.
The counterargument: "But we need to reuse steps across scenarios. That's the whole point of BDD."
I agree with reuse. I disagree with how you're doing it.
The fix is to separate your page objects or service clients from your step definitions entirely. Your step definitions should be thin wrappers that call a shared domain layer. Not other step definitions.
// domain/checkout.ts
export class CheckoutFlow {
constructor(private page: Page) {}
async prepareItemsForPurchase(): Promise<void> {
await this.page.goto('/login');
await this.page.fill('#email', 'test@example.com');
await this.page.fill('#password', 'Password123!');
await this.page.click('#login-button');
await this.page.waitForURL('/dashboard');
// ... add items, set address
}
async payWithCreditCard(): Promise<void> {
await this.page.click('#credit-card-option');
await this.page.fill('#card-number', '4111111111111111');
await this.page.click('#place-order');
}
}
// step_definitions/checkout_steps.ts
import { CheckoutFlow } from '../domain/checkout';
Given('the user has items ready to purchase', async function () {
this.checkout = new CheckoutFlow(this.page);
await this.checkout.prepareItemsForPurchase();
});
When('the user pays with a valid credit card', async function () {
await this.checkout.payWithCreditCard();
});
Now your step definitions are stateless. They don't import each other. They don't share mutable context. Each step creates or calls a domain object that encapsulates the behavior. If a step fails, the failure is isolated to that step's domain operation.
Your step definition file becomes a routing table, not a tangled web of imports. You can delete a step definition without worrying about breaking three other scenarios that depend on it.
Thing Three: Your Cucumber Suite Is a Documentation Project That You're Treating as a Test Project
This is the one nobody wants to say out loud.
BDD was never primarily about testing. It was about communication. The original vision was that scenarios would serve as living documentation — a shared language between developers, testers, and business stakeholders. The tests were a side effect.
But somewhere in the last decade, teams started measuring BDD success by test coverage. "We have 500 Cucumber scenarios." "Our BDD suite runs in CI." "We achieved 90% Gherkin coverage."
None of those metrics tell you whether your scenarios are readable. None of them tell you whether a new team member can open a feature file and understand what the system does without running the tests.
Here's the uncomfortable truth: if your Cucumber suite is a maintenance nightmare, it's because you optimized for the wrong thing. You optimized for test execution speed, for step reuse, for parameterization. You forgot to optimize for reading.
I learned this the hard way. I had a suite with beautiful step definitions. Clean abstractions. Zero duplication. But when a product manager asked me what the checkout flow looked like, I had to open three feature files and trace through five scenario outlines to explain it. The documentation was technically there, but it was useless.
The fix is brutal but simple: every time you add a scenario, ask yourself whether someone who has never seen your codebase can understand it in under thirty seconds. If the answer is no, rewrite the scenario. Not the step definition. The scenario.
Your feature files should be the first thing a new engineer reads when they join your team. Not the README. Not the wiki. The feature files. If they can't understand the system from those files alone, your BDD suite has failed its primary purpose.
What to Do Tomorrow
Pick one feature file. The worst one. The one with the most steps, the most parameters, the most inline data.
Rewrite it to three to five lines per scenario. Move every concrete value into a step definition or a test data factory. Delete every step definition that imports another step definition. Replace them with domain objects.
Run the suite. If it passes, you've just made your documentation better and your tests more maintainable in one move.
Then do it again for the next file.
A Question for You
Your Cucumber suite runs in CI. It passes. Your team is shipping. But if you had to explain the system's behavior to a new hire using only your feature files, could you do it in under five minutes?
If the answer is no, you know where to start.
Top comments (0)