We tried all three. Two of them broke at 200 tests. The third one still works at 2,400.
When you start an API automation project, organizing tests feels like a trivial decision.
You create a few folders, give the tests sensible names, and move on.
Six months later, the suite has grown to a few hundred tests.
A year later, you're searching for the same endpoint in three different folders.
Someone adds tags to make filtering easier.
Another team creates a "Regression" folder.
Someone else duplicates a payment test because they couldn't find the original.
Suddenly, your biggest maintenance challenge isn't writing tests—it's finding them.
Over the years, I've experimented with nearly every way of organizing API tests. Some structures looked elegant when the suite contained 50 tests but became impossible to maintain once it crossed a few hundred.
Eventually, we settled on a domain-driven structure combined with a disciplined tagging strategy. That approach has comfortably scaled to thousands of API tests across multiple services without becoming difficult to navigate.
Here's what worked, what didn't, and why.
The Folder-per-Endpoint Approach — When It Breaks
Most teams start here.
The structure feels obvious.
tests/
├── users/
├── orders/
├── products/
├── invoices/
└── payments/
Inside each folder:
users/
├── GET User
├── POST User
├── PUT User
└── DELETE User
For small projects, it's perfectly reasonable.
Everyone knows where to place a new test.
Finding endpoints is easy.
Then the application grows.
The First Problem: Shared Business Flows
Suppose you're testing checkout.
The workflow touches:
- Customers
- Orders
- Inventory
- Coupons
- Payments
- Shipping
Where does the test belong?
Inside:
orders/
or
payments/
or
shipping/
No answer feels correct.
Eventually teams duplicate the test in multiple locations.
Maintenance immediately becomes painful.
The Second Problem: Versioning
Now introduce:
v1/
and later:
v2/
Then:
internal/
Soon your folders resemble:
orders/
v1/
v2/
admin/
legacy/
Finding the right test becomes increasingly difficult.
Endpoint-Centric Thinking
The biggest issue is conceptual.
Endpoints aren't how businesses think.
Businesses think in capabilities:
- Customer onboarding
- Checkout
- Subscription management
- Identity verification
Organizing purely around URLs eventually creates friction.
Tags as the Primary Organization (The Trap)
Once folders become messy, many teams switch to tags.
Everything receives labels.
Examples include:
smoke
regression
payments
critical
api
checkout
release
v2
Filtering becomes powerful.
Searching becomes fast.
Initially, it feels like the perfect solution.
Until tags become your primary organizational model.
The Tag Explosion
After several months, you might discover:
payment
payments
payment-api
payment-service
pay
billing
billing-api
All describing roughly the same thing.
Different engineers invent slightly different naming conventions.
Now filtering becomes inconsistent.
Tags Drift Over Time
Folders usually remain stable.
Tags evolve continuously.
A regression tag might originally mean:
Runs nightly
Six months later it means:
Runs before release
Eventually nobody remembers which tags are authoritative.
The test suite becomes harder—not easier—to navigate.
Tags Should Describe, Not Organize
The biggest lesson we learned:
Folders answer:
Where does this test belong?
Tags answer:
What characteristics does this test have?
Those are different questions.
Trying to solve both with tags usually fails.
Domain-Driven Structure: payments/, identity/, catalog/
The structure that scaled best mirrored the business itself.
Instead of endpoints:
users/
orders/
products/
we organized around domains.
Example:
payments/
identity/
catalog/
subscriptions/
fulfillment/
notifications/
Notice these aren't APIs.
They're business capabilities.
Why This Works Better
Consider a refund workflow.
It might involve:
- Orders
- Payments
- Customer accounts
- Tax calculations
Every one of those belongs naturally inside:
payments/
because the business capability is payment processing.
The workflow remains together.
Maintenance becomes significantly easier.
Teams Already Think This Way
Most engineering organizations already assign ownership by domain.
Examples:
Identity Team
Payments Team
Catalog Team
Fulfillment Team
Matching the test suite to organizational boundaries makes ownership obvious.
Developers immediately know where changes belong.
The Hybrid We Landed On (And the Rule That Made It Stick)
Eventually we stopped searching for a perfect system.
Instead, we combined the strengths of folders and tags.
Our approach became:
Folders
Represent business domains.
Tags
Represent execution behavior.
For example:
payments/
contains:
- Card authorization
- Refunds
- Settlements
- Chargebacks
Tags include:
smoke
regression
critical
contract
integration
performance
The distinction remains clear.
Folders describe ownership.
Tags describe execution.
The Rule That Prevented Chaos
We introduced one simple rule:
A test can belong to only one folder, but it can have many tags.
That eliminated endless debates.
No duplicated tests.
No multiple folder locations.
One source of truth.
If another team needs the same scenario, they reference it instead of copying it.
Naming Conventions Matter Too
We also standardized test names.
Instead of:
Test Payment
we wrote:
Authorize payment with expired card
Instead of:
User API
we wrote:
Create customer with duplicate email
Searching became dramatically easier.
Intent became obvious.
A Working Folder Tree from One of Our Services
Here's a simplified version of the structure we've used successfully.
tests/
│
├── identity/
│ ├── authentication/
│ ├── authorization/
│ ├── users/
│ └── sessions/
│
├── catalog/
│ ├── products/
│ ├── categories/
│ └── pricing/
│
├── payments/
│ ├── authorization/
│ ├── capture/
│ ├── refunds/
│ ├── disputes/
│ └── settlement/
│
├── fulfillment/
│ ├── shipping/
│ ├── inventory/
│ └── warehouse/
│
└── shared/
├── fixtures/
├── helpers/
├── authentication/
└── utilities/
Notice what's missing.
There's no:
GET Tests
POST Tests
PUT Tests
DELETE Tests
HTTP methods don't define business intent.
Capabilities do.
How Tags Fit Into This Structure
Every test still carries metadata.
Examples:
smoke
critical
contract
integration
nightly
pci
Now executing:
Smoke + Payments
or
Regression + Identity
becomes straightforward without changing the folder hierarchy.
Folders stay stable.
Execution remains flexible.
Scaling Beyond a Thousand Tests
One question often comes up:
"Will this still work when the suite grows?"
In my experience, yes.
Because domains evolve much more slowly than endpoints.
Endpoints may:
- Change versions
- Merge
- Split
- Be deprecated
Business capabilities rarely change as dramatically.
Organizations still process:
- Payments
- Orders
- Customers
- Products
The underlying APIs evolve.
The business domains remain recognizable.
That's why domain-driven structures continue to scale while endpoint-based structures often become cluttered.
Choosing the Right Strategy for Your Team
There isn't a universal folder structure that fits every organization.
However, there are a few principles that consistently produce maintainable test suites:
- Organize by business capability rather than individual endpoints.
- Use folders to express ownership and intent.
- Use tags to control execution and filtering.
- Avoid duplicating tests across multiple locations.
- Standardize naming conventions from the beginning.
- Keep shared utilities separate from business scenarios.
Most importantly, agree on these rules early.
Reorganizing fifty tests is easy.
Reorganizing two thousand tests usually becomes a months-long effort.
Final Thoughts
The way you organize your API test suite matters far more than most teams realize.
Poor organization doesn't become obvious at fifty tests.
It becomes obvious at five hundred.
By the time your suite reaches a few thousand tests, every structural decision either accelerates development or slows it down.
After experimenting with endpoint-based folders, tag-heavy approaches, and domain-driven organization, the hybrid model consistently delivered the best balance of discoverability, maintainability, and scalability.
Folders answer where a test belongs.
Tags answer how a test should run.
Keeping those responsibilities separate is what allowed our suite to grow from a few dozen tests to well over two thousand without becoming unmanageable.
If you're comparing API testing platforms, take a look at how Shift-Left API compares with Insomnia.
A well-organized test suite isn't just easier to navigate—it becomes easier to maintain, easier to scale, and ultimately far more valuable to the engineering team that depends on it every day.
Top comments (0)