If you’ve written integration tests in Python long enough, you’ve hit this wall.
Your test calls three external services.
You mock one endpoint.
Then another.
Then another.
Suddenly your test is 60 lines long and half of it is patching.
At that point you’re not testing behavior.
You’re maintaining scaffolding.
I ran into this repeatedly while working on service-to-service flows, especially when:
- A single operation triggered multiple HTTP calls
- Different tests required different combinations of responses
- Fixtures started turning into mini frameworks
The tooling ecosystem is strong. responses, unittest.mock, httpx mocking utilities. But once endpoint count increases, ergonomics start degrading.
The issue is not capability.
It is readability.
The Breaking Point
Here is what multi-endpoint mocking often turns into:
import responses
@responses.activate
def test_checkout():
responses.add(
responses.GET,
"https://api.example.com/users/1",
json={"id": 1, "name": "Alice"},
status=200,
)
responses.add(
responses.POST,
"https://api.example.com/orders",
json={"order_id": 42},
status=201,
)
result = checkout_flow()
assert result.success
`
This works.
But scale it:
- Different combinations per test
- Conditional responses
- Dynamic payloads
- Partial URL matching
- Multiple external services
Now your fixtures grow. Helpers grow. Patching spreads.
Tests become infrastructure.
What I Wanted Instead
I wanted something that:
- Lives naturally inside pytest
- Keeps mocks close to test logic
- Makes multi-endpoint flows readable
- Avoids spinning up test servers
- Avoids deep patch trees
So I built a small utility called api-mocker.
The philosophy was simple: minimal surface area. No heavy DSL. No framework abstraction.
Just explicit endpoint declarations.
Example
`python
def test_checkout_flow(api_mocker):
api_mocker.get("/users/1").respond_with(
status=200,
json={"id": 1, "name": "Alice"}
)
api_mocker.post("/orders").respond_with(
status=201,
json={"order_id": 42}
)
result = checkout_flow()
assert result.success
`
No decorators.
No activation context.
No scattered patch logic.
The fixture handles lifecycle and cleanup per test.
Design Principles
1. Isolation Per Test
Mocks reset automatically after each test. No shared state leakage.
2. Explicit Failure
If an expected endpoint is not called, the test fails.
If an unexpected endpoint is called, the test fails.
Silent success hides integration problems.
3. Lightweight Interception
No embedded server. No process overhead.
Interception happens at the request layer.
Where This Approach Works Best
- Microservice architectures
- Services calling multiple third-party APIs
- Payment or auth flows
- Orchestrator style backends
Anywhere a single flow touches two or more HTTP integrations.
Open Questions I’m Exploring
Mocking libraries always face tension between simplicity and flexibility.
Some areas I’m actively thinking about:
- Async client handling
- Streaming responses
- When mocking should give way to contract testing
- Detecting over-mocking in large test suites
These are tradeoffs that affect long-term test quality.
Why I’m Sharing This
Mocking strategy has a direct impact on codebase health.
Readable tests scale.
Fixture jungles do not.
If you’ve dealt with messy multi-endpoint integration tests in Python, I’d genuinely like to hear:
- What worked well
- Where it broke down
- When you moved to contract testing
Project links if you want to explore the implementation:
PyPI: https://pypi.org/project/api-mocker/
GitHub: https://github.com/Sherin-SEF-AI/api-mocker
Curious to hear how others approach this problem.



Top comments (0)