You pick a testing framework when you start a project and rarely revisit the decision. Pick the wrong one and you end up refactoring your entire test suite six months later — or shipping bugs that your tests structurally cannot catch. In 2026, three frameworks dominate the Python testing landscape: unittest (standard library), pytest (ecosystem default), and hypothesis (property-based testing). Here is what each one actually does, where it falls short, and how to combine them effectively.
unittest: the built-in baseline
unittest ships with Python. No installation required. It follows the xUnit pattern: subclass TestCase, prefix methods with test_, and use self.assert* helpers.
import unittest
from myapp.auth import TokenValidator, TokenExpiredError
class TokenValidatorTest(unittest.TestCase):
def setUp(self):
self.validator = TokenValidator(secret="s3cr3t", max_age=3600)
def test_valid_token_passes(self):
token = self.validator.sign({"user_id": 42})
result = self.validator.verify(token)
self.assertEqual(result["user_id"], 42)
def test_expired_token_raises(self):
token = self.validator.sign({"user_id": 42}, issued_at=-7200)
with self.assertRaises(TokenExpiredError):
self.validator.verify(token)
if __name__ == "__main__":
unittest.main()
Where it shines: zero-dependency libraries, corporate environments where the approved package list is locked down, or codebases that already use it and don't justify migration cost.
Where it falls short: sharing fixtures across modules requires careful use of setUpModule or test inheritance hierarchies that grow awkward fast. Parametrization is verbose. Assertion failure output is minimal compared to what pytest produces.
The ergonomics gap is real. A failing self.assertEqual(expected, result) shows you a diff. A plain assert expected == result in raw Python gives you AssertionError with no context. Pytest solves this with rewritten assertion introspection.
Pytest: where most projects land
Pytest's key features are fixture injection with configurable scope, smart assertion introspection, and @pytest.mark.parametrize. No boilerplate class required:
import pytest
from myapp.auth import TokenValidator, TokenExpiredError
@pytest.fixture(scope="module")
def validator():
return TokenValidator(secret="s3cr3t", max_age=3600)
def test_valid_token_passes(validator):
token = validator.sign({"user_id": 42})
assert validator.verify(token)["user_id"] == 42
def test_expired_token_raises(validator):
token = validator.sign({"user_id": 42}, issued_at=-7200)
with pytest.raises(TokenExpiredError):
validator.verify(token)
@pytest.mark.parametrize("payload,expected_error", [
({"user_id": None}, ValueError),
({}, KeyError),
({"user_id": "not-an-int"}, TypeError),
])
def test_invalid_payloads(validator, payload, expected_error):
with pytest.raises(expected_error):
validator.sign(payload)
Fixtures are composable and scoped at function, class, module, or session level. A db_session fixture can depend on a db_engine fixture without any global state. This scales cleanly across hundreds of tests without duplication.
Pytest runs unittest-based tests transparently, so incremental migration is painless. The plugin ecosystem is extensive — pytest-asyncio for async code, pytest-django for Django, pytest-xdist for parallel execution:
# Run tests in parallel across all CPU cores
pytest -n auto tests/
# Run only last-failed tests, then the full suite
pytest --lf tests/
Performance note: pytest's collection overhead becomes noticeable past 10k tests. Use -x (exit on first failure) during development and --lf for quick iteration. Save full suite runs for CI.
Hypothesis: what example-based tests structurally miss
The fundamental problem with hand-written test cases is that you only test what you think of. Hypothesis inverts this: you describe the shape of valid inputs and the library generates thousands of them — including boundary conditions you would never write manually.
from hypothesis import given, settings
from hypothesis import strategies as st
from myapp.auth import TokenValidator
validator = TokenValidator(secret="s3cr3t", max_age=3600)
@given(
user_id=st.integers(min_value=1, max_value=2**31 - 1),
roles=st.lists(
st.sampled_from(["admin", "user", "guest"]),
min_size=1,
max_size=10
)
)
@settings(max_examples=500)
def test_round_trip_never_corrupts_payload(user_id, roles):
payload = {"user_id": user_id, "roles": roles}
token = validator.sign(payload)
result = validator.verify(token)
assert result["user_id"] == user_id
assert result["roles"] == roles
When Hypothesis finds a failing case, it shrinks the input to the minimal reproduction. Instead of user_id=1_732_891_043, roles=["admin","user","guest","user","admin"], it reduces to user_id=1, roles=["admin"]. The shrunk example is persisted in a local database (.hypothesis/) and replays on every subsequent run.
This catches real classes of bugs: integer overflow when a user_id exceeds a 32-bit signed boundary, Unicode normalization issues when a role name contains combining characters, off-by-one in expiry logic when max_age exactly equals the time offset. These are precisely the kinds of inputs attackers probe for manually.
Hypothesis integrates directly with pytest — @given decorates a standard def test_ function and failures surface as normal pytest output.
When to use what
Keep unittest if you have an existing test suite where migration cost isn't justified, or if you're writing a library that must ship with zero additional dependencies.
Default to pytest for everything new. The ergonomics, fixture system, and plugin ecosystem are significantly better. It also runs existing unittest tests transparently.
Add hypothesis on top of pytest for any module that processes external input — parsers, serializers, authentication helpers, API input validators. It does not replace example-based tests; it finds the bugs your examples can't.
A practical configuration for production projects:
# pyproject.toml
[tool.pytest.ini_options]
addopts = "-ra -q --tb=short"
testpaths = ["tests"]
[tool.hypothesis]
max_examples = 100
database = ".hypothesis"
In CI, increase coverage: HYPOTHESIS_MAX_EXAMPLES=1000 pytest tests/.
Security-critical code deserves property-based tests
If your codebase includes token validation, permission checks, input sanitization, or any authentication logic, hypothesis belongs in your test suite. The security hardening checklists we publish include property-based test coverage as a requirement for auth and authorization modules — alongside SAST tooling and manual review. Example-based tests systematically miss integer boundary conditions and encoding edge cases. Those are precisely the inputs attackers try first.
The takeaway
Don't overthink the initial choice: start with pytest. Add hypothesis to any module that handles external input or implements security-sensitive logic. Keep unittest only where you already have it.
The combination of pytest + hypothesis + pytest-cov covers 95% of what production Python projects need. Install pytest-xdist once test runtime crosses two minutes. Everything else is noise.
I run AYI NEDJIMI Consultants, a cybersecurity consulting firm. We publish free security hardening checklists — PDF and Excel.
Top comments (0)