Announcing pytest-mockllm v0.2.1: "True Fidelity"

#python #automation #testing

🎯

What's New in v0.2.1

True Async & Await: Native coroutines for OpenAI, Anthropic, Gemini, and LangChain
Pro Tokenizers: tiktoken integration for >99% token accuracy
PII Redaction: Automatic scrubbing of API keys before cassette storage
Chaos Engineering: Simulate rate limits, timeouts, and network jitter
Python 3.14 Ready: First to officially support and verify the latest Python

We are thrilled to announce the release of pytest-mockllm v0.2.1, codenamed "True Fidelity".

This release is a complete technical overhaul designed to make LLM testing as robust as the systems you're building. For the first time, developers can test complex asynchronous AI workflows with a level of accuracy that mirrors production environments exactly.

🚀

The Challenge We Solved

When we first released pytest-mockllm, our async support was a "best-effort" wrapper around synchronous mocks. While this worked for simple cases, it failed in production-grade environments where developers used:

Complex coroutine orchestration: Real async workflows with multiple awaits
Asynchronous generators: Streaming responses via LangChain's astream and ainvoke
Strict type checking: MyPy compatibility requirements
Enterprise security: VCR-style recordings risking API key leaks

⚡

True Async & Await

We've rewritten our core mocks from the ground up to support real asynchronous patterns. No more fake awaitables—pytest-mockllm now provides native coroutines and async iterators for OpenAI, Anthropic, Gemini, and LangChain.

Every provider mock now implements native async def methods that return real coroutines. This ensures that await calls behave exactly as they do with real SDKs.

import pytest
from pytest_mockllm import mock_openai

@pytest.mark.asyncio
async def test_async_completion():
    with mock_openai() as mock:
        mock.set_response("Hello from pytest-mockllm!")

        # Real async/await - no fake wrappers
        response = await client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": "Hi"}]
        )

        assert response.choices[0].message.content == "Hello from pytest-mockllm!"

Pro Tokenizers (tiktoken)

Standard character-based token estimation is often off by 20-30%. By integrating tiktoken (OpenAI) and custom heuristics (Anthropic), we brought our accuracy to >99% for standard models.

This allows developers to write precise assertions on usage and cost—critical for prompt window testing and budget limits.

Real Accuracy

Token counts now match exactly what you'd see in your OpenAI dashboard.

ROI Dashboard

Run your tests and see your savings! Every session now ends with a professional terminal summary showing exactly how many tokens you avoided paying for.

═══════════════════════════════════════════════════════
   pytest-mockllm ROI Summary
═══════════════════════════════════════════════════════
   Tests Run:        47
   API Calls Mocked: 312
   Tokens Saved:     847,291
   Estimated Cost:   $12.71 (at GPT-4 pricing)
═══════════════════════════════════════════════════════

PII Redaction by Default

Security should never be an afterthought. We implemented a PIIRedactor that automatically scrubs sensitive data before the cassette is ever written to disk, ensuring zero leak risk.

api\_key and sk-... strings
Authorization: Bearer ... headers
Sensitive parameters in request bodies

Enterprise Ready

Teams can now safely share VCR cassettes across repositories without security risk.

Chaos Engineering for LLMs

The real world is messy. Our new chaos tools allow you to simulate network jitter and random API refusals to ensure your retry logic and fallback systems are bulletproof.

from pytest_mockllm import mock_openai, chaos

def test_retry_logic():
    with mock_openai() as mock:
        # Simulate rate limit on first 2 calls, then succeed
        mock.add_chaos(chaos.rate_limit(times=2))
        mock.set_response("Success after retry!")

        # Your retry logic should handle this gracefully
        response = call_with_retry(prompt="Hello")
        assert response == "Success after retry!"

The First to Python 3.14

We are proud to be one of the first AI testing tools to officially support and verify compatibility with Python 3.14. We are building for the future, today.

🎯

Outcomes

Zero Flakiness: True async support eliminated TypeError and "coroutine not awaited" bugs in CI
Enterprise Ready: Secure recording allows teams to share cassettes without security risk
Future Proof: Full verification against Python 3.14 ensures the library is ready for the next decade of AI development