DEV Community

Sergey Inozemtsev
Sergey Inozemtsev

Posted on

How I built a bulletproof CI/CD for my LLM Python library

When building an open-source library that integrates with multiple LLM providers (OpenAI, Anthropic, Google), reliability matters. Users expect upgrades to be safe and predictable.

This post describes the CI/CD setup I use for llm-api-adapter. The key idea is simple: test not only the code, but the actual published package.


The Strategy: two pipelines, three stages

I use a dual-pipeline setup aligned with GitHub Flow:

  • Dev pipeline β€” runs on every push to dev. Its job is early feedback and validating the distribution process.
  • Main pipeline β€” runs on main and version tags. Its job is stable, repeatable releases to PyPI.

Most CI setups stop at unit or integration tests. This one goes further by validating the artifact installed from TestPyPI.


1. Dev pipeline: pre-flight validation

The dev workflow is where most of the safety guarantees come from.

Stage A: Unit & Integration tests

  • Executed with pytest
  • Tests are separated via markers (unit, integration)
  • Fast feedback on logic and provider integration

Stage B: publish to TestPyPI

After tests pass, the package is built and published to TestPyPI.

This step catches issues that tests alone cannot:

  • Incorrect pyproject.toml
  • Missing files in the source distribution
  • Broken dependency declarations

Stage C: E2E tests from TestPyPI

This is the critical part of the pipeline.

The job:

  1. Waits for TestPyPI to index the new release
  2. Installs the package from TestPyPI, not from source
  3. Pulls dependencies from the real PyPI
  4. Runs real end-to-end tests using live API keys
pip install --index-url https://test.pypi.org/simple/ \
            --extra-index-url https://pypi.org/simple \
            llm-api-adapter
Enter fullscreen mode Exit fullscreen mode

At this point, the CI environment matches what users will experience after pip install.


2. Main pipeline: controlled release

Once the package is validated in dev, changes move to main.

What runs on main

  • Full unit + integration test suite on every PR
  • No publishing on pushes

What triggers a release

  • A version tag (vX.Y.Z)
  • Build and publish to PyPI
  • Credentials handled via GitHub Secrets

By the time a tag is pushed, the same artifact has already passed E2E tests via TestPyPI.


Why this setup works

Before publishing to PyPI, I know that:

  • The code behaves correctly (unit + integration tests)
  • The package is installable from a registry (TestPyPI)
  • External LLM providers respond as expected (E2E tests)

Most importantly, this approach prevents broken versions from ever being published to PyPI.


Conclusion

If your library depends on external APIs, testing only the source code is not enough.

Testing the published artifact is what makes releases predictable and safe.

The full setup is fully public and reproducible:

πŸ‘‰ Repository:

https://github.com/Inozem/llm_api_adapter

πŸ‘‰ GitHub Actions workflows:

https://github.com/Inozem/llm_api_adapter/tree/main/.github/workflows


Question for you

How do you usually set up CI for your open-source projects?

Top comments (0)