How I built a bulletproof CI/CD for my LLM Python library

#python #opensource #cicd #devops

When building an open-source library that integrates with multiple LLM providers (OpenAI, Anthropic, Google), reliability matters. Users expect upgrades to be safe and predictable.

This post describes the CI/CD setup I use for llm-api-adapter. The key idea is simple: test not only the code, but the actual published package.

The Strategy: two pipelines, three stages

I use a dual-pipeline setup aligned with GitHub Flow:

Dev pipeline — runs on every push to dev. Its job is early feedback and validating the distribution process.
Main pipeline — runs on main and version tags. Its job is stable, repeatable releases to PyPI.

Most CI setups stop at unit or integration tests. This one goes further by validating the artifact installed from TestPyPI.

1. Dev pipeline: pre-flight validation

The dev workflow is where most of the safety guarantees come from.

Stage A: Unit & Integration tests

Executed with pytest
Tests are separated via markers (unit, integration)
Fast feedback on logic and provider integration

Stage B: publish to TestPyPI

After tests pass, the package is built and published to TestPyPI.

This step catches issues that tests alone cannot:

Incorrect pyproject.toml
Missing files in the source distribution
Broken dependency declarations

Stage C: E2E tests from TestPyPI

This is the critical part of the pipeline.

The job:

Waits for TestPyPI to index the new release
Installs the package from TestPyPI, not from source
Pulls dependencies from the real PyPI
Runs real end-to-end tests using live API keys

pip install --index-url https://test.pypi.org/simple/ \
            --extra-index-url https://pypi.org/simple \
            llm-api-adapter

At this point, the CI environment matches what users will experience after pip install.

2. Main pipeline: controlled release

Once the package is validated in dev, changes move to main.

What runs on `main`

Full unit + integration test suite on every PR
No publishing on pushes

What triggers a release

A version tag (vX.Y.Z)
Build and publish to PyPI
Credentials handled via GitHub Secrets

By the time a tag is pushed, the same artifact has already passed E2E tests via TestPyPI.

Why this setup works

Before publishing to PyPI, I know that:

The code behaves correctly (unit + integration tests)
The package is installable from a registry (TestPyPI)
External LLM providers respond as expected (E2E tests)

Most importantly, this approach prevents broken versions from ever being published to PyPI.

Conclusion

If your library depends on external APIs, testing only the source code is not enough.

Testing the published artifact is what makes releases predictable and safe.

The full setup is fully public and reproducible:

👉 Repository:

https://github.com/Inozem/llm_api_adapter

👉 GitHub Actions workflows:

https://github.com/Inozem/llm_api_adapter/tree/main/.github/workflows

Question for you

How do you usually set up CI for your open-source projects?

DEV Community

How I built a bulletproof CI/CD for my LLM Python library

The Strategy: two pipelines, three stages