DEV Community

Cover image for The Complete Guide to Building and Publishing a Python Library
Brij Kishore Pandey
Brij Kishore Pandey

Posted on

The Complete Guide to Building and Publishing a Python Library

In this comprehensive guide, we'll walk through the entire process of creating a Python library from scratch, testing it, documenting it, and finally publishing it on PyPI. We'll use a hypothetical library called "DataWizard" as our example.

Table of Contents

  1. Project Setup
  2. Writing the Library Code
  3. Setting Up Testing
  4. Configuring for Packaging
  5. Creating Documentation
  6. Setting Up Continuous Integration
  7. Managing Dependencies
  8. Creating a Readme and License
  9. Version Control with Git
  10. Publishing to PyPI
  11. Setting Up Documentation Hosting
  12. Maintaining Your Project

1. Project Setup

First, create a new directory for your project and set up a virtual environment:

mkdir datawizard
cd datawizard
python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Enter fullscreen mode Exit fullscreen mode

Create a basic project structure:

datawizard/
├── src/
│   └── datawizard/
│       └── __init__.py
├── tests/
├── docs/
├── README.md
├── LICENSE
├── .gitignore
└── pyproject.toml
Enter fullscreen mode Exit fullscreen mode

2. Writing the Library Code

Write your library code in the src/datawizard/ directory. For example, create a file core.py:

# src/datawizard/core.py

def process_data(data: list) -> list:
    """
    Process the input data.

    Args:
        data (list): Input data to process.

    Returns:
        list: Processed data.
    """
    return [item.upper() if isinstance(item, str) else item for item in data]

Enter fullscreen mode Exit fullscreen mode

In init.py, import and expose your main functions:

# src/datawizard/__init__.py

from .core import process_data

__all__ = ['process_data']
Enter fullscreen mode Exit fullscreen mode

3. Setting Up Testing

Install pytest:

pip install pytest
Enter fullscreen mode Exit fullscreen mode

Create a test file tests/test_core.py:

# tests/test_core.py

from datawizard import process_data

def test_process_data():
    input_data = ["hello", 42, "world"]
    expected_output = ["HELLO", 42, "WORLD"]
    assert process_data(input_data) == expected_output
Enter fullscreen mode Exit fullscreen mode

Run your tests with:

pytest
Enter fullscreen mode Exit fullscreen mode

4. Configuring for Packaging

Create a pyproject.toml file in the root directory:

[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "datawizard"
version = "0.1.0"
description = "A Python library for data processing wizardry"
readme = "README.md"
authors = [{ name = "Your Name", email = "your.email@example.com" }]
license = { file = "LICENSE" }
classifiers = [
    "License :: OSI Approved :: MIT License",
    "Programming Language :: Python",
    "Programming Language :: Python :: 3",
]
keywords = ["data", "processing", "wizard"]
dependencies = [
    "numpy>=1.20.0",
    "pandas>=1.2.0",
]
requires-python = ">=3.7"

[project.urls]
Homepage = "https://github.com/yourusername/datawizard"

[tool.setuptools.packages.find]
where = ["src"]
include = ["datawizard*"]
exclude = ["tests*"]
Enter fullscreen mode Exit fullscreen mode

5. Creating Documentation

Install MkDocs and the Material theme:

pip install mkdocs-material mkdocstrings[python]
Enter fullscreen mode Exit fullscreen mode

Create an mkdocs.yml file in the root directory:

site_name: DataWizard Documentation
theme:
  name: material

plugins:
  - search
  - mkdocstrings:
      default_handler: python

nav:
  - Home: index.md
  - Installation: installation.md
  - Usage: usage.md
  - API Reference: api_reference.md
Enter fullscreen mode Exit fullscreen mode

Create Markdown files in the docs/ directory for each section of your documentation.

Build your documentation locally with:

mkdocs serve
Enter fullscreen mode Exit fullscreen mode

6. Setting Up Continuous Integration

Create a .github/workflows/ci.yml file for GitHub Actions:

name: CI

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.7, 3.8, 3.9, '3.10', '3.11']

    steps:
    - uses: actions/checkout@v2
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v2
      with:
        python-version: ${{ matrix.python-version }}
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install .[dev]
    - name: Run tests
      run: pytest
Enter fullscreen mode Exit fullscreen mode

7. Managing Dependencies

Create a requirements.txt file for development dependencies:

pytest==7.3.1
mkdocs-material==9.1.15
mkdocstrings[python]==0.22.0
Enter fullscreen mode Exit fullscreen mode

Install these dependencies:

pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

8. Creating a Readme and License

Create a README.md file in the root directory:

# DataWizard

DataWizard is a Python library for advanced data processing.
Enter fullscreen mode Exit fullscreen mode

Installation

pip install datawizard
Enter fullscreen mode Exit fullscreen mode

Usage

from datawizard import process_data

data = ["hello", 42, "world"]
result = process_data(data)
print(result)  # Outputs: ["HELLO", 42, "WORLD"]
Enter fullscreen mode Exit fullscreen mode

Choose a license (e.g., MIT License) and add it to the LICENSE file.

9. Version Control with Git

Initialize a git repository and make your first commit:

git init
git add .
git commit -m "Initial commit"
Enter fullscreen mode Exit fullscreen mode

Create a .gitignore file:

venv/
__pycache__/
*.pyc
.pytest_cache/
dist/
build/
*.egg-info/
Enter fullscreen mode Exit fullscreen mode

10. Publishing to PyPI

First, install build and twine:

pip install build twine
Enter fullscreen mode Exit fullscreen mode

Build your package:

python -m build
Enter fullscreen mode Exit fullscreen mode

Create a .github/workflows/publish.yml file:

name: Publish to PyPI

on:
  release:
    types: [created]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - name: Set up Python
      uses: actions/setup-python@v2
      with:
        python-version: '3.x'
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install build twine
    - name: Build and publish
      env:
        TWINE_USERNAME: __token__
        TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
      run: |
        python -m build
        twine upload dist/*
Enter fullscreen mode Exit fullscreen mode

Create a PyPI account at https://pypi.org and generate an API token. Add this token as a secret in your GitHub repository settings with the name PYPI_API_TOKEN.

11. Setting Up Documentation Hosting

Create a .github/workflows/docs.yml file:

name: Publish docs via GitHub Pages
on:
  push:
    branches:
      - main

jobs:
  build:
    name: Deploy docs
    runs-on: ubuntu-latest
    steps:
      - name: Checkout main
        uses: actions/checkout@v2

      - name: Deploy docs
        uses: mhausenblas/mkdocs-deploy-gh-pages@master
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          CONFIG_FILE: mkdocs.yml
          EXTRA_PACKAGES: build-base
Enter fullscreen mode Exit fullscreen mode

Enable GitHub Pages in your repository settings, setting the source to the gh-pages branch.

12. Maintaining Your Project

  • Regularly update your dependencies
  • Respond to issues and pull requests on GitHub
  • Keep your documentation up-to-date
  • Release new versions when you make significant changes or bug fixes

Remember to increment the version number in pyproject.toml for each new release.

Conclusion

Building, testing, documenting, and publishing a Python library involves many steps, but it's a rewarding process that contributes to the Python ecosystem. By following this guide, you can create a well-structured, documented, and maintainable Python library that others can easily install and use via PyPI.

Remember that this is an iterative process. As you develop your library further, you'll likely need to update your documentation, add new tests, and possibly refactor your code. Always keep your users in mind and strive to make your library as user-friendly and well-documented as possible.

Happy coding!

Top comments (0)