DEV Community

Cover image for I Built My First Python Library From Scratch — Here's Everything I Learned
Darshan Bamankar
Darshan Bamankar

Posted on

I Built My First Python Library From Scratch — Here's Everything I Learned

A few weeks ago I had a simple goal: understand how Python libraries actually work — not just how to use them, but how they're built, packaged, and shipped.

So I built one. From scratch. No tutorials that skip the hard parts. No boilerplate generators. Just me, Python, and a lot of mistakes.

The result is valify — a data validation library that's now at v0.7.0 with 2,000+ downloads. Here's everything I learned along the way.


Why Build a Validation Library?

I picked validation because it's something every project needs, it's genuinely useful, and it touches every important part of the Python ecosystem:

  • Clean OOP design
  • Custom exceptions
  • Type hints and mypy
  • Packaging and PyPI
  • Testing with pytest
  • Documentation with Sphinx

It also gave me a chance to study how real libraries like pydantic and marshmallow work under the hood.


The Project Structure That Professionals Use

The first thing I learned is that Python library structure matters a lot more than I thought.

Most tutorials show you this:

myproject/
├── mypackage/
│   └── __init__.py
└── setup.py
Enter fullscreen mode Exit fullscreen mode

But professional libraries use the src layout:

valify/
├── src/
│   └── valify/          ← the actual package
│       ├── __init__.py
│       ├── exceptions.py
│       ├── validators.py
│       └── schema.py
├── tests/
├── docs/
├── pyproject.toml
├── README.md
└── CHANGELOG.md
Enter fullscreen mode Exit fullscreen mode

Why src/? Without it, when you're in your project folder and run import valify, Python might import your local development files instead of the installed package. The src/ folder prevents that subtle bug entirely.


pyproject.toml — The Modern Standard

The old way of packaging Python involved three files: setup.py, setup.cfg, and MANIFEST.in. It was a mess.

Today, everything lives in one file:

[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "valify"
version = "0.7.0"
description = "A composable, expressive data validation library for Python"
readme = "README.md"
requires-python = ">=3.10"
authors = [
    {name = "Darshan Bamankar", email = "darshanbamankar7@gmail.com"}
]
dependencies = []
Enter fullscreen mode Exit fullscreen mode

dependencies = [] is intentional — valify has zero external dependencies. This is a design goal. A good library doesn't bloat its users' environments.


Write Your Exceptions First — Always

This is the lesson I wish someone had told me before I started.

Every real library defines its own exception hierarchy. Here's why:

# Without custom exceptions — which library raised this?
except ValueError:
    ...

# With custom exceptions — crystal clear
except valify.ValidationError as e:
    print(e.field)    # which field failed
    print(e.value)    # what value was rejected
    print(e.message)  # human readable message
Enter fullscreen mode Exit fullscreen mode

Here's valify's exception hierarchy:

Exception
└── ValifyError               base  catch everything valify raises
    ├── ValidationError       a value failed validation
       └── RequiredFieldError   a required field was missing
    └── SchemaError           the schema definition is invalid
Enter fullscreen mode Exit fullscreen mode

The key design decision: every exception stores structured data as attributes, not just a string message. This lets callers inspect errors programmatically.

class ValidationError(ValifyError):
    def __init__(self, message: str, *, field: str | None = None, value: Any = None) -> None:
        self.message = message
        self.field = field      # "name"
        self.value = value      # "A"
        full_message = f"[{field}] {message}" if field else message
        super().__init__(full_message)
Enter fullscreen mode Exit fullscreen mode

Validators as Objects — The Strategy Pattern

The core insight of valify's design: validators are objects, not functions.

# Function approach — can't reuse or compose
validate_string("hello", min_length=2)

# Object approach — reusable, composable
v = StringValidator(min_length=2)
v.validate("hello")
Enter fullscreen mode Exit fullscreen mode

Every validator inherits from a base class:

class Validator:
    def validate(self, value: Any) -> Any:
        raise NotImplementedError(
            f"{type(self).__name__} must implement validate()"
        )

    def to_json_schema(self) -> dict[str, Any]:
        raise NotImplementedError(
            f"{type(self).__name__} must implement to_json_schema()"
        )
Enter fullscreen mode Exit fullscreen mode

This is the Strategy Pattern — each validator encapsulates one validation strategy. Because they're objects, you can store them in dictionaries, pass them around, and compose them together in a Schema.


The Detail That Trips Everyone Up — bool is a subclass of int

Python has a quirk that burned me early:

isinstance(True, int)   # True !!
isinstance(False, int)  # True !!
Enter fullscreen mode Exit fullscreen mode

bool is a subclass of int in Python. This means if you check int first, True and False pass as valid integers. The fix:

if not isinstance(value, int) or isinstance(value, bool):
    raise ValidationError(...)
Enter fullscreen mode Exit fullscreen mode

Always check bool before int. This pattern appears in IntValidator, FloatValidator, and Schema.from_example().


Accumulating Errors — The Most Important UX Decision

Most validation libraries stop at the first error:

❌ name: too short
# stops here, never checks age or email
Enter fullscreen mode Exit fullscreen mode

valify collects ALL errors before raising:

❌ name: Must be at least 2 characters long.
❌ age: Must be at least 0.
❌ email: 'bad' is not a valid email address.
Enter fullscreen mode Exit fullscreen mode

The implementation uses a simple dict to collect errors:

def validate(self, data: dict[str, Any]) -> dict[str, Any]:
    errors: dict[str, str] = {}
    result: dict[str, Any] = {}

    for field_name, validator in self.fields.items():
        if field_name not in data:
            errors[field_name] = "Required field is missing."
            continue
        try:
            result[field_name] = validator.validate(data[field_name])
        except ValidationError as e:
            errors[field_name] = e.message  # collect, don't raise

    if errors:  # raise everything at once
        raise ValidationError(...)

    return result
Enter fullscreen mode Exit fullscreen mode

This single design decision makes valify dramatically more useful for real applications.


Schema.from_example() — The Killer Feature

This is what makes valify unique. No other validation library does this:

schema = Schema.from_example({
    "name":    "Darshan",
    "age":     20,
    "email":   "darshan@example.com",
    "score":   9.5,
    "active":  True,
    "address": {
        "city": "Pune",
        "pin":  "411001",
    },
    "tags": ["python", "developer"],
})
Enter fullscreen mode Exit fullscreen mode

valify looks at your sample data and automatically infers:

  • "Darshan"StringValidator()
  • 20IntValidator()
  • "darshan@example.com"EmailValidator() (detected via regex)
  • 9.5FloatValidator()
  • TrueBoolValidator()
  • {...} → nested Schema() (recursive!)
  • [...]ListValidator() (inferred from first item)

The implementation uses a @classmethod — a factory method that creates a new Schema instance:

@classmethod
def from_example(cls, example: dict[str, Any]) -> "Schema":
    fields: dict[str, Validator] = {}

    for key, value in example.items():
        if isinstance(value, bool):      # bool before int — critical!
            fields[key] = BoolValidator()
        elif isinstance(value, int):
            fields[key] = IntValidator()
        elif isinstance(value, float):
            fields[key] = FloatValidator()
        elif isinstance(value, str):
            if _EMAIL_RE.match(value):
                fields[key] = EmailValidator()
            else:
                fields[key] = StringValidator()
        elif isinstance(value, dict):
            fields[key] = cls.from_example(value)  # recursion!
        elif isinstance(value, list) and value:
            # infer from first item
            ...

    return cls(fields)
Enter fullscreen mode Exit fullscreen mode

JSON Schema Export

Every validator in valify can export itself as standard JSON Schema:

from valify import Schema, StringValidator, IntValidator, EmailValidator
from valify import OptionalValidator
import json

schema = Schema({
    "name":  StringValidator(min_length=2),
    "age":   IntValidator(min_value=0, max_value=120),
    "email": EmailValidator(),
    "bio":   OptionalValidator(StringValidator(), default=""),
})

print(json.dumps(schema.to_json_schema(), indent=2))
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "type": "object",
  "properties": {
    "name": {"type": "string", "minLength": 2},
    "age":  {"type": "integer", "minimum": 0, "maximum": 120},
    "email": {"type": "string", "format": "email"},
    "bio":  {"anyOf": [{"type": "string"}, {"type": "null"}]}
  },
  "required": ["name", "age", "email"]
}
Enter fullscreen mode Exit fullscreen mode

This means valify schemas can be used to generate OpenAPI/Swagger documentation, validate JSON APIs, and integrate with any tool that understands JSON Schema.


Type Hints and mypy — Non-Negotiable

Every method in valify is fully typed:

def validate(self, value: Any) -> str:
    ...

def to_json_schema(self) -> dict[str, Any]:
    ...
Enter fullscreen mode Exit fullscreen mode

Running mypy src/valify with strict mode passes with zero errors. This isn't just for show — it catches real bugs before runtime and makes the library a pleasure to use in IDEs with autocomplete.

One lesson: always annotate local variables when mypy can't infer the type:

# mypy infers dict[str, str] — wrong!
schema = {"type": "string"}
schema["minLength"] = 2  # error: int is not str

# explicit annotation — correct
schema: dict[str, Any] = {"type": "string"}
schema["minLength"] = 2  # ✅
Enter fullscreen mode Exit fullscreen mode

Testing — 76 Tests and Counting

Every feature has tests. Every validator, every edge case, every error path:

class TestIntValidator:
    def test_bool_rejected(self):
        v = IntValidator()
        with pytest.raises(ValidationError):
            v.validate(True)  # bool is not int!

    def test_coerce_string_to_int(self):
        v = IntValidator(coerce=True)
        assert v.validate("42") == 42
Enter fullscreen mode Exit fullscreen mode

The key testing principles I learned:

  • One assertion per test — when it fails, you know exactly what broke
  • Test the sad path as much as the happy path
  • Group tests in classes with setup_method for shared setup

The Numbers


What's Next

  • 0.8.0Schema.is_valid(), Schema.errors(), RegexValidator
  • Flask/FastAPI integration
  • CLI tool — valify validate data.json schema.json
  • Road to 1.0

Try It

pip install valify
Enter fullscreen mode Exit fullscreen mode
from valify import Schema, StringValidator, IntValidator, EmailValidator

schema = Schema({
    "name":  StringValidator(min_length=2),
    "age":   IntValidator(min_value=0, max_value=120),
    "email": EmailValidator(),
})

schema.validate({
    "name":  "Darshan",
    "age":   20,
    "email": "darshan@example.com",
})
Enter fullscreen mode Exit fullscreen mode

The Biggest Lesson

Building a library teaches you things that no tutorial can. You stop being a consumer of the ecosystem and start understanding how it actually works.

If you've been thinking about building your own library — just start. Pick something small, build it properly, ship it. The Python community is welcoming and the tooling has never been better.

And if you find valify useful, a ⭐ on GitHub means the world to a new open source maintainer.


Top comments (0)