A few weeks ago I had a simple goal: understand how Python libraries actually work — not just how to use them, but how they're built, packaged, and shipped.
So I built one. From scratch. No tutorials that skip the hard parts. No boilerplate generators. Just me, Python, and a lot of mistakes.
The result is valify — a data validation library that's now at v0.7.0 with 2,000+ downloads. Here's everything I learned along the way.
Why Build a Validation Library?
I picked validation because it's something every project needs, it's genuinely useful, and it touches every important part of the Python ecosystem:
- Clean OOP design
- Custom exceptions
- Type hints and mypy
- Packaging and PyPI
- Testing with pytest
- Documentation with Sphinx
It also gave me a chance to study how real libraries like pydantic and marshmallow work under the hood.
The Project Structure That Professionals Use
The first thing I learned is that Python library structure matters a lot more than I thought.
Most tutorials show you this:
myproject/
├── mypackage/
│ └── __init__.py
└── setup.py
But professional libraries use the src layout:
valify/
├── src/
│ └── valify/ ← the actual package
│ ├── __init__.py
│ ├── exceptions.py
│ ├── validators.py
│ └── schema.py
├── tests/
├── docs/
├── pyproject.toml
├── README.md
└── CHANGELOG.md
Why src/? Without it, when you're in your project folder and run import valify, Python might import your local development files instead of the installed package. The src/ folder prevents that subtle bug entirely.
pyproject.toml — The Modern Standard
The old way of packaging Python involved three files: setup.py, setup.cfg, and MANIFEST.in. It was a mess.
Today, everything lives in one file:
[build-system]
requires = ["setuptools>=68", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "valify"
version = "0.7.0"
description = "A composable, expressive data validation library for Python"
readme = "README.md"
requires-python = ">=3.10"
authors = [
{name = "Darshan Bamankar", email = "darshanbamankar7@gmail.com"}
]
dependencies = []
dependencies = [] is intentional — valify has zero external dependencies. This is a design goal. A good library doesn't bloat its users' environments.
Write Your Exceptions First — Always
This is the lesson I wish someone had told me before I started.
Every real library defines its own exception hierarchy. Here's why:
# Without custom exceptions — which library raised this?
except ValueError:
...
# With custom exceptions — crystal clear
except valify.ValidationError as e:
print(e.field) # which field failed
print(e.value) # what value was rejected
print(e.message) # human readable message
Here's valify's exception hierarchy:
Exception
└── ValifyError ← base — catch everything valify raises
├── ValidationError ← a value failed validation
│ └── RequiredFieldError ← a required field was missing
└── SchemaError ← the schema definition is invalid
The key design decision: every exception stores structured data as attributes, not just a string message. This lets callers inspect errors programmatically.
class ValidationError(ValifyError):
def __init__(self, message: str, *, field: str | None = None, value: Any = None) -> None:
self.message = message
self.field = field # "name"
self.value = value # "A"
full_message = f"[{field}] {message}" if field else message
super().__init__(full_message)
Validators as Objects — The Strategy Pattern
The core insight of valify's design: validators are objects, not functions.
# Function approach — can't reuse or compose
validate_string("hello", min_length=2)
# Object approach — reusable, composable
v = StringValidator(min_length=2)
v.validate("hello")
Every validator inherits from a base class:
class Validator:
def validate(self, value: Any) -> Any:
raise NotImplementedError(
f"{type(self).__name__} must implement validate()"
)
def to_json_schema(self) -> dict[str, Any]:
raise NotImplementedError(
f"{type(self).__name__} must implement to_json_schema()"
)
This is the Strategy Pattern — each validator encapsulates one validation strategy. Because they're objects, you can store them in dictionaries, pass them around, and compose them together in a Schema.
The Detail That Trips Everyone Up — bool is a subclass of int
Python has a quirk that burned me early:
isinstance(True, int) # True !!
isinstance(False, int) # True !!
bool is a subclass of int in Python. This means if you check int first, True and False pass as valid integers. The fix:
if not isinstance(value, int) or isinstance(value, bool):
raise ValidationError(...)
Always check bool before int. This pattern appears in IntValidator, FloatValidator, and Schema.from_example().
Accumulating Errors — The Most Important UX Decision
Most validation libraries stop at the first error:
❌ name: too short
# stops here, never checks age or email
valify collects ALL errors before raising:
❌ name: Must be at least 2 characters long.
❌ age: Must be at least 0.
❌ email: 'bad' is not a valid email address.
The implementation uses a simple dict to collect errors:
def validate(self, data: dict[str, Any]) -> dict[str, Any]:
errors: dict[str, str] = {}
result: dict[str, Any] = {}
for field_name, validator in self.fields.items():
if field_name not in data:
errors[field_name] = "Required field is missing."
continue
try:
result[field_name] = validator.validate(data[field_name])
except ValidationError as e:
errors[field_name] = e.message # collect, don't raise
if errors: # raise everything at once
raise ValidationError(...)
return result
This single design decision makes valify dramatically more useful for real applications.
Schema.from_example() — The Killer Feature
This is what makes valify unique. No other validation library does this:
schema = Schema.from_example({
"name": "Darshan",
"age": 20,
"email": "darshan@example.com",
"score": 9.5,
"active": True,
"address": {
"city": "Pune",
"pin": "411001",
},
"tags": ["python", "developer"],
})
valify looks at your sample data and automatically infers:
-
"Darshan"→StringValidator() -
20→IntValidator() -
"darshan@example.com"→EmailValidator()(detected via regex) -
9.5→FloatValidator() -
True→BoolValidator() -
{...}→ nestedSchema()(recursive!) -
[...]→ListValidator()(inferred from first item)
The implementation uses a @classmethod — a factory method that creates a new Schema instance:
@classmethod
def from_example(cls, example: dict[str, Any]) -> "Schema":
fields: dict[str, Validator] = {}
for key, value in example.items():
if isinstance(value, bool): # bool before int — critical!
fields[key] = BoolValidator()
elif isinstance(value, int):
fields[key] = IntValidator()
elif isinstance(value, float):
fields[key] = FloatValidator()
elif isinstance(value, str):
if _EMAIL_RE.match(value):
fields[key] = EmailValidator()
else:
fields[key] = StringValidator()
elif isinstance(value, dict):
fields[key] = cls.from_example(value) # recursion!
elif isinstance(value, list) and value:
# infer from first item
...
return cls(fields)
JSON Schema Export
Every validator in valify can export itself as standard JSON Schema:
from valify import Schema, StringValidator, IntValidator, EmailValidator
from valify import OptionalValidator
import json
schema = Schema({
"name": StringValidator(min_length=2),
"age": IntValidator(min_value=0, max_value=120),
"email": EmailValidator(),
"bio": OptionalValidator(StringValidator(), default=""),
})
print(json.dumps(schema.to_json_schema(), indent=2))
Output:
{
"type": "object",
"properties": {
"name": {"type": "string", "minLength": 2},
"age": {"type": "integer", "minimum": 0, "maximum": 120},
"email": {"type": "string", "format": "email"},
"bio": {"anyOf": [{"type": "string"}, {"type": "null"}]}
},
"required": ["name", "age", "email"]
}
This means valify schemas can be used to generate OpenAPI/Swagger documentation, validate JSON APIs, and integrate with any tool that understands JSON Schema.
Type Hints and mypy — Non-Negotiable
Every method in valify is fully typed:
def validate(self, value: Any) -> str:
...
def to_json_schema(self) -> dict[str, Any]:
...
Running mypy src/valify with strict mode passes with zero errors. This isn't just for show — it catches real bugs before runtime and makes the library a pleasure to use in IDEs with autocomplete.
One lesson: always annotate local variables when mypy can't infer the type:
# mypy infers dict[str, str] — wrong!
schema = {"type": "string"}
schema["minLength"] = 2 # error: int is not str
# explicit annotation — correct
schema: dict[str, Any] = {"type": "string"}
schema["minLength"] = 2 # ✅
Testing — 76 Tests and Counting
Every feature has tests. Every validator, every edge case, every error path:
class TestIntValidator:
def test_bool_rejected(self):
v = IntValidator()
with pytest.raises(ValidationError):
v.validate(True) # bool is not int!
def test_coerce_string_to_int(self):
v = IntValidator(coerce=True)
assert v.validate("42") == 42
The key testing principles I learned:
- One assertion per test — when it fails, you know exactly what broke
- Test the sad path as much as the happy path
- Group tests in classes with
setup_methodfor shared setup
The Numbers
- 2,000+ downloads in the first week
- 7 versions shipped
- 76 automated tests
- 0 external dependencies
- Live at https://valify.readthedocs.io
What's Next
-
0.8.0—Schema.is_valid(),Schema.errors(),RegexValidator - Flask/FastAPI integration
- CLI tool —
valify validate data.json schema.json - Road to 1.0
Try It
pip install valify
from valify import Schema, StringValidator, IntValidator, EmailValidator
schema = Schema({
"name": StringValidator(min_length=2),
"age": IntValidator(min_value=0, max_value=120),
"email": EmailValidator(),
})
schema.validate({
"name": "Darshan",
"age": 20,
"email": "darshan@example.com",
})
- 📦 PyPI: https://pypi.org/project/valify/
- ⭐ GitHub: https://github.com/DarshanBamankar/valify
- 📖 Docs: https://valify.readthedocs.io
The Biggest Lesson
Building a library teaches you things that no tutorial can. You stop being a consumer of the ecosystem and start understanding how it actually works.
If you've been thinking about building your own library — just start. Pick something small, build it properly, ship it. The Python community is welcoming and the tooling has never been better.
And if you find valify useful, a ⭐ on GitHub means the world to a new open source maintainer.
Top comments (0)