Pydantic is a full data validation, serialization, and settings management library. And with v2 which has been rewritten in Rust, is blazingly fast. This article covers everything from the basics to the advanced patterns that will change how you write Python.
What Is Pydantic?
Pydantic lets you define data schemas using Python type hints and then validates data against those schemas at runtime. When validation fails, you get clear, structured error messages not cryptic KeyError or AttributeError exceptions buried in your business logic.
pip install pydantic
pip install "pydantic[email]" # for EmailStr and other extras
The Basics: BaseModel
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str
email: str
is_active: bool = True
# Valid data
user = User(id=1, name="Alice", email="alice@example.com")
print(user.id) # 1
print(user.is_active) # True (default)
# Pydantic converts compatible types
user2 = User(id="42", name="Bob", email="bob@example.com")
print(type(user2.id)) # <class 'int'> — "42" was coerced to 42
Type coercion is one of Pydantic's most useful behaviors. It tries to convert compatible values rather than failing immediately.
Validation
from pydantic import BaseModel, ValidationError
class Product(BaseModel):
name: str
price: float
quantity: int
try:
p = Product(name="Widget", price="not_a_number", quantity=10)
except ValidationError as e:
print(e.json())
Output:
[
{
"type": "float_parsing",
"loc": ["price"],
"msg": "Input should be a valid number, unable to parse string as a number",
"input": "not_a_number",
"url": "https://errors.pydantic.dev/2.x/v/float_parsing"
}
]
Structured, machine-readable errors with field location, error type, and the offending input.
Field: Fine-Grained Validation
The Field function lets you add constraints, defaults, aliases, and metadata to individual fields.
from pydantic import BaseModel, Field
from typing import Optional
class Article(BaseModel):
title: "str = Field(..., min_length=3, max_length=200)"
slug: str = Field(..., pattern=r"^[a-z0-9-]+$")
body: str = Field(..., min_length=10)
views: int = Field(default=0, ge=0)
rating: float = Field(default=5.0, ge=1.0, le=5.0)
tags: list[str] = Field(default_factory=list, max_length=10)
author_id: Optional[int] = Field(None, description="Foreign key to users table")
... (Ellipsis) means the field is required. Common constraints:
-
min_length/max_length— for strings and lists -
ge/le/gt/lt— numeric bounds (≥, ≤, >, <) -
pattern— regex validation for strings -
default_factory— callable that produces the default (use this instead ofdefault=[])
Custom Validators
Field Validators
from pydantic import BaseModel, field_validator
class UserRegistration(BaseModel):
username: str
password: str
confirm_password: str
@field_validator("username")
@classmethod
def username_must_be_alphanumeric(cls, v: str) -> str:
if not v.isalnum():
raise ValueError("Username must contain only letters and numbers")
return v.lower() # normalize to lowercase
@field_validator("password")
@classmethod
def password_strength(cls, v: str) -> str:
if len(v) < 8:
raise ValueError("Password must be at least 8 characters")
if not any(c.isupper() for c in v):
raise ValueError("Password must contain at least one uppercase letter")
if not any(c.isdigit() for c in v):
raise ValueError("Password must contain at least one digit")
return v
Model Validators (Cross-Field Validation)
Sometimes you need to validate multiple fields together. Use @model_validator:
from pydantic import BaseModel, model_validator
class PasswordReset(BaseModel):
password: str
confirm_password: str
@model_validator(mode="after")
def passwords_must_match(self) -> "PasswordReset":
if self.password != self.confirm_password:
raise ValueError("Passwords do not match")
return self
class DateRange(BaseModel):
start_date: date
end_date: date
@model_validator(mode="after")
def end_must_be_after_start(self) -> "DateRange":
if self.end_date <= self.start_date:
raise ValueError("end_date must be after start_date")
return self
Types That Do the Heavy Lifting
Pydantic ships with a rich set of built-in types that handle common validation patterns:
from pydantic import BaseModel, EmailStr, HttpUrl, AnyUrl
from pydantic import IPvAnyAddress, SecretStr, PositiveInt, NegativeFloat
from datetime import datetime
from uuid import UUID
class RichModel(BaseModel):
# Validated email address
email: EmailStr
# Validated URLs
website: HttpUrl
profile_picture: AnyUrl
# UUID (accepts strings and UUID objects)
user_id: UUID
# Hides value in repr and logs — great for passwords/tokens
api_key: SecretStr
# Type shortcuts
age: PositiveInt # int > 0
temperature: NegativeFloat # float < 0
# IP address (v4 or v6)
ip_address: IPvAnyAddress
# Datetime parsing from strings, timestamps, etc.
created_at: datetime
data = RichModel(
email="user@example.com",
website="https://example.com",
profile_picture="s3://bucket/image.png",
user_id="550e8400-e29b-41d4-a716-446655440000",
api_key="super-secret-key",
age=25,
temperature=-3.5,
ip_address="192.168.1.1",
created_at="2024-01-15T10:30:00Z",
)
print(data.api_key) # ********** (hidden!)
print(data.api_key.get_secret_value()) # super-secret-key
Nested Models and Composition
Pydantic models compose naturally:
from pydantic import BaseModel
from typing import Optional
class Address(BaseModel):
street: str
city: str
country: str
postal_code: Optional[str] = None
class Company(BaseModel):
name: str
address: Address
class Employee(BaseModel):
name: str
company: Company
work_address: Optional[Address] = None
# Nested dict input works seamlessly
data = {
"name": "Alice",
"company": {
"name": "Acme Corp",
"address": {
"street": "123 Main St",
"city": "Nairobi",
"country": "Kenya"
}
}
}
employee = Employee(**data)
print(employee.company.address.city) # Nairobi
Validation cascades through all nested models automatically.
Generics: Reusable Response Wrappers
One of Pydantic's most useful patterns for APIs is a generic response wrapper:
from pydantic import BaseModel
from typing import Generic, TypeVar, Optional, List
T = TypeVar("T")
class PaginatedResponse(BaseModel, Generic[T]):
items: List[T]
total: int
page: int
per_page: int
has_next: bool
class APIResponse(BaseModel, Generic[T]):
success: bool
data: Optional[T] = None
error: Optional[str] = None
class UserOut(BaseModel):
id: int
name: str
email: str
# Usage
response = APIResponse[UserOut](
success=True,
data=UserOut(id=1, name="Alice", email="alice@example.com")
)
paginated = PaginatedResponse[UserOut](
items=[UserOut(id=1, name="Alice", email="alice@example.com")],
total=1,
page=1,
per_page=10,
has_next=False,
)
In FastAPI, these generics show up correctly in the OpenAPI schema too.
Serialization: model_dump and model_dump_json
from pydantic import BaseModel, Field
from datetime import datetime
class Event(BaseModel):
event_id: int = Field(alias="eventId")
title: str
created_at: datetime
event = Event(eventId=1, title="Conference", created_at=datetime.now())
# Convert to dict
print(event.model_dump())
# {'event_id': 1, 'title': 'Conference', 'created_at': datetime(...)}
# Use the alias in output
print(event.model_dump(by_alias=True))
# {'eventId': 1, 'title': 'Conference', 'created_at': datetime(...)}
# Exclude fields
print(event.model_dump(exclude={"created_at"}))
# Include only specific fields
print(event.model_dump(include={"title", "event_id"}))
# Serialize to JSON string (fast Rust-based serializer)
print(event.model_dump_json())
Aliases: Bridging Different Naming Conventions
APIs often use camelCase while Python uses snake_case. Pydantic handles this elegantly:
from pydantic import BaseModel, Field
from pydantic.alias_generators import to_camel
class UserProfile(BaseModel):
model_config = {"alias_generator": to_camel, "populate_by_name": True}
first_name: str
last_name: str
phone_number: str
# Accept camelCase input (from a JS frontend)
profile = UserProfile(firstName="Alice", lastName="Smith", phoneNumber="+254712345678")
print(profile.first_name) # Alice
# Output camelCase (for a JS frontend)
print(profile.model_dump(by_alias=True))
# {'firstName': 'Meroline', 'lastName': 'Lizlent', 'phoneNumber': '+254712345678'}
Model Config: Customizing Behavior
model_config controls how your model behaves:
from pydantic import BaseModel, ConfigDict
class StrictUser(BaseModel):
model_config = ConfigDict(
# Reject extra fields instead of ignoring them
extra="forbid",
# Don't coerce types — "42" won't become 42
strict=True,
# Validate default values too
validate_default=True,
# Allow population by field name even when alias is set
populate_by_name=True,
# Freeze the model (make it immutable/hashable)
frozen=True,
)
id: int
name: str
Common extra options:
-
"ignore"(default) — silently discard extra fields -
"forbid"— raise a validation error if extra fields are present -
"allow"— keep extra fields in the model
Pydantic Settings: Managing Configuration
One of Pydantic's killer features for production apps: pydantic-settings for environment variable management.
pip install pydantic-settings
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=False,
)
# Required — will raise error if not set
database_url: str
secret_key: str
# Optional with defaults
debug: bool = False
allowed_hosts: list[str] = ["localhost"]
max_connections: int = Field(default=10, ge=1, le=100)
app_name: str = "My App"
# Loads from environment variables and .env file automatically
settings = Settings()
print(settings.database_url)
Your .env:
DATABASE_URL=postgresql://user:pass@localhost/mydb
SECRET_KEY=your-secret-key-here
DEBUG=true
MAX_CONNECTIONS=20
Use it with a singleton pattern in FastAPI:
from functools import lru_cache
@lru_cache
def get_settings() -> Settings:
return Settings()
# In your FastAPI routes
from fastapi import Depends
@app.get("/info")
def app_info(settings: Settings = Depends(get_settings)):
return {"app": settings.app_name, "debug": settings.debug}
Computed Fields
Need a field derived from other fields?
from pydantic import BaseModel, computed_field
class Rectangle(BaseModel):
width: float
height: float
@computed_field
@property
def area(self) -> float:
return self.width * self.height
@computed_field
@property
def perimeter(self) -> float:
return 2 * (self.width + self.height)
r = Rectangle(width=5.0, height=3.0)
print(r.area) # 15.0
print(r.model_dump()) # includes 'area' and 'perimeter'
Discriminated Unions: Polymorphic Models
When you have multiple model types that share a field used to distinguish them:
from pydantic import BaseModel
from typing import Literal, Union, Annotated
from pydantic import Field
class CreditCardPayment(BaseModel):
payment_type: Literal["credit_card"]
card_number: str
expiry: str
cvv: str
class BankTransferPayment(BaseModel):
payment_type: Literal["bank_transfer"]
account_number: str
routing_number: str
class MpesaPayment(BaseModel):
payment_type: Literal["mpesa"]
phone_number: str
transaction_id: str
Payment = Annotated[
Union[CreditCardPayment, BankTransferPayment, MpesaPayment],
Field(discriminator="payment_type")
]
class Order(BaseModel):
order_id: int
amount: float
payment: Payment
# Pydantic knows exactly which model to use based on payment_type
order = Order(
order_id=1,
amount=99.99,
payment={"payment_type": "mpesa", "phone_number": "+254712345678", "transaction_id": "MPX123"}
)
print(type(order.payment)) # <class 'MpesaPayment'>
Performance: Pydantic v2 Is Fast
Pydantic v2 was rewritten in Rust and is dramatically faster than v1:
| Operation | v1 | v2 |
|---|---|---|
| Model validation | baseline | ~5-50x faster |
| JSON serialization | baseline | ~10x faster |
| Schema generation | baseline | ~2-5x faster |
For high-throughput APIs handling thousands of requests per second, this matters.
You can also pre-compile validators for repeated use:
from pydantic import TypeAdapter
# Validate a list of items without defining a full model
adapter = TypeAdapter(list[int])
result = adapter.validate_python(["1", "2", "3"])
print(result) # [1, 2, 3]
# Or validate JSON directly (fast path)
result = adapter.validate_json("[1, 2, 3]")
Common Pitfalls
Mutable defaults
# BAD — shared mutable default (classic Python bug)
class Model(BaseModel):
tags: list = [] # all instances share the same list!
# GOOD
class Model(BaseModel):
tags: list[str] = Field(default_factory=list)
Modifying a model directly
user = User(id=1, name="Alice")
user.name = "Bob" # This works but bypasses validation in some contexts
# For truly immutable models:
class User(BaseModel):
model_config = ConfigDict(frozen=True)
id: int
name: str
The dict returned by model_dump() contains the same objects. If you need a deep copy, use model.model_copy(deep=True).
Summary
Pydantic is one of those libraries that, once you learn it properly, changes how you think about data in Python. It brings structure, validation, and type safety to a language that's historically been loose about all three.
The patterns covered here; Field constraints, custom validators, generics, discriminated unions, Settings management are production patterns used in real systems.
Top comments (0)