DEV Community

Cover image for Pydantic v2: The Data Validation Library That Makes Python Feel Type-Safe
MEROLINE LIZLENT
MEROLINE LIZLENT

Posted on

Pydantic v2: The Data Validation Library That Makes Python Feel Type-Safe

Pydantic is a full data validation, serialization, and settings management library. And with v2 which has been rewritten in Rust, is blazingly fast. This article covers everything from the basics to the advanced patterns that will change how you write Python.

What Is Pydantic?

Pydantic lets you define data schemas using Python type hints and then validates data against those schemas at runtime. When validation fails, you get clear, structured error messages not cryptic KeyError or AttributeError exceptions buried in your business logic.

pip install pydantic
pip install "pydantic[email]"  # for EmailStr and other extras
Enter fullscreen mode Exit fullscreen mode

The Basics: BaseModel

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str
    is_active: bool = True

# Valid data
user = User(id=1, name="Alice", email="alice@example.com")
print(user.id)        # 1
print(user.is_active) # True (default)

# Pydantic converts compatible types
user2 = User(id="42", name="Bob", email="bob@example.com")
print(type(user2.id))  # <class 'int'> — "42" was coerced to 42
Enter fullscreen mode Exit fullscreen mode

Type coercion is one of Pydantic's most useful behaviors. It tries to convert compatible values rather than failing immediately.

Validation

from pydantic import BaseModel, ValidationError

class Product(BaseModel):
    name: str
    price: float
    quantity: int

try:
    p = Product(name="Widget", price="not_a_number", quantity=10)
except ValidationError as e:
    print(e.json())
Enter fullscreen mode Exit fullscreen mode

Output:

[
  {
    "type": "float_parsing",
    "loc": ["price"],
    "msg": "Input should be a valid number, unable to parse string as a number",
    "input": "not_a_number",
    "url": "https://errors.pydantic.dev/2.x/v/float_parsing"
  }
]
Enter fullscreen mode Exit fullscreen mode

Structured, machine-readable errors with field location, error type, and the offending input.

Field: Fine-Grained Validation

The Field function lets you add constraints, defaults, aliases, and metadata to individual fields.

from pydantic import BaseModel, Field
from typing import Optional

class Article(BaseModel):
    title: "str = Field(..., min_length=3, max_length=200)"
    slug: str = Field(..., pattern=r"^[a-z0-9-]+$")
    body: str = Field(..., min_length=10)
    views: int = Field(default=0, ge=0)
    rating: float = Field(default=5.0, ge=1.0, le=5.0)
    tags: list[str] = Field(default_factory=list, max_length=10)
    author_id: Optional[int] = Field(None, description="Foreign key to users table")
Enter fullscreen mode Exit fullscreen mode

... (Ellipsis) means the field is required. Common constraints:

  • min_length / max_length — for strings and lists
  • ge / le / gt / lt — numeric bounds (≥, ≤, >, <)
  • pattern — regex validation for strings
  • default_factory — callable that produces the default (use this instead of default=[])

Custom Validators

Field Validators

from pydantic import BaseModel, field_validator

class UserRegistration(BaseModel):
    username: str
    password: str
    confirm_password: str

    @field_validator("username")
    @classmethod
    def username_must_be_alphanumeric(cls, v: str) -> str:
        if not v.isalnum():
            raise ValueError("Username must contain only letters and numbers")
        return v.lower()  # normalize to lowercase

    @field_validator("password")
    @classmethod
    def password_strength(cls, v: str) -> str:
        if len(v) < 8:
            raise ValueError("Password must be at least 8 characters")
        if not any(c.isupper() for c in v):
            raise ValueError("Password must contain at least one uppercase letter")
        if not any(c.isdigit() for c in v):
            raise ValueError("Password must contain at least one digit")
        return v
Enter fullscreen mode Exit fullscreen mode

Model Validators (Cross-Field Validation)

Sometimes you need to validate multiple fields together. Use @model_validator:

from pydantic import BaseModel, model_validator

class PasswordReset(BaseModel):
    password: str
    confirm_password: str

    @model_validator(mode="after")
    def passwords_must_match(self) -> "PasswordReset":
        if self.password != self.confirm_password:
            raise ValueError("Passwords do not match")
        return self

class DateRange(BaseModel):
    start_date: date
    end_date: date

    @model_validator(mode="after")
    def end_must_be_after_start(self) -> "DateRange":
        if self.end_date <= self.start_date:
            raise ValueError("end_date must be after start_date")
        return self
Enter fullscreen mode Exit fullscreen mode

Types That Do the Heavy Lifting

Pydantic ships with a rich set of built-in types that handle common validation patterns:

from pydantic import BaseModel, EmailStr, HttpUrl, AnyUrl
from pydantic import IPvAnyAddress, SecretStr, PositiveInt, NegativeFloat
from datetime import datetime
from uuid import UUID

class RichModel(BaseModel):
    # Validated email address
    email: EmailStr

    # Validated URLs
    website: HttpUrl
    profile_picture: AnyUrl

    # UUID (accepts strings and UUID objects)
    user_id: UUID

    # Hides value in repr and logs — great for passwords/tokens
    api_key: SecretStr

    # Type shortcuts
    age: PositiveInt          # int > 0
    temperature: NegativeFloat  # float < 0

    # IP address (v4 or v6)
    ip_address: IPvAnyAddress

    # Datetime parsing from strings, timestamps, etc.
    created_at: datetime

data = RichModel(
    email="user@example.com",
    website="https://example.com",
    profile_picture="s3://bucket/image.png",
    user_id="550e8400-e29b-41d4-a716-446655440000",
    api_key="super-secret-key",
    age=25,
    temperature=-3.5,
    ip_address="192.168.1.1",
    created_at="2024-01-15T10:30:00Z",
)

print(data.api_key)          # **********  (hidden!)
print(data.api_key.get_secret_value())  # super-secret-key
Enter fullscreen mode Exit fullscreen mode

Nested Models and Composition

Pydantic models compose naturally:

from pydantic import BaseModel
from typing import Optional

class Address(BaseModel):
    street: str
    city: str
    country: str
    postal_code: Optional[str] = None

class Company(BaseModel):
    name: str
    address: Address

class Employee(BaseModel):
    name: str
    company: Company
    work_address: Optional[Address] = None

# Nested dict input works seamlessly
data = {
    "name": "Alice",
    "company": {
        "name": "Acme Corp",
        "address": {
            "street": "123 Main St",
            "city": "Nairobi",
            "country": "Kenya"
        }
    }
}

employee = Employee(**data)
print(employee.company.address.city)  # Nairobi
Enter fullscreen mode Exit fullscreen mode

Validation cascades through all nested models automatically.

Generics: Reusable Response Wrappers

One of Pydantic's most useful patterns for APIs is a generic response wrapper:

from pydantic import BaseModel
from typing import Generic, TypeVar, Optional, List

T = TypeVar("T")

class PaginatedResponse(BaseModel, Generic[T]):
    items: List[T]
    total: int
    page: int
    per_page: int
    has_next: bool

class APIResponse(BaseModel, Generic[T]):
    success: bool
    data: Optional[T] = None
    error: Optional[str] = None

class UserOut(BaseModel):
    id: int
    name: str
    email: str

# Usage
response = APIResponse[UserOut](
    success=True,
    data=UserOut(id=1, name="Alice", email="alice@example.com")
)

paginated = PaginatedResponse[UserOut](
    items=[UserOut(id=1, name="Alice", email="alice@example.com")],
    total=1,
    page=1,
    per_page=10,
    has_next=False,
)
Enter fullscreen mode Exit fullscreen mode

In FastAPI, these generics show up correctly in the OpenAPI schema too.

Serialization: model_dump and model_dump_json

from pydantic import BaseModel, Field
from datetime import datetime

class Event(BaseModel):
    event_id: int = Field(alias="eventId")
    title: str
    created_at: datetime

event = Event(eventId=1, title="Conference", created_at=datetime.now())

# Convert to dict
print(event.model_dump())
# {'event_id': 1, 'title': 'Conference', 'created_at': datetime(...)}

# Use the alias in output
print(event.model_dump(by_alias=True))
# {'eventId': 1, 'title': 'Conference', 'created_at': datetime(...)}

# Exclude fields
print(event.model_dump(exclude={"created_at"}))

# Include only specific fields
print(event.model_dump(include={"title", "event_id"}))

# Serialize to JSON string (fast Rust-based serializer)
print(event.model_dump_json())
Enter fullscreen mode Exit fullscreen mode

Aliases: Bridging Different Naming Conventions

APIs often use camelCase while Python uses snake_case. Pydantic handles this elegantly:

from pydantic import BaseModel, Field
from pydantic.alias_generators import to_camel

class UserProfile(BaseModel):
    model_config = {"alias_generator": to_camel, "populate_by_name": True}

    first_name: str
    last_name: str
    phone_number: str

# Accept camelCase input (from a JS frontend)
profile = UserProfile(firstName="Alice", lastName="Smith", phoneNumber="+254712345678")
print(profile.first_name)  # Alice

# Output camelCase (for a JS frontend)
print(profile.model_dump(by_alias=True))
# {'firstName': 'Meroline', 'lastName': 'Lizlent', 'phoneNumber': '+254712345678'}
Enter fullscreen mode Exit fullscreen mode

Model Config: Customizing Behavior

model_config controls how your model behaves:

from pydantic import BaseModel, ConfigDict

class StrictUser(BaseModel):
    model_config = ConfigDict(
        # Reject extra fields instead of ignoring them
        extra="forbid",

        # Don't coerce types — "42" won't become 42
        strict=True,

        # Validate default values too
        validate_default=True,

        # Allow population by field name even when alias is set
        populate_by_name=True,

        # Freeze the model (make it immutable/hashable)
        frozen=True,
    )

    id: int
    name: str
Enter fullscreen mode Exit fullscreen mode

Common extra options:

  • "ignore" (default) — silently discard extra fields
  • "forbid" — raise a validation error if extra fields are present
  • "allow" — keep extra fields in the model

Pydantic Settings: Managing Configuration

One of Pydantic's killer features for production apps: pydantic-settings for environment variable management.

pip install pydantic-settings
Enter fullscreen mode Exit fullscreen mode
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False,
    )

    # Required — will raise error if not set
    database_url: str
    secret_key: str

    # Optional with defaults
    debug: bool = False
    allowed_hosts: list[str] = ["localhost"]
    max_connections: int = Field(default=10, ge=1, le=100)
    app_name: str = "My App"

# Loads from environment variables and .env file automatically
settings = Settings()
print(settings.database_url)
Enter fullscreen mode Exit fullscreen mode

Your .env:

DATABASE_URL=postgresql://user:pass@localhost/mydb
SECRET_KEY=your-secret-key-here
DEBUG=true
MAX_CONNECTIONS=20
Enter fullscreen mode Exit fullscreen mode

Use it with a singleton pattern in FastAPI:

from functools import lru_cache

@lru_cache
def get_settings() -> Settings:
    return Settings()

# In your FastAPI routes
from fastapi import Depends

@app.get("/info")
def app_info(settings: Settings = Depends(get_settings)):
    return {"app": settings.app_name, "debug": settings.debug}
Enter fullscreen mode Exit fullscreen mode

Computed Fields

Need a field derived from other fields?

from pydantic import BaseModel, computed_field

class Rectangle(BaseModel):
    width: float
    height: float

    @computed_field
    @property
    def area(self) -> float:
        return self.width * self.height

    @computed_field
    @property
    def perimeter(self) -> float:
        return 2 * (self.width + self.height)

r = Rectangle(width=5.0, height=3.0)
print(r.area)       # 15.0
print(r.model_dump())  # includes 'area' and 'perimeter'
Enter fullscreen mode Exit fullscreen mode

Discriminated Unions: Polymorphic Models

When you have multiple model types that share a field used to distinguish them:

from pydantic import BaseModel
from typing import Literal, Union, Annotated
from pydantic import Field

class CreditCardPayment(BaseModel):
    payment_type: Literal["credit_card"]
    card_number: str
    expiry: str
    cvv: str

class BankTransferPayment(BaseModel):
    payment_type: Literal["bank_transfer"]
    account_number: str
    routing_number: str

class MpesaPayment(BaseModel):
    payment_type: Literal["mpesa"]
    phone_number: str
    transaction_id: str

Payment = Annotated[
    Union[CreditCardPayment, BankTransferPayment, MpesaPayment],
    Field(discriminator="payment_type")
]

class Order(BaseModel):
    order_id: int
    amount: float
    payment: Payment

# Pydantic knows exactly which model to use based on payment_type
order = Order(
    order_id=1,
    amount=99.99,
    payment={"payment_type": "mpesa", "phone_number": "+254712345678", "transaction_id": "MPX123"}
)
print(type(order.payment))  # <class 'MpesaPayment'>
Enter fullscreen mode Exit fullscreen mode

Performance: Pydantic v2 Is Fast

Pydantic v2 was rewritten in Rust and is dramatically faster than v1:

Operation v1 v2
Model validation baseline ~5-50x faster
JSON serialization baseline ~10x faster
Schema generation baseline ~2-5x faster

For high-throughput APIs handling thousands of requests per second, this matters.

You can also pre-compile validators for repeated use:

from pydantic import TypeAdapter

# Validate a list of items without defining a full model
adapter = TypeAdapter(list[int])
result = adapter.validate_python(["1", "2", "3"])
print(result)  # [1, 2, 3]

# Or validate JSON directly (fast path)
result = adapter.validate_json("[1, 2, 3]")
Enter fullscreen mode Exit fullscreen mode

Common Pitfalls

Mutable defaults

# BAD — shared mutable default (classic Python bug)
class Model(BaseModel):
    tags: list = []  # all instances share the same list!

# GOOD
class Model(BaseModel):
    tags: list[str] = Field(default_factory=list)
Enter fullscreen mode Exit fullscreen mode

Modifying a model directly

user = User(id=1, name="Alice")
user.name = "Bob"  # This works but bypasses validation in some contexts

# For truly immutable models:
class User(BaseModel):
    model_config = ConfigDict(frozen=True)
    id: int
    name: str
Enter fullscreen mode Exit fullscreen mode

The dict returned by model_dump() contains the same objects. If you need a deep copy, use model.model_copy(deep=True).

Summary

Pydantic is one of those libraries that, once you learn it properly, changes how you think about data in Python. It brings structure, validation, and type safety to a language that's historically been loose about all three.

The patterns covered here; Field constraints, custom validators, generics, discriminated unions, Settings management are production patterns used in real systems.

Top comments (0)