DEV Community

Muhammad Ikramullah Khan
Muhammad Ikramullah Khan

Posted on

ORM: The Beginner's Guide (Making Databases Easy)

I wrote SQL queries to save my scraped data. INSERT statements. UPDATE statements. Handling duplicates. Escaping special characters. It was messy and confusing.

cursor.execute("INSERT INTO products (name, price) VALUES (%s, %s)", (name, price))
Enter fullscreen mode Exit fullscreen mode

What if the product name has a quote in it? What if I want to update instead of insert? What about relationships?

Then I discovered ORM. No more SQL. Just Python code:

product = Product(name=name, price=price)
product.save()
Enter fullscreen mode Exit fullscreen mode

Clean. Simple. Python.

Let me show you what ORM is and why it makes databases much easier.


What is ORM? (Super Simple Explanation)

ORM stands for Object-Relational Mapping.

Big fancy term. But the idea is simple:

Without ORM (what we did in last blog):

# Write SQL
cursor.execute("""
    INSERT INTO products (name, price, url)
    VALUES (%s, %s, %s)
""", (product_name, product_price, product_url))

# Remember table names
# Remember column names  
# Write SQL syntax
# Handle database connections
Enter fullscreen mode Exit fullscreen mode

With ORM (the easy way):

# Write Python
product = Product(
    name=product_name,
    price=product_price,
    url=product_url
)
product.save()

# No SQL!
# Just Python objects
# ORM handles everything
Enter fullscreen mode Exit fullscreen mode

Think of ORM like a translator:

You speak: Python
Database speaks: SQL

ORM = Translator between Python and SQL
Enter fullscreen mode Exit fullscreen mode

You write Python code. ORM converts it to SQL automatically.


The Problem ORM Solves

Problem 1: Writing SQL is Hard

Manual SQL:

# Complex query
cursor.execute("""
    SELECT products.name, categories.name, AVG(reviews.rating)
    FROM products
    JOIN categories ON products.category_id = categories.id
    JOIN reviews ON products.id = reviews.product_id
    WHERE products.price < 100
    GROUP BY products.id, categories.id
    HAVING AVG(reviews.rating) > 4
""")
Enter fullscreen mode Exit fullscreen mode

Confusing! Easy to make mistakes!

With ORM:

# Same thing in Python
products = Product.objects.filter(
    price__lt=100,
    reviews__rating__avg__gt=4
).select_related('category')
Enter fullscreen mode Exit fullscreen mode

Much clearer!

Problem 2: SQL Injection Attacks

Dangerous (SQL injection):

# User input: "'; DROP TABLE products; --"
query = f"SELECT * FROM products WHERE name = '{user_input}'"
cursor.execute(query)  # DANGEROUS!
Enter fullscreen mode Exit fullscreen mode

This could delete your entire database!

Safe with ORM:

Product.objects.filter(name=user_input)  # Automatically safe
Enter fullscreen mode Exit fullscreen mode

ORM handles escaping automatically. You can't make this mistake.

Problem 3: Database Changes

Manual SQL:

# You rename 'price' column to 'product_price'
# Now you must find and update EVERY query:
cursor.execute("SELECT price FROM products")  # BROKEN!
cursor.execute("UPDATE products SET price = 50")  # BROKEN!
# 50 places to change!
Enter fullscreen mode Exit fullscreen mode

With ORM:

# Define once
class Product:
    product_price = models.DecimalField()

# Use everywhere
product.product_price  # Works everywhere automatically
Enter fullscreen mode Exit fullscreen mode

Change in one place, works everywhere.

Problem 4: Relationships

Manual SQL (relationships are hard):

# Get product and its reviews
cursor.execute("SELECT * FROM products WHERE id = 1")
product = cursor.fetchone()

cursor.execute("SELECT * FROM reviews WHERE product_id = 1")
reviews = cursor.fetchall()

# Manual work to connect them!
Enter fullscreen mode Exit fullscreen mode

With ORM (relationships are easy):

product = Product.objects.get(id=1)
reviews = product.reviews.all()  # Automatic!
Enter fullscreen mode Exit fullscreen mode

ORM handles relationships for you.


How ORM Works

Step 1: Define Models (Python Classes)

from sqlalchemy import Column, Integer, String, Decimal
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class Product(Base):
    __tablename__ = 'products'

    id = Column(Integer, primary_key=True)
    name = Column(String(200))
    price = Column(Decimal(10, 2))
    url = Column(String(500))
Enter fullscreen mode Exit fullscreen mode

This Python class represents a database table.

Step 2: ORM Creates the Table

# ORM automatically creates this SQL:
"""
CREATE TABLE products (
    id INTEGER PRIMARY KEY,
    name VARCHAR(200),
    price DECIMAL(10, 2),
    url VARCHAR(500)
);
"""
Enter fullscreen mode Exit fullscreen mode

You don't write the SQL. ORM does!

Step 3: Use Python Objects

# Create product (like creating an object)
product = Product(
    name="Laptop",
    price=999.99,
    url="https://example.com/laptop"
)

# Save to database
product.save()
Enter fullscreen mode Exit fullscreen mode

Behind the scenes, ORM creates:

INSERT INTO products (name, price, url)
VALUES ('Laptop', 999.99, 'https://example.com/laptop');
Enter fullscreen mode Exit fullscreen mode

Step 4: Query with Python

# Find products
cheap_products = Product.objects.filter(price__lt=50)

# ORM converts to:
# SELECT * FROM products WHERE price < 50;
Enter fullscreen mode Exit fullscreen mode

All SQL is automatic!


Popular Python ORMs

1. SQLAlchemy (Most Popular)

What it is: Powerful, flexible ORM

Pros:

  • Very powerful
  • Works with any database
  • Large community
  • Excellent documentation

Cons:

  • Learning curve
  • Can be complex

When to use:

  • Serious projects
  • Need flexibility
  • Complex queries

This is what we'll use!

2. Django ORM

What it is: ORM built into Django framework

Pros:

  • Very easy
  • Great for beginners
  • Built-in admin interface

Cons:

  • Only with Django
  • Less flexible

When to use:

  • Using Django framework
  • Want simplicity

3. Peewee

What it is: Small, simple ORM

Pros:

  • Simple
  • Lightweight
  • Easy to learn

Cons:

  • Less features
  • Smaller community

When to use:

  • Small projects
  • Want simplicity
  • Don't need advanced features

Installing SQLAlchemy

We'll use SQLAlchemy (most popular with Scrapy).

pip install sqlalchemy
Enter fullscreen mode Exit fullscreen mode

That's it!


Your First ORM Example

Let's create a simple example to understand ORM.

Step 1: Define Model

# models.py
from sqlalchemy import create_engine, Column, Integer, String, Decimal
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

# Base class for models
Base = declarative_base()

# Define Product model
class Product(Base):
    __tablename__ = 'products'

    id = Column(Integer, primary_key=True)
    name = Column(String(200))
    price = Column(Decimal(10, 2))
    url = Column(String(500))

    def __repr__(self):
        return f"<Product(name='{self.name}', price={self.price})>"
Enter fullscreen mode Exit fullscreen mode

Step 2: Create Database Connection

# Create database (SQLite for simplicity)
engine = create_engine('sqlite:///products.db')

# Create all tables
Base.metadata.create_all(engine)

# Create session (like a connection)
Session = sessionmaker(bind=engine)
session = Session()
Enter fullscreen mode Exit fullscreen mode

Step 3: Create Products

# Create products
product1 = Product(
    name="Laptop",
    price=999.99,
    url="https://example.com/laptop"
)

product2 = Product(
    name="Mouse",
    price=29.99,
    url="https://example.com/mouse"
)

# Add to session
session.add(product1)
session.add(product2)

# Save to database
session.commit()

print("Products saved!")
Enter fullscreen mode Exit fullscreen mode

Step 4: Query Products

# Get all products
all_products = session.query(Product).all()
for product in all_products:
    print(product.name, product.price)

# Find products under $50
cheap = session.query(Product).filter(Product.price < 50).all()
for product in cheap:
    print(f"Cheap: {product.name} - ${product.price}")

# Get one product by ID
product = session.query(Product).get(1)
print(f"Product 1: {product.name}")
Enter fullscreen mode Exit fullscreen mode

Step 5: Update Products

# Get product
product = session.query(Product).filter(Product.name == "Laptop").first()

# Update price
product.price = 899.99

# Save changes
session.commit()

print("Price updated!")
Enter fullscreen mode Exit fullscreen mode

Step 6: Delete Products

# Get product
product = session.query(Product).filter(Product.name == "Mouse").first()

# Delete
session.delete(product)
session.commit()

print("Product deleted!")
Enter fullscreen mode Exit fullscreen mode

Complete Working Example

Let's put everything together:

# complete_example.py
from sqlalchemy import create_engine, Column, Integer, String, Decimal
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

# Setup
Base = declarative_base()

class Product(Base):
    __tablename__ = 'products'

    id = Column(Integer, primary_key=True)
    name = Column(String(200), nullable=False)
    price = Column(Decimal(10, 2))
    url = Column(String(500), unique=True)

    def __repr__(self):
        return f"<Product('{self.name}', ${self.price})>"

# Create database
engine = create_engine('sqlite:///products.db', echo=True)  # echo=True shows SQL
Base.metadata.create_all(engine)

# Create session
Session = sessionmaker(bind=engine)
session = Session()

# Create products
products_data = [
    {"name": "Laptop", "price": 999.99, "url": "https://example.com/laptop"},
    {"name": "Mouse", "price": 29.99, "url": "https://example.com/mouse"},
    {"name": "Keyboard", "price": 79.99, "url": "https://example.com/keyboard"},
]

for data in products_data:
    product = Product(**data)
    session.add(product)

session.commit()
print("✓ Products created")

# Query products
print("\n--- All Products ---")
for product in session.query(Product).all():
    print(f"{product.name}: ${product.price}")

print("\n--- Cheap Products (< $50) ---")
cheap = session.query(Product).filter(Product.price < 50).all()
for product in cheap:
    print(f"{product.name}: ${product.price}")

print("\n--- Most Expensive ---")
expensive = session.query(Product).order_by(Product.price.desc()).first()
print(f"{expensive.name}: ${expensive.price}")

# Update
print("\n--- Updating Laptop Price ---")
laptop = session.query(Product).filter(Product.name == "Laptop").first()
laptop.price = 899.99
session.commit()
print(f"New price: ${laptop.price}")

# Count
total = session.query(Product).count()
print(f"\nTotal products: {total}")

# Close session
session.close()
Enter fullscreen mode Exit fullscreen mode

Run this:

python complete_example.py
Enter fullscreen mode Exit fullscreen mode

You'll see all the SQL that ORM generates automatically!


ORM Key Concepts

Concept 1: Models = Tables

class Product(Base):
    __tablename__ = 'products'
Enter fullscreen mode Exit fullscreen mode

One Python class = One database table

Concept 2: Columns = Attributes

class Product(Base):
    name = Column(String(200))
    price = Column(Decimal(10, 2))
Enter fullscreen mode Exit fullscreen mode

Class attributes = Table columns

Concept 3: Objects = Rows

product = Product(name="Laptop", price=999)
Enter fullscreen mode Exit fullscreen mode

One object = One row in table

Concept 4: Session = Connection

session = Session()
session.add(product)
session.commit()
Enter fullscreen mode Exit fullscreen mode

Session manages database operations


Common ORM Operations

Creating Records

# Create object
product = Product(name="Laptop", price=999)

# Add to session
session.add(product)

# Save to database
session.commit()
Enter fullscreen mode Exit fullscreen mode

Reading Records

# Get all
products = session.query(Product).all()

# Get one by ID
product = session.query(Product).get(1)

# Filter
cheap = session.query(Product).filter(Product.price < 50).all()

# Get first match
product = session.query(Product).filter(Product.name == "Laptop").first()
Enter fullscreen mode Exit fullscreen mode

Updating Records

# Get product
product = session.query(Product).get(1)

# Change attribute
product.price = 899.99

# Save
session.commit()
Enter fullscreen mode Exit fullscreen mode

Deleting Records

# Get product
product = session.query(Product).get(1)

# Delete
session.delete(product)

# Save
session.commit()
Enter fullscreen mode Exit fullscreen mode

Filtering (Queries)

Basic Filters

# Exact match
session.query(Product).filter(Product.name == "Laptop")

# Greater than
session.query(Product).filter(Product.price > 100)

# Less than
session.query(Product).filter(Product.price < 50)

# Like (pattern matching)
session.query(Product).filter(Product.name.like("%top%"))
Enter fullscreen mode Exit fullscreen mode

Multiple Filters

# AND (multiple filters)
session.query(Product).filter(
    Product.price > 50,
    Product.price < 200
)

# OR
from sqlalchemy import or_
session.query(Product).filter(
    or_(Product.price < 50, Product.price > 500)
)
Enter fullscreen mode Exit fullscreen mode

Ordering

# Order by price (ascending)
session.query(Product).order_by(Product.price)

# Order by price (descending)
session.query(Product).order_by(Product.price.desc())
Enter fullscreen mode Exit fullscreen mode

Limit

# Get first 10
session.query(Product).limit(10)

# Get first one
session.query(Product).first()
Enter fullscreen mode Exit fullscreen mode

Relationships (Preview)

One of ORM's best features: handling relationships.

Example: Products and Reviews

from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship

class Product(Base):
    __tablename__ = 'products'

    id = Column(Integer, primary_key=True)
    name = Column(String(200))

    # Relationship
    reviews = relationship("Review", back_populates="product")

class Review(Base):
    __tablename__ = 'reviews'

    id = Column(Integer, primary_key=True)
    product_id = Column(Integer, ForeignKey('products.id'))
    rating = Column(Integer)
    text = Column(String)

    # Relationship
    product = relationship("Product", back_populates="reviews")
Enter fullscreen mode Exit fullscreen mode

Using Relationships

# Get product
product = session.query(Product).get(1)

# Get all reviews (automatic!)
for review in product.reviews:
    print(f"Rating: {review.rating}")

# Go backwards
review = session.query(Review).get(1)
print(f"Product: {review.product.name}")
Enter fullscreen mode Exit fullscreen mode

No manual JOINs needed! ORM handles it.

We'll use this in the next blog when scraping products with reviews!


Why ORM is Perfect for Scrapy

Reason 1: Clean Pipelines

Without ORM:

def process_item(self, item, spider):
    cursor.execute("""
        INSERT INTO products (name, price, url)
        VALUES (%s, %s, %s)
        ON CONFLICT (url) DO UPDATE
        SET name = EXCLUDED.name
    """, (item['name'], item['price'], item['url']))
    connection.commit()
Enter fullscreen mode Exit fullscreen mode

Messy!

With ORM:

def process_item(self, item, spider):
    product = Product(**item)
    session.add(product)
    session.commit()
Enter fullscreen mode Exit fullscreen mode

Clean!

Reason 2: Automatic Duplicate Handling

# ORM can handle duplicates elegantly
existing = session.query(Product).filter(Product.url == item['url']).first()

if existing:
    # Update existing
    existing.price = item['price']
else:
    # Create new
    product = Product(**item)
    session.add(product)

session.commit()
Enter fullscreen mode Exit fullscreen mode

Reason 3: Easy Relationships

# Product with reviews
product = Product(name="Laptop", price=999)
session.add(product)

review = Review(product=product, rating=5, text="Great!")
session.add(review)

session.commit()
Enter fullscreen mode Exit fullscreen mode

Relationships handled automatically!


Preparing for Next Blog

In the next blog, we'll integrate ORM with Scrapy. Here's a preview:

What We'll Build

A Scrapy spider that:

  1. Scrapes products
  2. Scrapes reviews for each product
  3. Saves to database using ORM
  4. Handles relationships automatically
  5. Updates existing products
  6. No manual SQL!

The Models We'll Use

class Product(Base):
    id = Column(Integer, primary_key=True)
    name = Column(String(200))
    price = Column(Decimal(10, 2))
    url = Column(String(500), unique=True)
    reviews = relationship("Review")

class Review(Base):
    id = Column(Integer, primary_key=True)
    product_id = Column(Integer, ForeignKey('products.id'))
    rating = Column(Integer)
    text = Column(String)
    product = relationship("Product")
Enter fullscreen mode Exit fullscreen mode

The Pipeline We'll Create

class ORMPipeline:
    def process_item(self, item, spider):
        # Save product with ORM
        product = Product(**item)
        session.add(product)
        session.commit()
        return item
Enter fullscreen mode Exit fullscreen mode

Simple and clean!


Summary

What is ORM?
Object-Relational Mapping. Converts Python objects to SQL automatically.

Why use ORM?

  • No SQL needed
  • Safer (no SQL injection)
  • Cleaner code
  • Handles relationships
  • Easy to maintain

SQLAlchemy basics:

# Define model
class Product(Base):
    __tablename__ = 'products'
    id = Column(Integer, primary_key=True)
    name = Column(String(200))

# Create
product = Product(name="Laptop")
session.add(product)
session.commit()

# Read
products = session.query(Product).all()

# Update
product.name = "New Name"
session.commit()

# Delete
session.delete(product)
session.commit()
Enter fullscreen mode Exit fullscreen mode

Key concepts:

  • Models = Tables
  • Columns = Attributes
  • Objects = Rows
  • Session = Connection

Next blog:
We'll use ORM in Scrapy pipelines. No more manual SQL in our spiders!

Remember:

  • ORM makes databases easier
  • Write Python, not SQL
  • Perfect for Scrapy
  • We'll use SQLAlchemy

Practice the examples above. In the next blog, we'll integrate ORM with Scrapy and build a real scraper with database storage!

Happy scraping! 🕷️

Top comments (0)