aykhlf yassir

Posted on Feb 3

Python Internals: init is Not the Constructor

#python

Mastering __new__, __repr__, and __hash__

The Constructor Myth

Pop quiz: What method creates a Python object?

If you answered __init__, you're in good company, and you're wrong!

class Point:
    def __init__(self, x, y):
        print(f"__init__ called with {self}")
        self.x = x
        self.y = y

p = Point(3, 4)
# Output: __init__ called with <__main__.Point object at 0x7f8b4c>

Notice that inside __init__, we already have self. The object already exists. So what actually created it?

The answer is __new__, a method so fundamental that Python calls it automatically, and most developers never even know it exists.

The Truth About Object Creation

Here's what actually happens when you call Point(3, 4):

class Point:
    def __new__(cls, x, y):
        print(f"__new__ called with class {cls}")
        instance = super().__new__(cls)
        print(f"__new__ created {instance}")
        return instance

    def __init__(self, x, y):
        print(f"__init__ called with {self}")
        self.x = x
        self.y = y

p = Point(3, 4)
# Output:
# __new__ called with class <class '__main__.Point'>
# __new__ created <__main__.Point object at 0x7f8b4c>
# __init__ called with <__main__.Point object at 0x7f8b4c>

The execution flow is:

__new__(cls, ...) - The Architect
- Allocates memory for a new instance
- Returns the newly created object
- Receives the class as first parameter, not an instance
__init__(self, ...) - The Interior Decorator
- Receives the instance created by __new__
- Populates it with data
- Returns None (always!)

The Mental Model:

Point(3, 4)
    ↓
__new__(Point, 3, 4) → creates empty object → instance
    ↓
__init__(instance, 3, 4) → populates instance.x, instance.y
    ↓
return instance

99% of the time, you don't need to touch __new__. Python's default implementation (inherited from object) handles memory allocation perfectly. But there's one critical use case where __new__ is not just useful, it's essential.

Deep Dive: The Singleton Pattern

Imagine you're building a database connection pool, a configuration manager, or a logger. You want exactly one instance of the class to exist, no matter how many times someone calls the constructor.

# What we want:
db1 = Database()
db2 = Database()
print(db1 is db2)  # Should be True!

Can we do this with __init__? Let's try:

class Database:
    _instance = None

    def __init__(self):
        if Database._instance is not None:
            # Too late! Memory is already allocated
            # We can't "un-create" this object
            pass
        Database._instance = self

db1 = Database()
db2 = Database()
print(db1 is db2)  # False - we created two objects!

The problem: by the time __init__ runs, __new__ has already allocated memory for a new object. We can't prevent the creation—only configure what's already been created.

The Solution: Intercept at `new`

class Database:
    _instance = None

    def __new__(cls):
        if cls._instance is None:
            print("Creating the one true Database instance...")
            cls._instance = super().__new__(cls)
        else:
            print("Returning existing instance...")
        return cls._instance

    def __init__(self):
        print(f"__init__ called on {id(self)}")

db1 = Database()
# Output:
# Creating the one true Database instance...
# __init__ called on 140234567890

db2 = Database()
# Output:
# Returning existing instance...
# __init__ called on 140234567890

print(db1 is db2)  # True!
print(id(db1), id(db2))  # Same memory address

What's happening:

First call: _instance is None, so we call super().__new__(cls) to actually allocate memory
We cache this instance in _instance
Second call: _instance exists, so we return the cached object
__init__ still runs every time (be careful with this!)

The Critical Detail: `super().new(cls)`

This line is calling object.__new__(cls), the base implementation that actually talks to Python's memory allocator. You're delegating the "real" work of memory allocation to Python's core object class.

Do NOT do this:

def __new__(cls):
    if cls._instance is None:
        cls._instance = cls()  # RECURSION ERROR!
    return cls._instance

Calling cls() inside __new__ calls __new__ again, which calls __new__ again... infinite recursion.

Singleton Best Practice

If __init__ shouldn't run multiple times, use a flag:

class Database:
    _instance = None
    _initialized = False

    def __new__(cls):
        if cls._instance is None:
            cls._instance = super().__new__(cls)
        return cls._instance

    def __init__(self, connection_string="localhost"):
        if not Database._initialized:
            self.connection_string = connection_string
            Database._initialized = True
            print(f"Connected to {connection_string}")

db1 = Database("prod-server")  # Connected to prod-server
db2 = Database("dev-server")   # (no output - already initialized)
print(db1.connection_string)   # prod-server

The Representation Layer: `str` vs `repr`

You've built a beautiful class. Now it looks like this in the debugger:

class Money:
    def __init__(self, amount, currency):
        self.amount = amount
        self.currency = currency

m = Money(10, "USD")
print(m)  # <__main__.Money object at 0x7f8b4c>

Useless. Let's fix it.

The Two Faces of Representation

Python has two methods for converting objects to strings:

__str__ - The User-Friendly Version

class Money:
    def __init__(self, amount, currency):
        self.amount = amount
        self.currency = currency

    def __str__(self):
        return f"${self.amount} {self.currency}"

m = Money(10, "USD")
print(m)  # $10 USD
print(str(m))  # $10 USD

__repr__ - The Developer Version

class Money:
    def __init__(self, amount, currency):
        self.amount = amount
        self.currency = currency

    def __repr__(self):
        return f"Money({self.amount}, {self.currency})"

m = Money(10, "USD")
print(repr(m))  # Money(10, USD)
print([m])  # [Money(10, USD)] - repr is used in containers!

The Golden Rule of `repr`

The output should be valid Python code that recreates the object.

This is often stated as: eval(repr(obj)) == obj

m = Money(10, "USD")
code = repr(m)  # "Money(10, USD)"
m2 = eval(code)  # Recreate the object!
print(m2.amount)  # 10

Wait... did that actually work? Let's test it:

class Money:
    def __init__(self, amount, currency):
        self.amount = amount
        self.currency = currency

    def __repr__(self):
        return f"Money({self.amount}, {self.currency})"

m = Money(10, "USD")
print(repr(m))  # Money(10, USD)
eval(repr(m))  # NameError: name 'USD' is not defined

The problem: USD without quotes isn't a string—it's treated as a variable name!

The `!r` Trick

Python's f-strings have a special formatter that automatically calls repr() on values:

class Money:
    def __init__(self, amount, currency):
        self.amount = amount
        self.currency = currency

    def __repr__(self):
        return f"Money({self.amount!r}, {self.currency!r})"

m = Money(10, "USD")
print(repr(m))  # Money(10, 'USD') - notice the quotes!
m2 = eval(repr(m))  # Works perfectly!

The !r format specifier calls repr() on each value, which for strings adds the quotes. This ensures the output is valid Python syntax.

Pro comparison:

amount = 10
currency = "USD"

# Without !r
print(f"Money({amount}, {currency})")  # Money(10, USD)

# With !r
print(f"Money({amount!r}, {currency!r})")  # Money(10, 'USD')

When to Use Which

__repr__: Always implement this. It's used by debuggers, logs, and the interactive interpreter. Make it unambiguous.
__str__: Optional. Only implement if you need a user-friendly format. If not defined, Python falls back to __repr__.

class Money:
    def __init__(self, amount, currency):
        self.amount = amount
        self.currency = currency

    def __repr__(self):
        return f"Money({self.amount!r}, {self.currency!r})"

    def __str__(self):
        symbols = {"USD": "$", "EUR": "€", "GBP": "£"}
        symbol = symbols.get(self.currency, self.currency)
        return f"{symbol}{self.amount}"

m = Money(10, "USD")
print(str(m))   # $10 (user-friendly)
print(repr(m))  # Money(10, 'USD') (code-like)
print(m)        # $10 (print uses str)
print([m])      # [Money(10, 'USD')] (containers use repr)

The Hashability Contract: Making Objects Dictionary Keys

You've probably used strings and tuples as dictionary keys:

cache = {}
cache["user:123"] = {"name": "Alice"}  # String key - works
cache[(1, 2)] = "point"  # Tuple key - works

But try this:

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

p = Point(1, 2)
cache = {}
cache[p] = "point"  # TypeError: unhashable type: 'Point'

Why can't we use our custom object as a key? Because it's not hashable.

What Does Hashable Mean?

To be used as a dictionary key or stored in a set, an object must:

Have a __hash__ method that returns an integer
Have an __eq__ method to check equality
Follow the hashability contract

The Hashability Contract

Rule 1: Equal objects must have equal hashes

If a == b, then hash(a) MUST equal hash(b)

Rule 2: The hash must never change

Once created, an object's hash must remain constant for its entire lifetime. This is why lists aren't hashable—you can modify them!

# This is why lists fail:
lst = [1, 2, 3]
hash(lst)  # TypeError: unhashable type: 'list'

# But tuples work:
tpl = (1, 2, 3)
hash(tpl)  # 529344067295497451

Implementing Hashability

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return self.x == other.x and self.y == other.y

    def __hash__(self):
        return hash((self.x, self.y))

p1 = Point(1, 2)
p2 = Point(1, 2)
p3 = Point(3, 4)

print(p1 == p2)  # True
print(hash(p1) == hash(p2))  # True - contract satisfied!

cache = {p1: "origin"}
print(cache[p2])  # "origin" - found it using p2!

Why Delegate to a Tuple?

The line return hash((self.x, self.y)) is the idiomatic way to hash objects. Here's why:

Tuples are immutable - Their hash is guaranteed stable
Python's tuple hash is well-designed - It combines element hashes efficiently
It's simple - You don't have to write your own hash combining logic

Under the hood, Python's tuple hash does something like:

# Simplified version of what Python does
def hash_tuple(items):
    result = 0x345678
    for item in items:
        result = (1000003 * result) ^ hash(item)
    return result

But you don't need to know that—just pack your state into a tuple and let Python handle it.

The `NotImplemented` Pattern

Notice this line in __eq__:

if not isinstance(other, Point):
    return NotImplemented

Don't return False here! Returning NotImplemented tells Python "I don't know how to compare with this type—ask the other object."

class Point:
    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return self.x == other.x and self.y == other.y

p = Point(1, 2)
print(p == 5)  # False (Python tries both p.__eq__(5) and (5).__eq__(p))

If you returned False instead, you'd be claiming "a Point is definitely not equal to an integer," which might not be true if someone subclasses Point and adds custom comparison logic.

The Immutability Trap

Remember: hashable objects should be immutable. If you allow modification, weird things happen:

class MutablePoint:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __eq__(self, other):
        if not isinstance(other, MutablePoint):
            return NotImplemented
        return self.x == other.x and self.y == other.y

    def __hash__(self):
        return hash((self.x, self.y))

p = MutablePoint(1, 2)
cache = {p: "original"}

print(cache[p])  # "original" - works

# Mutate the object
p.x = 99

# Now the hash changed!
print(cache[p])  # KeyError: MutablePoint object not found

The object is now "lost" in the dictionary because its hash changed. The dictionary is looking in the wrong bucket!

Best practice: If you implement __hash__, make your object immutable using __slots__ and properties:

class ImmutablePoint:
    __slots__ = ['_x', '_y']

    def __init__(self, x, y):
        object.__setattr__(self, '_x', x)
        object.__setattr__(self, '_y', y)

    @property
    def x(self):
        return self._x

    @property
    def y(self):
        return self._y

    def __setattr__(self, name, value):
        raise AttributeError("ImmutablePoint is immutable")

    def __eq__(self, other):
        if not isinstance(other, ImmutablePoint):
            return NotImplemented
        return self.x == other.x and self.y == other.y

    def __hash__(self):
        return hash((self.x, self.y))

    def __repr__(self):
        return f"ImmutablePoint({self.x!r}, {self.y!r})"

Summary: The Professional Object Checklist

Today we've learned the lifecycle methods that make Python objects behave like first-class types:

Creation & Representation

__new__(cls, ...) creates the object; __init__(self, ...) configures it
Use __new__ for Singletons and other creation-control patterns
__repr__ is for developers (make it code-like with !r)
__str__ is for users (optional, human-friendly)

The Hashability Contract

__eq__ defines equality (return NotImplemented for unknown types)
__hash__ enables dictionary/set usage (delegate to tuple)
Rule: If a == b, then hash(a) == hash(b)
Immutability: The hash must never change

The Professional Class Template

class Money:
    __slots__ = ['_amount', '_currency']

    def __init__(self, amount, currency):
        object.__setattr__(self, '_amount', amount)
        object.__setattr__(self, '_currency', currency)

    @property
    def amount(self):
        return self._amount

    @property
    def currency(self):
        return self._currency

    def __setattr__(self, name, value):
        raise AttributeError("Money is immutable")

    def __repr__(self):
        return f"Money({self.amount!r}, {self.currency!r})"

    def __str__(self):
        return f"${self.amount} {self.currency}"

    def __eq__(self, other):
        if not isinstance(other, Money):
            return NotImplemented
        return self.amount == other.amount and self.currency == other.currency

    def __hash__(self):
        return hash((self.amount, self.currency))

This class is memory-efficient (__slots__), immutable (read-only properties), debuggable (__repr__), user-friendly (__str__), and can be used in sets and dicts (__eq__ + __hash__).

DEV Community

Python Internals: init is Not the Constructor

The Constructor Myth

The Truth About Object Creation

Deep Dive: The Singleton Pattern

The Solution: Intercept at `new`

The Critical Detail: `super().new(cls)`

Singleton Best Practice

The Representation Layer: `str` vs `repr`

The Two Faces of Representation

The Golden Rule of `repr`

The `!r` Trick

When to Use Which

The Hashability Contract: Making Objects Dictionary Keys

What Does Hashable Mean?

The Hashability Contract

Implementing Hashability

Why Delegate to a Tuple?

The `NotImplemented` Pattern

The Immutability Trap

Summary: The Professional Object Checklist

Creation & Representation

The Hashability Contract

The Professional Class Template

Top comments (0)

The Constructor Myth

The Truth About Object Creation

Deep Dive: The Singleton Pattern

The Solution: Intercept at __new__

The Critical Detail: super().__new__(cls)

Singleton Best Practice

The Representation Layer: __str__ vs __repr__

The Two Faces of Representation

The Golden Rule of __repr__

The !r Trick

When to Use Which

The Hashability Contract: Making Objects Dictionary Keys

What Does Hashable Mean?

The Hashability Contract

Implementing Hashability

Why Delegate to a Tuple?

The NotImplemented Pattern

The Immutability Trap

Summary: The Professional Object Checklist

Creation & Representation

The Hashability Contract

The Professional Class Template

The Solution: Intercept at `new`

The Critical Detail: `super().new(cls)`

The Representation Layer: `str` vs `repr`

The Golden Rule of `repr`

The `!r` Trick

The `NotImplemented` Pattern