Why your Python objects are just dictionaries and when that becomes a problem
Identity vs. Equality:
First, let's clarify what is actually does. Most Python developers learn that == checks equality and is checks identity, but what does "identity" really mean?
x = [1, 2, 3]
y = [1, 2, 3]
print(x == y) # True - same contents
print(x is y) # False - different objects in memory
print(id(x), id(y)) # Different memory addresses
The is operator checks if two variables point to the exact same object in memory the same address.
The Integer Interning Optimization
Python uses an optimization called interning, it pre-allocates objects at startup and reuses them throughout your program's lifetime.
a = 256
b = 256
print(id(a) == id(b)) # True - same memory address
Every time you use the number 42 in your code, you're pointing to the same integer object. This saves memory and improves performance for commonly-used values.
The takeaway: Python manages memory in ways you don't see until you look. And this is just the beginning.
The Core:
Here's the real secret: a standard Python object is essentially a wrapper around a hash map. Let's prove it.
class Player:
server = "US-East" # Class variable
def __init__(self, name):
self.name = name
p = Player("Alice")
print(p.__dict__) # {'name': 'Alice'}
Every instance has a __dict__ attribute, that is a dictionary that holds all its instance variables. When you do p.name, Python is really doing p.__dict__['name'].
The Shadowing Experiment
This dictionary model creates interesting behavior with class variables:
class Player:
server = "US-East"
def __init__(self, name):
self.name = name
p1 = Player("Alice")
p2 = Player("Bob")
print(p1.server) # "US-East" - found on the class
print(p1.__dict__) # {'name': 'Alice'} - no 'server' key!
# Now shadow the class variable
p1.server = "EU-West"
print(p1.__dict__) # {'name': 'Alice', 'server': 'EU-West'}
print(p2.server) # Still "US-East"
print(Player.server) # Still "US-East"
What happened? When you read p1.server, Python checks:
- Instance dictionary (
p1.__dict__) → not found - Class dictionary (
Player.__dict__) → found!
But when you write p1.server = "EU-West", Python creates a new entry in the instance dictionary, shadowing the class variable for that specific instance.
The Lookup Chain
Python's attribute lookup follows a specific chain:
Instance → Class → Parent Classes (MRO)
This is why you can have class variables that all instances share by default, but can be overridden per-instance. The dictionary model makes this incredibly flexible.
The Cost of Flexibility
This flexibility comes at a steep price: memory.
A Python dictionary is a sparse, resizable hash table. It needs to handle collisions, maintain load factors, and allow dynamic resizing. For a simple Point(x, y) object, this is massive overkill.
Let's measure it:
import sys
class DictPoint:
def __init__(self, x, y):
self.x = x
self.y = y
class SlotPoint:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
p1 = DictPoint(1, 2)
p2 = SlotPoint(1, 2)
print(f"DictPoint size: {sys.getsizeof(p1) + sys.getsizeof(p1.__dict__)} bytes")
print(f"SlotPoint size: {sys.getsizeof(p2)} bytes")
# Try to access __dict__ on slotted class
try:
print(p2.__dict__)
except AttributeError as e:
print(f"SlotPoint error: {e}")
On my machine, DictPoint consumes roughly 344 bytes while SlotPoint uses only 48 bytes. That's a 7x difference for storing two integers!
Now imagine creating a million of these objects. The dictionary-based approach would consume an additional ~100MB of RAM just for the dictionary overhead.
What Actually Changed?
At the C level, here's what's happening:
Standard Python Object (Dictionary-Based):
PyObject
↓
ob_refcnt (reference count)
ob_type (pointer to type)
__dict__ → PyDictObject (hash table)
↓
Hash map with keys 'x', 'y'
Slotted Python Object:
PyObject
↓
ob_refcnt
ob_type
x (stored at fixed offset +0)
y (stored at fixed offset +8)
With __slots__, you're essentially telling Python: "I know exactly what attributes this object will have. Don't give me a dictionary just allocate fixed memory offsets like a C struct."
The interpreter can now access obj.x by going directly by offsetting from the object's base address. No hash lookup, no dictionary traversal just pointer arithmetic.
The Trade-Off
But this optimization comes with restrictions:
class SlotPoint:
__slots__ = ['x', 'y']
def __init__(self, x, y):
self.x = x
self.y = y
p = SlotPoint(1, 2)
p.z = 3 # AttributeError: 'SlotPoint' object has no attribute 'z'
You cannot dynamically add attributes to slotted objects. The attributes are fixed at class definition time. This is the price of performance: you trade flexibility for efficiency.
When to Use __slots__
Use __slots__ when:
- You're creating many instances (thousands+) of the same class
- The attributes are known and fixed
- Memory is a concern (data processing, game engines, embedded systems)
Don't use __slots__ when:
- You need dynamic attributes
- You're creating only a few instances
- The class needs to be subclassed with additional attributes (requires careful design)
Summary:
We uncovered three fundamental truths about Python's object model:
-
ischecks memory address;==checks value → Python interns small integers for performance -
Objects are usually dictionaries → The
__dict__model provides incredible flexibility at a memory cost -
__slots__turns objects into C structs → Fixed attributes, direct memory access, 60-70% memory savings
The dictionary-based model is what makes Python so dynamic and easy to use. You can add attributes on the fly, monkey-patch classes, and inspect objects at runtime. But when you're building high-performance systems or working with large datasets, knowing how to bypass this flexibility with __slots__ can be the difference between a program that runs and one that crashes.
Top comments (0)