TL;DR
Subclass collections.UserDict
, not dict
.
Longer Description
I need a dictionary-like variable, and pass it to other function, when the function returned, I need to know if this dictionary has been changed (key or value), if it is, I'll do a database saving (if you are seeing this as the process of saving sessions in the server side, it exactly is). Since this is a frequent operation, I want to make this as fast as possible.
The first idea came to my mind is like this:
session_dict = {}
old_dict = {**session_dict}
# now call function
func(session_dict)
if session_dict != old_dict:
# change happend, do database saving
But this method require a dictionary copy, which cost more space, and compare two dictionary objects, I am sure there are some recursive comparing happened inside, which may cost more CPU time.
The second idea is using pickle
to compare the pickled string:
import pickle
session_dict = {}
finger_print = pickle.dumps(session_dict)
func(session_dict)
if pickle.dumps(session_dict) == finger_print:
# change happend, do database saving
This one uses pickle.dumps
to convert dictionary to string, then after the function call, it compares the string. Based on how long the string is, this may cost more CPU time.
The third idea is, to subclass the dict
, and when __setitem__
and __delitem__
are being called, mark the change flag (you may also want to see my previous post here on subclassing dict in Python to support dot syntax accessing):
class ChangeDetectableDict(dict):
def __init__(self, val=None):
if val is None:
val = {}
super().__init__(val)
self.changed = False
def __setitem__(self, item, value):
super().__setitem__(item, value)
self.changed = True
def __delitem__(self, item):
super().__delitem__(item)
self.changed = True
This works fine if we have code to set key:
session_dict = ChangeDetectableDict()
session_dict["name"] = "bo"
print(session_dict.changed) # True
Or delete a key:
session_dict = ChangeDetectableDict({"name": "bo"})
del session_dict["name"]
print(session_dict.changed) # True
This will make the change-detection to time O(1)
, which is great. But You may see the problem that if I set the key's value to be the same value, the changed
will still be marked as True:
session_dict = ChangeDetectableDict({"name": "bo"})
session_dict["name"] = "bo"]
print(session_dict.changed) # True
In my case I am fine with it, as long as if something changed, the changed
will be marked as True, I don't mind having some false-positive case. But the real problem is, if I don't use del
to delete a key, use pop
instead, the __delitem__
method won't be called, thus changed
is still False:
session_dict = ChangeDetectableDict({"name": "bo"})
session_dict.pop("name", None)
print(session_dict.changed) # False
This is unacceptable for my use case, and the reason why this happens is because Python's built-in dict
has some inline optimizations which leads pop
not calling __delitem__
.
So the fourth method I found is not subclassing dict
, instead, subclass collections.UserDict
:
from collections import UserDict
class ChangeDetectableDict(UserDict):
def __init__(self, val=None):
if val is None:
val = {}
super().__init__(val)
self.changed = False
def __setitem__(self, item, value):
super().__setitem__(item, value)
self.changed = True
def __delitem__(self, item):
super().__delitem__(item)
self.changed = True
Now no matter we use del
or pop
, the __delitem__
method will always get called, thus will mark changed
to be True.
Back to the beginning, now I can write code like this:
session_dict = ChangeDetectableDict()
func(session_dict)
if session_dict.changed:
# change happend, do database saving
Top comments (0)