DEV Community

Cover image for Python Learning : Serialization and De-serialization
Vivek Yadav
Vivek Yadav

Posted on

Python Learning : Serialization and De-serialization

When building real-world applications—especially those that deal with data exchange, persistence, or communication between systems—you’ll often need a way to convert Python objects into a storable or transferable format, and then reconstruct them back when needed. This is where serialization and de-serialization come in.


1. What is Serialization?

Serialization is the process of converting a Python object (e.g., dictionary, list, class instance) into a format that can be easily stored on disk, transmitted across a network, or shared between different environments.

  • Python Object → Serialized Format (string/bytes)

2. What is De-serialization?

De-serialization is the reverse process—converting the serialized format back into a Python object.

  • Serialized Format → Python Object

3. Serialization Formats in Python

Format Module / Library Usage
Pickle pickle Native Python serialization (supports almost any Python object, but not human-readable).
JSON json Language-independent, human-readable, widely used for APIs and configuration files.
YAML PyYAML (third-party) Human-friendly, used in configs (e.g., Kubernetes, Ansible).
MessagePack / Protobuf / Avro Third-party libraries More compact and efficient for distributed systems.

4. Serialization with pickle

import pickle

# Sample Python object
data = {
    "name": "Alice",
    "age": 30,
    "skills": ["Python", "ML", "Data Science"]
}

# Serialization
with open("data.pkl", "wb") as file:
    pickle.dump(data, file)

# De-serialization
with open("data.pkl", "rb") as file:
    loaded_data = pickle.load(file)

print(loaded_data)
Enter fullscreen mode Exit fullscreen mode

Output:

{'name': 'Alice', 'age': 30, 'skills': ['Python', 'ML', 'Data Science']}

Enter fullscreen mode Exit fullscreen mode

5. Serialization with json

import json

data = {
    "name": "Bob",
    "age": 25,
    "skills": ["JavaScript", "React", "Node.js"]
}

# Serialization
json_str = json.dumps(data)   # object → string
print(json_str)

# Save to file
with open("data.json", "w") as file:
    json.dump(data, file)

# De-serialization
with open("data.json", "r") as file:
    loaded_data = json.load(file)

print(loaded_data)
Enter fullscreen mode Exit fullscreen mode

Output:

{"name": "Bob", "age": 25, "skills": ["JavaScript", "React", "Node.js"]}
{'name': 'Bob', 'age': 25, 'skills': ['JavaScript', 'React', 'Node.js']}
Enter fullscreen mode Exit fullscreen mode

6. Serializing Custom Objects

Using pickle
import pickle

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

p = Person("Eve", 28)

# Serialize
with open("person.pkl", "wb") as f:
    pickle.dump(p, f)

# Deserialize
with open("person.pkl", "rb") as f:
    p2 = pickle.load(f)

print(p2.name, p2.age)
Enter fullscreen mode Exit fullscreen mode

Output:

Eve 28

Enter fullscreen mode Exit fullscreen mode

Using json with custom encoder/decoder

import json

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

def encode_person(obj):
    if isinstance(obj, Person):
        return {"name": obj.name, "age": obj.age, "__person__": True}
    raise TypeError("Object not serializable")

def decode_person(dct):
    if "__person__" in dct:
        return Person(dct["name"], dct["age"])
    return dct

# Serialize
p = Person("Alice", 30)
json_str = json.dumps(p, default=encode_person)
print(json_str)

# Deserialize
p2 = json.loads(json_str, object_hook=decode_person)
print(p2.name, p2.age)
Enter fullscreen mode Exit fullscreen mode

Output:

{"name": "Alice", "age": 30, "__person__": true}
Alice 30
Enter fullscreen mode Exit fullscreen mode

7. Advanced Serialization Formats

Using MessagePack
import msgpack

data = {"id": 1, "msg": "Hello"}

# Serialize
packed = msgpack.packb(data)
print(packed)

# Deserialize
unpacked = msgpack.unpackb(packed)
print(unpacked)
Enter fullscreen mode Exit fullscreen mode

Output (packed is in bytes):

b'\x82\xa2id\x01\xa3msg\xa5Hello'
{'id': 1, 'msg': 'Hello'}
Enter fullscreen mode Exit fullscreen mode

8. Best Practices

1. Use json when:

  • Interacting with external systems.
  • Need human-readable formats.
  • Working with web APIs.

2. Use pickle only for:

  • Internal Python applications.
  • Short-term persistence.
  • Avoid loading untrusted pickle data.

3. Use binary formats (MessagePack/Protobuf) when:

  • Performance and efficiency matter.
  • Working with distributed systems.

9. Real-World Applications

  • Machine Learning: Saving/loading trained models (joblib/pickle).
  • Web Development: JSON payloads via REST APIs.
  • Configuration Management: YAML/JSON for configs.
  • Caching: Storing serialized objects in Redis.
  • Data Pipelines: Kafka/Avro for event streaming.

10. Conclusion

Serialization and de-serialization are fundamental concepts enabling persistence, sharing, and transfer of data.

  • Pickle: Powerful but Python-specific.
  • JSON: Universal and human-readable.
  • MessagePack/Protobuf/Avro: Efficient for distributed and large-scale apps.

Choosing the right format makes your application efficient, secure, and interoperable.

Top comments (0)