What is insecure deserialization?

#programming #programminglanguages #python #cybersecurity

Getting to know a critical vulnerability that affects Java, Python, and other common programming languages.

Photo by Jiawei Zhao on Unsplash

As a penetration tester, there are few vulnerabilities that fascinate me more than insecure deserialization.

Insecure deserialization bugs are very critical vulnerabilities: an insecure deserialization bug will often result in remote code execution, granting attackers a wide range of capabilities on the application.

Defending against deserialization vulnerabilities is also extremely difficult. How an application can defend against these vulnerabilities varies and depends greatly on the programming language, libraries, and serialization formats used. Because of this, there is no one-size-fits-all solution. Today, let’s dive into the ins and outs of insecure deserialization vulnerabilities.

What is object serialization?

In order to understand why “deserialization” can be insecure, you’ll need to first understand how applications serialize and deserialize objects.

Serialization is a process during which an object in a programming language (say, a Java or Python object) is converted into a format that can be saved to the database or transferred over a network. Whereas deserialization refers to the opposite: it’s when the serialized object is read from a file or the network and converted back into an object.

Basically, when you need to store an object or transfer it over the network, you serialize it to pack it up.

object ---serialization---> transportable format of the object

When you need to use that data again, you deserialize and get the data that you want.

transportable format of the object ---deserialization--> object

Many programming languages support the serialization and deserialization of objects, including Java, PHP, Python, and Ruby.

How deserialization becomes “insecure”

Insecure deserialization is a type of vulnerability that arises when an attacker can manipulate the serialized object and cause unintended consequences in the program’s flow.

Object value manipulation

For example, if a serialized object is used as a cookie for access control, you can try changing the usernames, role names, and other identity markers that are present in the object and re-serialize it and relay it back to the application.

Take a look at this PHP serialized object. It represents a “User” object and contains properties such as “username” and “status”. If the application assumes that serialized objects are safe and used them without verification, attackers might be able to tamper with the serialized object to manipulate object properties.

O:4:"User":2:{s:8:"username";s:6:"vickie";s:6:"status";s:9:"not admin";}

In this serialized string, you can try to change the value of “status” to “admin”, and see if the application grants you admin privileges.

O:4:"User":2:{s:8:"username";s:6:"vickie";s:6:"status";s:5:"admin";}

Serialization does not provide any form of data integrity protection. It is simply a way of packaging data for transmission.

Remote code execution (RCE)

If the application does not handle deserialization safely, an attacker might even be able to execute malicious code!

In Python, serialization is done through “Pickles”. The following code snippet will print the pickled representation of the Person object “new_person” (this process is called pickling):

class Person:
  def __init__ (self, name):
    self.name = name

new_person = Person("Vickie")
print(pickle.dumps(new_person))

The pickled object would look like this:

b'\x80\x03c __main__ \nPerson\nq\x00)\x81q\x01}q\x02X\x04\x00\x00\x00nameq\x03X\x06\x00\x00\x00Vickieq\x04sb.'

Python allows objects to declare how they should be pickled via the reduce method. This method takes no argument and returns either a string or a tuple. When returning a tuple, the tuple will dictate how the object will be reconstructed during unpickling:

(callable object that will be called to instantiate the new object, a tuple of arguments for that callable object)

This means that if you define a reduce method in an object, the pickled object could be instantiated as something else during unpickling. Now if the attacker constructs a malicious object like this:

class Malicious:
  def __reduce__ (self):
    return (os.system, ('cat /etc/shadow',))

fake_object = Malicious()
session_cookie = base64.b64encode(pickle.dumps(fake_object))

They can make the victim application execute arbitrary code after unpickling the fake object:

os.system('cat /etc/shadow')

You can learn in detail how attackers exploit Java deserialization vulnerabilities here, insecure deserialization vulnerabilities in Python here, and PHP unserialize vulnerabilities here.

Preventing insecure deserialization vulnerabilities

To prevent insecure deserialization, you need to first keep an eye out for patches and keep dependencies up to date. Many insecure deserialization vulnerabilities are introduced via dependencies, so make sure that your third-party code is secure. You can automate this process by employing a software composition analysis (SCA) tool.

If you are implementing the deserialization functionality yourself, make sure not to deserialize any data tainted by user input without proper checks. If deserialization is absolutely necessary, restrict deserialization to a small list of allowed classes (use a whitelist). You can learn how to deserialize objects safely to prevent insecure deserialization here. ShiftLeft’s vulnerability fix database contains detailed code samples for safely implementing deserialization in Java, Python, C#, Go, JavaScript, and Scala.

It also helps to utilize simple data types, like strings and arrays instead of objects that need to be serialized on transport. To prevent the tampering of serialized cookies, keep the session state on the server instead of relying on user input for session information.

What other security concepts do you want to learn about? I’d love to know. Feel free to connect on Twitter @vickieli7.

Continuously scanning your codebase for insecure deserialization issues is the best way to prevent them. ShiftLeft CORE can find insecure deserialization vulnerabilities in your application and protect you from malicious attacks.