DEV Community

BC
BC

Posted on

Convert string to bytes for both python 2 and 3 - Python Tips

Python2 retired this year, if you have some old projects are still running Python2, you may want to gradually update your code to Python3.

One big difference between Python 2 and 3 is the string and bytes type. This affect many functions' parameters. For example, if you have some code like this in Python2:

import hashlib

s = "abcd"
hashed = hashlib.md5(s).hexdigest()
Enter fullscreen mode Exit fullscreen mode

It will throw exception if you run it in Python3:

>>> import hashlib
>>> s = "abcd"
>>> hashed = hashlib.md5(s).hexdigest()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Unicode-objects must be encoded before hashing
Enter fullscreen mode Exit fullscreen mode

The reason is, in Python3, s="abcd", s is str type which is unicode, and hashlib.md5 only take bytes object for hashing.

So to make the code both compatible with Python2 and 3, we need to add a converter function to change the parameter to be bytes type.

PY2 = sys.version_info[0] == 2
if PY2:
    unicode_type = unicode
else:
    unicode_type = str

def force_bytes(s, encoding="utf-8", errors="strict"):
    if isinstance(s, unicode_type):
        s = s.encode(encoding, errors)
    return s
Enter fullscreen mode Exit fullscreen mode

This code will check Python version and based on the version to make proper conversion. Now we can call this force_bytes function before pass it to md5 function:

import hashlib
s = "abcd"
hashed = hashlib.md5(force_bytes(s)).hexdigest()
Enter fullscreen mode Exit fullscreen mode

A second way will be using the six library. It contains lots of util functions could help your code be compatible with both 2 and 3. For example, it has a ensure_binary function which is similar with our force_bytes function, so you can call it like this:

import hashlib
import six

s = "abcd"
hashed = hashlib.md5(six.ensure_binary(s)).hexdigest()
Enter fullscreen mode Exit fullscreen mode

If you want a function to convert to unicode, you can check six.ensure_text function, and if you are using StringIO, six also has StringIO and BytesIO you can use. You can check more functions here.

Top comments (0)