DEV Community

BC
BC

Posted on

1

Convert string to bytes for both python 2 and 3 - Python Tips

Python2 retired this year, if you have some old projects are still running Python2, you may want to gradually update your code to Python3.

One big difference between Python 2 and 3 is the string and bytes type. This affect many functions' parameters. For example, if you have some code like this in Python2:

import hashlib

s = "abcd"
hashed = hashlib.md5(s).hexdigest()
Enter fullscreen mode Exit fullscreen mode

It will throw exception if you run it in Python3:

>>> import hashlib
>>> s = "abcd"
>>> hashed = hashlib.md5(s).hexdigest()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Unicode-objects must be encoded before hashing
Enter fullscreen mode Exit fullscreen mode

The reason is, in Python3, s="abcd", s is str type which is unicode, and hashlib.md5 only take bytes object for hashing.

So to make the code both compatible with Python2 and 3, we need to add a converter function to change the parameter to be bytes type.

PY2 = sys.version_info[0] == 2
if PY2:
    unicode_type = unicode
else:
    unicode_type = str

def force_bytes(s, encoding="utf-8", errors="strict"):
    if isinstance(s, unicode_type):
        s = s.encode(encoding, errors)
    return s
Enter fullscreen mode Exit fullscreen mode

This code will check Python version and based on the version to make proper conversion. Now we can call this force_bytes function before pass it to md5 function:

import hashlib
s = "abcd"
hashed = hashlib.md5(force_bytes(s)).hexdigest()
Enter fullscreen mode Exit fullscreen mode

A second way will be using the six library. It contains lots of util functions could help your code be compatible with both 2 and 3. For example, it has a ensure_binary function which is similar with our force_bytes function, so you can call it like this:

import hashlib
import six

s = "abcd"
hashed = hashlib.md5(six.ensure_binary(s)).hexdigest()
Enter fullscreen mode Exit fullscreen mode

If you want a function to convert to unicode, you can check six.ensure_text function, and if you are using StringIO, six also has StringIO and BytesIO you can use. You can check more functions here.

Top comments (0)

👋 Kindness is contagious

Engage with a wealth of insights in this thoughtful article, valued within the supportive DEV Community. Coders of every background are welcome to join in and add to our collective wisdom.

A sincere "thank you" often brightens someone’s day. Share your gratitude in the comments below!

On DEV, the act of sharing knowledge eases our journey and fortifies our community ties. Found value in this? A quick thank you to the author can make a significant impact.

Okay