In this guide I'll be giving a brief introduction to cryptography, explain the basic concepts of it, and show some practical code (with python) to use it all!
I've recently started diving deep into the cryptography world and I was inspired enough to share my insights about it, which is where this guide comes from.
I'm, however, by no means, an expert of this topic. But I'll try my best to convey what I've learned :)
This guide is pretty loooong, so if you are short on time, you can put it on your Reading list to continue later! If you're going for the one-go, Buckle up!
As wikipedia explains it:
Cryptography is the practice and study of techniques for secure communication in the presence of third parties called adversaries
To put it simply, Cryptography is the branch of CS which focuses on securing a piece of data through the use of specific algorithms.
Why specific algorithms? Because if we secure a piece of data by the same method everytime, so that the output for the same data is always identical - well, that's not very secure is it?
This is why we need algorithms, that are random. So that encrypting the same data doesn't always produce the same output. But you might already know that computers can never be truly random. So we need to rely on highly sophisticated encryption algorithms that can generate ciphers that are never truly random but as random as possible. So let's discuss about some of these algorithms!
The algorithms can be divided into 3 sub categories -
However there's another sub category -
Hashing. You might be wondering, but hashing is one-way whereas encryption is two-way. That is, what has been encrypted can also be decrypted but what is hashed can not be dehashed. This
Hashing is actually different however, I'm referring to a Cryptographic Hash function which can both me mapped and demapped.
We'll be focusing on
Symmetric algorithms in this guide as these are one of the most common encryption techniques.
Symmetric encryption is a type of encryption where only one key (a secret key) is used to both encrypt and decrypt electronic information
Symmetric encryption can be primarily divided into
AES which stands for
Advanced Encryption Standard, is one of the most popular encryption method. The python module
pycryptodome allows us to easily implement this method of encryption. So let's dive into some examples!
First make sure you have
pycryptodome installed with
pip install pycryptodome
To use the AES encryption we import these modules;
from Crypto.Cipher import AES from Crypto import Random
Note: You only need the
AES module for the main encryption, but to generate secure keys I recommend the
Random module as it generated Cryptographically Secure random output.
There are 2 keys that you need to generate for AES:-
AES_KEY: This is the primary key that is needed to both encrypt and decrypt a piece of data. This is a private protected key and should never be kept alongside the encrypted data. This key can be either
16 bytes, or
24 bytes, or
32 bytes. These are the values
Initiallization Vector (IV)/
NONCE: Depending on what AES mode you are using (we'll be diving into the modes explanation real soon), you'll also need either an IV or NONCE. This is a public key that can be stored alongside the encrypted_data. This key is used to prevent a piece of data to be always encrypted into the same output. IV is
16 bytesin size, reflected by
AES.block_size, whereas the size of a NONCE can range from
15 bytesinclusive. Hence the max limit is reflected by
Message Authentication Code (MAC): This is a byte string returned by Authenticated Encryption methods to check if the encrypted data was modified before decryption. More on this later! This is a public key that can be stored alongside the encrypted data. This key is
16 bytesin size.
Now AES has many modes, you should never just take a guess on which mode to use, most of these modes have a specific use case. Modes are denoted by
AES.MODE_(mode name here). Let's focus on the most commonly used modes!
AES.MODE_CBC: This is a method that requires data padding. That is, if the size of the data is NOT in multiples of the
AES_KEYsize, you'll have to pad it with fill values (usually zeroes) to encrypt. This requires an
IV. The output from this mode is completely malleable
AES.MODE_OFB: This is a stream cipher method. This DOES NOT require data padding and it is independent of the source data. Neither encryption nor decryption can be parallelized. That is, Neither encryption nor decryption is done in parts. If one part of the output corrupts, so does the whole data. This requires either an
NONCE. The output from this mode is completely malleable
AES.MODE_CFB: This is a stream cipher method. This DOES NOT require data padding and it is dependent on the source data. Encryption cannot be parallelized but decryption can be. That is, Encryption is not is done in parts but decryption is. This requires either an
NONCE. The output from this mode is completely malleable
AES.MODE_OCB: This is an authenticated encryption method. This method overcomes the drawbacks of the previous methods and uses a
MAC(Message Encryption Code) to make sure the
encrypted_datahas not been modified before decryption. This requires a
NONCE. The output from this mode is not malleable
I highly recommend reading this stack overflow post to know more about these modes.
So once you've decided on a mode to use, we can move on to the coding part! I'll be using
AES.MODE_OCB for this example.
First we generate the required keys:-
AES_key = Random.get_random_bytes(AES.key_size) # A key of size 16 bytes NONCE = Random.get_random_bytes(AES.block_size-1) # A key of 15 bytes
Now we have to convert our source_data to a byte string. Remember
AES modes take byte strings as input and put out byte strings as output. After that we can just encrypt it.
data = 'Hello World' byte_dat = data.encode() # You can also just do 'Hello World'.encode() cipher = AES.new(AES_key, AES.MODE_OCB, NONCE) ciphertext, MAC = cipher.encrypt_and_digest(byte_dat)
Afterwards, when we want to decrypt this
ciphertext, we'll be needing the
MAC and ofcourse the
cipher = AES.new(AES_KEY, AES.MODE_OCB, NONCE) decrypted_string = cipher.decrypt_and_verify(ciphertxt, MAC) og_string = decrypted_string.decode()
Now a few words on storing the keys and the encrypted data, if you remember from above, we can store the
MAC and the
NONCE in the same place. The
AES_KEY, however, should be stored as securely as possible, we DO NOT want unauthorized access to this key. There are many methods of storing byte strings. You can either store them exactly as they seem to be, as byte strings or you can conver them into a list of integers by using
list(). For instance, using
list() on a
MAC will return a list of 16 integers;
You can also decode these byte strings into strings as well for storage. The
codec you want to use for the AES output byte strings is
latin-1. However you can also use
b64encode() on the byte string and then use
decode() on them. I prefer base64encoding, salting etc on top of the
ciphertext since the
ciphertext is actually the same length as the original string, we don't want that.
For encryption storage-
from base64 import b64encode encoded_string = b64encode(ciphertxt).decode() NONCE_string = NONCE.decode('latin-1') MAC_string = MAC.decode('latin-1')
For decryption retrieval-
from base64 import b64decode ciphertxt = b64decode(encoded_string) NONCE = NONCE_string.encode('latin-1') MAC = MAC.encode('latin-1')
And that's it! That's how you encrypt data with AES using
I hope this guide helped you learn a thing or two about cryptography, it's a really cool topic and if your thirst for knowledge has not quenched yet checkout this really cool github repo which has a LOT of resources about Cryptography! If you made it this far, thanks for reading and I sincerely hope my guide helps you in your future endeavors!