DEV Community

muriuki muriungi erick
muriuki muriungi erick

Posted on

Using python dictionary in data engineering.

Python dictionaries are a powerful data structure that can be useful in many data engineering applications. In this blog, we'll explore some of the ways that you can use Python dictionaries in data engineering, including how to create and manipulate dictionaries, and how to use them in various data processing tasks.

What is a Python Dictionary?

A Python dictionary is a collection of key-value pairs that allows you to store and retrieve data using a key. Dictionaries are one of the core data structures in Python, and are commonly used in a variety of applications, including data engineering. Here's an example of a simple dictionary in Python:

Image description
In this dictionary, the keys are 'key1', 'key2', and 'key3', and the values are 'value1', 'value2', and 'value3', respectively.

Creating and Accessing Dictionaries

To create a dictionary in Python, you can use the curly braces {} and separate the key-value pairs with colons. Here's an example:

Image description
You can also create a dictionary using the dict() function, which takes a sequence of key-value pairs as an argument. For example:

Image description
Once you have created a dictionary, you can access its values by using the keys. For example, to access the value associated with the 'name' key in the dictionary above, you can use the following code:

Image description
This will output 'Alice'.

Manipulating Dictionaries

Dictionaries are mutable, which means that you can add, delete, and modify key-value pairs in the dictionary. Here are some of the ways that you can manipulate dictionaries in Python:

Adding Key-Value Pairs

To add a new key-value pair to a dictionary, you can simply assign a value to a new key

Image description
This will add a new key 'email' with the value 'alice@example.com' to the dictionary

Modifying Values

To modify the value associated with a key in a dictionary, you can simply reassign the value:

Image description
This will change the value associated with the 'age' key from 30 to 31.

Deleting Key-Value Pairs

To delete a key-value pair from a dictionary, you can use the del statement:

Image description
This will remove the 'age' key and its associated value from the dictionary.

Using Dictionaries in Data Engineering

Dictionaries can be used in a variety of data engineering tasks, including data cleaning, data transformation, and data aggregation. Here are some examples of how dictionaries can be used in data engineering:

Data Cleaning

Suppose you have a dataset that contains customer information, and you want to clean up the data by standardizing the state names. You could create a dictionary that maps the abbreviated state names to the full state names, and then use that dictionary to replace the abbreviated state names in the dataset

happy coding

Top comments (2)

Collapse
 
chrisgreening profile image
Chris Greening

Thanks for sharing!

Learning how to leverage dict and dict-like structures in my data prep and ETL pipelines has been a huuuge improvement in code quality and readability - definitely one of my favorite tools to use!!

Collapse
 
ndurumo254 profile image
muriuki muriungi erick

sure, dict are very vital in both ETL and ELT data pipelines