Python has a variety of applications and use cases. Python's versatile nature allows it to be used for web development, data analysis, machine learning, and artificial intelligence. Data scientists can leverage the capabilities of Python to read, clean, visualize and analyze data. Moreover, python offers an awesome environment for training and deploying machine learning models.
Data visualization with Matplotlib
A picture is worth a thousand words. Data professionals visualize data to identify trends and communicate findings. Python has powerful packages for data visualization, among them Matplotlib. It uses the data provided to create desired plots that allow the data to tell its own story.
Matplotlib has a sub-package called Pyplot. To work with Pyplot, import the package;
import Matplotlib.pyplot as plt
Matplotlib is capable of creating a myriad of visualizations
i.e
- Line Plot: Plots data points connecting them by lines to visualize trends and relationships over a continuous variable.
plt.plot(x_values, y_values)
- Scatter Plot: Displays individual data points as dots to observe patterns or relationships between two variables.
plt.scatter(x_values, y_values)
- Bar Chart: Represents categorical data using rectangular bars to compare values across different categories.
plt.bar(x_values, y_values)
- Histogram: This shows the distribution of a continuous variable by dividing it into stacks called bins and displaying the count of data points within each bin. The size of the bin is determined by the count of data points within that range.
plt.hist(data, no_of_bins)
- Pie Chart: Displays proportions of different categories as sectors of a circle, it is useful for representing parts of a whole.
plt.pie(data, labels=labels(names of the proportions/ categories)
- Box Plot: Illustrates summary statistics, such as median, quartiles, and outliers, of a numerical variable to understand its distribution and identify potential outliers.
plt.boxplot(data)
- Heatmap: Visualize a matrix of data using colors to represent values, often used for correlation matrices or showing patterns in two-dimensional data.
plt.imshow(data, cmap='pick_colour_from_colourmap')
- Area Plot: Depicts the cumulative values of multiple variables over time, where the area between the lines represents the cumulative sum.
plt.fill_between(x_values, y_values1, y_values2)
- Horizontal bar Chart: Similar to a bar chart but with the bars plotted horizontally, useful for comparing values across different categories.
plt.barh(y_values, x_values)
- Violin Plot: Combine a box plot and a kernel density plot to display the distribution of a variable, providing information about both central tendency and density. Useful in detecting outliers
plt.violinplot(data)
Each visualization can be formatted to include:
- Axis labels
- Chart name/title
- Data labels
- Grid lines
- Legend
- Trendline
- Error bars
Common Data Structures in Python
A data structure is a data organization and storage format that is usually chosen for efficient access to data.
The systematic organization allows for efficient management of the data in the computer's memory storage locations.
You can read the location ID data is stored i.e. calling a function the variable name that references the storage location.
print(id(name_of_variable))
A variable is a named storage location that can hold a value
Common Data Structures in Python include:
- Lists
- Tuple
- Dictionary
- Sets ###Lists Say, I initiate a variable x, with certain integers as values, enclosed in square brackets:
x = [1, 2, 3, 4, 5, 6]
I can check the data type of the variable x, which will be influenced by the value it holds, in this case, integers enclosed in square brackets:
type = type(x)
print(type)
The output is a "list".
By initiating the variable x, we have created an object of the class "list".
Different data structures portray different management methods and capabilities-class determines behavior.
A list is a mutable data structure. This means that it portrays the following characteristics and capabilities:
- Appending: adding items to the list
x.append(the_value_to_be_added)
- Replacing: replacing items on a list with other items
x[2] = 5
This replaces the item at index 2 with 5
- Removing: removing items from a list
x.remove(value_to_remove)
The ability to perform these functions without having to create a new list is called mutability.
Lists can hold values of different data types i.e. string, integer, float, boolean and other lists.
Tuple
Tuples differ in syntax from lists slightly. They take normal brackets as opposed to square brackets, like in lists.
y = (1, 2, 3, 4, 5, 6)
Tuples are immutable and cannot be altered. They are used when you want to represent a collection of related values that should remain constant (not altered), such as coordinates, settings, or database records.
A tuple can hold values of different data types i.e. string, integer, float, boolean, and other lists/tuples.
Dictionaries
Take the example of a traditional dictionary. Say you want to know the meaning of the word programmer. You will open your dictionary and look up the word programmer.
Once you access the word in the dictionary, you will be able to read the definition of the word programmer. i.e.
programmer: person who turn the designs created by software developers and engineers into instructions that a computer can follow
The word programmer(which we know) is a key that directs us to the definition of itself(which we do not know).
The definition is the value we are looking for using the key, programmer.
Dictionaries in Python work under a similar principle. You can create keys and assign a value to each key. This will form a key-value pair, which is now a dictionary.
Dictionaries are enclosed in curly brackets{ }.
Syntax:
students_grade = {
Martin: "Not yet",
Jacob: "Pass",
Hellen: "Pass",
Joel: "Not yet",
Joylenne: "Not yet"
}
To access the grade of any student, I can use their name(which I know), to find out their grade(which I may not be knowing).
students_grade["Hellen"]
- This will return "Pass".
Dictionaries are used:
- In J-SON files which are stored as key-value pairs.
- In data retrieval by calling a key and accessing its value. To view all keys in a dictionary you can call the function;
name_of_dictionary.keys()
As a programmer, along the way, you will pick up code best practices that will guide you in structuring data structures and algorithms in a way that makes the code you write:
- More readable
- Display faster runtime
- Consume lesser space
Top comments (0)