DEV Community

John
John

Posted on

Cracking the Code: A Data Beginner's Guide to Python Programming

Python has a variety of applications and use cases. Python's versatile nature allows it to be used for web development, data analysis, machine learning, and artificial intelligence. Data scientists can leverage the capabilities of Python to read, clean, visualize and analyze data. Moreover, python offers an awesome environment for training and deploying machine learning models.

Data visualization with Matplotlib

A picture is worth a thousand words. Data professionals visualize data to identify trends and communicate findings. Python has powerful packages for data visualization, among them Matplotlib. It uses the data provided to create desired plots that allow the data to tell its own story.
Matplotlib has a sub-package called Pyplot. To work with Pyplot, import the package;

import Matplotlib.pyplot as plt
Enter fullscreen mode Exit fullscreen mode

Matplotlib is capable of creating a myriad of visualizations
i.e

  • Line Plot: Plots data points connecting them by lines to visualize trends and relationships over a continuous variable.
plt.plot(x_values, y_values)
Enter fullscreen mode Exit fullscreen mode
  • Scatter Plot: Displays individual data points as dots to observe patterns or relationships between two variables.
plt.scatter(x_values, y_values)
Enter fullscreen mode Exit fullscreen mode
  • Bar Chart: Represents categorical data using rectangular bars to compare values across different categories.
plt.bar(x_values, y_values)
Enter fullscreen mode Exit fullscreen mode
  • Histogram: This shows the distribution of a continuous variable by dividing it into stacks called bins and displaying the count of data points within each bin. The size of the bin is determined by the count of data points within that range.
plt.hist(data, no_of_bins)
Enter fullscreen mode Exit fullscreen mode
  • Pie Chart: Displays proportions of different categories as sectors of a circle, it is useful for representing parts of a whole.
plt.pie(data, labels=labels(names of the proportions/ categories)
Enter fullscreen mode Exit fullscreen mode
  • Box Plot: Illustrates summary statistics, such as median, quartiles, and outliers, of a numerical variable to understand its distribution and identify potential outliers.
plt.boxplot(data)
Enter fullscreen mode Exit fullscreen mode
  • Heatmap: Visualize a matrix of data using colors to represent values, often used for correlation matrices or showing patterns in two-dimensional data.
plt.imshow(data, cmap='pick_colour_from_colourmap')
Enter fullscreen mode Exit fullscreen mode
  • Area Plot: Depicts the cumulative values of multiple variables over time, where the area between the lines represents the cumulative sum.
plt.fill_between(x_values, y_values1, y_values2)
Enter fullscreen mode Exit fullscreen mode
  • Horizontal bar Chart: Similar to a bar chart but with the bars plotted horizontally, useful for comparing values across different categories.
plt.barh(y_values, x_values)
Enter fullscreen mode Exit fullscreen mode
  • Violin Plot: Combine a box plot and a kernel density plot to display the distribution of a variable, providing information about both central tendency and density. Useful in detecting outliers
plt.violinplot(data)

Enter fullscreen mode Exit fullscreen mode

Each visualization can be formatted to include:

  • Axis labels
  • Chart name/title
  • Data labels
  • Grid lines
  • Legend
  • Trendline
  • Error bars

Common Data Structures in Python

A data structure is a data organization and storage format that is usually chosen for efficient access to data.
The systematic organization allows for efficient management of the data in the computer's memory storage locations.
You can read the location ID data is stored i.e. calling a function the variable name that references the storage location.

print(id(name_of_variable))
Enter fullscreen mode Exit fullscreen mode

A variable is a named storage location that can hold a value
Common Data Structures in Python include:

  • Lists
  • Tuple
  • Dictionary
  • Sets ###Lists Say, I initiate a variable x, with certain integers as values, enclosed in square brackets:
x = [1, 2, 3, 4, 5, 6]
Enter fullscreen mode Exit fullscreen mode

I can check the data type of the variable x, which will be influenced by the value it holds, in this case, integers enclosed in square brackets:

type = type(x)
print(type)
Enter fullscreen mode Exit fullscreen mode

The output is a "list".
By initiating the variable x, we have created an object of the class "list".
Different data structures portray different management methods and capabilities-class determines behavior.
A list is a mutable data structure. This means that it portrays the following characteristics and capabilities:

- Appending: adding items to the list
x.append(the_value_to_be_added)
Enter fullscreen mode Exit fullscreen mode
- Replacing: replacing items on a list with other items
x[2] = 5
Enter fullscreen mode Exit fullscreen mode

This replaces the item at index 2 with 5

- Removing: removing items from a list
x.remove(value_to_remove)
Enter fullscreen mode Exit fullscreen mode

The ability to perform these functions without having to create a new list is called mutability.
Lists can hold values of different data types i.e. string, integer, float, boolean and other lists.

Tuple

Tuples differ in syntax from lists slightly. They take normal brackets as opposed to square brackets, like in lists.

y = (1, 2, 3, 4, 5, 6)
Enter fullscreen mode Exit fullscreen mode

Tuples are immutable and cannot be altered. They are used when you want to represent a collection of related values that should remain constant (not altered), such as coordinates, settings, or database records.
A tuple can hold values of different data types i.e. string, integer, float, boolean, and other lists/tuples.

Dictionaries

Take the example of a traditional dictionary. Say you want to know the meaning of the word programmer. You will open your dictionary and look up the word programmer.
Once you access the word in the dictionary, you will be able to read the definition of the word programmer. i.e.
programmer: person who turn the designs created by software developers and engineers into instructions that a computer can follow
The word programmer(which we know) is a key that directs us to the definition of itself(which we do not know).
The definition is the value we are looking for using the key, programmer.
Dictionaries in Python work under a similar principle. You can create keys and assign a value to each key. This will form a key-value pair, which is now a dictionary.
Dictionaries are enclosed in curly brackets{ }.
Syntax:

students_grade = {
Martin: "Not yet",
Jacob: "Pass",
Hellen: "Pass",
Joel: "Not yet",
Joylenne: "Not yet"
}
Enter fullscreen mode Exit fullscreen mode

To access the grade of any student, I can use their name(which I know), to find out their grade(which I may not be knowing).
students_grade["Hellen"] - This will return "Pass".

Dictionaries are used:

  • In J-SON files which are stored as key-value pairs.
  • In data retrieval by calling a key and accessing its value. To view all keys in a dictionary you can call the function; name_of_dictionary.keys()

As a programmer, along the way, you will pick up code best practices that will guide you in structuring data structures and algorithms in a way that makes the code you write:

  • More readable
  • Display faster runtime
  • Consume lesser space

Top comments (0)