DEV Community

GharamElhendy
GharamElhendy

Posted on

Data Visualization for Result Communication: Pandas and Matplotlib

Importing and parsing

Import pandas as pd
% matplotlib


df = pd.read_csv('file_name.csv')

Creating variable names to represent the groups in which you divide your data. For example, if you divide your data into two groups:

df_group1 = df[df['variable_used_to_divide_data'] == '>number']
df_group2 = df[df['variable_used_to_divide_data'] == '<number']


Creating charts (in this example, a bar chart) to compare values of the same variable in each of the groups

Note: We have to index the value counts in order to make the comparison easier in visual terms

ind = df_group1['comparison_variable'].value_counts().index
df_group1['comparison_variable'].value_counts()[ind].plot(kind='bar');
df_group2['comparison_variable'].value_counts()[ind].plot(kind= 'bar');


Creating pie charts to see variables that dominate each group:

ind df_group1['another_comparison_variable'].value_counts().index
df_group1['another_comparison_variable'].value_counts()[ind].plot(kind= 'pie', figsize= (8, 8));


Creating histograms to plot distributions of each group:

df_group1['third_comparison_variable'].hist();
df_group2['third_comparison_variable'].hist();


Then, viewing summary statistics

df_group1['third_comparison_variable'].describe()
df_group2['third_comparison_variable'].describe()


Top comments (0)