DEV Community

Cover image for Most used Pandas features (At least I use often:) ) [PART I]
Muhammet Tan
Muhammet Tan

Posted on

Most used Pandas features (At least I use often:) ) [PART I]

We can create Pandas series from array,list,dict or a dataframe.

“In simple words Pandas Series is a one-dimensional labeled array that holds any data type (integers, strings, floating-point numbers, None, Python objects, etc.). The axis labels are collectively referred to as the index”

If you come from a object oriented programming language, you may be familiar with term of constructor.

#Pandas Series Constructor Syntax looks like this
Pandas.series(data,index,dtype,copy)`
Enter fullscreen mode Exit fullscreen mode
#data: The data contains ndarray, list, constants.
#Index: The index must be unique and hashable. np.arrange(n) if no index is passed.
#dtype: dtype is also a data type.
#copy: It is used to copy the data. The data contains ndarray, 
Enter fullscreen mode Exit fullscreen mode

list, constants.
If you found above code messy, it is not important.Let’s start with basic implementations.

Creating a empty Pandas Series:

newserie= pd.Series() 
Output: Series([], dtype: float64)p
Enter fullscreen mode Exit fullscreen mode

As you can see above if you create a empty pandas series, default datatype will be float and the index of the series starts from 0.

Basic Example Code:

# import pandas as pd
import pandas as pd

# import numpy as np
import numpy as np

# simple array
data = np.array(['g', 'e', 'e', 'k', 's'])
Enter fullscreen mode Exit fullscreen mode
# providing an index
ser = pd.Series(data, index=[10, 11, 12, 13, 14])#we specify the indexes,
print(ser)

Output:
10    g
11    e
12    e
13    k
14    s
dtype: object
Enter fullscreen mode Exit fullscreen mode

Creating a series from Numpy Array:

import pandas as pd 
import numpy as np
data = np.array(['python','php','java'])
series = pd.Series(data)
print (series)
Output:
0    python
1    php
2    java
Enter fullscreen mode Exit fullscreen mode

Creating a series from lists (with indexes):

calories = pd.Series([320, 450, 140, 56], [“apple”, “banana”, “melon”, “bread”])
Enter fullscreen mode Exit fullscreen mode

After creation we will see indexes in our first column, in the same way our
data will be in the second column.

Second example:

second_list=['C','T']
example =pd.Series(second_list)
output:
0  A 
1  T 
dtype: object
Enter fullscreen mode Exit fullscreen mode

If we dont define index numbers, index numbers will be generated automatically.

Creating a serie from Dictionary

grades_dictionary={'A':100,'B':80,'C':70,'D':60}
grades=pd.Series(grades_dictionary)
print(grades)
Output:
A    100
B    80
C    70
D    60
dtype: object
Enter fullscreen mode Exit fullscreen mode

Creating a Pandas series from List Comprehensions

seri = pd.Series(range(1,15,5), index=[i for i in ‘abc’])
print(seri)Output:
a     1
b     6
c    11
dtype: int64
Enter fullscreen mode Exit fullscreen mode

first paramter of range is start value

second parameter of range is final value

last parameter of range is increase amount

from 1 to 15 => 1 (1)+ 5 (6) + 5 (11)...

Conditional Operations in Series

populations =pd.Series([1350, 1200, 350, 150], ["China", "India", "USA", "Russia"], name="Populations")
populations[populations> 500]
Output:
China       1350 
India       1200
Name: Populations, dtype: int64
Enter fullscreen mode Exit fullscreen mode

We can also use logical operators like this.

populations[(populations > 200) & (populations < 1000)]
Output:
USA    350
Name: Populations, dtype: int64
Enter fullscreen mode Exit fullscreen mode

Arithmetic operations in Pandas Series

data = pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])
data1 = pd.Series([5, 6, 7, 8], index=['d', 'a', 'f', 'g'])
data.add(data1)

Output :
a    7.0
b    NaN
c    NaN
d    9.0
f    NaN
g    NaN
Enter fullscreen mode Exit fullscreen mode

As you can see above some results shown as Nan value.Because of that sequance of indexes are not same. For example in first serie (data), first index is ‘a’, but second index starts with index of ‘d’.

We can fill different indexes as we want. In order to do that we should add second parameter to add() method like this.

data.add(data1, fill_value=0)
Enter fullscreen mode Exit fullscreen mode

Value of second parameter of add meethod is not mandatory.I use 0 value but you can enter value as you want 1,2,3, etc…

Datatype Conversions in Pandas

If you check out official pandas documentation, you will see this sentence: “Whether object dtypes should be converted to the best possible types.”

For example :

When you initialize a pandas series like this

serie1= pd.Series([9, 6, 3])
print(serie1)
Output:
0    9
1    6
2    3
dtype: int64 int64
Enter fullscreen mode Exit fullscreen mode

Even though we didn’t define datatype of serie explicitly, output dataype has became int64. Well, if you are asked to specify the datatype or convert it other ones?

We can use astype() function to get rid of it.

For example:

serie1= pd.Series([9, 6, 3])
floatserie= serie1.astype(float)
Output:
0    9.0
1    6.0
2    3.0
dtype: float64 float64  #we can see that output datatype is float64
Enter fullscreen mode Exit fullscreen mode

*Additional Information
We can easily convert Pandas Series to Python Lists with toList()

Like this

py_list= serie1.tolist()
Output: 
[9.0, 6.0, 3.0]
Enter fullscreen mode Exit fullscreen mode

https://sparkbyexamples.com/python-pandas-tutorial-for-beginners/
https://pandas.pydata.org/docs/reference/api/pandas.Series.convert_dtypes.html

Top comments (0)