DEV Community: IftakharRahat

Python for data analysis- Pandas

IftakharRahat — Tue, 20 Sep 2022 14:42:11 +0000

1.SERIES

A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.

import numpy as np
import pandas as pd

Creating a series: basic syntax of creating a series is pd.Series(data type,index) here is some examples

labels = ['a','b','c']
my_list = [10,20,30]
arr = np.array([10,20,30])
d = {'a':10,'b':20,'c':30}
pd.Series(data=my_list)
output:
0    10
1    20
2    30
dtype: int64

pd.Series(data=my_list,index=labels)
output:
a    10
b    20
c    30
dtype: int64

NumPy arrays

pd.Series(arr)
output:
0    10
1    20
2    30
dtype: int64
pd.Series(arr,labels)
output:
a    10
b    20
c    30
dtype: int64

dictionary

pd.Series(d)
o:
a    10
b    20
c    30
dtype: int64

data in a series: _a pandas series can hold a variety of object types_

pd.Series(data=labels)
o:
0    a
1    b
2    c
dtype: object

Using an index:

ser1 = pd.Series([1,2,3,4],index = ['USA', 'Germany','USSR', 'Japan'])  
o:
USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

ser2 = pd.Series([1,2,5,4],index = ['USA', 'Germany','Italy', 'Japan'])                                   
o:
USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

Operations are then also done based off of index:

ser1 + ser2
Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

2. DataFrames

DataFrames are the workhorse of pandas and are directly inspired by the R programming language. We can think of a DataFrame as a bunch of Series objects put together to share the same index.
import pandas as pd import numpy as np
syntax of dataframe:
pandas.DataFrame(data, index, columns)
randn() function in Python is used to return random values from the normal distribution in a specified shape. This function creates an array of the given shape and it fills with random samples from the normal standard distribution.

from numpy.random import randn
np.random.seed(101)

df = pd.DataFrame(randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())
    W       X               Y               Z
A   2.706850    0.628133    0.907969    0.503826
B   0.651118    -0.319318   -0.848077   0.605965
C   -2.018168   0.740122    0.528813    -0.589001
D   0.188695    -0.758872   -0.933237   0.955057
E   0.190794    1.978757    2.605967    0.683509

randn(5,4) here its a 2 dimensional array (5 rows and 4 columns)

Selection and indexing

df['W']
A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

df[['W','Z']]

        W               Z
A   2.706850    0.503826
B   0.651118    0.605965
C   -2.018168   -0.589001
D   0.188695    0.955057
E   0.190794    0.683509

Creating a new column

df['new'] = df['W'] + df['Y']
        W          X            Y          Z             NEW
A   2.706850   0.628133 0.907969   0.503826      3.614819
B   0.651118  -0.319318    -0.848077   0.605965 -0.196959
C      -2.018168   0.740122 0.528813  -0.589001 -1.489355
D   0.188695  -0.758872    -0.933237   0.955057 -0.744542
E   0.190794   1.978757 2.605967   0.683509  2.796762

Removing column using drop method for removing column we have to use axis=1. Syntax of removing column is df.drop('column name',axis=1)

Inplace: The inplace parameter enables you to modify your dataframe directly. Remember: by default, the drop() method produces a new dataframe and leaves the original dataframe unchanged. That's because by default, the inplace parameter is set to inplace = False .

Dropping of rows while deleting rows in a dataframe we have to use axis=0

Or select based off of position instead of label

_Selecting subset of rows and columns

Conditional selection An important feature of pandas is conditional selection using bracket notation, very similar to numpy

df
output:

df>0

values which are greater than zero gonna be TRUE
if we want the values in a dataframe which are TRUE, we can write df[df>0]
then the output will be like this-

df[df>0]

again dataframe of df will remain unchanged because the method has the default inplace value of false

df[df['W']>0] here values of column W whose are not greater than zero wont be shown, not only the specific value of column W but also the row of value

df[df['W']>0]
output:

For two conditions we can use | and & with parenthesis. we cant use 'and' 'or' because this boolean is only applicable when it has only single output 'TRUE' or 'FALSE'. But dataframe contains multiple boolean in a single column so it gets confused while executing the code

df[(df['W']>0) & (df['Y'] > 1)]

More index details: Let's discuss some more features of indexing, including resetting the index or setting it something else

Pandas reset_index() is a method to reset index of a Data Frame. reset_index() method sets a list of integer ranging from 0 to length of data as index.syntax of this method-
DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill=”)
Parameters:
level: int, string or a list to select and remove passed column from index.
drop: Boolean value, Adds the replaced index column to the data if False.
inplace: Boolean value, make changes in the original data frame itself if True.
col_level: Select in which column level to insert the labels.
col_fill: Object, to determine how the other levels are named.

df.reset_index()

newind = 'CA NY WY OR CO'.split()
df['States'] = newind

df.set_index('States')

But the dataframe 'df' remains unchanged beacuse reset function has the default inplace value of FALSE

inplace=TRUE
df.set_index('states',inplace=TRUE)

Multi index and index hierarchy

outside = ['G1','G1','G1','G2','G2','G2']
inside = [1,2,3,1,2,3]
hier_index = list(zip(outside,inside))
hier_index = pd.MultiIndex.from_tuples(hier_index)
hier_index
MultiIndex(levels=[['G1', 'G2'], [1, 2, 3]],
           labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]])

df = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])

Now let's show how to index this! For index hierarchy we use df.loc[], if this was on the columns axis, you would just use normal bracket notation df[]. Calling one level of the index returns the sub-dataframe:
df.loc['G1']

df.loc['G1'].loc[1]

df.xs('G1')

df.xs(['G1',1])

for getting index 1 of both G1 and G2 we have to mention the heading of index box
df.xs(1,level='Num')

- Missing Data:

Let's show a few convenient methods to deal with Missing Data in pandas:
import numpy as np
import pandas as pd
df = pd.DataFrame({'A':[1,2,np.nan],
'B':[5,np.nan,np.nan],
'C':[1,2,3]})

dropna:
Sometimes csv file has null values, which are later displayed as NaN in Data Frame. Pandas dropna() method allows the user to analyze and drop Rows/Columns with Null values in different ways.
_
Syntax - **DataFrameName.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)**
Parameter
_axis: axis takes int or string value for rows/columns. Input can be 0 or 1 for Integer and ‘index’ or ‘columns’ for String.
how: how takes string value of two kinds only (‘any’ or ‘all’). ‘any’ drops the row/column if ANY value is Null and ‘all’ drops only if ALL values are null.
thresh: thresh takes integer value which tells minimum amount of na values to drop.
subset: It’s an array which limits the dropping process to passed rows/columns through list.
inplace: It is a boolean which makes the changes in data frame itself if True.

df.dropna()

df.dropna(axis=1)

df.dropna(thresh=2)

it removes the column which has the two NaN value
fillna
Sometimes csv file has null values, which are later displayed as NaN in Data Frame. Just like pandas dropna() method manage and remove Null values from a data frame, fillna() manages and let the user replace NaN values with some value of their own.
**Syntax: DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)

**
Parameters:
_value : Static, dictionary, array, series or dataframe to fill instead of NaN.
method : Method is used if user doesn’t pass any value. Pandas has different methods like bfill, backfill or ffill which fills the place with value in the Forward index or Previous/Back respectively.
axis: axis takes int or string value for rows/columns. Input can be 0 or 1 for Integer and ‘index’ or ‘columns’ for String
inplace: It is a boolean which makes the changes in data frame itself if True.
limit : This is an integer value which specifies maximum number of consequetive forward/backward NaN value fills.
downcast : It takes a dict which specifies what dtype to downcast to which one. Like Float64 to int64.
**kwargs : Any other Keyword arguments
_
df.fillna(value='FILL VALUE')

df['A'].fillna(value=df['A'].mean())

MERGING,JOINING AND CONACATENATING

import pandas as pd
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 1, 2, 3])
df2 = pd.DataFrame({'A': ['A4', 'A5', 'A6', 'A7'],
'B': ['B4', 'B5', 'B6', 'B7'],
'C': ['C4', 'C5', 'C6', 'C7'],
'D': ['D4', 'D5', 'D6', 'D7']},
index=[4, 5, 6, 7])
df3 = pd.DataFrame({'A': ['A8', 'A9', 'A10', 'A11'],
'B': ['B8', 'B9', 'B10', 'B11'],
'C': ['C8', 'C9', 'C10', 'C11'],
'D': ['D8', 'D9', 'D10', 'D11']},
index=[8, 9, 10, 11])

Concatenation
Concatenation basically glues together DataFrames. Keep in mind that dimensions should match along the axis you are concatenating on. You can use pd.concat and pass in a list of DataFrames to concatenate together
pandas.concat() function does all the heavy lifting of performing concatenation operations along with an axis od Pandas objects while performing optional set logic (union or intersection) of the indexes (if any) on the other axes.
Syntax: concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
objs: Series or DataFrame objects
axis: axis to concatenate along; default = 0
join: way to handle indexes on other axis; default = ‘outer’
ignore_index: if True, do not use the index values along the concatenation axis; default = False
keys: sequence to add an identifier to the result indexes; default = None
levels: specific levels (unique values) to use for constructing a MultiIndex; default = None
names: names for the levels in the resulting hierarchical index; default = None
verify_integrity: check whether the new concatenated axis contains duplicates; default = False
sort: sort non-concatenation axis if it is not already aligned when join is ‘outer’; default = False
copy: if False, do not copy data unnecessarily; default = True
pd.concat([df1,df2,df3])

pd.concat([df1,df2,df3],axis=1

left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']})

right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']})

MERGING
The merge function allows you to merge DataFrames together using a similar logic as merging SQL Tables together. For example:

left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
'key2': ['K0', 'K1', 'K0', 'K1'],
'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']})

right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
'key2': ['K0', 'K0', 'K0', 'K0'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']})
pd.merge(left, right, on=['key1', 'key2'])

pd.merge(left, right, how='outer', on=['key1', 'key2'])

pd.merge(left, right, how='right', on=['key1', 'key2'])

pd.merge(left, right, how='left', on=['key1', 'key2'])

JOINING
-Joining is a convenient method for combining the columns of two potentially differently-indexed DataFrames into a single result DataFrame.
left = pd.DataFrame({'A': ['A0', 'A1', 'A2'],
'B': ['B0', 'B1', 'B2']},
index=['K0', 'K1', 'K2'])

right = pd.DataFrame({'C': ['C0', 'C2', 'C3'],
'D': ['D0', 'D2', 'D3']},
index=['K0', 'K2', 'K3'])
left.join(right)

left.join(right, how='outer')

OPERATION

There are lots of operations with pandas that will be really useful to you, but don't fall into any distinct category
import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4],'col2':[444,555,666,444],'col3':['abc','def','ghi','xyz']})
df.head()

Info on Unique Values:
The unique function in pandas is used to find the unique values from a series. A series is a single column of a data frame. We can use the unique function on any possible set of elements in Python. It can be used on a series of strings, integers, tuples, or mixed elements.
df['col2'].unique()
output: array([444, 555, 666])
pandas.DataFrame.nunique
_DataFrame.nunique(axis=0, dropna=True)[source]
Count number of distinct elements in specified axis.

Return Series with number of distinct elements. Can ignore NaN values.

Parameters
axis{0 or ‘index’, 1 or ‘columns’}, default 0
The axis to use. 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise.

dropnabool, default True
Don’t include NaN in the counts._

SELECTING DATA
newdf = df[(df['col1']>2) & (df['col2']==444)]

*it will count the row which fulfills the above criteria

Applying function

def times2(x):
return x*2
df['col1'].apply(times2)

Permanently Removing a Column
del df['col1']

** Get column and index names: **

** Sorting and Ordering a DataFrame:**

** Find Null Values or Check for Null Values**

** Filling in NaN values with something else: **
import numpy as np
df = pd.DataFrame({'col1':[1,2,3,np.nan],
'col2':[np.nan,555,666,444],
'col3':['abc','def','ghi','xyz']})
df.head()

df.fillna('FILL')

data = {'A':['foo','foo','foo','bar','bar','bar'],
'B':['one','one','two','two','one','one'],
'C':['x','y','x','y','x','y'],
'D':[1,3,2,5,4,1]}

df = pd.DataFrame(data)

PIVOT_TABLE:
pandas.pivot_table(data, values=None, index=None, columns=None, aggfunc=’mean’, fill_value=None, margins=False, dropna=True, margins_name=’All’)
PARAMETERS
data : DataFrame
values : column to aggregate, optional
index: column, Grouper, array, or list of the previous
columns: column, Grouper, array, or list of the previous

aggfunc: function, list of functions, dict, default numpy.mean
-> If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names.
-> If dict is passed, the key is column to aggregate and value is function or list of functions

fill_value[scalar, default None] : Value to replace missing values with
margins[boolean, default False] : Add all row / columns (e.g. for subtotal / grand totals)
dropna[boolean, default True] : Do not include columns whose entries are all NaN
margins_name[string, default ‘All’] : Name of the row / column that will contain the totals when margins is True.

Returns: DataFrame

DATA INPUT AND OUTPUT

CSV
_A simple way to store big data sets is to use CSV files (comma separated files).

CSV files contains plain text and is a well know format that can be read by everyone including Pandas._

EXCEL
Pandas can read and write excel files, keep in mind, this only imports data. Not formulas or images, having images or macros may cause this read_excel method to crash.

Python for data analysis- NumPy (Based on udemy)

IftakharRahat — Mon, 19 Sep 2022 16:11:56 +0000

What is Numpy?
-NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.
Using NumPy
Once you've installed NumPy you can import it as a library:

import numpy as np
Numpy has many built-in functions and capabilities. We won't cover them all but instead we will focus on some of the most important aspects of Numpy: vectors,arrays,matrices, and number generation. Let's start by discussing arrays.

1.Numpy Arrays
NumPy arrays are the main way we will use Numpy throughout the course. Numpy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column).

Let's begin our introduction by exploring how to create NumPy arrays.

Creating NumPy Arrays
From a Python List
We can create an array by directly converting a list or list of lists:

my_list = [1,2,3]
my_list
[1, 2, 3]
np.array(my_list)
array([1, 2, 3])
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
np.array(my_matrix)
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Built-in Methods
There are lots of built-in ways to generate Arrays

arange
Return evenly spaced values within a given interval.

np.arange(0,10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
np.arange(0,11,2)
array([ 0, 2, 4, 6, 8, 10])
zeros and ones
Generate arrays of zeros or ones

np.zeros(3)
array([ 0., 0., 0.])
np.zeros((5,5))
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
np.ones(3)
array([ 1., 1., 1.])
np.ones((3,3))
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])
linspace
Return evenly spaced numbers over a specified interval.

np.linspace(0,10,3)
array([ 0., 5., 10.])
np.linspace(0,10,50)
array([ 0. , 0.20408163, 0.40816327, 0.6122449 ,
0.81632653, 1.02040816, 1.2244898 , 1.42857143,
1.63265306, 1.83673469, 2.04081633, 2.24489796,
2.44897959, 2.65306122, 2.85714286, 3.06122449,
3.26530612, 3.46938776, 3.67346939, 3.87755102,
4.08163265, 4.28571429, 4.48979592, 4.69387755,
4.89795918, 5.10204082, 5.30612245, 5.51020408,
5.71428571, 5.91836735, 6.12244898, 6.32653061,
6.53061224, 6.73469388, 6.93877551, 7.14285714,
7.34693878, 7.55102041, 7.75510204, 7.95918367,
8.16326531, 8.36734694, 8.57142857, 8.7755102 ,
8.97959184, 9.18367347, 9.3877551 , 9.59183673,
9.79591837, 10. ])
eye
Creates an identity matrix

np.eye(4)
array([[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]])
Random
Numpy also has lots of ways to create random number arrays:

rand
Create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1).

np.random.rand(2)
array([ 0.11570539, 0.35279769])
np.random.rand(5,5)
array([[ 0.66660768, 0.87589888, 0.12421056, 0.65074126, 0.60260888],
[ 0.70027668, 0.85572434, 0.8464595 , 0.2735416 , 0.10955384],
[ 0.0670566 , 0.83267738, 0.9082729 , 0.58249129, 0.12305748],
[ 0.27948423, 0.66422017, 0.95639833, 0.34238788, 0.9578872 ],
[ 0.72155386, 0.3035422 , 0.85249683, 0.30414307, 0.79718816]])
randn
Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:

np.random.randn(2)
array([-0.27954018, 0.90078368])
np.random.randn(5,5)
array([[ 0.70154515, 0.22441999, 1.33563186, 0.82872577, -0.28247509],
[ 0.64489788, 0.61815094, -0.81693168, -0.30102424, -0.29030574],
[ 0.8695976 , 0.413755 , 2.20047208, 0.17955692, -0.82159344],
[ 0.59264235, 1.29869894, -1.18870241, 0.11590888, -0.09181687],
[-0.96924265, -1.62888685, -2.05787102, -0.29705576, 0.68915542]])
randint
Return random integers from low (inclusive) to high (exclusive).

np.random.randint(1,100)
44
np.random.randint(1,100,10)
array([13, 64, 27, 63, 46, 68, 92, 10, 58, 24])
Array Attributes and Methods
Let's discuss some useful attributes and methods or an array:

arr = np.arange(25)
ranarr = np.random.randint(0,50,10)
arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24])
ranarr
array([10, 12, 41, 17, 49, 2, 46, 3, 19, 39])
Reshape
Returns an array containing the same data with a new shape.

arr.reshape(5,5)
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
max,min,argmax,argmin
These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax

ranarr
array([10, 12, 41, 17, 49, 2, 46, 3, 19, 39])
ranarr.max()
49
ranarr.argmax()
4
ranarr.min()
2
ranarr.argmin()
5
Shape
Shape is an attribute that arrays have (not a method):

# Vector
arr.shape
(25,)

Notice the two sets of brackets

arr.reshape(1,25)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24]])
arr.reshape(1,25).shape
(1, 25)
arr.reshape(25,1)
array([[ 0],
[ 1],
[ 2],
[ 3],
[ 4],
[ 5],
[ 6],
[ 7],
[ 8],
[ 9],
[10],
[11],
[12],
[13],
[14],
[15],
[16],
[17],
[18],
[19],
[20],
[21],
[22],
[23],
[24]])
arr.reshape(25,1).shape
(25, 1)
dtype
You can also grab the data type of the object in the array:

arr.dtype
dtype('int64')

2. NumPy array indexing and selection
NumPy Indexing and Selection
In this lecture we will discuss how to select elements or groups of elements from an array.

import numpy as np

Creating sample array

arr = np.arange(0,11)

Show

arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:

#Get a value at an index
arr[8]
8

Get values in a range

arr[1:5]
array([1, 2, 3, 4])

Get values in a range

arr[0:5]
array([0, 1, 2, 3, 4])
Broadcasting
Numpy arrays differ from a normal Python list because of their ability to broadcast:

Setting a value with index range (Broadcasting)

arr[0:5]=100

Show

arr
array([100, 100, 100, 100, 100, 5, 6, 7, 8, 9, 10])

Reset array, we'll see why I had to reset in a moment

arr = np.arange(0,11)

Show

arr
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

Important notes on Slices

slice_of_arr = arr[0:6]

Show slice

slice_of_arr
array([0, 1, 2, 3, 4, 5])

Change Slice

slice_of_arr[:]=99

Show Slice again

slice_of_arr
array([99, 99, 99, 99, 99, 99])
Now note the changes also occur in our original array!

arr
array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])
Data is not copied, it's a view of the original array! This avoids memory problems!

#To get a copy, need to be explicit
arr_copy = arr.copy()

arr_copy
array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])
Indexing a 2D array (matrices)
The general format is arr_2d[row][col] or arr_2d[row,col]. I recommend usually using the comma notation for clarity.

arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))

Show

arr_2d
array([[ 5, 10, 15],
[20, 25, 30],
[35, 40, 45]])

Indexing row

arr_2d[1]

array([20, 25, 30])

Format is arr_2d[row][col] or arr_2d[row,col]

Getting individual element value

arr_2d[1][0]
20

Getting individual element value

arr_2d[1,0]
20

2D array slicing

Shape (2,2) from top right corner

arr_2d[:2,1:]
array([[10, 15],
[25, 30]])

Shape bottom row

arr_2d[2]
array([35, 40, 45])

Shape bottom row

arr_2d[2,:]
array([35, 40, 45])
_Fancy Indexing
Fancy indexing allows you to select entire rows or columns out of order,to show this, let's quickly build out a numpy array:
_

Set up matrix

arr2d = np.zeros((10,10))

Length of array

arr_length = arr2d.shape[1]

Set up array

for i in range(arr_length):
arr2d[i] = i

arr2d
array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[ 5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
[ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[ 7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
[ 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
[ 9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])
Fancy indexing allows the following

arr2d[[2,4,6,8]]
array([[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[ 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])

Allows in any order

arr2d[[6,4,2,7]]
array([[ 6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
[ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
[ 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
[ 7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])
More Indexing Help
_Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size. Try google image searching NumPy indexing to fins useful images, like this one:
_

_Selection
Let's briefly go over how to use brackets for selection based off of comparison operators.
_
arr = np.arange(1,11)
arr
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
arr > 4
array([False, False, False, False, True, True, True, True, True, True], dtype=bool)
bool_arr = arr>4
bool_arr
array([False, False, False, False, True, True, True, True, True, True], dtype=bool)
arr[bool_arr]
array([ 5, 6, 7, 8, 9, 10])
arr[arr>2]
array([ 3, 4, 5, 6, 7, 8, 9, 10])
x = 2
arr[arr>x]
array([ 3, 4, 5, 6, 7, 8, 9, 10])

3.NumPy Operations
Arithmetic
You can easily perform array with array arithmetic, or scalar with array arithmetic. Let's see some examples:

import numpy as np
arr = np.arange(0,10)
arr + arr
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
arr * arr
array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])
arr - arr
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Warning on division by zero, but not an error!

Just replaced with nan

arr/arr
/Users/marci/anaconda/lib/python3.5/site-packages/ipykernel/main.py:1: RuntimeWarning: invalid value encountered in true_divide
if name == 'main':
array([ nan, 1., 1., 1., 1., 1., 1., 1., 1., 1.])

Also warning, but not an error instead infinity

1/arr
/Users/marci/anaconda/lib/python3.5/site-packages/ipykernel/main.py:1: RuntimeWarning: divide by zero encountered in true_divide
if name == 'main':
array([ inf, 1. , 0.5 , 0.33333333, 0.25 ,
0.2 , 0.16666667, 0.14285714, 0.125 , 0.11111111])
arr**3
array([ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729])
Universal Array Functions
Numpy comes with many universal array functions, which are essentially just mathematical operations you can use to perform the operation across the array. Let's show some common ones:

Taking Square Roots

np.sqrt(arr)
array([ 0. , 1. , 1.41421356, 1.73205081, 2. ,
2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ])

Calcualting exponential (e^)

np.exp(arr)
array([ 1.00000000e+00, 2.71828183e+00, 7.38905610e+00,
2.00855369e+01, 5.45981500e+01, 1.48413159e+02,
4.03428793e+02, 1.09663316e+03, 2.98095799e+03,
8.10308393e+03])
np.max(arr) #same as arr.max()
9
np.sin(arr)
array([ 0. , 0.84147098, 0.90929743, 0.14112001, -0.7568025 ,
-0.95892427, -0.2794155 , 0.6569866 , 0.98935825, 0.41211849])
np.log(arr)
/Users/marci/anaconda/lib/python3.5/site-packages/ipykernel/main.py:1: RuntimeWarning: divide by zero encountered in log
if name == 'main':
array([ -inf, 0. , 0.69314718, 1.09861229, 1.38629436,
1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458])