Today I learned four important Pandas DataFrame operations:
Sorting, Merge, Join, and Concat.
These are essential for organizing and combining data efficiently. Letβs explore each one π
πΉ 1. Sorting
Definition:
Sorting is used to arrange data in ascending or descending order based on one or more columns.
Types of Sorting:
- Ascending Sort: Small to large (default)
- Descending Sort: Large to small
- By Index Sort: Sort based on index values
- By Column Sort: Sort based on specific column(s)
Example:
import pandas as pd
df = pd.DataFrame({
'Name': ['Ram', 'Anu', 'Kavi'],
'Age': [25, 22, 30]
})
# Sort by Age in ascending order
sorted_df = df.sort_values(by='Age')
# Sort by Age in descending order
sorted_desc = df.sort_values(by='Age', ascending=False)
print(sorted_df)
print(sorted_desc)
πΉ 2. Merge
Definition:
The merge()
function combines two DataFrames based on common columns or keys β similar to SQL joins.
Types of Merge:
- Inner Join: Returns only matching rows from both DataFrames
- Left Join: Returns all rows from the left DataFrame and matching rows from the right
- Right Join: Returns all rows from the right DataFrame and matching from the left
- Outer Join: Returns all rows from both DataFrames, filling missing values with NaN
Example:
df1 = pd.DataFrame({'ID': [1, 2, 3], 'Name': ['Ram', 'Anu', 'Kavi']})
df2 = pd.DataFrame({'ID': [1, 2, 4], 'Marks': [85, 90, 88]})
merged_df = pd.merge(df1, df2, on='ID', how='inner')
print(merged_df)
πΉ 3. Join
Definition:
join()
is used to combine two DataFrames based on their index or a key column.
Types of Join:
- Inner Join
- Left Join
- Right Join
- Outer Join
Example:
df1 = pd.DataFrame({'Name': ['Ram', 'Anu', 'Kavi']}, index=[1, 2, 3])
df2 = pd.DataFrame({'Marks': [85, 90, 88]}, index=[1, 2, 3])
joined_df = df1.join(df2)
print(joined_df)
πΉ 4. Concat
Definition:
concat()
is used to combine two or more DataFrames either vertically (rows) or horizontally (columns).
Types of Concat:
- Vertical Concat (axis=0): Add rows
- Horizontal Concat (axis=1): Add columns
Example:
df1 = pd.DataFrame({'Name': ['Ram', 'Anu']})
df2 = pd.DataFrame({'Name': ['Kavi', 'Sara']})
# Combine rows
concat_rows = pd.concat([df1, df2], axis=0)
# Combine columns
df3 = pd.DataFrame({'Marks': [85, 90]})
concat_cols = pd.concat([df1, df3], axis=1)
print(concat_rows)
print(concat_cols)
π‘ Key Takeaway:
These functions help you manipulate, combine, and organize datasets β essential skills for every data analyst working with real-world data!
#Day49 #DataAnalytics #Python #Pandas #Sorting #Join #Merge #Concat #100DaysOfCode #LearningJourney
Top comments (0)