Today marks a milestone β Day 50 of my learning journey in Data Analytics! π
Hereβs what I learned today in Pandas and Spark:
π§© 1. Pivot Table in Pandas
A pivot table is used to summarize and analyze data.
import pandas as pd
data = {
'Product': ['A', 'A', 'B', 'B', 'C'],
'Category': ['X', 'Y', 'X', 'Y', 'X'],
'Quantity': [10, 15, 20, 25, 30]
}
df = pd.DataFrame(data)
pivot = df.pivot_table(values='Quantity', index='Product', columns='Category', aggfunc='mean')
print(pivot)
π 2. Pandas Series
A Series is a one-dimensional labeled array.
import pandas as pd
s = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
print(s)
π’ 3. Using range()
numbers = list(range(1, 6))
print(numbers)
π― 4. Accessing Declared Values
print(s['b']) # Access using label
print(s[2]) # Access using index
β‘ 5. Spark
Spark is used for large-scale data processing.
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("DataExample").getOrCreate()
df = spark.read.csv("data.csv", header=True, inferSchema=True)
df.show()
β 6. Arithmetic Operations
s1 = pd.Series([1, 2, 3])
s2 = pd.Series([4, 5, 6])
print(s1 + s2) # Addition
print(s1 * s2) # Multiplication
π« 7. Skip Rows While Reading File
df = pd.read_csv("data.csv", skiprows=2)
print(df.head())
βοΈ 8. Slicing
print(s[1:3]) # Slice elements
π 9. Read & Write Files
CSV
df = pd.read_csv("data.csv")
df.to_csv("output.csv", index=False)
Excel
df = pd.read_excel("data.xlsx")
df.to_excel("output.xlsx", index=False)
JSON
df = pd.read_json("data.json")
df.to_json("output.json", orient='records')
XML
df = pd.read_xml("data.xml")
df.to_xml("output.xml")
Text
df = pd.read_csv("data.txt", sep='\t')
df.to_csv("output.txt", sep='\t', index=False)
ποΈ 10. Compression While Saving
df.to_csv("compressed_data.csv.gz", compression='gzip')
π‘ Summary
Todayβs learning covered:
- Pivot Table creation
- Pandas Series and range
- Accessing values and performing arithmetic
- Working with Spark
- Reading & Writing data in multiple formats
- File compression techniques
β¨ Every line of code brings me closer to mastering Data Analytics!
Top comments (0)