1. CSV (Comma Separated Values)
Explanation:
CSV is a simple text format where each row represents a record, and columns are separated by commas. It’s easy to read and widely supported.
Example:
Name,RegisterNo,Subject,Marks
Alice,101,Math,95
Bob,102,Science,88
Charlie,103,English,92
Use case: Quick data exchange, spreadsheets, basic analytics.
2. SQL (Relational Table Format)
Explanation:
SQL stores data in structured tables with rows and columns. You can query this data using SQL commands.
Example Table Creation and Data Insert:
CREATE TABLE Students (
Name VARCHAR(50),
RegisterNo INT,
Subject VARCHAR(50),
Marks INT
);
INSERT INTO Students (Name, RegisterNo, Subject, Marks) VALUES
('Alice', 101, 'Math', 95),
('Bob', 102, 'Science', 88),
('Charlie', 103, 'English', 92);
Use case: Database storage, complex queries, data integrity.
**
- JSON (JavaScript Object Notation)**
Explanation:
JSON is a lightweight data format often used for APIs. It stores data as key-value pairs, making it easy to read for humans and machines.
Example:
[
{"Name": "Alice", "RegisterNo": 101, "Subject": "Math", "Marks": 95},
{"Name": "Bob", "RegisterNo": 102, "Subject": "Science", "Marks": 88},
{"Name": "Charlie", "RegisterNo": 103, "Subject": "English", "Marks": 92}
]
Use case: Data exchange between web apps, APIs, and NoSQL databases.
4. Parquet (Columnar Storage Format)
Explanation:
Parquet is a columnar storage format optimized for big data. It stores data by columns instead of rows, improving performance for analytical queries.
Python Example (Creating Parquet):
import pandas as pd
data = {
"Name": ["Alice", "Bob", "Charlie"],
"RegisterNo": [101, 102, 103],
"Subject": ["Math", "Science", "English"],
"Marks": [95, 88, 92]
}
df = pd.DataFrame(data)
df.to_parquet("students.parquet")
Use case: Big data analytics, Spark, Hive, fast columnar queries.
5. XML (Extensible Markup Language)
Explanation:
XML is a markup language that stores data in a hierarchical structure using tags. It’s human-readable and widely used in legacy systems.
Example:
Alice
101
Math
95
Bob
102
Science
88
Charlie
103
English
92
Use case: Config files, legacy systems, document storage.
6. Avro (Row-based Storage Format)
Explanation:
Avro is a row-based storage format often used with Hadoop. It’s compact, efficient, and supports schema evolution (i.e., adding new fields without breaking old data).
Example Schema + Data (JSON representation of Avro):
{
"type": "record",
"name": "Student",
"fields": [
{"name": "Name", "type": "string"},
{"name": "RegisterNo", "type": "int"},
{"name": "Subject", "type": "string"},
{"name": "Marks", "type": "int"}
]
}
Data:
[
{"Name": "Alice", "RegisterNo": 101, "Subject": "Math", "Marks": 95},
{"Name": "Bob", "RegisterNo": 102, "Subject": "Science", "Marks": 88},
{"Name": "Charlie", "RegisterNo": 103, "Subject": "English", "Marks": 92}
]
Use case: Big data storage, efficient serialization, Hadoop ecosystem.
Top comments (0)