<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Hindu Narmatha</title>
    <description>The latest articles on DEV Community by Hindu Narmatha (@hindu_narmatha_132a576713).</description>
    <link>https://dev.to/hindu_narmatha_132a576713</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3482026%2F62d80144-1ae7-4a5f-b904-88cf3de4e3ca.png</url>
      <title>DEV Community: Hindu Narmatha</title>
      <link>https://dev.to/hindu_narmatha_132a576713</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hindu_narmatha_132a576713"/>
    <language>en</language>
    <item>
      <title>Data Formats Used in Data Analytics</title>
      <dc:creator>Hindu Narmatha</dc:creator>
      <pubDate>Tue, 07 Oct 2025 14:34:58 +0000</pubDate>
      <link>https://dev.to/hindu_narmatha_132a576713/data-formats-used-in-data-analytics-59h8</link>
      <guid>https://dev.to/hindu_narmatha_132a576713/data-formats-used-in-data-analytics-59h8</guid>
      <description>&lt;p&gt;In the world of data analytics, we deal with data in many forms — from simple spreadsheets to complex binary formats. Choosing the right data format can affect performance, storage efficiency, and compatibility.&lt;/p&gt;

&lt;p&gt;In this post, I’ll show you 6 commonly used data formats — CSV, SQL, JSON, Parquet, XML, and Avro — with examples of the same dataset represented in each format.&lt;/p&gt;

&lt;p&gt;example :&lt;br&gt;
| Name  | RegisterNo | Subject | Marks |&lt;br&gt;
| ----- | ---------- | ------- | ----- |&lt;br&gt;
| Alice | 101        | Math    | 95    |&lt;br&gt;
| Bob   | 102        | Science | 88    |&lt;br&gt;
| Carol | 103        | English | 92    |&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.CSV&lt;/strong&gt;&lt;br&gt;
Definition:&lt;br&gt;
CSV is a simple text format where each row is a record and columns are separated by commas. It is easy to read and widely used.&lt;/p&gt;

&lt;p&gt;Google Colab Code Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.to_csv("data.csv", index=False)
print("✅ CSV file created.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output (CSV Table):&lt;/p&gt;

&lt;p&gt;Name    RegisterNo  Subject Marks&lt;br&gt;
Alice   101 Math    95&lt;br&gt;
Bob 102 Science 88&lt;br&gt;
Carol   103 English 92&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. SQL&lt;/strong&gt;&lt;br&gt;
Definition:&lt;br&gt;
SQL stores data in structured tables. It allows querying and managing data efficiently.&lt;/p&gt;

&lt;p&gt;Google Colab Code Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import sqlite3
conn = sqlite3.connect("students.db")
df.to_sql("Student", conn, if_exists="replace", index=False)
print(pd.read_sql_query("SELECT * FROM Student", conn))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sample Output (SQL Table):&lt;/p&gt;

&lt;p&gt;Name    RegisterNo  Subject Marks&lt;br&gt;
Alice   101 Math    95&lt;br&gt;
Bob 102 Science 88&lt;br&gt;
Carol   103 English 92&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. JSON (JavaScript Object Notation)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Definition:&lt;br&gt;
JSON stores data in key-value pairs. It’s human-readable and widely used in APIs and web apps.&lt;/p&gt;

&lt;p&gt;Google Colab Code Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.to_json("data.json", orient="records", indent=4)
print("✅ JSON file created.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sample Output (JSON Data):&lt;/p&gt;

&lt;p&gt;[&lt;br&gt;
    {"Name": "Alice", "RegisterNo": 101, "Subject": "Math", "Marks": 95},&lt;br&gt;
    {"Name": "Bob", "RegisterNo": 102, "Subject": "Science", "Marks": 88},&lt;br&gt;
    {"Name": "Carol", "RegisterNo": 103, "Subject": "English", "Marks": 92}&lt;br&gt;
]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Parquet (Columnar Storage Format)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Definition:&lt;br&gt;
Parquet is a column-based storage format used in big data analytics. It is highly efficient for queries on large datasets.&lt;/p&gt;

&lt;p&gt;Google Colab Code Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;!pip install pyarrow
df.to_parquet("data.parquet")
print("✅ Parquet file created.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sample Output (Read in Python):&lt;/p&gt;

&lt;p&gt;Name    RegisterNo  Subject Marks&lt;br&gt;
Alice   101 Math    95&lt;br&gt;
Bob 102 Science 88&lt;br&gt;
Carol   103 English 92&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. XML (Extensible Markup Language)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Definition:&lt;br&gt;
XML uses tags to structure data. It is extensible and readable by both humans and machines.&lt;/p&gt;

&lt;p&gt;Google Colab Code Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;!pip install dicttoxml
from dicttoxml import dicttoxml
xml_data = dicttoxml(df.to_dict(orient='records'), custom_root='Students', attr_type=False)
with open("data.xml", "wb") as f:
    f.write(xml_data)

print("✅ XML file created.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sample Output (XML Data):&lt;br&gt;
Name    RegisterNo  Subject Marks&lt;br&gt;
Alice   101 Math    95&lt;br&gt;
Bob 102 Science 88&lt;br&gt;
Carol   103 English 92&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Avro (Row-based Storage Format)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Definition:&lt;br&gt;
Avro is a binary, row-based format commonly used in big data pipelines. It stores schema along with data for efficient processing.&lt;/p&gt;

&lt;p&gt;Google Colab Code Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;!pip install fastavro
import fastavro

schema = {
    "type": "record",
    "name": "Student",
    "fields": [
        {"name": "Name", "type": "string"},
        {"name": "RegisterNo", "type": "int"},
        {"name": "Subject", "type": "string"},
        {"name": "Marks", "type": "int"}
    ]
}

records = df.to_dict(orient="records")

with open("data.avro", "wb") as out:
    fastavro.writer(out, schema, records)

print("✅ Avro file created.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sample Output (Read in Python):&lt;/p&gt;

&lt;p&gt;Name    RegisterNo  Subject Marks&lt;br&gt;
Alice   101 Math    95&lt;br&gt;
Bob 102 Science 88&lt;br&gt;
Carol   103 English 92&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each format serves different purposes:&lt;/p&gt;

&lt;p&gt;CSV: Simple, readable, good for small datasets&lt;/p&gt;

&lt;p&gt;SQL: Structured, relational, great for databases&lt;/p&gt;

&lt;p&gt;JSON: Lightweight, perfect for web APIs&lt;/p&gt;

&lt;p&gt;Parquet: Columnar, efficient for analytics on large data&lt;/p&gt;

&lt;p&gt;XML: Extensible, ideal for data exchange&lt;/p&gt;

&lt;p&gt;Avro: Row-based, optimized for big data pipelines&lt;/p&gt;

</description>
      <category>python</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>NoSQL MongoDB</title>
      <dc:creator>Hindu Narmatha</dc:creator>
      <pubDate>Fri, 05 Sep 2025 14:11:28 +0000</pubDate>
      <link>https://dev.to/hindu_narmatha_132a576713/nosql-mongodb-4cf2</link>
      <guid>https://dev.to/hindu_narmatha_132a576713/nosql-mongodb-4cf2</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx5ejtdmotewn70uuftni.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx5ejtdmotewn70uuftni.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftexl71why1ikfi2qur3f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftexl71why1ikfi2qur3f.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4nmxnct9hbcnmzyteynj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4nmxnct9hbcnmzyteynj.png" alt=" " width="800" height="265"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgoq20wmor62f499ltr8v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgoq20wmor62f499ltr8v.png" alt=" " width="800" height="418"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvnqpnijztnezhc7jhfc4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvnqpnijztnezhc7jhfc4.png" alt=" " width="800" height="421"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
    </item>
  </channel>
</rss>
