<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: DHANYAA R S </title>
    <description>The latest articles on DEV Community by DHANYAA R S  (@dhanyaa_rs).</description>
    <link>https://dev.to/dhanyaa_rs</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3456167%2F8bb9f221-5d51-48aa-bb9a-10ecaaff63d4.png</url>
      <title>DEV Community: DHANYAA R S </title>
      <link>https://dev.to/dhanyaa_rs</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dhanyaa_rs"/>
    <language>en</language>
    <item>
      <title>6 Common Data Formats in Data Analytics</title>
      <dc:creator>DHANYAA R S </dc:creator>
      <pubDate>Wed, 08 Oct 2025 07:17:03 +0000</pubDate>
      <link>https://dev.to/dhanyaa_rs/6-common-data-formats-in-data-analytics-4f5</link>
      <guid>https://dev.to/dhanyaa_rs/6-common-data-formats-in-data-analytics-4f5</guid>
      <description>&lt;p&gt;In the world of data analytics, information can come in many formats. Each format serves different purposes—some are human-readable, others are optimized for storage or speed. In this article, we’ll explore six popular data formats used in analytics: CSV, SQL, JSON, Parquet, XML, and Avro. We’ll use a simple dataset to demonstrate each format.&lt;br&gt;
&lt;strong&gt;Sample Dataset&lt;/strong&gt;&lt;br&gt;
[{'Name': 'Dhanyaa', 'Register_No': 'KPR23CB007', 'Subject': 'Data Analytics', 'Marks': 92}, {'Name': ’Krishna, 'Register_No': 'KPR23CB009', 'Subject': 'Cloud Computing', 'Marks': 88}, {'Name': 'Aarav', 'Register_No': 'KPR23CB011', 'Subject': 'AI &amp;amp; ML', 'Marks': 95}]&lt;br&gt;
&lt;strong&gt;1. CSV (Comma Separated Values)&lt;/strong&gt;&lt;br&gt;
CSV is one of the simplest and most widely used data formats. It stores data in plain text, where each line represents a record and columns are separated by commas.&lt;br&gt;
Name,Register_No,Subject,Marks&lt;br&gt;
Dhanyaa,KPR23CB007,Data Analytics,92&lt;br&gt;
Krishna,KPR23CB009,Cloud Computing,88&lt;br&gt;
Aarav,KPR23CB011,AI &amp;amp; ML,95&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. SQL (Relational Table Format)&lt;/strong&gt;&lt;br&gt;
SQL databases store data in tables with defined columns and rows. You can create, read, update, and delete records using SQL queries.&lt;br&gt;
CREATE TABLE students (&lt;br&gt;
    Name VARCHAR(50),&lt;br&gt;
    Register_No VARCHAR(20),&lt;br&gt;
    Subject VARCHAR(50),&lt;br&gt;
    Marks INT&lt;br&gt;
);&lt;/p&gt;

&lt;p&gt;INSERT INTO students VALUES&lt;br&gt;
('Dhanyaa', 'KPR23CB007', 'Data Analytics', 92),&lt;br&gt;
(Krishna, 'KPR23CB009', 'Cloud Computing', 88),&lt;br&gt;
('Aarav', 'KPR23CB011', 'AI &amp;amp; ML', 95);&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. JSON (JavaScript Object Notation)&lt;/strong&gt;&lt;br&gt;
JSON is a lightweight data-interchange format that’s easy for humans to read and machines to parse. It’s widely used in APIs and data transmission.&lt;br&gt;
{&lt;br&gt;
  "students": [&lt;br&gt;
    {"Name": "Dhanyaa", "Register_No": "KPR23CB007", "Subject": "Data Analytics", "Marks": 92},&lt;br&gt;
    {"Name": "Krishna", "Register_No": "KPR23CB009", "Subject": "Cloud Computing", "Marks": 88},&lt;br&gt;
    {"Name": "Aarav", "Register_No": "KPR23CB011", "Subject": "AI &amp;amp; ML", "Marks": 95}&lt;br&gt;
  ]&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Parquet (Columnar Storage Format)&lt;/strong&gt;&lt;br&gt;
Parquet is a columnar storage format optimized for big data processing frameworks like Apache Spark. It stores data by columns instead of rows, making queries faster for analytical workloads.&lt;br&gt;
Example representation (simplified for illustration):&lt;br&gt;
| Column Name | Values                   |&lt;br&gt;
|--------------|--------------------------|&lt;br&gt;
| Name         | Dhanyaa, Krishna, Aarav    |&lt;br&gt;
| Register_No  | KPR23CB007, KPR23CB009, KPR23CB011 |&lt;br&gt;
| Subject      | Data Analytics, Cloud Computing, AI &amp;amp; ML |&lt;br&gt;
| Marks        | 92, 88, 95               |&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. XML (Extensible Markup Language)&lt;/strong&gt;&lt;br&gt;
XML uses custom tags to define and structure data. Although more verbose, it’s useful for hierarchical data representation and data exchange.&lt;br&gt;
&lt;br&gt;
  &lt;br&gt;
    Dhanyaa&lt;br&gt;
    KPR23CB007&lt;br&gt;
    Data Analytics&lt;br&gt;
    92&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
    Krishna&lt;br&gt;
    KPR23CB009&lt;br&gt;
    Cloud Computing&lt;br&gt;
    88&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
    Aarav&lt;br&gt;
    KPR23CB011&lt;br&gt;
    AI &amp;amp; ML&lt;br&gt;
    95&lt;br&gt;
  &lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Avro (Row-Based Storage Format)&lt;/strong&gt;&lt;br&gt;
Avro is a binary row-based format developed under Apache Hadoop. It stores data along with its schema, which makes it efficient for serialization.&lt;br&gt;
Schema Example:&lt;br&gt;
{&lt;br&gt;
  "type": "record",&lt;br&gt;
  "name": "Student",&lt;br&gt;
  "fields": [&lt;br&gt;
    {"name": "Name", "type": "string"},&lt;br&gt;
    {"name": "Register_No", "type": "string"},&lt;br&gt;
    {"name": "Subject", "type": "string"},&lt;br&gt;
    {"name": "Marks", "type": "int"}&lt;br&gt;
  ]&lt;br&gt;
}&lt;br&gt;
Data Example (in JSON-like representation):&lt;br&gt;
{"Name": "Dhanyaa", "Register_No": "KPR23CB007", "Subject": "Data Analytics", "Marks": 92}&lt;br&gt;
{"Name": "Krishna", "Register_No": "KPR23CB009", "Subject": "Cloud Computing", "Marks": 88}&lt;br&gt;
{"Name": "Aarav", "Register_No": "KPR23CB011", "Subject": "AI &amp;amp; ML", "Marks": 95}&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
Each data format serves a unique purpose depending on the use case. While CSV and JSON are great for readability, Parquet and Avro are more efficient for large-scale analytics. Understanding these formats helps data professionals choose the right tools for data storage, transfer, and processing.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>analytics</category>
      <category>datascience</category>
      <category>beginners</category>
    </item>
    <item>
      <title>🚀 My Hilarious Journey Into MongoDB Atlas (with Yelp Reviews, JSON, and “good” vibes)</title>
      <dc:creator>DHANYAA R S </dc:creator>
      <pubDate>Sun, 24 Aug 2025 15:14:09 +0000</pubDate>
      <link>https://dev.to/dhanyaa_rs/my-hilarious-journey-into-mongodb-atlas-with-yelp-reviews-json-and-good-vibes-576j</link>
      <guid>https://dev.to/dhanyaa_rs/my-hilarious-journey-into-mongodb-atlas-with-yelp-reviews-json-and-good-vibes-576j</guid>
      <description>&lt;p&gt;So there I was, innocently sipping chai ☕ when I thought: “Hey, let’s play around with MongoDB Atlas. How hard could it be?” Spoiler alert: it was part comedy, part tragedy, but in the end—success tasted sweeter than Gulab Jamun. 🍯&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Step 1: Logging into MongoDB Atlas&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MongoDB Atlas greeted me like a strict professor: “Welcome, young padawan. Ready to suffer with connection strings?”&lt;br&gt;
I bravely clicked Create Cluster, gave it a free-tier hug, and promised not to blow up the cloud.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Step 2: Building the yelp_demo.reviews Collection&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I created a database called yelp_demo and a collection named reviews. Then came the fun part—manually inserting 10 reviews. Imagine me, typing fake reviews like:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;{&lt;br&gt;
  "business_id": "B003",&lt;br&gt;
  "review": "The biryani here is sooo good!",&lt;br&gt;
  "rating": 5,&lt;br&gt;
  "date": "2025-08-20"&lt;br&gt;
}&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fui3dk58ktld47ilg70q4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fui3dk58ktld47ilg70q4.png" alt=" " width="800" height="374"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Yes, I felt like an undercover Yelp critic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Step 3: Query Magic 🪄&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Top 5 businesses with highest average rating&lt;br&gt;
Using the Aggregation Pipeline:&lt;/p&gt;

&lt;p&gt;_db.reviews.aggregate([&lt;br&gt;
  { $group: { _id: "$business_id", avgRating: { $avg: "$rating" } } },&lt;br&gt;
  { $sort: { avgRating: -1 } },&lt;br&gt;
  { $limit: 5 }&lt;br&gt;
])&lt;br&gt;
_&lt;/p&gt;

&lt;p&gt;Translation: “Dear MongoDB, please rank these food joints before my stomach makes decisions for me.”&lt;/p&gt;

&lt;p&gt;Count reviews containing “good”&lt;br&gt;
But first, MongoDB whispered: “Thou shall create a text index.”&lt;/p&gt;

&lt;p&gt;&lt;em&gt;db.reviews.createIndex({ review: "text" })&lt;br&gt;
db.reviews.countDocuments({ $text: { $search: "good" } })&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy34an36dz8js1pybdf8a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy34an36dz8js1pybdf8a.png" alt=" " width="800" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Result: Apparently, everyone thinks food is “good.” My dataset looked like it was sponsored by the word “good.” 😂&lt;/p&gt;

&lt;p&gt;Get all reviews for a specific business (B003)&lt;/p&gt;

&lt;p&gt;&lt;em&gt;db.reviews.find({ business_id: "B003" }).sort({ date: -1 })&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Yup, sorted by date, because reviews age faster than bananas. 🍌&lt;/p&gt;

&lt;p&gt;Update a review&lt;br&gt;
_&lt;br&gt;
db.reviews.updateOne(&lt;br&gt;
  { business_id: "B003" },&lt;br&gt;
  { $set: { review: "Actually, the biryani was legendary!" } }&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;_&lt;br&gt;
Because sometimes, you realize you were too harsh.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdyq8mymh56igjo6lxtd1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdyq8mymh56igjo6lxtd1.png" alt=" " width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Delete a record&lt;/p&gt;

&lt;p&gt;&lt;em&gt;db.reviews.deleteOne({ business_id: "B010" })&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Farewell, random fake café. You shall not be missed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6whjj6z1hwh8cfl9jljr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6whjj6z1hwh8cfl9jljr.png" alt=" " width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;u&gt;Step 4: The Export Saga 🎭&lt;/u&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I thought: “Cool, I’ll just click Export in Atlas!” But Atlas laughed in my face—no export button in browser!&lt;/p&gt;

&lt;p&gt;So here’s the trick I used:&lt;/p&gt;

&lt;p&gt;Switch to JSON view in Atlas, copy everything, paste into VS Code, save as .json.&lt;/p&gt;

&lt;p&gt;If CSV was needed, I tossed the JSON into an online converter.&lt;br&gt;
Not elegant, but hey—it worked! 🎉&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Moral of the Story&lt;/strong&gt; 🧘&lt;/p&gt;

&lt;p&gt;MongoDB Atlas is like a desi auntie at a wedding—confusing at first, but once you understand her, she’ll feed you endless data love. I inserted, queried, updated, deleted, and even counted “good” vibes, all while laughing at my own mistakes.&lt;/p&gt;

&lt;p&gt;So, if you’re diving into #DataEngineering or #DataAnalysis, don’t be afraid to get your hands messy. MongoDB will test your patience, but trust me, the JSON rewards are worth it.&lt;/p&gt;

&lt;p&gt;_&lt;br&gt;
_#Hashtags:&lt;/p&gt;

&lt;h1&gt;
  
  
  DataEngineering #DataAnalysis #LearningJourney #MongoDB #DevHumor_#DataEngineering #DataAnalysis #LearningJourney #MongoDB #DevHumor_
&lt;/h1&gt;

</description>
    </item>
  </channel>
</rss>
