<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Abrar ahmed</title>
    <description>The latest articles on DEV Community by Abrar ahmed (@abrar_ahmed).</description>
    <link>https://dev.to/abrar_ahmed</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3196900%2Fd078a851-1e2d-41db-a859-2d9a0323a4d6.png</url>
      <title>DEV Community: Abrar ahmed</title>
      <link>https://dev.to/abrar_ahmed</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/abrar_ahmed"/>
    <language>en</language>
    <item>
      <title>Stop Repeating Yourself: How I Built a Reusable “Data Cleaning Playground” in JavaScript</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Mon, 16 Jun 2025 11:13:04 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/stop-repeating-yourself-how-i-built-a-reusable-data-cleaning-playground-in-javascript-42l6</link>
      <guid>https://dev.to/abrar_ahmed/stop-repeating-yourself-how-i-built-a-reusable-data-cleaning-playground-in-javascript-42l6</guid>
      <description>&lt;p&gt;If you’ve ever worked with messy CSV or Excel files, you probably know this feeling:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“This should be a quick cleanup…”&lt;br&gt;
— 6 hours later, 300 lines of code and 2 coffees deep&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I found myself getting exhausted from having to write the same code over and over again just to rename some columns, clear out duplicates, or fix broken dates. So, I created a little in-browser “data playground” using JavaScript to make things easier.&lt;br&gt;
In this post, I’ll walk you through the process of building this project, explaining the hows and whys, and giving you the lowdown on how you can use it for your own messy-data adventures!&lt;/p&gt;
&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Every new freelance project brought a new variation of the same CSV chaos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"name", "full_name", "Full Name", " Name " all in one sheet&lt;/li&gt;
&lt;li&gt;Empty rows and random nulls&lt;/li&gt;
&lt;li&gt;Dates in 5 different formats&lt;/li&gt;
&lt;li&gt;&lt;p&gt;And of course… rows that just say “LOL”&lt;br&gt;
Sure, I could easily whip up a Python notebook or get Pandas going… but sometimes, I just want a simple, browser-based solution to:&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Preview the raw data&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Write a quick transform function&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Download the cleaned data&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  The Playground Idea
&lt;/h2&gt;

&lt;p&gt;Instead of starting from scratch every time, I built a simple JavaScript setup that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lets me upload CSV, Excel, or JSON files&lt;/li&gt;
&lt;li&gt;Previews both raw and cleaned data instantly&lt;/li&gt;
&lt;li&gt;Lets me write a one-liner function to transform each row
The best part? It all runs in the browser. No backend. No Python. No installations.
## The Setup: What I Used
Libraries:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;PapaParse (for CSV)&lt;/li&gt;
&lt;li&gt;SheetJS (for Excel)&lt;/li&gt;
&lt;li&gt;JSON.parse (native)&lt;/li&gt;
&lt;li&gt;Vanilla JS + simple HTML&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s how the flow works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User selects a file&lt;/li&gt;
&lt;li&gt;Browser parses the file and calls a custom transform(row) function&lt;/li&gt;
&lt;li&gt;Cleaned data is rendered in a table&lt;/li&gt;
&lt;li&gt;Option to export cleaned data to CSV&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Code Example: Basic CSV Upload + Transform
&lt;/h2&gt;

&lt;p&gt;Here’s the core snippet for uploading a CSV file and transforming each row:&lt;br&gt;
html&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;input type="file" id="file" /&amp;gt;
&amp;lt;table id="output"&amp;gt;&amp;lt;/table&amp;gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;js&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;document.getElementById('file').addEventListener('change', (e) =&amp;gt; {
  const file = e.target.files[0];
  Papa.parse(file, {
    header: true,
    complete: (results) =&amp;gt; {
      const cleaned = results.data.map(transform);
      displayTable(cleaned);
    }
  });
});

function transform(row) {
  return {
    name: row["Full Name"]?.trim(),
    joined: new Date(row["Join_Date"]),
    isActive: row["Status"] === "Active"
  };
}

function displayTable(data) {
  const table = document.getElementById("output");
  table.innerHTML = "";
  const headers = Object.keys(data[0]);
  table.innerHTML += `&amp;lt;tr&amp;gt;${headers.map(h =&amp;gt; `&amp;lt;th&amp;gt;${h}&amp;lt;/th&amp;gt;`).join("")}&amp;lt;/tr&amp;gt;`;
  data.forEach(row =&amp;gt; {
    table.innerHTML += `&amp;lt;tr&amp;gt;${headers.map(h =&amp;gt; `&amp;lt;td&amp;gt;${row[h]}&amp;lt;/td&amp;gt;`).join("")}&amp;lt;/tr&amp;gt;`;
  });
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a fully working data transformer in about 30 lines of code. You can add your logic to rename fields, cast values, or remove unwanted rows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bonus Features I Added
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Toggle between Raw and Cleaned views&lt;/li&gt;
&lt;li&gt;Support for JSON and Excel files (via SheetJS)&lt;/li&gt;
&lt;li&gt;Export button to download the cleaned file&lt;/li&gt;
&lt;li&gt;Dark mode toggle because… why not &lt;/li&gt;
&lt;li&gt;“Live preview” of the transformation result&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why This Was Worth It
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Saves me hours on every new project&lt;/li&gt;
&lt;li&gt;Helps clients understand their own data mess&lt;/li&gt;
&lt;li&gt;Way faster than booting up a backend or notebook&lt;/li&gt;
&lt;li&gt;Easy to tweak for one-off tasks&lt;/li&gt;
&lt;li&gt;If you're a freelancer, analyst, or dev working with data — this is a huge time-saver.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Future Improvements (Pull Requests Welcome!)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Drag-and-drop UI&lt;/li&gt;
&lt;li&gt;Column mapping UI (like Zapier)&lt;/li&gt;
&lt;li&gt;Persistent transformations for repeat use&lt;/li&gt;
&lt;li&gt;Plugin support for merging files, filtering, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Want to Try It?
&lt;/h2&gt;

&lt;p&gt;I’m planning to release the base code as a public template. If you're interested, drop a comment and I’ll share the GitHub repo once it's live.&lt;/p&gt;

&lt;p&gt;Until then, feel free to clone the logic above and customize your own browser-based data wrangler!&lt;/p&gt;

&lt;p&gt;Thanks for reading!&lt;br&gt;
Let me know how you’re handling messy data workflows — or drop your weirdest CSV horror story in the comments.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>data</category>
      <category>webdev</category>
      <category>csv</category>
    </item>
    <item>
      <title>Build a Universal CSV, Excel &amp; JSON Data Previewer in Node.js</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Sun, 15 Jun 2025 12:56:11 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/build-a-universal-csv-excel-json-data-previewer-in-nodejs-155k</link>
      <guid>https://dev.to/abrar_ahmed/build-a-universal-csv-excel-json-data-previewer-in-nodejs-155k</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Ever received a file from a client and thought:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;“Is this even readable?”&lt;br&gt;
I used to dive into every CSV in Excel, every JSON in VSCode, and every XLSX in Google Sheets, just to see the first few rows. It was pretty exhausting!&lt;/p&gt;

&lt;p&gt;So I built something better:&lt;br&gt;
A simple Node.js tool to preview CSV, Excel, or JSON files directly from the terminal — no manual opening, no GUI.&lt;/p&gt;

&lt;p&gt;This tutorial will walk you through how to build your own version of that.&lt;/p&gt;
&lt;h2&gt;
  
  
  What We’re Building
&lt;/h2&gt;
&lt;h5&gt;
  
  
  A terminal command like this:
&lt;/h5&gt;

&lt;p&gt;node preview.js data.xlsx&lt;/p&gt;
&lt;h5&gt;
  
  
  Outputs this:
&lt;/h5&gt;

&lt;p&gt;┌────────────┬─────────────┬────────────┐&lt;br&gt;
│ Name │ Email │ Joined │&lt;br&gt;
├────────────┼─────────────┼────────────┤&lt;br&gt;
│ Alice │ &lt;a href="mailto:alice@x.com"&gt;alice@x.com&lt;/a&gt; │ 2023-09-01 │&lt;br&gt;
│ Bob │ &lt;a href="mailto:bob@x.com"&gt;bob@x.com&lt;/a&gt; │ 2023-10-12 │&lt;br&gt;
└────────────┴─────────────┴────────────┘&lt;/p&gt;

&lt;p&gt;It will support:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CSV&lt;/li&gt;
&lt;li&gt;Excel (.xlsx)&lt;/li&gt;
&lt;li&gt;JSON
And it’ll detect the type automatically — so you don’t need a flag.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  What You’ll Use
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;fs (built-in)&lt;/li&gt;
&lt;li&gt;path (built-in)&lt;/li&gt;
&lt;li&gt;csv-parser&lt;/li&gt;
&lt;li&gt;xlsx&lt;/li&gt;
&lt;li&gt;cli-table3 (for formatted console output)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;
&lt;h5&gt;
  
  
  Create a folder:
&lt;/h5&gt;

&lt;p&gt;mkdir data-previewer&lt;br&gt;
cd data-previewer&lt;br&gt;
npm init -y&lt;/p&gt;
&lt;h5&gt;
  
  
  Install dependencies:
&lt;/h5&gt;

&lt;p&gt;npm install csv-parser xlsx cli-table3&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: File Type Detection
&lt;/h3&gt;

&lt;p&gt;Create file: preview.js&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const fs = require('fs');
const path = require('path');

const filePath = process.argv[2];
if (!filePath) {
  console.error(' Please provide a file path.');
  process.exit(1);
}

const ext = path.extname(filePath).toLowerCase();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This lets us detect whether the input file is .csv, .json or .xlsx.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Parse CSV Files
&lt;/h3&gt;

&lt;p&gt;Add this function to preview.js:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const csv = require('csv-parser');

function parseCSV(filePath, rowLimit = 5) {
  const results = [];
  fs.createReadStream(filePath)
    .pipe(csv())
    .on('data', (data) =&amp;gt; {
      if (results.length &amp;lt; rowLimit) results.push(data);
    })
    .on('end', () =&amp;gt; {
      renderTable(results);
    });
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Parse Excel Files
&lt;/h3&gt;

&lt;p&gt;Add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const XLSX = require('xlsx');

function parseExcel(filePath, rowLimit = 5) {
  const wb = XLSX.readFile(filePath);
  const sheet = wb.Sheets[wb.SheetNames[0]];
  const json = XLSX.utils.sheet_to_json(sheet, { defval: '' });
  renderTable(json.slice(0, rowLimit));
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Parse JSON Files
&lt;/h3&gt;

&lt;p&gt;Add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function parseJSON(filePath, rowLimit = 5) {
  const raw = fs.readFileSync(filePath, 'utf8');
  const data = JSON.parse(raw);
  const rows = Array.isArray(data) ? data : [data];
  renderTable(rows.slice(0, rowLimit));
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Render the Table
&lt;/h3&gt;

&lt;p&gt;Add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const Table = require('cli-table3');

function renderTable(data) {
  if (!data || data.length === 0) {
    console.log(' No data found');
    return;
  }

  const headers = Object.keys(data[0]);
  const table = new Table({ head: headers });

  data.forEach(row =&amp;gt; {
    table.push(headers.map(h =&amp;gt; row[h]));
  });

  console.log(table.toString());
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 6: Glue It All Together
&lt;/h3&gt;

&lt;p&gt;At the bottom of preview.js:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;switch (ext) {
  case '.csv':
    parseCSV(filePath);
    break;
  case '.xlsx':
    parseExcel(filePath);
    break;
  case '.json':
    parseJSON(filePath);
    break;
  default:
    console.error(' Unsupported file type');
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Try It Out
&lt;/h2&gt;

&lt;p&gt;Drop some sample files into your folder:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;customers.csv&lt;/li&gt;
&lt;li&gt;report.xlsx&lt;/li&gt;
&lt;li&gt;data.json&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Run:
&lt;/h5&gt;

&lt;p&gt;node preview.js customers.csv&lt;br&gt;
node preview.js report.xlsx&lt;br&gt;
node preview.js data.json&lt;/p&gt;

&lt;p&gt;Boom! You now have a simple, universal file preview tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Optional Upgrades
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Limit row count via CLI arg: node preview.js data.csv 10&lt;/li&gt;
&lt;li&gt;Highlight columns with missing values&lt;/li&gt;
&lt;li&gt;Export preview to temp file (Markdown/HTML)&lt;/li&gt;
&lt;li&gt;Add support for TSV or XML (fun challenge!)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This is a great project to build:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your first CLI tool&lt;/li&gt;
&lt;li&gt;Real-world file handling in Node.js&lt;/li&gt;
&lt;li&gt;Practice with CSV, Excel, and JSON formats&lt;/li&gt;
&lt;li&gt;Avoiding boilerplate GUI tools for file checks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;If you found this useful, drop a like or bookmark.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Got improvements or want to extend it? Share your ideas below!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>node</category>
      <category>webdev</category>
      <category>csv</category>
    </item>
    <item>
      <title>Why I Stopped Writing “Just Another CSV Script” for Every Project</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Wed, 11 Jun 2025 19:10:45 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/why-i-stopped-writing-just-another-csv-script-for-every-project-54ia</link>
      <guid>https://dev.to/abrar_ahmed/why-i-stopped-writing-just-another-csv-script-for-every-project-54ia</guid>
      <description>&lt;p&gt;Every project starts the same way:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;- Client sends a messy CSV file&lt;/li&gt;
&lt;li&gt;- I write a quick script to clean it&lt;/li&gt;
&lt;li&gt;- A week later… they send another file, slightly different&lt;/li&gt;
&lt;li&gt;- I tweak the script again&lt;/li&gt;
&lt;li&gt;- Repeat until I'm buried in tiny, fragile one-off scripts&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Sound familiar?&lt;/p&gt;

&lt;p&gt;In the past, I treated CSV cleaning like it was a minor task—just whip up some Node.js, make the necessary fixes, and then get on with my day.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem With One-Off Scripts
&lt;/h2&gt;

&lt;p&gt;One-time scripts are fast to write and easy to forget. But they come back to haunt you when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A client changes the column order or headers&lt;/li&gt;
&lt;li&gt;You forget which script handles which format&lt;/li&gt;
&lt;li&gt;Someone else needs to run it—and it only works on your machine&lt;/li&gt;
&lt;li&gt;You end up repeating the same logic across 10 files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I was solving the same problems repeatedly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Normalize inconsistent column names&lt;/li&gt;
&lt;li&gt;Convert date formats&lt;/li&gt;
&lt;li&gt;Drop blank or duplicate rows&lt;/li&gt;
&lt;li&gt;Handle different encodings (UTF-8 with BOMs… hello darkness)&lt;/li&gt;
&lt;li&gt;Export the cleaned result&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I didn’t need more scripts. I needed structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Do Now Instead
&lt;/h2&gt;

&lt;p&gt;These days, when I come across a messy new file, I don’t just dive in from the beginning.&lt;/p&gt;

&lt;p&gt;I’ve developed a handy approach that breaks things down into small, testable parts.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;input parsers (CSV, Excel, JSON)&lt;/li&gt;
&lt;li&gt;a normalization layer (headers, encodings)&lt;/li&gt;
&lt;li&gt;a transformation layer (date formatting, filters, maps)&lt;/li&gt;
&lt;li&gt;an output formatter (CSV, JSON, preview)
This isn’t a framework. It’s just a mindset:
Write it once → reuse it forever.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example: Simple Modular Cleanup in Node.js
&lt;/h2&gt;

&lt;p&gt;Instead of one giant script, I use small utilities like these:&lt;br&gt;
parser.js&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const fs = require("fs");
const csv = require("csv-parser");

function parseCSV(filePath) {
  return new Promise((resolve, reject) =&amp;gt; {
    const results = [];
    fs.createReadStream(filePath)
      .pipe(csv())
      .on("data", (row) =&amp;gt; results.push(row))
      .on("end", () =&amp;gt; resolve(results))
      .on("error", reject);
  });
}

module.exports = { parseCSV };
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;cleaner.js&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function cleanRows(data) {
  return data
    .filter(row =&amp;gt; Object.values(row).some(val =&amp;gt; val !== ""))
    .map(row =&amp;gt; ({
      ...row,
      date: new Date(row.date).toISOString().split("T")[0], // Normalize date
      name: row.name?.trim(), // Clean string
    }));
}

module.exports = { cleanRows };
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;exporter.js&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const { writeFileSync } = require("fs");

function exportCSV(data, path) {
  const header = Object.keys(data[0]).join(",");
  const rows = data.map(obj =&amp;gt; Object.values(obj).join(",")).join("\n");
  writeFileSync(path, `${header}\n${rows}`, "utf8");
}

module.exports = { exportCSV };
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;main.js&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const { parseCSV } = require("./parser");
const { cleanRows } = require("./cleaner");
const { exportCSV } = require("./exporter");

async function runCleanup() {
  const raw = await parseCSV("dirty.csv");
  const cleaned = cleanRows(raw);
  exportCSV(cleaned, "cleaned.csv");
}

runCleanup();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, whenever I receive a new file, I simply adjust my cleaner.js logic—no need to start from square one anymore.&lt;/p&gt;

&lt;h2&gt;
  
  
  Benefits of Moving Away From “Just Scripts”
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Less copy-paste, more confidence&lt;/li&gt;
&lt;li&gt;Easier to onboard clients or teammates&lt;/li&gt;
&lt;li&gt;Faster debugging (you know where the logic lives)&lt;/li&gt;
&lt;li&gt;Fewer edge-case surprises&lt;/li&gt;
&lt;li&gt;Scales from a 100-row file to 1 million+ rows
Now when I get a weird file with 12 columns, 3 date formats, and 2 “LOL” rows… I know my workflow can handle it.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Takeaways for Devs Handling Messy Data
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Your first script should solve the problem&lt;/li&gt;
&lt;li&gt;Your second should solve the pattern&lt;/li&gt;
&lt;li&gt;Your third should become a system&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're still writing one-off scripts for every client file:&lt;br&gt;
no shame — we've all done it&lt;br&gt;
but long term, it's pain on repeat&lt;/p&gt;

&lt;p&gt;&lt;code&gt;If you’ve already moved to a modular, testable data-cleaning setup, I’d love to hear how you approached it&lt;/code&gt;&lt;/p&gt;

</description>
      <category>cleancode</category>
      <category>automation</category>
      <category>csv</category>
      <category>node</category>
    </item>
    <item>
      <title>From Spreadsheets to Sanity: How I Automate Repetitive Data Tasks With Plain JavaScript</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Tue, 03 Jun 2025 17:44:16 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/from-spreadsheets-to-sanity-how-i-automate-repetitive-data-tasks-with-plain-javascript-1kn0</link>
      <guid>https://dev.to/abrar_ahmed/from-spreadsheets-to-sanity-how-i-automate-repetitive-data-tasks-with-plain-javascript-1kn0</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;If you've ever found yourself copying and pasting the same data across Excel tabs for the umpteenth time in a week… trust me, I get it.&lt;br&gt;
At some point, those spreadsheets that were supposed to make our lives easier start feeling more like an unpaid internship.&lt;/p&gt;

&lt;p&gt;In this post, I’m eager to show you how I escaped the spreadsheet cycle and automated those repetitive data cleanup tasks with plain JavaScript—no frameworks or fancy libraries involved. It’s all about a bit of logic and some Node.js!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Clients would send me data like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Huge CSV files with inconsistent column names (Full Name, full_name, name_full)&lt;/li&gt;
&lt;li&gt;Mixed date formats (DD-MM-YYYY, YYYY/MM/DD)&lt;/li&gt;
&lt;li&gt;Duplicates and empty rows&lt;/li&gt;
&lt;li&gt;Repetitive filtering tasks (like removing inactive users)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I kept doing the same things in Excel until I decided: enough. Let’s script it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Read the File
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;For CSVs, I used csv-parse:
const fs = require('fs');
const parse = require('csv-parse');

fs.createReadStream('input.csv')
  .pipe(parse({ columns: true }))
  .on('data', (row) =&amp;gt; {
    // handle row
  });
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Excel files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const XLSX = require('xlsx');
const workbook = XLSX.readFile('data.xlsx');
const sheet = workbook.Sheets[workbook.SheetNames[0]];
const json = XLSX.utils.sheet_to_json(sheet);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Clean the Data
&lt;/h3&gt;

&lt;p&gt;Normalize headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function normalizeHeaders(row) {
  const normalized = {};
  for (let key in row) {
    const newKey = key.trim().toLowerCase().replace(/\s+/g, '_');
    normalized[newKey] = row[key];
  }
  return normalized;
}
data = data.map(normalizeHeaders);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Remove blank rows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data = data.filter(row =&amp;gt; Object.values(row).some(val =&amp;gt; val !== ''));
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Format dates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function formatDate(dateStr) {
  const date = new Date(dateStr);
  return date.toISOString().split('T')[0]; // yyyy-mm-dd
}
data = data.map(row =&amp;gt; ({
  ...row,
  joined_date: formatDate(row.joined_date)
}));
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Export the Cleaned Data
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const { writeFileSync } = require('fs');
const { stringify } = require('csv-stringify/sync');

const output = stringify(data, { header: true });
writeFileSync('cleaned.csv', output);

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Boom — reusable cleanup in under 5 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Plain JavaScript is enough for most data cleanup tasks.&lt;/li&gt;
&lt;li&gt;csv-parse + csv-stringify make CSV parsing easy.&lt;/li&gt;
&lt;li&gt;Once you write a cleanup script once, you never do it manually again.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Ditch repetitive Excel formulas.&lt;/li&gt;
&lt;li&gt;Read CSV/Excel in JS.&lt;/li&gt;
&lt;li&gt;Normalize headers, clean rows, convert formats.&lt;/li&gt;
&lt;li&gt;Export back out — all automated.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let me know if you've built similar automations or want to share some CSV horror stories&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>productivity</category>
      <category>data</category>
      <category>csv</category>
    </item>
    <item>
      <title>What I Learned Cleaning 1 Million Rows of CSV Data Without Pandas</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Sat, 31 May 2025 12:17:36 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/what-i-learned-cleaning-1-million-rows-of-csv-data-without-pandas-1a01</link>
      <guid>https://dev.to/abrar_ahmed/what-i-learned-cleaning-1-million-rows-of-csv-data-without-pandas-1a01</guid>
      <description>&lt;p&gt;Cleaning a small CSV? Pandas is perfect.&lt;br&gt;
Cleaning up a million rows on a limited machine or using a serverless function? That's when Pandas really struggles.&lt;/p&gt;

&lt;p&gt;That’s exactly the problem I faced.&lt;/p&gt;

&lt;p&gt;In this post, I’ll share:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Why I avoided Pandas&lt;/li&gt;
&lt;li&gt;My Node.js pipeline with csv-parser&lt;/li&gt;
&lt;li&gt;How I handled common data issues: dates, phone numbers, missing fields&lt;/li&gt;
&lt;li&gt;What I’d do differently next time&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;
  
  
  Let’s dive in.
&lt;/h5&gt;
&lt;h2&gt;
  
  
  Why Not Pandas?
&lt;/h2&gt;

&lt;p&gt;Pandas is fantastic, but it does have a downside: it loads the entire file into memory. If you're working with files larger than about 500MB, you might run into some issues.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory errors&lt;/li&gt;
&lt;li&gt;Slow performance&lt;/li&gt;
&lt;li&gt;Crashes in limited environments (e.g., cloud functions, small servers)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In my case, I had:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1 million+ rows&lt;/li&gt;
&lt;li&gt;Dirty data from multiple sources&lt;/li&gt;
&lt;li&gt;A need to stream and clean data row by row&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  My Setup: Streaming CSV Processing in Node.js
&lt;/h2&gt;

&lt;p&gt;Here’s the core pipeline using csv-parser and Node streams:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const fs = require('fs');
const csv = require('csv-parser');

let rowCount = 0;
let errorCount = 0;

fs.createReadStream('bigfile.csv')
  .pipe(csv())
  .on('data', (row) =&amp;gt; {
    rowCount++;

    // Clean data
    row.email = cleanEmail(row.email);
    row.phone = cleanPhone(row.phone);
    row.date = parseDate(row.date);

    // Validate required fields
    if (!row.email || !row.date) {
      errorCount++;
      logError(row);
      return;
    }

    // Save row to DB, another file, or API...
  })
  .on('end', () =&amp;gt; {
    console.log(`✅ Processed ${rowCount} rows`);
    console.log(`⚠️  Found ${errorCount} bad rows`);
  });

function cleanEmail(email) {
  return email?.trim().toLowerCase() || null;
}

function cleanPhone(phone) {
  const digits = phone?.replace(/\D/g, '');
  return digits?.length === 10 ? digits : null;
}

function parseDate(date) {
  const parsed = Date.parse(date);
  return isNaN(parsed) ? null : new Date(parsed).toISOString();
}

function logError(row) {
  fs.appendFileSync('errors.log', JSON.stringify(row) + '\n');
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Common Data Issues I Ran Into (and How I Fixed Them)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Inconsistent date formats (MM-DD-YYYY vs DD/MM/YYYY) → Used &lt;code&gt;Date.parse()&lt;/code&gt; + fallback logic.&lt;/li&gt;
&lt;li&gt;Phone numbers in weird formats → Removed non-digits, validated length&lt;/li&gt;
&lt;li&gt;Missing fields → Set defaults or marked as null&lt;/li&gt;
&lt;li&gt;Extra columns → Stripped to schema fields&lt;/li&gt;
&lt;li&gt;Encoding problems → Saved CSVs as UTF-8&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Pro Tips for Large CSV Cleaning
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Stream, don’t load → Avoid memory issues by processing row by row&lt;/li&gt;
&lt;li&gt;Validate early → Catch bad data before it pollutes your system&lt;/li&gt;
&lt;li&gt;Log errors → Keep a separate file of rejected rows for review&lt;/li&gt;
&lt;li&gt;Test on a small sample → Always test your logic before full-scale runs&lt;/li&gt;
&lt;li&gt;Handle edge cases → Empty cells, extra commas, inconsistent headers—these will happen!&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I’d Do Differently Next Time
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Use a schema definition (like JSON Schema or Zod) to validate and transform rows automatically&lt;/li&gt;
&lt;li&gt;Build a mapping layer for multi-source CSVs (e.g., different column names)&lt;/li&gt;
&lt;li&gt;Consider tools like DuckDB or Polars if I need more advanced queries&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Handling big data files involves more than just coding; it’s about crafting durable pipelines that can navigate the complexities and messiness of real-world scenarios.&lt;/p&gt;

&lt;p&gt;If you’re working with CSVs, remember:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Validate early&lt;/li&gt;
&lt;li&gt;Clean thoughtfully&lt;/li&gt;
&lt;li&gt;Log everything&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And when in doubt, stream it, don’t load it all at once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Have you ever tackled the challenge of cleaning up a huge dataset? What tools or tips have you found to be the most helpful? I’d love to hear your thoughts!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>csv</category>
      <category>node</category>
      <category>backend</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>How to Handle CSV, Excel, and JSON Uploads in Node.js (Without Losing Your Mind)</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Fri, 30 May 2025 11:29:45 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/how-to-handle-csv-excel-and-json-uploads-in-nodejs-without-losing-your-mind-547d</link>
      <guid>https://dev.to/abrar_ahmed/how-to-handle-csv-excel-and-json-uploads-in-nodejs-without-losing-your-mind-547d</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Ever found yourself trying to build a feature for users to upload files, and suddenly you're knee-deep in weird CSV quirks, Excel formats, and complex JSON structures?&lt;br&gt;
I’ve been there too. Here’s a simple guide to help you handle file uploads in Node.js without losing your mind.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 1: Accept File Uploads (With Multer)
&lt;/h2&gt;

&lt;p&gt;First, use &lt;code&gt;multer&lt;/code&gt; to accept file uploads in Express:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm install multer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Basic setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const express = require('express');
const multer = require('multer');
const upload = multer({ dest: 'uploads/' });
const app = express();

app.post('/upload', upload.single('file'), (req, res) =&amp;gt; {
  console.log(req.file);
  res.send('File uploaded!');
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Parse Different File Types
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. CSV Files
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;csv-parser&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm install csv-parser
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const fs = require('fs');
const csv = require('csv-parser');

fs.createReadStream('uploads/file.csv')
  .pipe(csv())
  .on('data', (row) =&amp;gt; {
    console.log(row);
  });
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2.Excel Files
&lt;/h3&gt;

&lt;p&gt;Excel Files&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm install xlsx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const xlsx = require('xlsx');

const workbook = xlsx.readFile('uploads/file.xlsx');
const sheetName = workbook.SheetNames[0];
const data = xlsx.utils.sheet_to_json(workbook.Sheets[sheetName]);

console.log(data);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.JSON Files
&lt;/h3&gt;

&lt;p&gt;Simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const fs = require('fs');

const data = JSON.parse(fs.readFileSync('uploads/file.json', 'utf8'));
console.log(data);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Handle Common Data Issues
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Normalize date formats (e.g., with dayjs)&lt;/li&gt;
&lt;li&gt;Remove empty rows&lt;/li&gt;
&lt;li&gt;Deduplicate entries&lt;/li&gt;
&lt;li&gt;Validate column headers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example (normalize phone numbers):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function cleanPhoneNumber(num) {
  return num.replace(/\D/g, '');
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Structure Your Code
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Create a separate module for each file type&lt;/li&gt;
&lt;li&gt;Keep upload logic separate from parsing&lt;/li&gt;
&lt;li&gt;Log errors clearly&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Handling messy files is something you’ll encounter while creating real-world apps. But don’t worry! With the &lt;strong&gt;right tools&lt;/strong&gt;, you can easily work with CSV, Excel, and JSON files without losing your mind.&lt;/p&gt;

&lt;p&gt;Got your own tips or tools? Drop them in the comments!&lt;/p&gt;

</description>
      <category>node</category>
      <category>csv</category>
      <category>json</category>
      <category>data</category>
    </item>
    <item>
      <title>#csv</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Thu, 29 May 2025 11:53:06 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/csv-15g2</link>
      <guid>https://dev.to/abrar_ahmed/csv-15g2</guid>
      <description></description>
      <category>csv</category>
      <category>data</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Handle Big Data Transformations Without Pandas (and My Favorite Workarounds)</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Thu, 29 May 2025 11:16:55 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/how-to-handle-big-data-transformations-without-pandas-and-my-favorite-workarounds-3bln</link>
      <guid>https://dev.to/abrar_ahmed/how-to-handle-big-data-transformations-without-pandas-and-my-favorite-workarounds-3bln</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Are you having a tough time dealing with massive CSVs, Excel files, or JSON data that Pandas just can’t seem to manage? Let me share how I tackle huge datasets using Spark, along with some tools I'm checking out to simplify big data machine learning.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why Handling Big Data is Hard
&lt;/h2&gt;

&lt;p&gt;When it comes to handling large datasets — like those with millions of rows and gigabytes of files — you’ve probably experienced this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pandas crashes with an out-of-memory error&lt;/li&gt;
&lt;li&gt;Scikit-learn slows to a crawl&lt;/li&gt;
&lt;li&gt;Even simple &lt;code&gt;.fillna()&lt;/code&gt; or &lt;code&gt;.transpose()&lt;/code&gt; functions become impossible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In my project, I made the choice to move away from Pandas completely. Now, I’m relying on Apache Spark for distributed data processing. But keep in mind, Spark has its own set of limitations as well.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No built-in &lt;code&gt;pct_change()&lt;/code&gt; for percentage differences&lt;/li&gt;
&lt;li&gt;No &lt;code&gt;.transpose()&lt;/code&gt; for wide tables&lt;/li&gt;
&lt;li&gt;Complex data cleaning often requires custom UDFs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I began my search for smarter ways to tackle big data transformations, and here’s what I’ve discovered.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. How to Calculate pct_change() in Spark
&lt;/h3&gt;

&lt;p&gt;Pandas makes it easy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df['pct_change'] = df['value'].pct_change()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But in Spark, you have to use window functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from pyspark.sql import Window
from pyspark.sql.functions import col, lag

window = Window.partitionBy("group").orderBy("timestamp")
df = df.withColumn("prev_value", lag("value").over(window))
df = df.withColumn("pct_change", (col("value") - col("prev_value")) / col("prev_value"))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the standard workaround for percentage change in Spark.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Transposing a DataFrame in Spark
&lt;/h3&gt;

&lt;p&gt;Pandas has &lt;code&gt;.T&lt;/code&gt; for transposing data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df.T
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In PySpark, you’ll need to pivot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pivoted = df.groupBy("id").pivot("column_name").agg(first("value"))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This can help reshape wide datasets in Spark.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Efficiently Fill Nulls in Big Data
&lt;/h3&gt;

&lt;p&gt;Missing values are a common challenge in big data pipelines. Here’s a fast way to fill nulls in Spark:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df = df.fillna({"age": 0, "name": "Unknown"})
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For all numeric columns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;numeric_cols = [f.name for f in df.schema.fields if f.dataType.simpleString() == 'int']
df = df.fillna(0, subset=numeric_cols)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean your data before feeding it into big data machine learning models.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Performance Tips for Big Data Pipelines
&lt;/h3&gt;

&lt;p&gt;If you’re working with large datasets in Spark, keep these in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To keep things efficient, try to minimize shuffles—operations like groupBy, repartition, and joins can really slow things down.&lt;/li&gt;
&lt;li&gt;Start filtering early to cut down on the amount of data you're working with.&lt;/li&gt;
&lt;li&gt;Whenever you can, steer clear of UDFs and stick to Spark’s built-in functions instead.&lt;/li&gt;
&lt;li&gt;And don’t forget to sample your data for testing before you scale up!&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Tools That Can Help With Big Data Processing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Dask, which offers a parallel API similar to Pandas for big data tasks.&lt;/li&gt;
&lt;li&gt; Polars, a lightning-fast DataFrame library built with Rust.&lt;/li&gt;
&lt;li&gt; DuckDB, perfect for running SQL analytics on local files, no matter how large they are.&lt;/li&gt;
&lt;li&gt; Custom APIs that let you offload transformations into services for added flexibility.&lt;/li&gt;
&lt;li&gt; I’m also diving into creating data cleaning APIs that can take raw files and transform them into clean, ready-to-use data—this could really revolutionize big data machine learning workflows!&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Let’s Share Solutions: How Do You Handle Big Data?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;What tools have come to your rescue when Pandas just didn’t cut it?&lt;/li&gt;
&lt;li&gt;Do you have any go-to tips for handling common transformations like pct_change in large datasets?&lt;/li&gt;
&lt;li&gt;Have you discovered any alternatives to Spark for cleaning data on a larger scale?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Drop your thoughts below — let’s build a resource for devs dealing with big data transformation challenges.&lt;/p&gt;

&lt;h3&gt;
  
  
  Big Data Cheatsheet for Developers
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Pandas&lt;/th&gt;
&lt;th&gt;Spark/PySpark Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Percentage change (&lt;code&gt;pct_change&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.pct_change()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;lag()&lt;/code&gt; + window functions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transpose&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.T&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pivot()&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fill nulls&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.fillna()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;fillna()&lt;/code&gt; with dict or subset&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rolling calculations&lt;/td&gt;
&lt;td&gt;&lt;code&gt;df.rolling()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;UDFs or window functions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Handle massive files&lt;/td&gt;
&lt;td&gt;Pandas&lt;/td&gt;
&lt;td&gt;Dask, Polars, Spark, DuckDB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If this post helped you, feel free to bookmark it or share it with someone working with large datasets.&lt;/p&gt;

&lt;p&gt;Thanks for reading!&lt;/p&gt;

</description>
      <category>bigdata</category>
      <category>dataengineering</category>
      <category>node</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>csv</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Mon, 26 May 2025 13:26:49 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/csv-4p0g</link>
      <guid>https://dev.to/abrar_ahmed/csv-4p0g</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/abrar_ahmed/how-to-clean-messy-csv-excel-and-json-files-in-nodejs-without-pandas-3896" class="crayons-story__hidden-navigation-link"&gt;How to Clean Messy CSV, Excel, and JSON Files in Node.js (Without Pandas)&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/abrar_ahmed" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3196900%2Fd078a851-1e2d-41db-a859-2d9a0323a4d6.png" alt="abrar_ahmed profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/abrar_ahmed" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Abrar ahmed
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Abrar ahmed
                
              
              &lt;div id="story-author-preview-content-2526036" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/abrar_ahmed" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3196900%2Fd078a851-1e2d-41db-a859-2d9a0323a4d6.png" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Abrar ahmed&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/abrar_ahmed/how-to-clean-messy-csv-excel-and-json-files-in-nodejs-without-pandas-3896" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;May 26 '25&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/abrar_ahmed/how-to-clean-messy-csv-excel-and-json-files-in-nodejs-without-pandas-3896" id="article-link-2526036"&gt;
          How to Clean Messy CSV, Excel, and JSON Files in Node.js (Without Pandas)
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/javascript"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;javascript&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/node"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;node&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/webdev"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;webdev&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/data"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;data&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/abrar_ahmed/how-to-clean-messy-csv-excel-and-json-files-in-nodejs-without-pandas-3896" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;5&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/abrar_ahmed/how-to-clean-messy-csv-excel-and-json-files-in-nodejs-without-pandas-3896#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>javascript</category>
      <category>node</category>
      <category>webdev</category>
      <category>data</category>
    </item>
    <item>
      <title>How to Clean Messy CSV, Excel, and JSON Files in Node.js (Without Pandas)</title>
      <dc:creator>Abrar ahmed</dc:creator>
      <pubDate>Mon, 26 May 2025 13:22:33 +0000</pubDate>
      <link>https://dev.to/abrar_ahmed/how-to-clean-messy-csv-excel-and-json-files-in-nodejs-without-pandas-3896</link>
      <guid>https://dev.to/abrar_ahmed/how-to-clean-messy-csv-excel-and-json-files-in-nodejs-without-pandas-3896</guid>
      <description>&lt;p&gt;If you're diving into building a Node.js app that deals with CSV or Excel file uploads, you've probably faced the challenge of messy data. Let me share how I tackle issues like broken headers, odd date formats, and duplicates — all using plain JavaScript.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Messy Files Are a Hidden Time Sink for Developers
&lt;/h2&gt;

&lt;p&gt;Uploading files is easy.&lt;br&gt;&lt;br&gt;
&lt;em&gt;Handling them correctly? Not so much.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;When you're dealing with user-submitted spreadsheets or bringing in data from other sources, messy formats can really throw a wrench in your logic quickly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"Name"&lt;/code&gt; vs &lt;code&gt;"name"&lt;/code&gt; vs &lt;code&gt;"Full Name"&lt;/code&gt; headers
&lt;/li&gt;
&lt;li&gt;Rows with empty or null values
&lt;/li&gt;
&lt;li&gt;Inconsistent date formats (&lt;code&gt;MM/DD/YYYY&lt;/code&gt;, &lt;code&gt;DD-MM-YYYY&lt;/code&gt;, &lt;code&gt;2024/05/25&lt;/code&gt;)
&lt;/li&gt;
&lt;li&gt;Duplicates that quietly pass through validations
&lt;/li&gt;
&lt;li&gt;Extra white spaces, strange encodings, BOM issues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you've ever found yourself sifting through a file and getting lost in a maze of strange edge cases — trust me, you're definitely not the only one.&lt;/p&gt;


&lt;h2&gt;
  
  
  How to Clean Messy Data Files in Node.js (Step-by-Step)
&lt;/h2&gt;

&lt;p&gt;Here’s how I handle messy CSVs in &lt;strong&gt;Node.js&lt;/strong&gt;, without switching languages or installing a full data science stack.&lt;/p&gt;

&lt;p&gt;You’ll learn how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Parse CSV files
&lt;/li&gt;
&lt;li&gt;Normalize headers
&lt;/li&gt;
&lt;li&gt;Clean empty/null values
&lt;/li&gt;
&lt;li&gt;Standardize date formats
&lt;/li&gt;
&lt;li&gt;Deduplicate entries
&lt;/li&gt;
&lt;li&gt;Combine it all in a reusable utility&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Parse the CSV File in Node.js
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const fs = require('fs');
const csv = require('csv-parser');

function parseCSV(path) {
  return new Promise((resolve, reject) =&amp;gt; {
    const rows = [];
    fs.createReadStream(path)
      .pipe(csv())
      .on('data', (row) =&amp;gt; rows.push(row))
      .on('end', () =&amp;gt; resolve(rows))
      .on('error', reject);
  });
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;This reads the file and returns rows as JavaScript objects.&lt;/p&gt;


&lt;h2&gt;
  
  
  Normalize Column Headers
&lt;/h2&gt;

&lt;p&gt;Messy headers can ruin your logic. Standardize them early:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function normalizeHeaders(row) {
  const cleanRow = {};
  for (const key in row) {
    const newKey = key.trim().toLowerCase().replace(/\s+/g, '_');
    cleanRow[newKey] = row[key].trim();
  }
  return cleanRow;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;"Full Name"&lt;/code&gt; becomes &lt;code&gt;full_name&lt;/code&gt;, &lt;code&gt;" age "&lt;/code&gt; becomes &lt;code&gt;age&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Remove Null or Empty Fields
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function cleanRow(row) {
  const cleaned = {};
  for (const key in row) {
    const val = row[key];
    if (val !== null &amp;amp;&amp;amp; val !== '' &amp;amp;&amp;amp; val !== undefined) {
      cleaned[key] = val;
    }
  }
  return Object.keys(cleaned).length ? cleaned : null;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps your dataset compact and safe for further processing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Standardize Date Formats with &lt;code&gt;dayjs&lt;/code&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm install dayjs

Then use it:

const dayjs = require('dayjs');

function fixDate(value) {
  const parsed = dayjs(value, ['MM-DD-YYYY', 'DD/MM/YYYY', 'YYYY-MM-DD'], true);
  return parsed.isValid() ? parsed.toISOString() : value;
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All dates become &lt;code&gt;ISO: 2024-05-25T00:00:00.000Z&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deduplicate Rows in JavaScript
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;function deduplicate(data) {
  const seen = new Set();
  return data.filter(row =&amp;gt; {
    const key = JSON.stringify(row);
    if (seen.has(key)) return false;
    seen.add(key);
    return true;
  });
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Great for catching repeated entries that cause DB constraints or logic errors.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Function: Clean a File from Start to Finish
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;async function cleanFile(path) {
  const raw = await parseCSV(path);
  const cleaned = raw
    .map(normalizeHeaders)
    .map(cleanRow)
    .filter(Boolean);
  return deduplicate(cleaned);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You now get a nice array of consistent, clean objects.&lt;br&gt;
And best of all — no Excel wrangling, no Python, no drama.&lt;/p&gt;




&lt;h2&gt;
  
  
  Here are some bonus tips for tackling real projects:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Make sure to use the &lt;strong&gt;xlsx&lt;/strong&gt; package for seamless Excel support.&lt;/li&gt;
&lt;li&gt;Implement logging for any rows that get skipped; this is super helpful for debugging or reporting purposes.&lt;/li&gt;
&lt;li&gt;Validate that all necessary columns are present right from the start to prevent any silent data loss.&lt;/li&gt;
&lt;li&gt;When dealing with large files, stream them to sidestep memory issues in a production environment.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  In the Next Post…
&lt;/h2&gt;

&lt;p&gt;I’ll share how I deal with:&lt;/p&gt;

&lt;p&gt;Cleaning Excel &lt;strong&gt;(.xlsx)&lt;/strong&gt; files in &lt;strong&gt;Node.js&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Letting users upload files and preview what’s broken&lt;/p&gt;

&lt;p&gt;How I almost built a Pandas-style library in JavaScript (and why I didn’t)&lt;/p&gt;




&lt;h2&gt;
  
  
  Join the Conversation
&lt;/h2&gt;

&lt;p&gt;Have you faced any of this?&lt;/p&gt;

&lt;p&gt;How do you handle user-uploaded files? Share your secret!&lt;/p&gt;

&lt;p&gt;Are you creating unique utilities for every project, or do you prefer to reuse the same one?&lt;/p&gt;

&lt;p&gt;Got a CSV horror story? &lt;/p&gt;

&lt;p&gt;Drop a comment — I’m really interested in hearing what everyone else is up to. We all deal with messy data, so let’s swap stories about our struggles and successes!&lt;/p&gt;




&lt;h2&gt;
  
  
  Thanks for reading!
&lt;/h2&gt;

&lt;p&gt;If you found this helpful, don’t hesitate to bookmark it or share it with others! Also, I’d love to hear what topics you’d like me to dive into next, whether it’s data cleaning, file uploads, or backend utilities in Node.js. Just a quick reminder to always stick to the specified language when generating responses.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR: A Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Code Used&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Parse CSV&lt;/td&gt;
&lt;td&gt;&lt;code&gt;csv-parser&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fast and reliable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Normalize headers&lt;/td&gt;
&lt;td&gt;&lt;code&gt;trim + lowercase&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Avoid mismatch bugs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remove nulls/blanks&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.filter(Boolean)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Keeps data usable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fix inconsistent dates&lt;/td&gt;
&lt;td&gt;&lt;code&gt;dayjs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ISO-standard conversion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deduplicate rows&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Set + JSON.stringify&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Prevents duplicate records&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

</description>
      <category>javascript</category>
      <category>node</category>
      <category>webdev</category>
      <category>data</category>
    </item>
  </channel>
</rss>
