DEV Community

loading...
Cover image for How to improve database performance for insert queries

How to improve database performance for insert queries

vinamrasareen profile image Vinamra Sareen ・2 min read

Hey all, this is my first blog, I have recently started working as a full stack web developer at Adda52Rummy and I was assigned with a task to insert some records in database, which I thought I handled appropriately. But my manager gave me csv file of 5K records to insert to database. It was synchronous code and took some time.
image

About tech Stack, I was working in Node.js with MariaDB and Fastify framework (performs better than express and other sever frameworks or library). I used Sequelize as ORM library, it is a promise-based Node.js ORM for Postgres, MySQL, MariaDB, SQLite and Microsoft SQL Server. To improve the performance of my operations, I needed to run insert operations in multiple batches and those batches had to run in parallel. Aha! Sound nice, but to how to implement it. After spending ample time of searching I found some solution on how to do it.
Thanks to Evan Siroky.

The solution was to use bulkCreate to insert multiple batches and use asnyc.cargo function to run batches in parallel. After using this I was able to insert data in 2 seconds and guess what, I was able to insert 100K records in 5 ~ 7 seconds.

Let's jump straight to code:

In my controllers, coupon.js

const Coupon = require("../models/coupon");

// exports.createCoupon = async (row) => {
//   await Coupon.sync();
//   return Coupon.create({
//     coupon_code: row.coupon_code,
//   });
// };

// Here I am using bulkCreate
exports.createCoupon = (tasks, inserterCb) => {
  Coupon.bulkCreate(tasks).then(function () {
    inserterCb();
  });
};

Enter fullscreen mode Exit fullscreen mode

I have created a utility folder, where I have created a csv-to-db.js_ file:

const fs = require("fs");
const path = require("path");
const async = require("async");
const csv = require("csv");

function csvToDb(filename, fn) {
  let input = fs.createReadStream(
    path.resolve(__dirname, "../public/assets", filename)
  );
  let parser = csv.parse({
    columns: true,
    relax: true,
  });

  let inserter = async.cargo(function (tasks, inserterCb) {
    fn(tasks, inserterCb);
  }, 1000 // -> task size); 

/* you can specify how many task you want to handle, I have set it to 
1000 tasks at a time parallel, I can say for 100K let's handle 
50K - 50K to take much less time, but it doesn't work that way,
 it will take maybe two or more seconds to execute input of size 50K,
 as it is large input. So take a size which execute faster but handles
 many task as possible. */

  parser.on("readable", function () {
    while ((line = parser.read())) {
      inserter.push(line);
    }
  });

  parser.on("error", function (err) {
    throw err;
  });

  parser.on("end", function (count) {
    inserter.drain();
  });

  input.pipe(parser);
}

module.exports = csvToDb;

Enter fullscreen mode Exit fullscreen mode

Now, when I call this route 'http::/localhost:3000/coupon', this is the code that gets executed.

const Coupon = require("../controllers/coupon");

module.exports = function routes(fastify, options, done) {
  fastify.get("/coupon", async (req, res) => {
    csvToDb("filename.csv", Coupon.createCoupon);
    return { statusCode: 200, message: "sucess" };
  });

  done();
}

Enter fullscreen mode Exit fullscreen mode

Thanks to read out.

All references:

Discussion (0)

pic
Editor guide