DEV Community

Cover image for Optimize multiple INSERTs (for Oracle)
Michel
Michel

Posted on • Originally published at blog.pagesd.info

Optimize multiple INSERTs (for Oracle)

In order to check the integrity of some data, I had to copy the contents of a large customer table from a DB2 database into an Oracle database. I know SQL*Loader exists, but this is not the time I'm gonna learn how to use it.

One INSERT command for each row to process

To begin with, I made tests with a subset of the data and tried to make it as simple as possible. So simple that I generated and executed one INSERT query per row to be inserted.

INSERT INTO MyTable (Column1, Column2, Column3) VALUES ('One_1', 'Two_1', 'Three_1');
then
INSERT INTO MyTable (Column1, Column2, Column3) VALUES ('One_2', 'Two_2', 'Three_2');
then
INSERT INTO MyTable (Column1, Column2, Column3) VALUES ('One_3', 'Two_3', 'Three_3');
etc...
Enter fullscreen mode Exit fullscreen mode

=> 12.07 seconds to insert 4579 lines (i.e. 379 requests per second).

INSERT commands inside one BEGIN / END

Rather than making a connection.Execute() for each line, I grouped the INSERT commands into a BEGIN ... INSERT expression (a bit like a stored procedure) so as to hit the server for only one connection.Execute().

BEGIN
  INSERT INTO MyTable (Column1, Column2, Column3) VALUES ('One_1', 'Two_1', 'Three_1');
  INSERT INTO MyTable (Column1, Column2, Column3) VALUES ('One_2', 'Two_2', 'Three_2');
  INSERT INTO MyTable (Column1, Column2, Column3) VALUES ('One_3', 'Two_3', 'Three_3');
  etc...
END;
Enter fullscreen mode Exit fullscreen mode

=> 6.37 seconds to insert 4579 lines => already twice as fast.

Note: with Dapper (and ADO.NET in general I suppose), Oracle doesn't support line breaks in the SQL command: I get an error @@PLS-00103: Encountered the symbol "" when expecting one of the following@.

To avoid this problem, just don't add a line break when generating the big query :

BEGIN INSERT INTO MyTable (...) VALUES (...); INTO MyTable (...) VALUES (...); ...; END;
Enter fullscreen mode Exit fullscreen mode

An INSERT command from several SELECTs

This time, instead of making 1 access to the server to still make several INSERTs, I optimized even more by sending a single INSERT request. To do this, I turned "INSERT ... VALUES ..." into "INSERT ... SELECT FROM ..." :)

INSERT INTO MyTable (Column1, Column2, Column3)
SELECT ('One_1', 'Two_1', 'Three_1') FROM DUAL
UNION SELECT ('One_2', 'Two_2', 'Three_2') FROM DUAL
UNION SELECT ('One_3', 'Two_3', 'Three_3') FROM DUAL
etc...
Enter fullscreen mode Exit fullscreen mode

=> 2.84 seconds to insert 4579 lines (i.e. 1612 requests per second) => 4 times faster.

Test results

It was useful to run a few tests, since it allows to go from 379 inserts in one second to more than 1600! And as I have nearly 300,000 lines to process, it will take 3 minutes and not 15 minutes.

Implementation

Actually, I copy the entire data with block of 500 customers to avoid overloading memory:

public int SaveCustomers(IEnumerable<Customer> customers)
{
    var sql = @"UNION SELECT '{0}', '{1}', '{2}', '{3}', '{4}' FROM DUAL ";

    int count = 0;
    try
    {
        connexion.Open();
        var batch = new StringBuilder();
        foreach (var c in customers)
        {
            batch.Append(string.Format(sql, c.Agency, c.Code, c.Siret, c.Type));
            count++;
            if ((count % 500) == 0)
            {
                connexion.Execute(Sql_FromSelect(batch));
                batch = new StringBuilder();
            }
        }
        connexion.Execute(Sql_FromSelect(batch));
    }
    catch (Exception ex)
    {
        throw ex;
    }
    finally
    {
        connexion.Close();
    }

    return count;
}

private string Sql_FromSelect(StringBuilder batch)
{
    // Au départ :
    // "UNION SELECT '...', '...', '...', '...', '...' FROM DUAL UNION SELECT ... "

    batch.Remove(0, 5);
    // => " SELECT '...', '...', '...', '...', '...' FROM DUAL UNION SELECT ... "

    var start = @"BEGIN INSERT INTO Customers (Agency, Code, Siret, Type) ";
    batch.Insert(0, start);
    // => "BEGIN INSERT INTO Customers (...) SELECT '...', '...', ... "

    batch.Append("; END;");
    // => "BEGIN INSERT INTO Customers (...) SELECT ... FROM DUAL; END;"

    return batch.ToString();
}
Enter fullscreen mode Exit fullscreen mode

This post was originally published on blog.pagesd.info.
Cover image : Rusty old truck in a pumpkin patch - Priscilla Du Preez.

Latest comments (4)

Collapse
 
mohan734 profile image
Mohan

Good One. it really helped to boost performance of an INSERT operation. is there any way for UPDATE as well similar to it?

Collapse
 
michelc profile image
Michel

Yes, it works for everything. The BEGIN / END block can contain all kinds of queries:

BEGIN
  UPDATE MyTable SET Column4 = Column4 + 1 WHERE Column1 = 'One_1';
  UPDATE MyTable SET Column4 = Column4 + 2 WHERE Column1 = 'One_2';
  UPDATE MyTable SET Column4 = Column4 + 3 WHERE Column1 = 'One_3';
  etc...
END;
Enter fullscreen mode Exit fullscreen mode
Collapse
 
yyvess profile image
yyvess • Edited

Good trick but why not simply use jdbc batch update?
It should must faster and easily to use.
Note that in general you should commit each 1'000 - 10'000 otherwise performance decrease

docs.oracle.com/cd/E11882_01/java....

Collapse
 
michelc profile image
Michel

Indeed :)
But it's C# code and I used what I know best: ADO.NET and Dapper.
Otherwise, it's a translation of an old post. At this time, the batch update had to go through DataAdapter that I didn't use.