I'm using python, using MySQL connector. I'm having nearly 67 Million (14GB) entries in a table. When I do a bulk insert of 2K data each time, it is taking very long to insert.
Inserted 2000 rows in 23 Seconds
Inserted 2000 rows in 25 Seconds
Inserted 2000 rows in 29 Seconds
Inserted 2000 rows in 28 Seconds
For another table (having less data), insertion speed is fine(2-4 seconds).
After using the transaction:
Inserted 2000 rows in 21 Seconds
Inserted 2000 rows in 20 Seconds
Inserted 2000 rows in 20 Seconds
Inserted 2000 rows in 18 Seconds
How can I improve the speed?
I'm using AWS RDS, Aurora MySQL version 5.7.12 (db.t3.medium) having CPU usage 4% to 8%. My objective is to insert around 50K data into a table. This table is currently having nearly 67 Million (14GB) data already. Data must need to be inserted ASAP. This almost real-time data is very important for the client. The table is having 18 columns: id(PK auto-increment), customer, serial_number, batch, data, and some others. Indexes are on (customer,serial_number) - To make the combination unique, batch - For searching, data(unique). All are by default BTREE indexed. This insertion should need to take less than 1 minute for 50K. But currently taking around 15 minutes. I've tried inserting on an empty table. It is inserting 50K data just in 5-7 seconds. As you increase the number of entries in the table, the insertion process time is increasing.
Is upgrading MySQL version is going to speed-up the insertion process anyhow? Is it the last option to split or partitioning the table? I cannot consolidate the data because each data is important, specially the last 2 years of data. Please help.
My table schema is already having some default values in 8 columns and these data are never going to update later. There are not many Read/Write operations are going on. Almost 2 or in some cases 3 selects per second as per RDS monitor shows.
Top comments (0)