Not sure what the references to multi-process/thread in the title, but you are running out of memory while trying to build a big string from JSON.stringify. I have seen that before when you have large arrays. I have no idea how you are writing these 10K records to the DB and I don't know the size of each object in the array, so it's difficult to give advice. You are blowing the memory limit at ~1.4GB.
Maybe try going down to 1K or even 100 records at a time to see if that fixes the issue.
Workers (threads) are useful for performing CPU-intensive JavaScript operations. They will not help much with I/O-intensive work. Node.js’s built-in asynchronous I/O operations are more efficient than Workers can be.
If each thread is writing then they will block each other, regardless of what mode you use. I don't know why you started using worker threads, maybe you have some expensive request handling before writing to the DB, but they are not going to help you write to the DB faster.
The good news is that you clearly have an easy method of testing this. Great! I suggest you use that to better understand you system. My recommendation would be to remove all of the worker thread stuff and then run your test against the system while under a profiler. That should give you the truth about where the work is actually being done in your system. Only then should you then look at applying techniques, whether worker threads or otherwise, to improve performance.
Key uniqueness is an entirely different problem. I don't use SQL DBs that much so I cannot offer advice on the technical options you specify. Though I expect that before anyone else could you would need to share more of the use-case about what you are actually trying to accomplish.
When trying to manage key uniqueness there are three fundamental approaches you can take.
Assume the data is already unique and do nothing apart from throwing errors when this assumption is incorrect.
It is the writers responsibility to ensure uniqueness. This generally involves querying the DB to find out about the existing keys but doesn't have to. If you have access to all of the data outside of the DB before it is inserted you can create unique keys based upon that before inserting.
Define the DB schema so that uniqueness is automatic. Auto incrementing numbers or GUIDs work here.
From the sound of it I'm assuming that your request handler is not just inserting thousands of records but is inserting records into multiple tables, hence your reference to foreign keys. So the crux of the problem isn't necessarily key uniquness but in finding the keys of the records you just inserted into order to refer to them in other records you are inserting. If you got with the above option #3 then you have to query the DB to find the values of the keys that the DB determined for the inserted records. If you go with option #2 you can create the key values yourself before insert so you do not need to query afterwards, but that means you are responsible for keeping them unique. GUIDs are the simplest approach, not sure why you don't like that one. If you went with that your system would already be working.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Not sure what the references to multi-process/thread in the title, but you are running out of memory while trying to build a big string from JSON.stringify. I have seen that before when you have large arrays. I have no idea how you are writing these 10K records to the DB and I don't know the size of each object in the array, so it's difficult to give advice. You are blowing the memory limit at ~1.4GB.
Maybe try going down to 1K or even 100 records at a time to see if that fixes the issue.
I fixed with
worker_threads
orthreads.js
, but some threads die due to SQLite write lock (even if I enable WAL mode.)I am using a web server with POST request / WebSockets, so I assume that I cannot really control concurrency???
From the node docs
If each thread is writing then they will block each other, regardless of what mode you use. I don't know why you started using worker threads, maybe you have some expensive request handling before writing to the DB, but they are not going to help you write to the DB faster.
Thanks, I will try to extract the said steps.
The good news is that you clearly have an easy method of testing this. Great! I suggest you use that to better understand you system. My recommendation would be to remove all of the worker thread stuff and then run your test against the system while under a profiler. That should give you the truth about where the work is actually being done in your system. Only then should you then look at applying techniques, whether worker threads or otherwise, to improve performance.
The real issue is PK/FK/UNIQUE. How do I enforce that? My original solution was to MIX READ/WRITE's, but it is probably a bad idea. How about?
SELECT MAX(PK)
+ 1new Set()
/new Map()
, where keys areJSON.stringify(Array)
BEGIN TRANSACTION
--COMMIT
+ Write only, no reads at the time of writing.PRAGMA foreign_keys=off
temporarily? Or maybe I don't need to do that?Not really sure how do I do this in Node.js
better-sqlite3 is sync by default. Not really sure if this is a good or bad idea. The only way to go async is to use Worker or Fork.
Key uniqueness is an entirely different problem. I don't use SQL DBs that much so I cannot offer advice on the technical options you specify. Though I expect that before anyone else could you would need to share more of the use-case about what you are actually trying to accomplish.
When trying to manage key uniqueness there are three fundamental approaches you can take.
From the sound of it I'm assuming that your request handler is not just inserting thousands of records but is inserting records into multiple tables, hence your reference to foreign keys. So the crux of the problem isn't necessarily key uniquness but in finding the keys of the records you just inserted into order to refer to them in other records you are inserting. If you got with the above option #3 then you have to query the DB to find the values of the keys that the DB determined for the inserted records. If you go with option #2 you can create the key values yourself before insert so you do not need to query afterwards, but that means you are responsible for keeping them unique. GUIDs are the simplest approach, not sure why you don't like that one. If you went with that your system would already be working.