I know there are multiple ways.
- Python/CLI script, and pipe stdin/stdout (which might be able to make long running as well.)
- ZeroMQ
- Full blown web server (HTTP) Falcon looks nice.
I might have to fear about startup time of Python scripts as well.
What I want,
Actually, I have tried MeCab directly (in Docker), but there is a little complication.
const { spawn } = require('child_process')
async function main () {
const p = spawn('mecab')
p.stdout.on('data', (data) => {
const s = data.toString().split('\n').map(row => row.split('\t')[0])
console.log(s)
})
p.stdin.write('日本語です')
p.stdin.write('\n')
await new Promise(resolve => p.stdout.once('data', resolve))
p.stdin.write('すもももももももものうち')
p.stdin.end()
}
main()
Top comments (3)
Spawning child processes often works,
but if you plan to scale, things can go totally wrong.
Just check
mecab-lite
node module implementationEasy as typing
npm i mecab-lite
Consider using web-sockets, it gives you two-way communication, starting with python client implementation
Implementing web-sockets server using node can be even easier
I am using ZeroMQ quite consistently for interfacing different languages or different programs. On the one hand, it makes it a breeze to work with sockets, etc. on the other, if you are sharing a lot of data very quickly, you have no control on queue sizes, etc. in order to balance the load or stop the publisher, etc. I am using it to stream data from 2 sCMOS cameras, and it can very quickly choke the computer.
If your python program is short lived, but you need to run it very often, you should be careful with startup and tear down times. Perhaps it would be better to have a long-lived process that is always eager for tasks.
What I like about ZMQ is that you have implementations in a myriad of languages, and therefore it can be easily extended in the future, if you ever need to add another language, or if you go distributed.