I'm a developer-turned-business owner who loves to explore the right tools for the job. I enjoy writing and documenting my journey. I use code as one of the tools to solve real problems.
The entire application infrastructure, both webserver and processing servers, are built on top of PHP. Go took over part of the processing server's responsibilities. The PHP application now sends raw data to the Go application, which does calculations from that data and sends it back to PHP to integrate it back into the existing pipelines. So the start and end of the process is still all PHP with Go replacing a process in between.
I have tried parallel processing for PHP as well. I had some trouble with this because the process stills needs to be able to run on the same hardware. Spawning multiple instances of this process would exhaust the server resources with more than 2 child processes. The other problem I've had, which I saw some solutions for in the libraries you mentioned, was the fact that I needed the resulting data of those parallel processing. It's quite simple to spawn a few processes, let it do their task asynchronously and then let it kill the process once the task is complete. But I need the resulting data of these processes.
This is where the channels in Go came in for me. I could let it spawn as many workers as there are threads (8 in this case) and once the task was done combine all the resulting data into a single slice. I've been struggling to do this in PHP for a long time and Go has this built-in, which is why I was sold on it for this specific task. That and the fact that the server resource usage is so low that I can actually do multiple things at once.
Which PHP library (if any) would you recommend for something like this?
I understood. in this case would recommend any library of asynchronous processing combined with a math library capable of performing calculations with more speed.
I'm a developer-turned-business owner who loves to explore the right tools for the job. I enjoy writing and documenting my journey. I use code as one of the tools to solve real problems.
I've tried a few libraries, extensions, and approaches, including:
ReactPHP components
AmpPHP components
Swoole PHP Extension
Asynchronous jobs on a Redis queue
Asynchronous jobs from the database
Multiple processing servers
They all worked, but only very temporarily. Take the jobs in the Redis queue for example: at a certain point, there were so many queued jobs that I was never able to work through all of them and the queue actually got bigger every day until it ran out of memory on day four.
I'd really like to revisit doing parallel processing in PHP though because it's much easier to implement something in the language the application is built with.
If I do find a nice solution to do everything in PHP, I'll definitely write about it. Finding a solution would be a great end of a very long journey and a great start for a new one!
That's interesting to know, it seem PHP memory limit could be tune if the issue is related.
There are many solutions in PHP which is often difficult for me to make the decisions, has lead my way to Go (from PHP (pure and Laravel) to Swift to Nodejs to Ruby to Crystal to Go), it's a long journey and Go’s minimalistic approach actually… works. I assume it could probably help to avoid out of memory if you could experiment it and my finding that Go consumes less memory and gain better performance in various benchmark found on the web.
The entire application infrastructure, both webserver and processing servers, are built on top of PHP. Go took over part of the processing server's responsibilities. The PHP application now sends raw data to the Go application, which does calculations from that data and sends it back to PHP to integrate it back into the existing pipelines. So the start and end of the process is still all PHP with Go replacing a process in between.
I have tried parallel processing for PHP as well. I had some trouble with this because the process stills needs to be able to run on the same hardware. Spawning multiple instances of this process would exhaust the server resources with more than 2 child processes. The other problem I've had, which I saw some solutions for in the libraries you mentioned, was the fact that I needed the resulting data of those parallel processing. It's quite simple to spawn a few processes, let it do their task asynchronously and then let it kill the process once the task is complete. But I need the resulting data of these processes.
This is where the channels in Go came in for me. I could let it spawn as many workers as there are threads (8 in this case) and once the task was done combine all the resulting data into a single slice. I've been struggling to do this in PHP for a long time and Go has this built-in, which is why I was sold on it for this specific task. That and the fact that the server resource usage is so low that I can actually do multiple things at once.
Which PHP library (if any) would you recommend for something like this?
I understood. in this case would recommend any library of asynchronous processing combined with a math library capable of performing calculations with more speed.
In this case, how much more speed improvement did you estimate? Do you have the benchmark data in your recommendation?
No. it's just a personal tip based on a quick search.
parallel processing is not new in php, there must be several other specific libraries.
personally i would try at least half a dozen of them before venturing into a new language.
I understand, maybe it would be curious for you to compare if there are gains or performance improvement.
I've tried a few libraries, extensions, and approaches, including:
They all worked, but only very temporarily. Take the jobs in the Redis queue for example: at a certain point, there were so many queued jobs that I was never able to work through all of them and the queue actually got bigger every day until it ran out of memory on day four.
I'd really like to revisit doing parallel processing in PHP though because it's much easier to implement something in the language the application is built with.
If I do find a nice solution to do everything in PHP, I'll definitely write about it. Finding a solution would be a great end of a very long journey and a great start for a new one!
That's interesting to know, it seem PHP memory limit could be tune if the issue is related.
There are many solutions in PHP which is often difficult for me to make the decisions, has lead my way to Go (from PHP (pure and Laravel) to Swift to Nodejs to Ruby to Crystal to Go), it's a long journey and Go’s minimalistic approach actually… works. I assume it could probably help to avoid out of memory if you could experiment it and my finding that Go consumes less memory and gain better performance in various benchmark found on the web.
surely Go is faster than PHP as well as any compiled language is often faster than script-based ones.
static typing languages like Go are also often faster than dynamic typing languages as well as the strong type system of Go also helps a lot.
Go is a more strong competitor for Java than PHP and Python IMHO.