DEV Community

Cover image for From Arrays to GPU - how the PHP ecosystem is moving toward real ML
Samuel Akopyan
Samuel Akopyan

Posted on

From Arrays to GPU - how the PHP ecosystem is moving toward real ML

From Arrays to GPU - how the PHP ecosystem is moving toward real ML
Why PHP arrays are a poor fit for math, how Tensor and NDArray emerged, and why RubixML eventually moved toward GPU.


Whenever I talk to someone or write about "machine learning in PHP", the reaction is usually predictable: people either smirk, point to Python, or downvote (except for those who are genuinely interested).

The general conclusion is simple - this is just not what the language is for.

And honestly, there is some truth in that. But…

For several years now, I've been trying to push the idea of "ML in PHP". Why? I'm not entirely sure. I've asked myself that question many times and still don't have a clear answer - even for myself.

I'm not crazy, and I'm not obsessed with some fixed idea.

I'm just a regular developer who enjoys writing PHP.
This language has paid my bills for years (and still does). It's not perfect, but overall I like how it evolves and where it's heading.

So it feels like the very idea of "ML in PHP" is a kind of challenge that keeps me up at night - trying to understand math formulas or implement yet another ML algorithm in PHP. And honestly - I enjoy it.

Of course, PHP was never designed for numerical computing. It has no built-in support for vector operations, no real control over memory, and no access to low-level optimizations that are standard in scientific computing.

And yet - machine learning in PHP exists.

Not as an experiment, but as part of real systems:

  • inference directly inside web applications
  • data processing inside SaaS
  • automation and embedded analytics

And every month - the use of ML in PHP keeps growing.

If you're skeptical, take a look here:
👉 https://github.com/apphp/awesome-php-ml

This is one of the most complete curated lists of machine learning, AI, NLP, LLM, and data analysis libraries for PHP - over 140 resources. The PHP ML ecosystem is not loud, but it's fairly mature. It consists of four layers:

  • classical ML libraries
  • mathematical foundation
  • integration tools with modern ML systems
  • integration with external ML services

In this article, we'll focus on the first two layers, and only briefly touch the other two.

Despite the ecosystem being alive and growing, if you look deeper, it becomes clear: it evolved not thanks to the language, but despite its limitations - driven by a handful of enthusiasts (I'll try to mention as many as I know).

Let me repeat a cliché - this article is not a library overview and not a comparison of PHP vs Python.

It's a story of how the PHP community gradually arrived at the same conclusion:

the problem is not the algorithms - it's the runtime model itself.

At first, vectors and matrices were implemented as plain PHP arrays. Then came optimization attempts. Then native structures written in C and Rust. And at some point it became obvious: even that is not enough.

The next step was inevitable - moving computation to the GPU.

We'll walk through this path step by step: from naive implementations to modern solutions like Tensor, NDArray, and GPU backends in RubixML.

Along the way, we'll see how not only the API changed, but also the understanding of where PHP ends - and where a real computational system or orchestration layer begins.

In short: this is not a story about how to do ML in PHP, but why it had to be done differently.


2. First attempt: when a matrix is just an array

If we ignore theory, a matrix is just a table of numbers. And since PHP already has arrays, you can represent a matrix as an array of arrays:

$matrix = [
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0],
];
Enter fullscreen mode Exit fullscreen mode

This idea was the foundation of the first ML libraries in PHP:

Everything was written in pure PHP - no extensions, no native code (meaning all computations were done by the interpreter, without low-level optimizations).

From a developer's perspective, this was extremely convenient: no compilation required, easy debugging with familiar tools, and fully transparent code.

What it looked like in code

Even operations like matrix multiplication were implemented very directly - using three nested loops:

function matmul(array $a, array $b): array {
    $result = [];

    $rowsA = count($a);
    $colsA = count($a[0]);
    $colsB = count($b[0]);

    for ($i = 0; $i < $rowsA; $i++) {
        for ($j = 0; $j < $colsB; $j++) {
            $sum = 0.0;
            for ($k = 0; $k < $colsA; $k++) {
                $sum += $a[$i][$k] * $b[$k][$j];
            }
            $result[$i][$j] = $sum;
        }
    }

    return $result;
}
Enter fullscreen mode Exit fullscreen mode

Or, for example, a dot product:

function dot(array $a, array $b): float {
    $sum = 0.0;

    for ($i = 0, $n = count($a); $i < $n; $i++) {
        $sum += $a[$i] * $b[$i];
    }

    return $sum;
}
Enter fullscreen mode Exit fullscreen mode

From an algorithmic point of view - everything is correct.
From a PHP point of view - everything is "by the book".

For a human - it's clear and readable. And most importantly, it works.

In fact, this approach never really disappeared.

Even today, you can find fresh examples where neural networks are implemented in pure PHP - without external libraries or extensions. Usually not for production, but for learning: to understand how things work "under the hood" and how complex algorithms are built from simple operations.

For example (manual forward pass of a neural network layer):
function forward(array $inputs, array $weights, array $biases):

array {
    $outputs = [];

    foreach ($weights as $i => $neuronWeights) {
        $sum = 0.0;

        foreach ($neuronWeights as $j => $weight) {
            $sum += $inputs[$j] * $weight;
        }

        $sum += $biases[$i];
        $outputs[$i] = 1 / (1 + exp(-$sum));
    }
    return $outputs;
}
Enter fullscreen mode Exit fullscreen mode

This is an important point.

On one hand, it shows that:

  • the approach is still understandable
  • it's still useful for learning and experimentation

On the other hand - its limitations become obvious very quickly when you try to scale or move to production.

If you're interested in looking ahead, I recommend watching Alexey Nechaev's relatively recent talk (It's in Russian, but I promise you'll have a blast watching it):

It walks you through the implementation of a neural network step by step, directly in PHP, using the same concepts: arrays, loops, and basic mathematical operations.

Why it was convenient

The main advantage - simplicity.

If you're motivated, you can implement in one evening:

  • k-NN
  • simple linear regression
  • a basic classifier
  • etc.

All without leaving PHP.

For example, a piece of logistic regression:

function sigmoid(float $x): float {
    return 1 / (1 + exp(-$x));
}

function predict(array $weights, array $features): float {
    return sigmoid(dot($weights, $features));
}
Enter fullscreen mode Exit fullscreen mode

As I like to say - "everything is neat and old-school".

On small datasets, this works fine.
For learning, experiments, and prototypes - more than enough.

Problems start a bit later.


3. Why it stops working

The problems don't appear immediately.

At first everything looks fine. Then the data grows - and the code starts slowing down. First a little, then dramatically, and eventually it becomes unusable.

The first instinct is: "we need to optimize the loops". But very quickly it becomes clear - the issue is not the algorithms or the code itself. Loop optimizations don't help much.

The problem is deeper - in how PHP works internally.

A number is not just a number

In PHP, a number is not a primitive - it's a zval structure that stores:

  • type
  • value
  • metadata (e.g., reference count)

So instead of compact 4–8 bytes like in C, each element takes much more memory.

If you have a million elements - that's a million such structures.

An array is a hash table

A PHP array is not contiguous memory - it's typically a hash table (except for packed arrays).

That means elements are not stored next to each other. Access is not just "index lookup", but a sequence of operations:

  • key lookup
  • traversal of internal structures
  • extraction of the zval

For example:

$value = $matrix[10][5];
Enter fullscreen mode Exit fullscreen mode

Under the hood, this is a series of operations, not a direct memory access.

This immediately impacts several things:

  • CPU cache efficiency drops
  • data access becomes more expensive
  • no automatic SIMD vectorization at the PHP level

In numerical computing, this is critical.

Copy-on-write

There is another less obvious issue - copy-on-write.

PHP may implicitly copy arrays when modified:

$b = $a;
$b[0][0] = 42;
Enter fullscreen mode Exit fullscreen mode

You don't always control when the copy happens.

In matrix-heavy workloads, this can:

  • significantly increase memory usage
  • introduce extra allocations

No vectorization

All operations are done via loops:

for (...) {
    $c[$i] = $a[$i] + $b[$i];
}
Enter fullscreen mode Exit fullscreen mode

Whereas in NumPy this is a single operation executed in C using SIMD (Single Instruction, Multiple Data).

So you end up in a strange situation: the algorithm is correct, the code is simple - but every operation is too expensive.

And that's why everything suddenly breaks when data grows.

What it looks like in practice

Let's compare a simple operation - matrix multiplication.

Option 1: pure PHP

function matmul(array $a, array $b): array {
    $result = [];

    $rowsA = count($a);
    $colsA = count($a[0]);
    $colsB = count($b[0]);

    for ($i = 0; $i < $rowsA; $i++) {
        for ($j = 0; $j < $colsB; $j++) {
            $sum = 0.0;
            for ($k = 0; $k < $colsA; $k++) {
                $sum += $a[$i][$k] * $b[$k][$j];
            }
            $result[$i][$j] = $sum;
        }
    }

    return $result;
}
Enter fullscreen mode Exit fullscreen mode

Option 2: Tensor

(https://github.com/RubixML/Tensor)

use Tensor\Matrix;

$a = Matrix::rand(500, 500);
$b = Matrix::rand(500, 500);

$c = $a->matmul($b);
Enter fullscreen mode Exit fullscreen mode

Option 3: NumPower (potential GPU)

(https://github.com/RubixML/numpower)

$c = NumPower::multiply($a, $b);
// or
$c = $a * $b;
Enter fullscreen mode Exit fullscreen mode

Results (approximate)

Approach Time for matrix 500x500

  • PHP arrays ~ 10–20 sec
  • Tensor (CPU) ~ 0.3–0.8 sec
  • GPU (NumPower) ~ 0.05–0.2 sec

The exact numbers depend on hardware, but the order of magnitude remains.

The key point is not absolute values - it's the orders of magnitude difference.

And this is where it becomes clear: the issue is not that PHP is "slow" or "bad" - it's that it's using the wrong tool for the job.


4. Optimization attempts - and why they don't save it

For example, you can remove repeated count() calls and reduce array access:

$count = count($a);

for ($i = 0; $i < $count; $i++) {
    $ai = $a[$i];
    $bi = $b[$i];
    $sum += $ai * $bi;
}
Enter fullscreen mode Exit fullscreen mode

This:

  • calls count() once instead of every iteration
  • reduces array access
  • uses local variables, which are faster

It does give a small speedup, especially in large loops. But the gain is modest and depends heavily on the code and PHP version.

You quickly hit the limits of such micro-optimizations.

The core issue is that this is optimization at the syntax level.

The fundamental limitations remain:

  • no memory control
  • no contiguous storage
  • no SIMD
  • no BLAS

The bottleneck is not the loop - it's PHP's memory model.

Still, optimizations didn't stop there.

Optimization: transposition

A common trick is to transpose the second matrix in advance.

The issue with naive multiplication in PHP is frequent column access:

$sum += $a[$i][$k] * $b[$k][$j];
Enter fullscreen mode Exit fullscreen mode

Accessing $a[$i][$k] is fine – it's row-wise.
But $b[$k][$j] is column access – inefficient in PHP arrays.

As a result, we are constantly jumping around in memory.

👉 For a fun way to pass the time, I highly recommend watching the video of Andrew DalPino (author of the RubixML library):

The solution here is deadly simple: transpose the second matrix in advance:

function transpose(array $matrix): array {
    $result = [];

    foreach ($matrix as $i => $row) {
        foreach ($row as $j => $value) {
            $result[$j][$i] = $value;
        }
    }
    return $result;
}
Enter fullscreen mode Exit fullscreen mode

Then:

function matmulOptimized(array $a, array $b): array {
    $bT = transpose($b);
    $result = [];

    $rowsA = count($a);
    $colsB = count($bT);
    for ($i = 0; $i < $rowsA; $i++) {
        for ($j = 0; $j < $colsB; $j++) {
            $sum = 0.0;
            for ($k = 0; $k < count($a[0]); $k++) {
                $sum += $a[$i][$k] * $bT[$j][$k];
            }
            $result[$i][$j] = $sum;
        }
    }
    return $result;
}
Enter fullscreen mode Exit fullscreen mode

Now we always work with rows:

$bT[$j][$k] // instead of $b[$k][$j]
Enter fullscreen mode Exit fullscreen mode

Why it helps:

  • fewer memory "jumps"
  • better CPU cache usage
  • more predictable access

This can noticeably improve performance even in pure PHP.

But - there's a catch.

We still:

  • use zval
  • use hash tables
  • run inside the interpreter

So we're making things less bad - but not truly efficient.

At some point, you have to admit: even advanced PHP tricks won't replace a proper data model.

PHP was not designed for high-performance numerical computing.

Optimization: CPU cache

Another idea - improve cache locality.

In low-level languages (C, C++, NumPy), performance heavily depends on how data is laid out in memory.

If data is contiguous, the CPU can read it in blocks (cache lines):

for ($i = 0; $i < $n; $i++) {
    $sum += $a[$i] * $b[$i];
}
Enter fullscreen mode Exit fullscreen mode

This is fast in C not because it's simple, but because:

  • data is contiguous
  • CPU reads it in blocks
  • access is predictable

So naturally, you try:

  • row-based access
  • minimize randomness
  • use linear structures

Where it breaks in PHP

Even with a "flat" array:

$a = [1.0, 2.0, 3.0, 4.0];
Enter fullscreen mode Exit fullscreen mode

In memory this is:

  • not a sequence of floats
  • but an array of zval
  • still backed by internal structures

So:

  • data is not compact
  • there is overhead between elements
  • CPU cache is not used efficiently

You end up optimizing as if cache matters - but getting little benefit.

Optimization: packed arrays

PHP 7+ introduced packed arrays for simple cases:

$a = [1, 2, 3, 4, 5];
Enter fullscreen mode Exit fullscreen mode

These are more compact and faster.

But for matrices:

$matrix = [
    [1, 2, 3],
    [4, 5, 6],
];
Enter fullscreen mode Exit fullscreen mode

You still get an array of arrays - not truly contiguous.

And:

  • each element is still a zval
  • no real contiguous memory

Plus, packed arrays are fragile:

$a = [1, 2, 3];
$a[10] = 4; // breaks packing
Enter fullscreen mode Exit fullscreen mode

In ML workloads (filtering, reshaping, indexing), this happens all the time.


If you combine everything - transposition, cache tricks, packed arrays - you see the same pattern:

we can speed things up a bit - but we can't change the foundation.

And again we arrive at the key insight:

the problem is not how we write loops –
the problem is how data is represented in memory.

A subtle but important point

There's another aspect that's rarely discussed.

Many libraries from that era simply stopped evolving. The internet is full of abandoned PHP ML pet projects.

Not because the idea was bad - quite the opposite. It was logical.

But developers quickly hit the same ceiling.

They could:

  • rewrite internals
  • add more optimizations
  • improve APIs

But it didn't lead to fundamental gains.

At some point, authors realized:

no matter how much you optimize PHP arrays, they won't become suitable for numerical computing.

And then there were only two options:

  • rewrite everything in C / Rust
  • lose motivation

Often, it was the second.

Enthusiasm fades when each step brings diminishing returns, while the core problem remains.

And this is one of the clearest signals:

  • the issue was not in the libraries –
  • it was in the model itself.

5. The turning point: when matrices stop being arrays

The next stage was simple - stop treating matrices as PHP arrays.

Libraries and extensions appeared that store data outside PHP structures.

For example:

Now data is stored in contiguous memory, like in NumPy. This is a completely different level.

The code looks similar:

use Tensor\Matrix;

$a = Matrix::quick([
    [1, 2],
    [3, 4],
]);
$b = Matrix::quick([
    [5, 6],
    [7, 8],
]);
$c = $a->matmul($b);
Enter fullscreen mode Exit fullscreen mode

But internally everything changes:

  • no zval per element
  • no hash table
  • tightly packed data
  • efficient CPU usage

Rust plays an interesting role here. In NDArray, it provides performance without typical C memory pitfalls. It's a good example of PHP delegating heavy work to other languages.

And it works - speedups can be significant.

But even this is not the end.

Architecture shift

Simplified evolution:
PHP arrays
    ↓
(realization of limitations)
    ↓
Native structures (Tensor, ndarray)
    ↓
C / Rust extensions
Enter fullscreen mode Exit fullscreen mode

More broadly - this is a shift in roles:

  • before: PHP did computations
  • now: PHP orchestrates

And interestingly, code becomes not just faster - but less "PHP-like" and closer to actual math.

Higher-level operations

$c = $a->add($b);
$c = $a->divide($a->sum());
Enter fullscreen mode Exit fullscreen mode

This reads almost like formulas.

But something still feels missing.

The API is still object-oriented, not fully mathematical.

NumPower and a new abstraction level

NumPower (originally created by Henrique Borba, not it's a part of RubixML) goes further:

$c = NumPower::multiply($a, $b);
// or
$c = $a * $b;
Enter fullscreen mode Exit fullscreen mode

Even:

$y = $a / $b;
Enter fullscreen mode Exit fullscreen mode

This is not just syntax sugar - it's making math expressions native to the language.


6. A new ceiling: when CPU is not enough

As workloads grow, it becomes clear - the problem is not just PHP.
Even with perfect data structures and fast native code, CPU has limits: cores and bandwidth.

ML tasks are essentially massive matrix operations.

At some scale, even optimized CPU implementations struggle.

So we move to the next logical step.


7. Why everything leads to GPU

GPU is built for parallel computation: thousands of threads, high memory bandwidth - perfect for linear algebra.

Today it's obvious: squeezing more out of CPU is a dead end for many ML workloads.

You need to change the architecture.

What's happening now

RubixML v3 is moving toward GPU via projects like NumPower:

👉 https://github.com/RubixML/numpower

This is a paradigm shift:

  • PHP → orchestration
  • Tensor / NumPower → computation
  • GPU → heavy math

**This is still driven by enthusiasts and under active development.

If you're interested - this is a great time to get involved:**

👉 https://github.com/RubixML/ML/tree/3.0

This is no longer about "optimizing computation". It's about changing the model: PHP becomes an orchestrator, delegating heavy work to native layers.


8. What actually happened

Looking at the evolution, the picture is quite logical.

First, PHP was used as a compute environment. Then it became clear it's not suited for that. Then computation moved outward: first to native structures, then extensions, and now GPU.

Today, PHP in ML is mostly an orchestration layer. It connects components, manages processes, handles data - while heavy math happens elsewhere:

  • GPU
  • Rust
  • native libraries

And that's probably the key takeaway.

A practical example

We have projects like:

👉 https://github.com/CodeWithKyrian/transformers-php (by Kyrian Obikwelu)

This library gives access to modern models (like Hugging Face transformers) in PHP.

But under the hood, PHP is not doing heavy computation.

It:

  • manages the pipeline
  • loads models
  • processes results

Computation happens outside PHP - via native libraries or external runtimes.

Example:

use function Codewithkyrian\Transformers\Pipelines\pipeline;

// Allocate a pipeline for sentiment analysis
$classifier = pipeline('sentiment-analysis');

$out = $classifier(['I love transformers!']);
echo print_r($out, true) . PHP_EOL;

$out = $classifier(['I hate transformers!']);
echo print_r($out, true);
Enter fullscreen mode Exit fullscreen mode

This shows how PHP's role changed:

before:

// we implement the mathematics ourselves
$sum += $inputs[$i] * $weights[$i];
Enter fullscreen mode Exit fullscreen mode

now:

// we use a ready-made ML infrastructure
$result = $pipeline($text);
Enter fullscreen mode Exit fullscreen mode

Similarly, libraries like:

👉 https://github.com/llphant/llphant (by Franco Lombardo, Fabrizio Balliano and others)

LLPhant moves even higher - working with full LLM scenarios:

  • text generation
  • embeddings
  • retrieval (RAG)
  • chat interfaces

Example:

use LLPhant\Chat\Enums\ChatRole;
use LLPhant\Chat\Message;
use LLPhant\Chat\OpenAIChat;
use LLPhant\OpenAIConfig;

$chat = new OpenAIChat(new OpenAIConfig(...));

$message = new Message();
$message->role = ChatRole::User;
$message->content = 'What is the capital of France? ';

$response = $chat->generateText((string)$message);
echo $response;
Enter fullscreen mode Exit fullscreen mode

Embeddings:

use LLPhant\Embeddings\OpenAI\EmbeddingGenerator;

$generator = new EmbeddingGenerator();

$embedding = $generator->embed("PHP and machine learning");
Enter fullscreen mode Exit fullscreen mode


php

Now we no longer think about:

  • matrices
  • loops
  • memory

We think about:

  • data
  • requests
  • system behavior

And the transformation doesn't stop here.

Higher-level tools are emerging where PHP is not just a wrapper, but an environment for building AI logic.

For example:

👉 https://github.com/neuron-core/neuron-ai (by Valerio Barbera)

Neuron AI focuses on:

  • agents
  • chains
  • LLM integrations

Example:

$message = MyAgent::make()
    ->chat(new UserMessage("Hi!"))
    ->getMessage();

echo "Q1: Hi!\n";
echo "A: " . $message->getContent() . "\n\n";
echo "Q2: Explain who you are?\n";
echo "A: ". MyAgent::make()
    ->chat(new UserMessage("Explain who you are?"))
    ->getMessage()
    ->getContent();
Enter fullscreen mode Exit fullscreen mode

Here it's especially clear how PHP's role evolved:

  • before - we implemented math
  • then - we moved it to native structures
  • now - we describe system behavior

9. Conclusion

Looking at the full journey over the last ten years, it becomes clear: this is not just an evolution of tools - it's a gradual displacement of PHP from computation.

  • First, we honestly multiply matrices in arrays.
  • Then we optimize and hit a ceiling.
  • Then we move to native structures.
  • Then to C / Rust.
  • And eventually… to GPU.
PHP arrays
    ↓
(realization of limitations)
    ↓
Native structures (Tensor, ndarray)
    ↓
C / Rust extensions
    ↓
GPU (NumPower, RubixML v3)
Enter fullscreen mode Exit fullscreen mode

And here's a somewhat uncomfortable conclusion:

PHP is not becoming faster at ML.
It's just stopping doing the math.

It's no longer a compute engine - it's a dispatcher, an orchestrator, glue between components.

And maybe that's exactly its chance to stay relevant.

But the question remains:

*Is the game still going on?
Or has PHP been playing a different game for a while now?
*

If you're interested in the topic, you can dive deeper in my free book:
https://apphp.gitbook.io/ai-for-php-developers/

And explore interactive examples:
https://aiwithphp.org/books/ai-for-php-developers/examples/

Top comments (0)