DEV Community

Cover image for Refactoring #3: Optimizing Eloquent queries & console commands
Geni Jaho
Geni Jaho

Posted on • Originally published at genijaho.dev

2 1

Refactoring #3: Optimizing Eloquent queries & console commands

We have a console command at Openlittermap that takes all the litter locations in the world and groups them into clusters for viewing on the global map. It's a heavy command in terms of memory consumption, database strain, and execution time. That's why it's used sparingly, from time to time. Let's see how we can make it run daily, without sacrificing time and memory.

It's going to be a bit long, so bear with me here. For brevity, we're only showing the handle method. Being a lengthy method, it's hard to find a place to start refactoring, so let's go easy and extract some smaller methods out of it. This will help us understand the problem better, and improve the command's readability.

<?php
...
public function handle()
{
$photos = Photo::select('lat', 'lon')->get();
$features = [];
foreach ($photos as $photo) {
$feature = [
'type' => 'Feature',
'geometry' => [
'type' => 'Point',
'coordinates' => [$photo->lon, $photo->lat]
]
];
array_push($features, $feature);
}
unset($photos); // free up memory
$features = json_encode($features, JSON_NUMERIC_CHECK);
Storage::put('/data/features.json', $features);
if (app()->environment() === 'local') {
$prefix = config('app.root_dir');
} else if (app()->environment() === 'staging') {
$prefix = '/home/***/olmdev.***';
} else {
$prefix = '/home/***/openlittermap.com';
}
Cluster::truncate();
$zoomLevels = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16];
foreach ($zoomLevels as $zoomLevel) {
exec('node app/Node/supercluster-php ' . $prefix . ' ' . $zoomLevel);
$clusters = json_decode(Storage::get('/data/clusters.json'));
foreach ($clusters as $cluster) {
if (isset($cluster->properties)) {
Cluster::create([
'lat' => $cluster->geometry->coordinates[1],
'lon' => $cluster->geometry->coordinates[0],
'zoom' => $zoomLevel
]);
}
}
}
}

Refactor into smaller methods

There are two main actions that this command does. Firstly, it generates a JSON file with all the latitude & longitude values of litter, hence, the generateFeatures method. Secondly, using this JSON file, a node command (yes, a node command) is executed using the exec PHP function, which generates the clusters into a clusters.json file. Lastly, we populate our clusters database table using the generated JSON. I know, lengthy explanation.

<?php
...
public function handle()
{
$this->generateFeatures();
$this->generateClusters();
}
protected function generateFeatures()
{
$photos = Photo::select('lat', 'lon')->get();
$features = [];
foreach ($photos as $photo) {
$feature = [
'type' => 'Feature',
'geometry' => [
'type' => 'Point',
'coordinates' => [$photo->lon, $photo->lat]
]
];
array_push($features, $feature);
}
unset($photos); // free up memory
$features = json_encode($features, JSON_NUMERIC_CHECK);
Storage::put('/data/features.json', $features);
}
protected function generateClusters()
{
if (app()->environment() === 'local') {
$prefix = config('app.root_dir');
} else if (app()->environment() === 'staging') {
$prefix = '/home/***/olmdev.***';
} else {
$prefix = '/home/***/openlittermap.com';
}
Cluster::truncate();
$zoomLevels = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16];
foreach ($zoomLevels as $zoomLevel) {
exec('node app/Node/supercluster-php ' . $prefix . ' ' . $zoomLevel);
$clusters = json_decode(Storage::get('/data/clusters.json'));
foreach ($clusters as $cluster) {
if (isset($cluster->properties)) {
Cluster::create([
'lat' => $cluster->geometry->coordinates[1],
'lon' => $cluster->geometry->coordinates[0],
'zoom' => $zoomLevel
]);
}
}
}
}

Use cursor() instead of get()

The memory issue with this line Photo::select('lat', 'lon')->get(); is that it will load all the photo objects into memory, and that is a big problem with the 250k+ photos currently in the database. To fix that, Laravel provides a nice query helper called cursor. By using $photos = Photo::select('lat', 'lon')->cursor();, only one Eloquent model is kept in memory at any given time while iterating over the cursor. For that reason, we don't even have to call the unset($photos); method at all.

From the docs, although the cursor method uses far less memory than a regular query, it will still eventually run out of memory. If you're dealing with a very large number of Eloquent records, consider using the lazy method instead.

Use batches instead of individual inserts

Looking at the loop that iterates over all clusters, it's killing the database with thousands of MySql insert queries for storing the clusters. It would be much better if we utilized the Cluster::insert() method instead of Cluster::create(). This way, we'll only execute one query to insert all the records.

<?php
...
$clusters = json_decode(Storage::get('/data/clusters.json'));
$result = [];
foreach ($clusters as $cluster) {
if (isset($cluster->properties)) {
$result[] = [
'lat' => $cluster->geometry->coordinates[1],
'lon' => $cluster->geometry->coordinates[0],
'zoom' => $zoomLevel
];
}
}
Cluster::insert($result);

Now this will of course fail because inserting lots of megabytes at once is a big no-no for the database. Let's see how we can insert the data in chunks.

<?php
...
collect(json_decode(Storage::get('/data/clusters.json')))
->filter(function ($cluster) {
return isset($cluster->properties);
})
->map(function ($cluster) use ($zoomLevel) {
return [
'lat' => $cluster->geometry->coordinates[1],
'lon' => $cluster->geometry->coordinates[0],
'zoom' => $zoomLevel
];
})
->chunk(1000)
->each(function ($chunk) {
Cluster::insert($chunk->toArray());
});

We're doing a couple of things here. First, we're wrapping the clusters object into a Laravel Collection instance, so that the code becomes more readable and we're provided some nice helpers. Using the filter() method on the collection, we're extracting the check with isset() outside of the mapping (or looping). The call to map() on the collection allows us to modify the $cluster objects into simple arrays, preparing them for insertion.

Now here's the fun part. The call to ->chunk(1000) will separate the collection items into multiple chunks of size 1000. The call to each() after that iterates all the chunks and inserts them separately. This allows us to easily execute a much smaller number of queries, without sacrificing performance.

Bonus small improvements

There are three small improvements I'd like to mention. Just to feel... complete.

<?php
// 1.
// Pushing an element to an array
array_push($features, $feature);
// Becomes shorter and
// the PHPStorm IDE even suggests it as a refactoring
$features[] = $feature;
// 2.
// Getting a range of numbers
$zoomLevels = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16];
// Becomes like this
// the range() function can be used for letters too
$zoomLevels = range(2, 16);
// 3.
// And probably the most important refactoring
if (app()->environment() === 'local') {
$prefix = config('app.root_dir');
} else if (app()->environment() === 'staging') {
$prefix = '/home/***/olmdev.***';
} else {
$prefix = '/home/***/openlittermap.com';
}
// Becomes simply..
// There are countless Laravel helpers out there
// This one gives you the full path where your project is located
$prefix = base_path();

The code used for illustration is taken from the OpenLitterMap project. They're doing a great job creating the world's most advanced open database on litter, brands & plastic pollution. The project is open-sourced and would love your contributions, both as users and developers.


Originally published at https://genijaho.dev.

Image of Datadog

The Essential Toolkit for Front-end Developers

Take a user-centric approach to front-end monitoring that evolves alongside increasingly complex frameworks and single-page applications.

Get The Kit

Top comments (0)

Image of Timescale

Timescale – the developer's data platform for modern apps, built on PostgreSQL

Timescale Cloud is PostgreSQL optimized for speed, scale, and performance. Over 3 million IoT, AI, crypto, and dev tool apps are powered by Timescale. Try it free today! No credit card required.

Try free