Parallel file downloading with cURL

#php

A consequence of full cURL integration into the EventLoop. The option CURLOPT_FILE => $fp tells #cURL to write output directly to a file, which goes through the async engine.

On Windows, since it does not handle file I/O descriptors properly, the code runs via IOCP + a thread pool. As a result, all 20 files are written not just concurrently, but actually in parallel. Hence the unusual performance boost.

On Linux, this would require only #io_uring, but it still needs some work.

<?php

use function Async\spawn;
use function Async\await_all;

// Directory where downloaded files will be stored
$downloadDir = __DIR__ . '/downloads';

// Create directory if it does not exist
if (!is_dir($downloadDir)) {
    mkdir($downloadDir, 0755, true);
}

// Prepare list of files to download (WordPress plugins)
$files = array_map(fn($slug) => [
    'url'      => "https://downloads.wordpress.org/plugin/{$slug}.zip",
    'filename' => "{$slug}.zip",
], [
    'classic-editor',
    'akismet',
    'contact-form-7',
    'jetpack',
    'woocommerce',
    'wordfence',
    'elementor',
    'yoast-seo',
    'wpforms-lite',
    'really-simple-ssl',
    'all-in-one-seo-pack',
    'updraftplus',
    'litespeed-cache',
    'w3-total-cache',
    'duplicate-post',
    'mailchimp-for-wp',
    'regenerate-thumbnails',
    'redirection',
    'cookie-law-info',
    'wp-super-cache',
]);

/**
 * Downloads a file using cURL and saves it to disk
 */
function downloadFile(string $url, string $savePath): array
{
    // Open file for writing
    $fp = fopen($savePath, 'wb');
    if ($fp === false) {
        return ['success' => false, 'error' => "Failed to open file: $savePath"];
    }

    // Initialize cURL session
    $ch = curl_init($url);

    // Configure cURL options
    curl_setopt_array($ch, [
        CURLOPT_FILE           => $fp,              // Write response directly to file
        CURLOPT_FOLLOWLOCATION => true,             // Follow redirects
        CURLOPT_MAXREDIRS      => 5,                // Max redirects
        CURLOPT_TIMEOUT        => 120,              // Max execution time
        CURLOPT_CONNECTTIMEOUT => 10,               // Connection timeout
        CURLOPT_USERAGENT      => 'PHP CURL Downloader/1.0',
        CURLOPT_FAILONERROR    => true,             // Fail on HTTP errors (4xx, 5xx)
    ]);

    // Execute request
    $ok    = curl_exec($ch);
    $error = curl_error($ch);
    $info  = curl_getinfo($ch);

    fclose($fp);

    // Handle failure: remove partial file
    if (!$ok) {
        unlink($savePath);
        return ['success' => false, 'error' => $error ?: "HTTP {$info['http_code']}"];
    }

    // Return download metadata
    return [
        'success'  => true,
        'filename' => basename($savePath),
        'bytes'    => $info['size_download'],
        'speed'    => round($info['speed_download'] / 1024, 1) . ' KB/s',
    ];
}

$startTime = microtime(true);

// Start parallel downloads
echo "Starting " . count($files) . " downloads in parallel...\n\n";

$coroutines = [];
foreach ($files as $file) {
    $savePath = $downloadDir . '/' . $file['filename'];

    // Log start of download
    echo "↓ Start: {$file['filename']}\n";

    // Spawn coroutine for each download
    $coroutines[] = spawn(fn() => downloadFile($file['url'], $savePath));
}

echo "\n";

// Wait for all coroutines to complete
[$results, $exceptions] = await_all($coroutines);

// Process successful/failed results
foreach ($results as $result) {
    if ($result['success']) {
        $kb = round($result['bytes'] / 1024, 1);
        echo "✓ {$result['filename']}: {$kb} KB  [{$result['speed']}]\n";
    } else {
        echo "✗ Error: {$result['error']}\n";
    }
}

// Handle coroutine-level exceptions
foreach ($exceptions as $i => $e) {
    echo "✗ Coroutine $i failed: {$e->getMessage()}\n";
}

// Print total execution time
$elapsed = round(microtime(true) - $startTime, 2);
echo "\nDone in {$elapsed}s. Files saved to: $downloadDir\n";

DEV Community

Parallel file downloading with cURL

Top comments (0)