DEV Community

Cover image for Generating Sitemaps is Easy
Benyamin Khalife
Benyamin Khalife

Posted on

Generating Sitemaps is Easy

I've been building PHP projects for a while, and one of those annoying little tasks that keeps coming up is generating sitemaps. Not because it's technically hard — it's just tedious. You either roll your own XML string builder (and inevitably forget an edge case), or you pull in a heavy package that does ten things you don't need.
So I wrote a small library called webrium/sitemap to handle it cleanly. No bloat, just a focused tool that covers the real-world cases: images, videos, hreflang for multilingual sites, gzip, and automatic splitting for large sites.


Getting Started

composer require webrium/sitemap
Enter fullscreen mode Exit fullscreen mode

That's it. PHP 8.1+ and the xmlwriter extension (which is almost always enabled by default).


The Basics

use Webrium\Sitemap\Sitemap;

$sitemap = new Sitemap('https://example.com');

$sitemap->addUrl('/',       changefreq: Sitemap::FREQ_DAILY,   priority: 1.0); // → https://example.com
$sitemap->addUrl('/about',  changefreq: Sitemap::FREQ_MONTHLY, priority: 0.8);
$sitemap->addUrl('/blog',   changefreq: Sitemap::FREQ_DAILY,   priority: 0.9);

echo $sitemap->generate();
Enter fullscreen mode Exit fullscreen mode

One thing worth noting: passing '/' for the homepage correctly produces https://example.com — no trailing slash appended. Clean URLs matter, especially when Google is reading them.


Full URL Options

Each URL can carry a lastmod, changefreq, and priority. Named arguments make this readable even when you're using all of them:

use DateTime;

$sitemap->addUrl(
    path:       '/blog/my-first-post',
    lastmod:    new DateTime('2025-04-15'),
    changefreq: Sitemap::FREQ_WEEKLY,
    priority:   0.7
);
Enter fullscreen mode Exit fullscreen mode

changefreq tells crawlers how often the content changes. Available values:

changefreq tells crawlers how often the content changes. Available values:

Constant Value
Sitemap::FREQ_ALWAYS always
Sitemap::FREQ_HOURLY hourly
Sitemap::FREQ_DAILY daily
Sitemap::FREQ_WEEKLY weekly
Sitemap::FREQ_MONTHLY monthly
Sitemap::FREQ_YEARLY yearly
Sitemap::FREQ_NEVER never

priority is a float between 0.0 and 1.0. Your homepage is typically 1.0, blog posts somewhere around 0.60.8. Search engines don't treat this as a hard rule, but it's still worth setting thoughtfully.

lastmod accepts any DateTimeInterface, so you can pass a DateTime, DateTimeImmutable, or Carbon instance from your models directly.


Adding URLs in Bulk

If you're pulling pages from a database, addUrls() is cleaner than looping manually:

$sitemap->addUrls([
    ['path' => '/services',  'priority' => 0.9],
    ['path' => '/portfolio', 'changefreq' => Sitemap::FREQ_MONTHLY],
    ['path' => '/contact',   'lastmod' => new DateTime('2025-01-10'), 'priority' => 0.5],
]);
Enter fullscreen mode Exit fullscreen mode

Duplicate URLs are silently ignored — you don't need to deduplicate before passing.


Images

Google's image sitemap extension lets you associate images with pages, which can improve image search visibility:

$sitemap->addUrl(
    path:   '/portfolio/project-alpha',
    images: [
        [
            'loc'     => 'https://example.com/images/project-alpha-hero.jpg',
            'title'   => 'Project Alpha — hero shot',
            'caption' => 'The main dashboard view of Project Alpha.',
        ],
        [
            'loc'   => 'https://example.com/images/project-alpha-mobile.jpg',
            'title' => 'Project Alpha — mobile view',
        ],
    ]
);
Enter fullscreen mode Exit fullscreen mode

loc is required. title and caption are optional but recommended.


Videos

Same idea for videos, with a few more required fields:

$sitemap->addUrl(
    path:   '/tutorials/getting-started',
    videos: [
        [
            'thumbnail_loc' => 'https://example.com/thumbs/getting-started.jpg',
            'title'         => 'Getting Started in 5 Minutes',
            'description'   => 'A quick walkthrough of the core features.',
            'duration'      => 320, // seconds
        ],
    ]
);
Enter fullscreen mode Exit fullscreen mode

thumbnail_loc, title, and description are required. duration is optional.


Multilingual Sites (Hreflang)

If your site serves content in multiple languages, you can attach hreflang alternates to each URL:

$sitemap->addUrl(
    path:      '/about',
    hreflangs: [
        ['lang' => 'en',        'url' => 'https://example.com/en/about'],
        ['lang' => 'fa',        'url' => 'https://example.com/fa/about'],
        ['lang' => 'x-default', 'url' => 'https://example.com/about'],
    ]
);
Enter fullscreen mode Exit fullscreen mode

The x-default entry tells Google which version to show when no language preference matches.


Saving to File

// Plain XML
$sitemap->saveToFile('/var/www/public/sitemap.xml');

// Gzip compressed — just add .gz to the filename
$sitemap->saveToFile('/var/www/public/sitemap.xml.gz');
Enter fullscreen mode Exit fullscreen mode

Gzip is worth enabling in production. The file ends up 5–10x smaller, and most crawlers handle it fine.


Large Sites — Auto Splitting

The sitemap protocol caps files at 50,000 URLs and 50 MB. For bigger sites, splitAndSave() handles the splitting automatically and generates a sitemap index:

$sitemap = new Sitemap('https://example.com');

foreach ($allPages as $page) {
    $sitemap->addUrl($page->path, $page->updatedAt);
}

$indexXml = $sitemap->splitAndSave(
    directory:   '/var/www/public/sitemaps',
    baseFileUrl: 'https://example.com/sitemaps',
    prefix:      'sitemap',
    gzip:        true
);

file_put_contents('/var/www/public/sitemap-index.xml', $indexXml);
Enter fullscreen mode Exit fullscreen mode

This produces sitemap-1.xml.gz, sitemap-2.xml.gz, and so on — plus a sitemap index file that points to all of them. Submit the index URL to Google Search Console and you're done.


Namespace Cleanliness

One small detail I care about: the library only adds XML namespaces that are actually used. If your sitemap has no images or videos, the output won't carry xmlns:image or xmlns:video. It's a minor thing, but it keeps the XML clean and honest.


Wrapping Up

Nothing revolutionary here — it's a sitemap generator. But it handles the real cases correctly and gets out of your way. If you've been copy-pasting sitemap XML by hand or using a bloated package, give it a try.

GitHub / Packagist: composer require webrium/sitemap


Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.