DEV Community

Cover image for Building a Serverless SEO Metadata Analyzer at the Edge
smail hachami
smail hachami

Posted on

Building a Serverless SEO Metadata Analyzer at the Edge

When diving deep into how search engines actually rank pages, the best way to learn isn't just reading theory—it's building your own tools. I wanted to create an automated system to test and understand on-page ranking factors. This isn't about spamming or exploiting algorithms; it's a completely legitimate, automated way to see the web exactly how a search engine crawler sees it.

By deploying this analyzer on serverless edge infrastructure, we can extract and analyze metadata with microsecond response times without the overhead of heavy web scrapers.

Here is how the architecture comes together.

The Core Architecture

To make this lightweight and infinitely scalable, the stack relies on three main components:

  • Compute: A serverless edge environment (like Cloudflare Workers). This ensures the request originates close to the target server, reducing latency.
  • Parsing: The HTMLRewriter API. Instead of loading an entire DOM into memory (which is slow and expensive), this parses the HTML stream as it arrives.
  • Routing: A lightweight web framework to handle the incoming GET requests.

What We Are Extracting

To understand a page's SEO footprint, the API automatically pulls the most critical on-page elements:

  1. <title> tags (checking for optimal character length).
  2. <meta name="description"> tags.
  3. Canonical URLs to check for duplicate content issues.
  4. Header hierarchies (H1 through H6) to ensure the content is structured logically.

The JSON Response

When you send a GET request to the worker with a target URL, it processes the stream and returns a clean, structured analysis ready for any dashboard.

Here is an example of the output:


json
{
  "target_url": "[https://example.com](https://example.com)",
  "status_code": 200,
  "seo_metrics": {
    "title": {
      "text": "Example Domain - High Performance Hosting",
      "length": 41,
      "optimal": true
    },
    "description": {
      "text": "The best hosting solutions for high-concurrency environments.",
      "length": 61,
      "optimal": false,
      "warning": "Description is under the recommended 120-160 character limit."
    },
    "canonical": "[https://example.com](https://example.com)",
    "headers": {
      "h1_count": 1,
      "h2_count": 4
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)