I recently shipped a static local-business site with ~112 pages, all generated from a single Python script and deployed on Cloudflare Pages. Here's the architecture and the SEO pitfalls I had to fix.
The stack
-
Generator: one
generate_site.pythat renders every page from data dicts (one entry per neighborhood/product). -
Hosting: Cloudflare Pages, deployed via
wrangler pages deploy. -
Clean URLs: a
_redirectsfile maps/*.html -> /:splat 301so URLs stay extensionless.
The duplicate-content trap
The first version generated dozens of near-identical neighborhood pages (~88% similarity). Google's response was brutal: "Discovered – currently not indexed". The fix:
- inject genuinely unique local data per page (landmarks, transit, real street names);
- deterministically rotate sentence variants by slug so boilerplate differs page to page;
- vary section order.
Similarity dropped from ~72% to ~18%.
Internal linking matters more than people think
Orphan pages (linked only from one hub) barely get crawled. I made the "related pages" block contextual (geographic neighbors + rotation) so every page receives 8-12 internal links.
You can see the live result here: livraison-alcool-toulouse.com (a night delivery service in Toulouse, France).
Takeaways
- Programmatic SEO works, but near-duplicate templates get ignored.
- A custom
404.htmlis mandatory on CF Pages, otherwise unknown paths soft-404 with a 200. - Internal link distribution is a real ranking/crawl signal.
Happy to share the generator pattern if anyone's interested.
Top comments (0)