shreyas shinde

Posted on Dec 16, 2025 • Originally published at kanaeru.ai on Dec 16, 2025

Our SEO Journey: From SPA to Next.js (The Complete Playbook)

#react #nextjs #architecture #tutorial

Our SEO Journey: From "Crawled - Not Indexed" to Search Visibility

Building a beautiful Single Page Application (SPA) is one thing. Getting Google to actually index it? That's an entirely different challenge.

This is the story of how we transformed our Kanaeru AI website from a client-side rendered React app that search engines couldn't properly index, to a fully optimized Next.js site with comprehensive SEO that ranks well on Google.

The Problem: Beautiful But Invisible

When we first launched our marketing website, we chose Lovable.dev as our starting point. Lovable uses Vite + React under the hood and gave us a well-designed base template with rapid initial development speed. We designed our entire site through Lovable's AI interface, then migrated the code to GitHub where we continued development entirely via Claude Code.

The result looked perfect to human visitors. The animations were smooth, the design was polished, and the content was compelling.

But there was a problem: Google couldn't see most of it.

Our Google Search Console was showing a frustrating pattern:

Pages marked as "Crawled - currently not indexed"
Blog posts returning the homepage HTML to crawlers
Duplicate content issues across pages
Missing structured data for rich snippets

The root cause? SPAs render content with JavaScript. Search engine crawlers, while improving, still struggle with JavaScript-heavy pages. When Googlebot visited our blog posts, it saw the same generic homepage HTML for every URL.

Phase 1: Foundation Work (October 2025)

Comprehensive SEO Infrastructure

Our first major fix addressed the fundamentals:

1. Sitemap Generation

We created a dynamic sitemap generator that runs on every build:

// scripts/generate-sitemap.mjs
const routes = [
  { url: '/', changefreq: 'weekly', priority: 1.0 },
  { url: '/platform', changefreq: 'monthly', priority: 0.8 },
  { url: '/team', changefreq: 'monthly', priority: 0.7 },
  { url: '/blog', changefreq: 'daily', priority: 0.9 },
  // ... blog posts dynamically added
];

2. robots.txt for Modern Crawlers

We updated our robots.txt to explicitly allow both search engines and LLM crawlers:

User-agent: Googlebot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Claude-Web
Allow: /

User-agent: PerplexityBot
Allow: /

Sitemap: https://www.kanaeru.ai/sitemap.xml

3. JSON-LD Structured Data

We added Organization, WebSite, and Service schemas to our homepage:

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Kanaeru AI",
  "url": "https://www.kanaeru.ai",
  "logo": "https://www.kanaeru.ai/logo.png",
  "sameAs": [
    "https://github.com/kanaerulabs",
    "https://www.linkedin.com/company/kanaeru-ai"
  ]
}

Blog Post Pre-rendering

The game-changer was implementing static HTML generation for blog posts. Instead of serving the same SPA shell to every request, we pre-rendered each blog post with:

Complete meta tags (title, description, Open Graph, Twitter Cards)
Full article content for crawlers
Proper canonical URLs
BlogPosting JSON-LD structured data

// scripts/prerender-blog.ts
async function prerenderBlogPost(post: BlogPost) {
  const html = `
    <!DOCTYPE html>
    <html lang="${post.locale}">
    <head>
      <title>${post.title}</title>
      <meta name="description" content="${post.excerpt}">
      <link rel="canonical" href="https://www.kanaeru.ai/blog/${post.slug}">
      <script type="application/ld+json">
        ${JSON.stringify(generateBlogPostingSchema(post))}
      </script>
    </head>
    <body>
      <article>${post.htmlContent}</article>
    </body>
    </html>
  `;

  await writeFile(`public/prerendered/blog/${post.slug}.html`, html);
}

Phase 2: Fixing Critical Indexing Issues (October 2025)

After the foundation work, we still had issues. Google Search Console showed "Crawled - currently not indexed" for our blog posts. Investigation revealed several problems:

1. Wrong Canonical URLs

Our blog posts were pointing their canonical URL to the homepage instead of their own URL. This told Google "don't index me, index the homepage instead."

Fix: Updated the SEO library to generate correct canonical URLs for each page type.

2. Missing BlogPosting Schema

Generic Organization schema wasn't enough. Blog posts need specific BlogPosting structured data:

{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "Article Title",
  "datePublished": "2025-10-13",
  "dateModified": "2025-10-15",
  "author": {
    "@type": "Person",
    "name": "Shreyas Shinde"
  },
  "publisher": {
    "@type": "Organization",
    "name": "Kanaeru AI"
  },
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://www.kanaeru.ai/blog/article-slug"
  }
}

3. Empty Image Fields

Schema.org requires images. We were leaving image fields empty, which caused validation failures.

Fix: Added fallback logic to use default images when post-specific images weren't available.

Phase 3: Performance Optimization (October 2025)

SEO isn't just about content - Core Web Vitals directly impact rankings. Our PageSpeed Insights scores were suffering from:

Desktop scores after optimization. Mobile performance is still a work in progress.

Render-Blocking Resources

Google Fonts loaded via CSS @import blocked rendering for 1.6+ seconds.

Fix: Switched to async font loading:

<link rel="preload" href="https://fonts.googleapis.com/css2?family=Inter"
      as="style" onload="this.onload=null;this.rel='stylesheet'">
<noscript>
  <link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Inter">
</noscript>

Unused JavaScript

Targeting ES5 for broad compatibility bloated our bundles unnecessarily.

Fix: Updated to ES2020 target with better code splitting:

// vite.config.ts
build: {
  target: 'es2020',
  rollupOptions: {
    output: {
      manualChunks: {
        'react-vendor': ['react', 'react-dom'],
        'router': ['react-router-dom'],
        'i18n': ['i18next', 'react-i18next'],
        'markdown': ['marked', 'prismjs']
      }
    }
  }
}

Cache Headers

Static assets weren't being cached properly, causing repeat visitors to re-download everything.

Fix: Added aggressive cache headers via vercel.json:

{
  "headers": [
    {
      "source": "/assets/(.*)",
      "headers": [
        { "key": "Cache-Control", "value": "public, max-age=31536000, immutable" }
      ]
    }
  ]
}

Phase 4: Off-Page SEO & Backlink Building (October-November 2025)

On-page SEO is only half the battle. Search engines also evaluate your site's authority based on external signals - primarily backlinks from other reputable websites.

Cross-Publishing with Growth Kit

In October, we built Growth Kit, a Claude Code plugin that automatically transforms our blog posts into platform-specific content for:

LinkedIn - Professional articles with proper formatting
Medium - Long-form content with canonical URLs pointing back to our site
Dev.to - Technical content for the developer community
X/Twitter - Thread summaries with links to full articles

Each cross-published article includes a canonical URL back to our original post, ensuring:

No duplicate content penalties - Search engines know where the original lives
Backlink juice flows back - Links from Medium, Dev.to, and LinkedIn boost our domain authority
Wider reach - Content reaches audiences on multiple platforms
Brand consistency - Same message, optimized for each platform

Directory Submissions

In November, we submitted our site to startup and product directories to build initial backlinks:

RankingPublic - Startup directory with do-follow links
TinyLaunch - Product launch platform for early-stage startups
Product Hunt - For product launches and visibility
Various AI directories - Niche-specific listings for AI companies

These directories provide legitimate backlinks that signal to search engines: "This is a real business that others are talking about."

Why Backlinks Matter

Domain Authority (DA) and Page Authority (PA) are metrics that predict how well a site will rank. They're heavily influenced by:

Quality of linking domains - A link from a DA 80 site is worth more than 100 links from DA 10 sites
Relevance - Links from tech/AI sites matter more for an AI company
Diversity - Links from many different domains signal broad recognition
Natural growth - Sudden spikes in backlinks can trigger spam filters

Our strategy focuses on creating genuinely useful content that earns links organically, supplemented by strategic directory submissions and cross-platform publishing.

Phase 5: Addressing Ahrefs Audit (December 2025)

As our traffic grew, we invested in Ahrefs for deeper SEO analysis. Their Site Audit revealed issues GSC couldn't show:

Orphan Pages

Several pages had no internal links pointing to them, making them nearly invisible to crawlers.

Fix: Created a FeaturedArticles component for the homepage that links to key blog posts:

<section className="py-16">
  <h2>Featured Articles</h2>
  <div className="grid grid-cols-3 gap-6">
    {featuredPosts.map(post => (
      <Link key={post.slug} href={`/blog/${post.slug}`}>
        <ArticleCard post={post} />
      </Link>
    ))}
  </div>
</section>

Duplicate Metadata

Our SPA was returning identical HTML shells for different URLs. While the JavaScript would eventually render unique content, crawlers saw duplicates.

Fix: Implemented crawler-targeted prerendering using User-Agent detection in Vercel:

{
  "rewrites": [
    {
      "source": "/blog/:slug",
      "has": [
        { "type": "header", "key": "user-agent", "value": ".*bot.*" }
      ],
      "destination": "/prerendered/blog/:slug.html"
    }
  ]
}

301 Redirects for Old URLs

When we changed our URL structure (adding date prefixes to blog slugs), old URLs started returning 404s.

Fix: Added permanent redirects in vercel.json:

{
  "redirects": [
    {
      "source": "/blog/old-slug",
      "destination": "/blog/2025-10-13-new-slug",
      "permanent": true
    }
  ]
}

Phase 6: Next.js Migration (December 2025)

All our workarounds worked, but they were brittle. We were fighting against React's client-side rendering nature instead of working with it.

The solution? Migrate to Next.js 16 with App Router.

Why Next.js?

Native SSR/SSG : Pages render server-side by default
Built-in metadata API : No more manual meta tag injection
Automatic sitemap generation : app/sitemap.ts just works
Image optimization : Next/Image handles responsive images automatically
Better developer experience : Less configuration, more building

The Migration

Moving from Vite React to Next.js 16 was a significant undertaking:

166 files changed in the migration PR
Converted all pages to App Router conventions
Moved components to use 'use client' where needed
Implemented proper metadata exports for each page
Set up internationalization with next-intl

Results

After the migration, our SEO setup became dramatically simpler:

// app/[locale]/blog/[slug]/page.tsx
export async function generateMetadata({ params }): Promise<Metadata> {
  const post = await getBlogPost(params.slug);

  return {
    title: post.title,
    description: post.excerpt,
    openGraph: {
      title: post.title,
      description: post.excerpt,
      type: 'article',
      publishedTime: post.publishedAt,
      authors: [post.author.name],
    },
  };
}

No more pre-rendering scripts. No more crawler detection. No more duplicate content issues.

Phase 7: Final Polish (December 2025)

With Next.js handling the heavy lifting, we focused on final refinements:

ProfilePage Structured Data

For our team pages, we added proper ProfilePage schema with the required mainEntity field:

{
  "@context": "https://schema.org",
  "@type": "ProfilePage",
  "mainEntity": {
    "@type": "Person",
    "name": "Shreyas Shinde",
    "jobTitle": "CEO and Founder",
    "worksFor": {
      "@type": "Organization",
      "name": "Kanaeru Labs"
    }
  }
}

Canonical URL Consistency

We removed unnecessary /en prefixes from canonical URLs, ensuring clean URLs like https://www.kanaeru.ai/blog/article-slug instead of https://www.kanaeru.ai/en/blog/article-slug.

Open Graph Image Paths

Fixed OG image URLs that were pointing to wrong paths, ensuring social shares show correct preview images.

Lessons Learned

1. SPAs Need Special Attention

If you're building an SPA, plan for SEO from day one. Pre-rendering, dynamic meta tags, and sitemap generation should be part of your initial architecture.

2. Use the Right Tool for the Job

Fighting against your framework's nature is exhausting. If SEO is critical (and for a marketing site, it always is), use a framework with native SSR support.

3. Multiple Data Sources Are Essential

Google Search Console shows what Google sees. Ahrefs shows what's crawlable. PageSpeed Insights shows performance. You need all three.

4. Structured Data Matters

JSON-LD isn't just nice-to-have. Rich snippets can dramatically improve click-through rates, and proper schema validation prevents indexing issues.

5. Internal Linking Is Underrated

Every page needs at least one internal link pointing to it. Orphan pages might as well not exist.

The Results

After implementing all these changes:

Blog posts are indexed within days of publishing
Rich snippets appear in search results with proper article markup
Core Web Vitals pass all thresholds
Ahrefs Site Health Score improved significantly
Organic traffic is steadily growing

What's Next?

SEO is never "done." We're continuing to:

Monitor GSC for new crawl issues
Run monthly Ahrefs audits
Optimize content for target keywords
Build more internal links through related posts
Expand structured data coverage

The journey from "Crawled - Not Indexed" to proper search visibility took about two months of focused work. But now we have a solid foundation that will serve us for years to come.