Our SEO Journey: From "Crawled - Not Indexed" to Search Visibility

Building a beautiful Single Page Application (SPA) is one thing. Getting Google to actually index it? That's an entirely different challenge.
This is the story of how we transformed our Kanaeru AI website from a client-side rendered React app that search engines couldn't properly index, to a fully optimized Next.js site with comprehensive SEO that ranks well on Google.
The Problem: Beautiful But Invisible
When we first launched our marketing website, we chose Lovable.dev as our starting point. Lovable uses Vite + React under the hood and gave us a well-designed base template with rapid initial development speed. We designed our entire site through Lovable's AI interface, then migrated the code to GitHub where we continued development entirely via Claude Code.
The result looked perfect to human visitors. The animations were smooth, the design was polished, and the content was compelling.
But there was a problem: Google couldn't see most of it.
Our Google Search Console was showing a frustrating pattern:
- Pages marked as "Crawled - currently not indexed"
- Blog posts returning the homepage HTML to crawlers
- Duplicate content issues across pages
- Missing structured data for rich snippets
The root cause? SPAs render content with JavaScript. Search engine crawlers, while improving, still struggle with JavaScript-heavy pages. When Googlebot visited our blog posts, it saw the same generic homepage HTML for every URL.
Phase 1: Foundation Work (October 2025)
Comprehensive SEO Infrastructure
Our first major fix addressed the fundamentals:
1. Sitemap Generation
We created a dynamic sitemap generator that runs on every build:
// scripts/generate-sitemap.mjs
const routes = [
{ url: '/', changefreq: 'weekly', priority: 1.0 },
{ url: '/platform', changefreq: 'monthly', priority: 0.8 },
{ url: '/team', changefreq: 'monthly', priority: 0.7 },
{ url: '/blog', changefreq: 'daily', priority: 0.9 },
// ... blog posts dynamically added
];
2. robots.txt for Modern Crawlers
We updated our robots.txt to explicitly allow both search engines and LLM crawlers:
User-agent: Googlebot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Claude-Web
Allow: /
User-agent: PerplexityBot
Allow: /
Sitemap: https://www.kanaeru.ai/sitemap.xml
3. JSON-LD Structured Data
We added Organization, WebSite, and Service schemas to our homepage:
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Kanaeru AI",
"url": "https://www.kanaeru.ai",
"logo": "https://www.kanaeru.ai/logo.png",
"sameAs": [
"https://github.com/kanaerulabs",
"https://www.linkedin.com/company/kanaeru-ai"
]
}
Blog Post Pre-rendering
The game-changer was implementing static HTML generation for blog posts. Instead of serving the same SPA shell to every request, we pre-rendered each blog post with:
- Complete meta tags (title, description, Open Graph, Twitter Cards)
- Full article content for crawlers
- Proper canonical URLs
- BlogPosting JSON-LD structured data
// scripts/prerender-blog.ts
async function prerenderBlogPost(post: BlogPost) {
const html = `
<!DOCTYPE html>
<html lang="${post.locale}">
<head>
<title>${post.title}</title>
<meta name="description" content="${post.excerpt}">
<link rel="canonical" href="https://www.kanaeru.ai/blog/${post.slug}">
<script type="application/ld+json">
${JSON.stringify(generateBlogPostingSchema(post))}
</script>
</head>
<body>
<article>${post.htmlContent}</article>
</body>
</html>
`;
await writeFile(`public/prerendered/blog/${post.slug}.html`, html);
}
Phase 2: Fixing Critical Indexing Issues (October 2025)
After the foundation work, we still had issues. Google Search Console showed "Crawled - currently not indexed" for our blog posts. Investigation revealed several problems:
1. Wrong Canonical URLs
Our blog posts were pointing their canonical URL to the homepage instead of their own URL. This told Google "don't index me, index the homepage instead."
Fix: Updated the SEO library to generate correct canonical URLs for each page type.
2. Missing BlogPosting Schema
Generic Organization schema wasn't enough. Blog posts need specific BlogPosting structured data:
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "Article Title",
"datePublished": "2025-10-13",
"dateModified": "2025-10-15",
"author": {
"@type": "Person",
"name": "Shreyas Shinde"
},
"publisher": {
"@type": "Organization",
"name": "Kanaeru AI"
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://www.kanaeru.ai/blog/article-slug"
}
}
3. Empty Image Fields
Schema.org requires images. We were leaving image fields empty, which caused validation failures.
Fix: Added fallback logic to use default images when post-specific images weren't available.
Phase 3: Performance Optimization (October 2025)
SEO isn't just about content - Core Web Vitals directly impact rankings. Our PageSpeed Insights scores were suffering from:
Desktop scores after optimization. Mobile performance is still a work in progress.
Render-Blocking Resources
Google Fonts loaded via CSS @import blocked rendering for 1.6+ seconds.
Fix: Switched to async font loading:
<link rel="preload" href="https://fonts.googleapis.com/css2?family=Inter"
as="style" onload="this.onload=null;this.rel='stylesheet'">
<noscript>
<link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Inter">
</noscript>
Unused JavaScript
Targeting ES5 for broad compatibility bloated our bundles unnecessarily.
Fix: Updated to ES2020 target with better code splitting:
// vite.config.ts
build: {
target: 'es2020',
rollupOptions: {
output: {
manualChunks: {
'react-vendor': ['react', 'react-dom'],
'router': ['react-router-dom'],
'i18n': ['i18next', 'react-i18next'],
'markdown': ['marked', 'prismjs']
}
}
}
}
Cache Headers
Static assets weren't being cached properly, causing repeat visitors to re-download everything.
Fix: Added aggressive cache headers via vercel.json:
{
"headers": [
{
"source": "/assets/(.*)",
"headers": [
{ "key": "Cache-Control", "value": "public, max-age=31536000, immutable" }
]
}
]
}
Phase 4: Off-Page SEO & Backlink Building (October-November 2025)
On-page SEO is only half the battle. Search engines also evaluate your site's authority based on external signals - primarily backlinks from other reputable websites.
Cross-Publishing with Growth Kit
In October, we built Growth Kit, a Claude Code plugin that automatically transforms our blog posts into platform-specific content for:
- LinkedIn - Professional articles with proper formatting
- Medium - Long-form content with canonical URLs pointing back to our site
- Dev.to - Technical content for the developer community
- X/Twitter - Thread summaries with links to full articles
Each cross-published article includes a canonical URL back to our original post, ensuring:
- No duplicate content penalties - Search engines know where the original lives
- Backlink juice flows back - Links from Medium, Dev.to, and LinkedIn boost our domain authority
- Wider reach - Content reaches audiences on multiple platforms
- Brand consistency - Same message, optimized for each platform
Directory Submissions
In November, we submitted our site to startup and product directories to build initial backlinks:
- RankingPublic - Startup directory with do-follow links
- TinyLaunch - Product launch platform for early-stage startups
- Product Hunt - For product launches and visibility
- Various AI directories - Niche-specific listings for AI companies
These directories provide legitimate backlinks that signal to search engines: "This is a real business that others are talking about."
Why Backlinks Matter
Domain Authority (DA) and Page Authority (PA) are metrics that predict how well a site will rank. They're heavily influenced by:
- Quality of linking domains - A link from a DA 80 site is worth more than 100 links from DA 10 sites
- Relevance - Links from tech/AI sites matter more for an AI company
- Diversity - Links from many different domains signal broad recognition
- Natural growth - Sudden spikes in backlinks can trigger spam filters
Our strategy focuses on creating genuinely useful content that earns links organically, supplemented by strategic directory submissions and cross-platform publishing.
Phase 5: Addressing Ahrefs Audit (December 2025)
As our traffic grew, we invested in Ahrefs for deeper SEO analysis. Their Site Audit revealed issues GSC couldn't show:
Orphan Pages
Several pages had no internal links pointing to them, making them nearly invisible to crawlers.
Fix: Created a FeaturedArticles component for the homepage that links to key blog posts:
<section className="py-16">
<h2>Featured Articles</h2>
<div className="grid grid-cols-3 gap-6">
{featuredPosts.map(post => (
<Link key={post.slug} href={`/blog/${post.slug}`}>
<ArticleCard post={post} />
</Link>
))}
</div>
</section>
Duplicate Metadata
Our SPA was returning identical HTML shells for different URLs. While the JavaScript would eventually render unique content, crawlers saw duplicates.
Fix: Implemented crawler-targeted prerendering using User-Agent detection in Vercel:
{
"rewrites": [
{
"source": "/blog/:slug",
"has": [
{ "type": "header", "key": "user-agent", "value": ".*bot.*" }
],
"destination": "/prerendered/blog/:slug.html"
}
]
}
301 Redirects for Old URLs
When we changed our URL structure (adding date prefixes to blog slugs), old URLs started returning 404s.
Fix: Added permanent redirects in vercel.json:
{
"redirects": [
{
"source": "/blog/old-slug",
"destination": "/blog/2025-10-13-new-slug",
"permanent": true
}
]
}
Phase 6: Next.js Migration (December 2025)
All our workarounds worked, but they were brittle. We were fighting against React's client-side rendering nature instead of working with it.
The solution? Migrate to Next.js 16 with App Router.
Why Next.js?
- Native SSR/SSG : Pages render server-side by default
- Built-in metadata API : No more manual meta tag injection
-
Automatic sitemap generation :
app/sitemap.tsjust works - Image optimization : Next/Image handles responsive images automatically
- Better developer experience : Less configuration, more building
The Migration
Moving from Vite React to Next.js 16 was a significant undertaking:
- 166 files changed in the migration PR
- Converted all pages to App Router conventions
- Moved components to use
'use client'where needed - Implemented proper metadata exports for each page
- Set up internationalization with
next-intl
Results
After the migration, our SEO setup became dramatically simpler:
// app/[locale]/blog/[slug]/page.tsx
export async function generateMetadata({ params }): Promise<Metadata> {
const post = await getBlogPost(params.slug);
return {
title: post.title,
description: post.excerpt,
openGraph: {
title: post.title,
description: post.excerpt,
type: 'article',
publishedTime: post.publishedAt,
authors: [post.author.name],
},
};
}
No more pre-rendering scripts. No more crawler detection. No more duplicate content issues.
Phase 7: Final Polish (December 2025)
With Next.js handling the heavy lifting, we focused on final refinements:
ProfilePage Structured Data
For our team pages, we added proper ProfilePage schema with the required mainEntity field:
{
"@context": "https://schema.org",
"@type": "ProfilePage",
"mainEntity": {
"@type": "Person",
"name": "Shreyas Shinde",
"jobTitle": "CEO and Founder",
"worksFor": {
"@type": "Organization",
"name": "Kanaeru Labs"
}
}
}
Canonical URL Consistency
We removed unnecessary /en prefixes from canonical URLs, ensuring clean URLs like https://www.kanaeru.ai/blog/article-slug instead of https://www.kanaeru.ai/en/blog/article-slug.
Open Graph Image Paths
Fixed OG image URLs that were pointing to wrong paths, ensuring social shares show correct preview images.
Lessons Learned
1. SPAs Need Special Attention
If you're building an SPA, plan for SEO from day one. Pre-rendering, dynamic meta tags, and sitemap generation should be part of your initial architecture.
2. Use the Right Tool for the Job
Fighting against your framework's nature is exhausting. If SEO is critical (and for a marketing site, it always is), use a framework with native SSR support.
3. Multiple Data Sources Are Essential
Google Search Console shows what Google sees. Ahrefs shows what's crawlable. PageSpeed Insights shows performance. You need all three.
4. Structured Data Matters
JSON-LD isn't just nice-to-have. Rich snippets can dramatically improve click-through rates, and proper schema validation prevents indexing issues.
5. Internal Linking Is Underrated
Every page needs at least one internal link pointing to it. Orphan pages might as well not exist.
The Results
After implementing all these changes:
- Blog posts are indexed within days of publishing
- Rich snippets appear in search results with proper article markup
- Core Web Vitals pass all thresholds
- Ahrefs Site Health Score improved significantly
- Organic traffic is steadily growing
What's Next?
SEO is never "done." We're continuing to:
- Monitor GSC for new crawl issues
- Run monthly Ahrefs audits
- Optimize content for target keywords
- Build more internal links through related posts
- Expand structured data coverage
The journey from "Crawled - Not Indexed" to proper search visibility took about two months of focused work. But now we have a solid foundation that will serve us for years to come.
Quick Reference: SEO Checklist for SPAs
For anyone facing similar challenges, here's our condensed checklist:
Foundation
- Dynamic sitemap.xml generation
- robots.txt with explicit allow rules
- Canonical URLs on every page
- hreflang tags for multi-language sites
Structured Data
- Organization schema on homepage
- BlogPosting schema on articles
- ProfilePage schema on team pages
- Validate with Google's Rich Results Test
Performance
- Async font loading
- Code splitting and lazy loading
- Image optimization
- Cache headers for static assets
Content Accessibility
- Pre-render critical pages for crawlers
- 301 redirects for URL changes
- Internal linking strategy
- No orphan pages
Monitoring
- Google Search Console
- Ahrefs or similar SEO tool
- PageSpeed Insights
- Regular audits
Have questions about SPA SEO or our migration process? Book a free consultation with our team.
Originally published at Kanaeru AI




Top comments (1)
I was just struggling with SPA prerendering yesterday! What a headache. Good stuff!