DEV Community

Khôi Nguyễn Vũ
Khôi Nguyễn Vũ

Posted on

Algolia v5 + Algolia Agent Studio + Next.js 16: Building a Startup Validator

This is a submission for the Algolia Agent Studio Challenge: Consumer-Facing Conversational Experiences


What I Built

Startup Roast is an AI-powered startup idea validator that provides brutally honest feedback by analyzing your idea against a database of 2,500+ real YC startups and 403+ failed companies.

Before you quit your job, raise money from friends and family, or spend months building something nobody wants - get a reality check in seconds, not months.

The Problem

Every year, thousands of entrepreneurs start companies based on "gut feelings" and encouragement from supportive friends who don't want to hurt their feelings. Meanwhile, there are decades of startup data available showing patterns of success and failure that nobody consults until it's too late.

The Solution

Startup Roast combines conversational AI with retrieval from thousands of real companies to provide:

  • Survival Probability Score - Multi-factor algorithm (Growth 35%, Market 25%, Team 20%, Funding 15%, Trend 5%)
  • Market Saturation Analysis - Know if you're entering a crowded space
  • Funding Likelihood - Realistic assessment based on similar companies
  • Graveyard Insights - See similar companies that failed and WHY they failed
  • Pivot Suggestions - Actionable alternatives when your concept needs refinement

The Experience

  1. Search for any startup or describe your own idea
  2. AI retrieves relevant context from 2,900+ companies using RAG
  3. Get structured, visual feedback with actionable insights
  4. Explore pivot suggestions with one click

Demo

Live URL: https://ai-roasting-algolia.vercel.app

Video Demo:

Repository: https://github.com/VuKhoiGVM/ai-roasting-algolia.git


How I Used Algolia Agent Studio

Data Indexed

I created two Algolia indices with rich, structured data:

1. Startups Index (2,500 records)

Data Structure:

{
  "objectID": "yc_31306",
  "name": "Martini",
  "description": "Collaborative AI-native filmmaking for professionals",
  "long_description": "Martini is a collaborative, AI-native platform...",
  "batch": "W26",
  "status": "Active",
  "tags": ["Generative AI", "Entertainment", "Design Tools"],
  "location": "San Francisco",
  "year_founded": 2025,
  "team_size": 2,
  "website": "https://martini.film",
  "is_hiring": false,
  "open_jobs": 0,
  "category": "Generative AI",
  "survival_score": 64,
  "survival_breakdown": {
    "total": 64,
    "growth": 14,
    "market": 100,
    "team": 100,
    "funding": 60,
    "trend": 100,
    "penalty": 0
  },
  "saturation": "Medium"
}
Enter fullscreen mode Exit fullscreen mode

Searchable Attributes: name, description, long_description, category, sector, tags, batch, location

Custom Ranking: Higher survival_score → higher rank, then hiring companies get boost

Facets: category, status, sector, batch, is_hiring, saturation, year_founded

2. Graveyard Index (403 records)

Data Structure:

{
  "objectID": "fail_Health_Care_0",
  "name": "Aira Health",
  "sector": "Health Care",
  "category": "Health Care",
  "years_of_operation": "2015-2019",
  "what_they_did": "Personalized asthma/allergy app",
  "how_much_raised": "$12M",
  "raised_amount": 12000000,
  "why_they_failed": "Small user base and cash shortage",
  "takeaway": "Niche apps need big audiences",
  "year_founded": 2015,
  "year_closed": 2019,
  "lost_to_giants": true,
  "no_budget": true,
  "competition": true,
  "poor_market_fit": true
}
Enter fullscreen mode Exit fullscreen mode

Searchable Attributes: name, what_they_did, why_they_failed, takeaway, category, sector

Custom Ranking: Higher raised_amount → higher rank (bigger failures are more educational), then more recent failures

Facets: 40+ failure reason flags (lost_to_giants, no_budget, competition, poor_market_fit, monetization_failure, etc.)

Index Configuration

I configured both indices with optimized settings:

Custom Ranking (primary - determines result order):

// Startups: Prioritize high-quality, active companies
customRanking: [
  'desc(survival_score)',  // 🔥 Primary signal: companies with higher survival potential
  'desc(is_hiring)',       // 📈 Active growth signal: hiring = expanding
  'desc(batch)',           // 🆕 Recency bias: newer batches first (W26 > W25)
  'asc(name)'              // 📝 Alphabetical tie-breaker for consistency
]

// Graveyard: Most educational failures first
customRanking: [
  'desc(raised_amount)',   // 💰 Raised more $$ = more expensive lesson = higher priority
  'desc(year_closed)',     // 📅 Recent failures = more relevant to current market
  'asc(name)'              // 📝 Alphabetical tie-breaker
]
Enter fullscreen mode Exit fullscreen mode

Why This Ranking Works:

  • survival_score first ensures the best companies surface when browsing
  • is_hiring as boost rewards actively growing companies
  • batch recency gives newer YC companies visibility (they need it more!)
  • raised_amount for graveyard shows the most dramatic failures ($3.5B Faraday Future story > $50K failure)

Searchable Attributes (after ranking - determines which fields are searched):

searchableAttributes: [
  'name',           // Exact company name matches rank highest
  'description',    // Then description content
  'long_description',
  'category',       // Category matches
  'tags',           // Tag matches
  'batch',          // YC batch
  'location'
]
Enter fullscreen mode Exit fullscreen mode

Typo Tolerance: Enabled for 4+ character words, 2 typos for 8+ character words

Frontend Integration

Algolia JavaScript SDK v5

Using Algolia JavaScript SDK v5.35.0 with the new search API:

import { algoliasearch } from 'algoliasearch'

// Initialize client with search-only key (safe for frontend)
const client = algoliasearch(
  process.env.NEXT_PUBLIC_ALGOLIA_APP_ID,
  process.env.NEXT_PUBLIC_ALGOLIA_SEARCH_KEY
)

// v5 Breaking Changes: initIndex() removed, use search() with requests array
const { results } = await client.search({
  requests: [
    {
      type: 'default',
      indexName: 'startups',
      query: 'AI healthcare',
      hitsPerPage: 20,
      filters: 'survival_score >= 40',  // Only good survival rates
      attributesToHighlight: ['name', 'description', 'category'],
      highlightPreTag: '<em>',
      highlightPostTag: '</em>',
    }
  ]
})

const hits = results[0].hits
Enter fullscreen mode Exit fullscreen mode

Key Search Functions Implemented:

// 1. Unified Search - both indices in parallel
searchAll(query: string)  Promise<(Startup | FailedStartup)[]>

// 2. Single index search with filters
searchStartups(query, { category, hitsPerPage, filters, facetFilters })

// 3. Top performers
getTopStartups()  Top 10 by survival_score (filtered >= 40)
getTopGraveyardEntries()  Top 10 by raised_amount

// 4. Faceting for filters
getCategories()  Category names with counts
getBatchFacets()  YC batches with counts (sorted by recency)
getAllFacets()  All facets in parallel using Promise.all()

// 5. Category filtering
searchStartupsByCategory(category)
searchGraveyardByCategory(category)
Enter fullscreen mode Exit fullscreen mode

Search Response with Highlights:

{
  "objectID": "yc_31306",
  "name": "Martini",
  "description": "Collaborative <em>AI</em>-native filmmaking...",
  "category": "Generative <em>AI</em>",
  "_highlightResult": {
    "name": { value: "Martini", matchLevel: "none" },
    "description": { value: "Collaborative <em>AI</em>-native...", matchLevel: "full" }
  }
}
Enter fullscreen mode Exit fullscreen mode

Vercel AI SDK + Agent Studio

Using Vercel AI SDK v6 with direct Agent Studio transport (no backend needed!):

import { useChat, DefaultChatTransport } from "ai"

const transport = new DefaultChatTransport({
  api: `https://${appId}.algolia.net/agent-studio/1/agents/${agentId}/completions?compatibilityMode=ai-sdk-5`,
  headers: {
    "x-algolia-application-id": appId,
    "x-algolia-api-key": searchKey,  // Search-only key (safe for client)
  },
})

const chat = useChat({ transport })

// Send message
chat.sendMessage({ text: userInput })
Enter fullscreen mode Exit fullscreen mode

Response Parsing:

// Extract metrics from structured AI response using regex
const survivalMatch = text.match(/\*\*Survival Probability:\*\*\s*(\d+)%?/i)
const saturationMatch = text.match(/\*\*Market Saturation:\*\*\s*(Low|Medium|High)/i)
const fundingMatch = text.match(/\*\*Funding Likelihood:\*\*\s*(\d+)%?/i)

// Parse graveyard entries
const graveyardMatch = text.match(/\*\*💀[^*]*:\*\*([\s\S]*?)(?=\*\*🔄|\*\*The Roast|$)/i)

// Parse pivot suggestions
const pivotMatch = text.match(/\*\*🔄[^*]*:\*\*([\s\S]*?)(?=\*\*|$)/i)
Enter fullscreen mode Exit fullscreen mode

Rendering Components:

  • Survival probability → colored progress bar (green ≥70%, yellow 40-69%, red <40%)
  • Market saturation → visual meter with emoji indicators (🔥 Low, ⚠️ Medium, 🚫 High)
  • Funding likelihood → percentage meter with confidence level
  • Graveyard section → cards showing failed companies with reasons
  • Pivot suggestions → clickable chips that re-analyze the new direction

Real-time Search UI:

// Debounced search as user types
const [results, setResults] = useState([])

useEffect(() => {
  const timer = setTimeout(async () => {
    const hits = await searchAll(query)  // Searches both indices
    setResults(hits)
  }, 300)
  return () => clearTimeout(timer)
}, [query])
Enter fullscreen mode Exit fullscreen mode

Query Rules

I implemented query rules for both indices to enhance search experience:

Startups Index Query Rules

AI/ML Category Boost:

// Searching "ai", "ml", "llm", "gpt" → Boost AI/ML companies to top
{
  condition: { pattern: 'ai', anchoring: 'contains' },
  consequence: {
    filterPromotes: true,
    params: {
      filters: 'category:Artificial Intelligence OR category:Machine Learning OR category:AI'
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Dev Tools Category Boost:

// Searching "dev tools", "developer tools" → Boost Dev Tools companies to top
{
  condition: { pattern: 'dev', anchoring: 'contains' },
  consequence: {
    filterPromotes: true,
    params: {
      filters: 'category:Developer Tools'
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Fintech Category Boost:

// Searching "fintech", "financial" → Boost Fintech companies to top
{
  condition: { pattern: 'fintech', anchoring: 'contains' },
  consequence: {
    filterPromotes: true,
    params: {
      filters: 'category:Fintech'
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Graveyard Index Query Rules

// Only show graveyard companies relevant when search keywords are 'failed', 'bankrupt', 'shutdown'
  {
    enabled: true,
    conditions: [{
      pattern: 'failed',
      anchoring: 'contains'
    }],
    consequence: {
      filterPromotes: false,
      userData: { showGraveyardFirst: true }
    },
    description: 'Show graveyard for "failed" queries'
  },
Enter fullscreen mode Exit fullscreen mode

Agent Configuration

In Algolia Agent Studio, I configured a custom agent:

Tools Configuration

Both indices enabled for retrieval:

  • startups index search - For successful company analysis
  • graveyard index search - For failure pattern analysis

LLM Selection

Google Gemini 2.5 Flash - Chosen for:

  • Fast response times (<10 seconds)
  • Strong reasoning capabilities
  • Good with structured output

System Prompt Engineering

You are Startup Roast - a brutally honest startup advisor. Analyze ideas against real data.

Your response MUST follow this exact format:

**Survival Probability:** X%
**Market Saturation:** [Low/Medium/High]
**Funding Likelihood:** X%

**💀 The Graveyard (similar failures):**
- [Company Name]: [brief failure reason]

**🔄 Pivot Suggestions:**
- [Specific pivot idea with reasoning]

**The Roast:**
[Brutally honest analysis with specific references to similar companies from the retrieved data.
Be direct but constructive. Reference actual companies when making comparisons.]
Enter fullscreen mode Exit fullscreen mode

Key Prompt Techniques Used:

  1. Structured output format - Enables frontend parsing for visual metrics
  2. Specific section markers - **💀 The Graveyard:** for regex parsing
  3. Retrieval grounding - "Reference actual companies from retrieved data"
  4. Tone setting - "Brutally honest but constructive"

Why This Approach

Instead of building a custom RAG pipeline with vector search and prompt engineering, Algolia Agent Studio gave me:

Feature Custom RAG Pipeline Algolia Agent Studio
Infrastructure Vector DB, embedding API, backend server Zero infra needed
Setup Time Days to weeks Hours
Retrieval Speed 2-5 seconds <100ms
Cost Multiple API costs Single platform
Maintenance High Low
Grounding Manual prompt engineering Built-in RAG

Why Fast Retrieval Matters

The Speed Expectation

In a conversational interface, users expect responses in seconds, not minutes. When someone types "Uber for dog walking," they want feedback now, not after 30 seconds of loading spinners.

What Would Happen with Slow Retrieval

Without Algolia's millisecond-level retrieval:

  1. User abandonment - 53% of users abandon sites that take >3 seconds to load
  2. Context window limits - Can't retrieve enough relevant examples with slow APIs
  3. Cold starts - Every query needs new retrieval, no caching benefits
  4. Cascading delays - Slow retrieval → slow LLM → slow generation → frustrated user

How Algolia's Speed Improves Experience

Metric With Slow Search With Algolia
First response time 5-10 seconds <1 second
Relevant examples retrieved 5-10 20-50
Context quality Sparse Rich & diverse
User engagement Drop off after 1 query Continue exploring pivots

Real Example

When a user types "dating app with AI matching for people over 30":

Algolia retrieves instantly:

  • 500+ dating/tech companies across both indices
  • Failed companies like Woo ($1.5M raised, lost to Tinder)
  • Market saturation: High (dominated by Match Group, Bumble)
  • Survival Probability: 20%

Structured Metrics Displayed:

Survival Probability: 20% (Danger zone 💀)
Funding Likelihood: 15% (Better bring cash 💸)
Market Saturation: High (Crowded market 👥)
Enter fullscreen mode Exit fullscreen mode

Graveyard Section:

💀 Woo ($1.5M) - Dating app for professionals, lost to Tinder. The dating app market is dominated by giants.

The Roast (with real data):

"The dating app market is a brutal graveyard for new entrants. You're entering a 'High' saturation market dominated by Match Group (Tinder, Hinge, OKCupid) and Bumble, who have massive user bases and deep pockets for marketing. Your differentiation, 'AI suggestions for matching,' is a feature, not a unique product. Woo raised $1.5M and couldn't compete with Tinder."

Pivot Suggestions:

  • "Divorced Parents Dating App" → 40% survival score
  • "Travel Enthusiasts Over 30" → Medium saturation

After clicking a pivot:
The user clicks "Divorced Parents Dating App" and the AI instantly re-analyzes:

  • Survival Probability improves to 40%
  • Market Saturation drops to Medium
  • New pivot suggestions appear

This shows the full power: instant retrieval → brutal honesty → actionable pivots. Users can iterate 5-6 times in one session, each time getting specific feedback grounded in real company data.


Tech Stack

  • Frontend: Next.js 16.1.6 (App Router + Turbopack), React 19
  • AI: Algolia Agent Studio, Vercel AI SDK v6, Google Gemini 2.5 Flash
  • Search: Algolia JavaScript SDK v5.35.0
  • Styling: Tailwind CSS v4, shadcn/ui
  • Data: 2,500 YC startups + 403 failed companies

What's Next

Potential improvements I'd love to add:

  • [ ] User accounts to save favorite roasts
  • [ ] Export analysis as PDF
  • [ ] Industry trend visualization
  • [ ] Investor perspective mode
  • [ ] Co-founder matching based on complementary skills

Built with 🔥 for the Algolia Agent Studio Challenge 2026

Demo: https://youtu.be/TuSimU_864U | Live: https://ai-roasting-algolia.vercel.app


💬 Got questions? Drop them in the comments below!

I'm happy to discuss:

  • Algolia Agent Studio setup and configuration
  • Survival score algorithm tuning
  • Working with Algolia v5 SDK
  • Startup graveyard data sourcing
  • Or anything else about the project!

Top comments (0)