Ben Stone

Posted on Jan 20

Building ClipSearch: AI-Powered Product Sourcing for Interior Designers

#nextjs #ai #webdev #typescript

Building ClipSearch: AI-Powered Product Sourcing for Interior Designers

I just launched ClipSearch, a product sourcing tool for interior designers. This post covers the technical architecture, interesting challenges, and what I learned building a Chrome extension + web app with AI-powered search.

The Problem Space

Interior designers browse hundreds of products across dozens of websites every week. They need to:

Save products from any e-commerce site
Organize items by project/client
Find products later (often by vague descriptions)
Present curated selections to clients

Traditional solutions (bookmarks, screenshots, Pinterest) break down at scale.

Tech Stack

Frontend:

Next.js 14 (App Router)
TypeScript
Tailwind CSS
Shadcn/ui components

Backend:

Supabase (PostgreSQL + Auth)
pgvector extension for semantic search
OpenAI API (text-embedding-3-small model)

Chrome Extension:

Manifest V3
Content scripts for product scraping
Background service worker for API calls

Deployment:

Fly.io (Next.js app)
Supabase cloud (database)
Chrome Web Store (extension)

Architecture Overview

┌─────────────────┐
│ Chrome Extension│ ──┐
└─────────────────┘   │
                      ├──> Next.js API Routes
┌─────────────────┐   │         │
│   Web App       │ ──┘         │
└─────────────────┘             ▼
                         ┌──────────────┐
                         │  Supabase    │
                         │  PostgreSQL  │
                         │  + pgvector  │
                         └──────────────┘
                                │
                                ▼
                         ┌──────────────┐
                         │  OpenAI API  │
                         │  Embeddings  │
                         └──────────────┘

Key Features & Implementation

1. Chrome Extension Product Clipping

The extension uses content scripts to extract product information from any website:

// content-script.ts
function extractProductInfo(): ProductData {
  // Try Open Graph meta tags first
  const ogImage = document.querySelector('meta[property="og:image"]')?.content;
  const ogTitle = document.querySelector('meta[property="og:title"]')?.content;
  const ogPrice = document.querySelector('meta[property="og:price:amount"]')?.content;

  // Fallback: heuristic detection
  const priceElement = document.querySelector('[class*="price"], [id*="price"]');
  const titleElement = document.querySelector('h1, [class*="product-title"]');

  return {
    url: window.location.href,
    title: ogTitle || titleElement?.textContent || document.title,
    price: ogPrice || extractPriceFromText(priceElement?.textContent),
    imageUrl: ogImage || findLargestImage(),
    source: new URL(window.location.href).hostname
  };
}

Challenge: Different e-commerce sites structure their HTML differently. Solution: Multi-layered extraction (Open Graph → schema.org → heuristics).

2. AI Semantic Search

Users can search by describing items ("find mid-century credenzas under $2000") rather than exact keywords.

Implementation:

// app/api/search/route.ts
export async function POST(req: Request) {
  const { query, userId } = await req.json();

  // Generate embedding for search query
  const embedding = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: query,
  });

  // pgvector similarity search
  const { data, error } = await supabase.rpc('search_clips', {
    query_embedding: embedding.data[0].embedding,
    match_threshold: 0.7,
    match_count: 20,
    user_id: userId
  });

  return Response.json(data);
}

Database function:

CREATE FUNCTION search_clips(
  query_embedding vector(1536),
  match_threshold float,
  match_count int,
  user_id uuid
)
RETURNS TABLE (
  id uuid,
  title text,
  similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    clips.id,
    clips.title,
    1 - (clips.embedding <=> query_embedding) as similarity
  FROM clips
  WHERE clips.user_id = user_id
    AND 1 - (clips.embedding <=> query_embedding) > match_threshold
  ORDER BY clips.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

Why pgvector?

Native PostgreSQL extension (no separate vector DB)
Scales to millions of vectors
Cosine similarity operator (<=>) is fast
Works seamlessly with existing relational data

3. Visual Image Search

Users can upload inspiration photos and find similar items they've saved.

// Using CLIP-like approach
async function visualSearch(imageFile: File, userId: string) {
  // Convert image to embedding
  const imageEmbedding = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: await imageToBase64(imageFile),
  });

  // Same pgvector search, but with image embedding
  return supabase.rpc('search_clips', {
    query_embedding: imageEmbedding.data[0].embedding,
    match_threshold: 0.6, // Lower threshold for visual similarity
    match_count: 30,
    user_id: userId
  });
}

4. Extension <-> Web App Authentication

Challenge: Share authentication state between Chrome extension and web app.

Solution: Token-based flow

// Extension: popup.tsx
async function authenticate() {
  // Open web app for OAuth
  const tab = await chrome.tabs.create({
    url: `${WEB_APP_URL}/extension-auth`
  });

  // Listen for auth token
  chrome.runtime.onMessage.addListener((message) => {
    if (message.type === 'AUTH_TOKEN') {
      // Store token in extension storage
      chrome.storage.local.set({ authToken: message.token });
    }
  });
}

// Web app: app/extension-auth/page.tsx
export default function ExtensionAuth() {
  useEffect(() => {
    const session = await supabase.auth.getSession();

    // Send token to extension
    if (window.opener) {
      chrome.runtime.sendMessage({
        type: 'AUTH_TOKEN',
        token: session.access_token
      });
      window.close();
    }
  }, []);
}

5. Price Tracking

Background job checks saved products for price changes:

// Cron job (runs daily)
async function checkPriceChanges() {
  const clips = await supabase
    .from('clips')
    .select('*')
    .not('url', 'is', null);

  for (const clip of clips) {
    const currentPrice = await scrapePrice(clip.url);

    if (currentPrice && currentPrice !== clip.price) {
      await supabase.from('price_changes').insert({
        clip_id: clip.id,
        old_price: clip.price,
        new_price: currentPrice,
        changed_at: new Date().toISOString()
      });

      // Update clip price
      await supabase
        .from('clips')
        .update({ price: currentPrice })
        .eq('id', clip.id);
    }
  }
}

Interesting Challenges

1. Embedding Generation at Scale

Problem: Generating embeddings for every clipped product is expensive (API costs).

Solution:

Cache embeddings aggressively
Batch embedding generation (up to 100 texts per API call)
Only regenerate if title/description changes significantly

async function generateEmbeddings(clips: Clip[]) {
  // Batch up to 100 clips
  const batches = chunk(clips, 100);

  for (const batch of batches) {
    const texts = batch.map(c => `${c.title} ${c.description}`);

    const response = await openai.embeddings.create({
      model: "text-embedding-3-small",
      input: texts,
    });

    // Update all clips in batch
    await supabase.from('clips').upsert(
      batch.map((clip, i) => ({
        ...clip,
        embedding: response.data[i].embedding
      }))
    );
  }
}

2. Chrome Extension Content Security Policy

Problem: Manifest V3's strict CSP blocks inline scripts and eval().

Solution:

Move all logic to service workers
Use message passing for communication
Pre-compile templates (no runtime JSX in content scripts)

3. Handling Authentication Across Contexts

Problem: Supabase session cookies don't work across extension and web app.

Solution: Token-based auth with secure storage:

Web app uses cookie-based sessions (httpOnly, secure)
Extension uses token-based auth (stored in chrome.storage.local)
Tokens are short-lived (1 hour) with refresh mechanism

Performance Optimizations

Database Indexing

-- Critical indexes for performance
CREATE INDEX idx_clips_user_id ON clips(user_id);
CREATE INDEX idx_clips_embedding ON clips USING ivfflat (embedding vector_cosine_ops);
CREATE INDEX idx_clips_created_at ON clips(created_at DESC);

Next.js Optimizations

Server Components for initial page loads
Dynamic imports for heavy components
Image optimization with next/image
Route caching with revalidate

// app/dashboard/page.tsx
export const revalidate = 60; // Cache for 60 seconds

export default async function Dashboard() {
  const clips = await getClips(); // Server-side fetch

  return (
    <Suspense fallback={<ClipsSkeleton />}>
      <ClipGrid clips={clips} />
    </Suspense>
  );
}

Lessons Learned

1. Start with Constraints

I initially wanted to support image recognition for all products on the web. Scoping down to "help designers organize what they already find" made the MVP achievable.

2. pgvector is Production-Ready

I was skeptical about using PostgreSQL for vector search instead of Pinecone/Weaviate. pgvector has been rock-solid and eliminates operational complexity.

3. Chrome Extension Distribution is Slow

The Chrome Web Store review process takes 3-7 days. Plan accordingly for updates.

4. Pricing is Hard

Choosing $24/month was based on competitive analysis, but I'm still not sure if it's optimal. B2B tools can charge more; productivity tools need volume.

What's Next

Planned features:

Bulk import from Pinterest/bookmarks
Shared folders for team collaboration
Integration with design software (Mood boards, CAD tools)
Mobile app (React Native)

Scaling considerations:

Move to dedicated vector DB if >1M clips
Add caching layer (Redis) for search results
Implement rate limiting for API routes

Try It Out

ClipSearch is live at https://designshelf.biz

Free tier includes:

150 clips/month
AI semantic search
Visual image search
Folder organization

Would love feedback from the dev community, especially on:

The extension architecture
Vector search performance
Pricing strategy

Code Snippets & Resources

Full tech stack:

Questions? Drop them in the comments or reach out! Happy to discuss any of the technical decisions in more detail.

Built with ☕ and TypeScript

DEV Community

Building ClipSearch: AI-Powered Product Sourcing for Interior Designers

Building ClipSearch: AI-Powered Product Sourcing for Interior Designers

The Problem Space

Tech Stack

Architecture Overview

Key Features & Implementation

1. Chrome Extension Product Clipping

2. AI Semantic Search

3. Visual Image Search

4. Extension <-> Web App Authentication

5. Price Tracking

Interesting Challenges

1. Embedding Generation at Scale

2. Chrome Extension Content Security Policy

3. Handling Authentication Across Contexts

Performance Optimizations

Database Indexing

Next.js Optimizations

Lessons Learned

1. Start with Constraints

2. pgvector is Production-Ready

3. Chrome Extension Distribution is Slow

4. Pricing is Hard

What's Next

Try It Out

Code Snippets & Resources

Top comments (0)