SIKOUTRIS

Posted on Feb 28 • Edited on Mar 9 • Originally published at aimarketingcompare.com

Building an AI Tool Comparison Platform: Data Architecture and Scoring System

#ai #webdev #startup #tutorial

Building an AI Tool Comparison Platform: Data Architecture and Scoring System

Building a comparison platform for AI tools seems simple: show features side-by-side, done. Except it's not. You need to handle constantly changing products, subjective quality assessments, and user-specific preferences.

Here's the architecture that actually works.

Core Data Model

interface Tool {
  id: string;
  name: string;
  category: 'email' | 'content' | 'social' | 'analytics' | 'personalization';
  pricing: PricingModel;
  features: Feature[];
  metrics: ToolMetrics;
  lastUpdated: Date;
  dataSource: 'api' | 'manual' | 'user_report';
}

interface Feature {
  id: string;
  name: string;
  category: 'core' | 'integration' | 'analytics' | 'automation';
  availability: 'free' | 'starter' | 'pro' | 'enterprise';
  description: string;
  verified: boolean;
}

interface ToolMetrics {
  avgRating: number;
  reviewCount: number;
  adoptionScore: number;
  pricePerformanceRatio: number;
  lastVerified: Date;
}

interface PricingModel {
  tier: 'free' | 'freemium' | 'paid';
  basePrice: number;
  currency: string;
  billingCycle: 'monthly' | 'annual';
  features: PricingTier[];
}

The key: dataSource tracks where info came from. User-reported features aren't the same as API-verified features.

Dynamic Scoring System

The magic happens in scoring. You can't just show features—users need guidance.

function calculateToolScore(tool: Tool, userPreferences: UserPrefs): Score {
  const weights = {
    pricePerformance: userPreferences.budget ? 0.3 : 0.1,
    featureCompleteness: 0.25,
    integrations: userPreferences.stackSize > 0 ? 0.25 : 0.15,
    userSatisfaction: 0.2,
    adoptionVelocity: 0.1
  };

  const scores = {
    pricePerformance: calculatePricePerformance(tool, userPreferences.budget),
    featureCompleteness: calculateFeatureCoverage(tool, userPreferences.requiredFeatures),
    integrations: calculateIntegrationScore(tool, userPreferences.stack),
    userSatisfaction: tool.metrics.avgRating,
    adoptionVelocity: calculateAdoptionVelocity(tool)
  };

  const finalScore = Object.entries(weights).reduce((sum, [key, weight]) => {
    return sum + (scores[key] * weight);
  }, 0);

  return { finalScore: finalScore * 100, breakdown: scores };
}

Users see not just a score, but why that tool scored that way. Transparency matters.

Database Schema

CREATE TABLE tools (
  id UUID PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  category VARCHAR(50) NOT NULL,
  official_website VARCHAR(255),
  created_at TIMESTAMP,
  updated_at TIMESTAMP,
  data_source ENUM('api', 'manual', 'user_report'),
  UNIQUE(name, category)
);

CREATE TABLE features (
  id UUID PRIMARY KEY,
  tool_id UUID NOT NULL REFERENCES tools(id),
  feature_name VARCHAR(255),
  category VARCHAR(50),
  available_tier VARCHAR(50),
  verified BOOLEAN DEFAULT false,
  UNIQUE(tool_id, feature_name)
);

CREATE TABLE pricing (
  id UUID PRIMARY KEY,
  tool_id UUID NOT NULL REFERENCES tools(id),
  tier_name VARCHAR(50),
  base_price DECIMAL(10, 2),
  currency VARCHAR(3),
  billing_cycle VARCHAR(20),
  last_verified TIMESTAMP
);

CREATE TABLE reviews (
  id UUID PRIMARY KEY,
  tool_id UUID NOT NULL REFERENCES tools(id),
  user_id UUID NOT NULL,
  rating INT (1-5),
  review_text TEXT,
  verified_user BOOLEAN,
  created_at TIMESTAMP
);

CREATE TABLE tool_integrations (
  id UUID PRIMARY KEY,
  tool_id UUID NOT NULL REFERENCES tools(id),
  integration_name VARCHAR(255),
  integration_type ENUM('native', 'api', 'zapier'),
  verified BOOLEAN DEFAULT false
);

Index strategy:

tool_id on features, pricing, reviews (frequent lookups)
category on tools (filtering)
verified on features/integrations (quality signals)

Handling Changing Tool Data

Tools change. Pricing updates weekly. New features launch. Your data becomes stale.

async function updateToolData(toolId: string, source: DataSource) {
  const latestData = await source.fetch(toolId);

  // Track what changed
  const changes = await compareWithStored(toolId, latestData);

  // Update only changed fields
  await db.tools.update(toolId, latestData, {
    updatedAt: new Date()
  });

  // Log changes for audit trail
  await db.changeLog.insert({
    toolId,
    changes,
    timestamp: new Date(),
    source: source.name
  });

  // Invalidate caches
  cache.invalidate(`tool:${toolId}`);
}

Never overwrite everything. Track changes.

User Preference-Based Filtering

Different users care about different things:

interface UserPreferences {
  budget: {
    min: number;
    max: number;
    currency: string;
  };
  requiredFeatures: string[];
  stack: string[]; // Tools they already use
  teamSize: number;
  useCase: 'email' | 'content' | 'analytics';
}

function filterByPreferences(tools: Tool[], prefs: UserPreferences): Tool[] {
  return tools.filter(tool => {
    // Budget filter
    if (tool.pricing.basePrice > prefs.budget.max) return false;

    // Feature filter
    const hasAllRequired = prefs.requiredFeatures.every(feature =>
      tool.features.some(f => f.name.toLowerCase().includes(feature.toLowerCase()))
    );
    if (!hasAllRequired) return false;

    // Integration filter
    const hasRequiredIntegrations = prefs.stack.some(integration =>
      hasIntegration(tool, integration)
    );
    if (prefs.stack.length > 0 && !hasRequiredIntegrations) return false;

    return true;
  }).sort((a, b) => 
    calculateToolScore(b, prefs).finalScore - 
    calculateToolScore(a, prefs).finalScore
  );
}

API Design

GET /api/tools
  ?category=email
  &features=automation,reporting
  &maxPrice=100
  &integrations=salesforce,hubspot
  &sort=score,price
  &limit=10

Returns paginated, scored results matching user criteria.

GET /api/tools/:toolId/comparison
  ?vs=tool2,tool3

Side-by-side comparison endpoint.

POST /api/tools/:toolId/report

Users report outdated information (pricing changed, feature added, etc.).

Keeping Data Fresh

Without this, your comparison platform becomes a graveyard of outdated info.

// Daily refresh for top 50 tools
schedule.daily('0 2', async () => {
  const topTools = await db.tools.topBy('reviews', 50);
  for (const tool of topTools) {
    await updateToolData(tool.id, getSourceFor(tool));
  }
});

// Weekly refresh for all tools
schedule.weekly('sunday 3:00', async () => {
  const allTools = await db.tools.findAll();
  for (const tool of allTools) {
    await updateToolData(tool.id, getSourceFor(tool));
  }
});

Metrics That Matter

Track:

How many users updated which tool's data (crowdsourced accuracy)
Which features get compared most often (real needs vs marketing claims)
Which tools users actually choose after comparing (outcome data)
Price change frequency (volatility indicator)

This data improves your platform over time.

The Real Challenge

Building the architecture is straightforward. The hard part is keeping data accurate while scaling. Crowdsourced updates help, but need moderation. API access helps for structured data, but you'll always have gaps.

AI comparison platforms work best when they embrace this: perfect data is impossible, but transparent data about uncertainty is better than pretending accuracy you don't have.

DEV Community

Building an AI Tool Comparison Platform: Data Architecture and Scoring System

Building an AI Tool Comparison Platform: Data Architecture and Scoring System

Core Data Model

Dynamic Scoring System

Database Schema

Handling Changing Tool Data

User Preference-Based Filtering

API Design

Keeping Data Fresh

Metrics That Matter

The Real Challenge

Top comments (0)