Building an AI Tool Comparison Platform: Data Architecture and Scoring System
Building a comparison platform for AI tools seems simple: show features side-by-side, done. Except it's not. You need to handle constantly changing products, subjective quality assessments, and user-specific preferences.
Here's the architecture that actually works.
Core Data Model
interface Tool {
id: string;
name: string;
category: 'email' | 'content' | 'social' | 'analytics' | 'personalization';
pricing: PricingModel;
features: Feature[];
metrics: ToolMetrics;
lastUpdated: Date;
dataSource: 'api' | 'manual' | 'user_report';
}
interface Feature {
id: string;
name: string;
category: 'core' | 'integration' | 'analytics' | 'automation';
availability: 'free' | 'starter' | 'pro' | 'enterprise';
description: string;
verified: boolean;
}
interface ToolMetrics {
avgRating: number;
reviewCount: number;
adoptionScore: number;
pricePerformanceRatio: number;
lastVerified: Date;
}
interface PricingModel {
tier: 'free' | 'freemium' | 'paid';
basePrice: number;
currency: string;
billingCycle: 'monthly' | 'annual';
features: PricingTier[];
}
The key: dataSource tracks where info came from. User-reported features aren't the same as API-verified features.
Dynamic Scoring System
The magic happens in scoring. You can't just show features—users need guidance.
function calculateToolScore(tool: Tool, userPreferences: UserPrefs): Score {
const weights = {
pricePerformance: userPreferences.budget ? 0.3 : 0.1,
featureCompleteness: 0.25,
integrations: userPreferences.stackSize > 0 ? 0.25 : 0.15,
userSatisfaction: 0.2,
adoptionVelocity: 0.1
};
const scores = {
pricePerformance: calculatePricePerformance(tool, userPreferences.budget),
featureCompleteness: calculateFeatureCoverage(tool, userPreferences.requiredFeatures),
integrations: calculateIntegrationScore(tool, userPreferences.stack),
userSatisfaction: tool.metrics.avgRating,
adoptionVelocity: calculateAdoptionVelocity(tool)
};
const finalScore = Object.entries(weights).reduce((sum, [key, weight]) => {
return sum + (scores[key] * weight);
}, 0);
return { finalScore: finalScore * 100, breakdown: scores };
}
Users see not just a score, but why that tool scored that way. Transparency matters.
Database Schema
CREATE TABLE tools (
id UUID PRIMARY KEY,
name VARCHAR(255) NOT NULL,
category VARCHAR(50) NOT NULL,
official_website VARCHAR(255),
created_at TIMESTAMP,
updated_at TIMESTAMP,
data_source ENUM('api', 'manual', 'user_report'),
UNIQUE(name, category)
);
CREATE TABLE features (
id UUID PRIMARY KEY,
tool_id UUID NOT NULL REFERENCES tools(id),
feature_name VARCHAR(255),
category VARCHAR(50),
available_tier VARCHAR(50),
verified BOOLEAN DEFAULT false,
UNIQUE(tool_id, feature_name)
);
CREATE TABLE pricing (
id UUID PRIMARY KEY,
tool_id UUID NOT NULL REFERENCES tools(id),
tier_name VARCHAR(50),
base_price DECIMAL(10, 2),
currency VARCHAR(3),
billing_cycle VARCHAR(20),
last_verified TIMESTAMP
);
CREATE TABLE reviews (
id UUID PRIMARY KEY,
tool_id UUID NOT NULL REFERENCES tools(id),
user_id UUID NOT NULL,
rating INT (1-5),
review_text TEXT,
verified_user BOOLEAN,
created_at TIMESTAMP
);
CREATE TABLE tool_integrations (
id UUID PRIMARY KEY,
tool_id UUID NOT NULL REFERENCES tools(id),
integration_name VARCHAR(255),
integration_type ENUM('native', 'api', 'zapier'),
verified BOOLEAN DEFAULT false
);
Index strategy:
-
tool_idon features, pricing, reviews (frequent lookups) -
categoryon tools (filtering) -
verifiedon features/integrations (quality signals)
Handling Changing Tool Data
Tools change. Pricing updates weekly. New features launch. Your data becomes stale.
async function updateToolData(toolId: string, source: DataSource) {
const latestData = await source.fetch(toolId);
// Track what changed
const changes = await compareWithStored(toolId, latestData);
// Update only changed fields
await db.tools.update(toolId, latestData, {
updatedAt: new Date()
});
// Log changes for audit trail
await db.changeLog.insert({
toolId,
changes,
timestamp: new Date(),
source: source.name
});
// Invalidate caches
cache.invalidate(`tool:${toolId}`);
}
Never overwrite everything. Track changes.
User Preference-Based Filtering
Different users care about different things:
interface UserPreferences {
budget: {
min: number;
max: number;
currency: string;
};
requiredFeatures: string[];
stack: string[]; // Tools they already use
teamSize: number;
useCase: 'email' | 'content' | 'analytics';
}
function filterByPreferences(tools: Tool[], prefs: UserPreferences): Tool[] {
return tools.filter(tool => {
// Budget filter
if (tool.pricing.basePrice > prefs.budget.max) return false;
// Feature filter
const hasAllRequired = prefs.requiredFeatures.every(feature =>
tool.features.some(f => f.name.toLowerCase().includes(feature.toLowerCase()))
);
if (!hasAllRequired) return false;
// Integration filter
const hasRequiredIntegrations = prefs.stack.some(integration =>
hasIntegration(tool, integration)
);
if (prefs.stack.length > 0 && !hasRequiredIntegrations) return false;
return true;
}).sort((a, b) =>
calculateToolScore(b, prefs).finalScore -
calculateToolScore(a, prefs).finalScore
);
}
API Design
GET /api/tools
?category=email
&features=automation,reporting
&maxPrice=100
&integrations=salesforce,hubspot
&sort=score,price
&limit=10
Returns paginated, scored results matching user criteria.
GET /api/tools/:toolId/comparison
?vs=tool2,tool3
Side-by-side comparison endpoint.
POST /api/tools/:toolId/report
Users report outdated information (pricing changed, feature added, etc.).
Keeping Data Fresh
Without this, your comparison platform becomes a graveyard of outdated info.
// Daily refresh for top 50 tools
schedule.daily('0 2', async () => {
const topTools = await db.tools.topBy('reviews', 50);
for (const tool of topTools) {
await updateToolData(tool.id, getSourceFor(tool));
}
});
// Weekly refresh for all tools
schedule.weekly('sunday 3:00', async () => {
const allTools = await db.tools.findAll();
for (const tool of allTools) {
await updateToolData(tool.id, getSourceFor(tool));
}
});
Metrics That Matter
Track:
- How many users updated which tool's data (crowdsourced accuracy)
- Which features get compared most often (real needs vs marketing claims)
- Which tools users actually choose after comparing (outcome data)
- Price change frequency (volatility indicator)
This data improves your platform over time.
The Real Challenge
Building the architecture is straightforward. The hard part is keeping data accurate while scaling. Crowdsourced updates help, but need moderation. API access helps for structured data, but you'll always have gaps.
AI comparison platforms work best when they embrace this: perfect data is impossible, but transparent data about uncertainty is better than pretending accuracy you don't have.
Top comments (0)