DEV Community

SIKOUTRIS
SIKOUTRIS

Posted on

How We Built a Scoring Algorithm for Comparing 200+ AI Business Tools

The AI tools market for businesses is a mess. Every week another startup launches with claims of "revolutionary AI" that will "transform your workflow." When we started building AI Business Compare, the technical challenge was not just listing tools — it was making the comparisons actually useful.

Here is how we approached the engineering side of building a B2B AI tool comparison platform.

The Data Problem

Our first prototype was a spreadsheet with 50 tools and 20 columns. It took about a week before that became unmaintainable. The core issue: AI business tools span wildly different categories (writing, analytics, customer support, code generation, HR automation) and comparing them requires category-specific criteria.

A writing assistant and a data analytics platform both use AI, but comparing them on the same axes makes no sense.

Category-Aware Scoring Architecture

We settled on a hierarchical scoring system. Each tool belongs to one or more categories, and each category defines its own scoring dimensions:

class ScoringEngine {
    private array $categoryWeights = [
        "writing" => [
            "output_quality" => 0.35,
            "language_support" => 0.20,
            "integration_count" => 0.15,
            "pricing_value" => 0.20,
            "api_availability" => 0.10
        ],
        "analytics" => [
            "data_connectors" => 0.25,
            "visualization" => 0.20,
            "real_time" => 0.20,
            "pricing_value" => 0.20,
            "api_availability" => 0.15
        ]
    ];

    public function score(Tool $tool, string $category): float {
        $weights = $this->categoryWeights[$category];
        $total = 0;
        foreach ($weights as $dimension => $weight) {
            $total += $weight * $tool->getNormalizedScore($dimension);
        }
        return round($total, 2);
    }
}
Enter fullscreen mode Exit fullscreen mode

The normalization step is critical. Raw values like "supports 15 languages" or "has 42 integrations" mean nothing until normalized against the category average. A tool with 42 integrations might be exceptional in HR automation but average in marketing.

Pricing Normalization: Harder Than It Sounds

Every SaaS vendor has a different pricing model. Some charge per seat, some per API call, some per output token, some have flat monthly fees with usage caps. Making these comparable required building a "pricing scenario" engine:

CREATE TABLE pricing_scenarios (
    id INT PRIMARY KEY,
    tool_id INT,
    scenario_name VARCHAR(100),  -- e.g., "solo_freelancer", "team_10", "enterprise_100"
    monthly_cost DECIMAL(10,2),
    usage_limit VARCHAR(200),
    overage_rate DECIMAL(10,4),
    last_verified DATE
);
Enter fullscreen mode Exit fullscreen mode

We define standard personas (solo freelancer, 10-person team, 100-person enterprise) and calculate what each tool would actually cost for that persona. This turns "starting at $29/mo" into something genuinely comparable.

The Freshness Challenge

AI tools update their pricing and features constantly. We needed a system that could detect when our data was stale.

The solution: a combination of scheduled scraping and change detection.

def check_pricing_page(tool):
    current_hash = hashlib.md5(
        fetch_page(tool.pricing_url).encode()
    ).hexdigest()

    if current_hash != tool.last_pricing_hash:
        flag_for_review(tool, "pricing_page_changed")
        tool.last_pricing_hash = current_hash
        tool.save()
Enter fullscreen mode Exit fullscreen mode

When a pricing page changes, we do not automatically update the data — that would be fragile. Instead, we flag it for manual review. A human verifies the new pricing and updates the structured data. This gives us accuracy over speed, which matters more for purchase decisions.

Feature Matrix Rendering

Comparison tables with 30+ rows and 5+ columns are a UX nightmare on mobile. We went through three iterations:

V1: Standard HTML table with horizontal scroll. Functional but ugly.

V2: Accordion-style collapsible sections grouped by feature category. Better on mobile, but users lost the side-by-side visual.

V3 (current): A hybrid approach. On desktop, full comparison table with sticky headers. On mobile, a swipeable card interface where you see two tools at a time with a feature-by-feature breakdown.

The mobile implementation uses CSS scroll-snap:

.comparison-cards {
    display: flex;
    overflow-x: auto;
    scroll-snap-type: x mandatory;
    -webkit-overflow-scrolling: touch;
}

.tool-card {
    min-width: 85vw;
    scroll-snap-align: center;
    flex-shrink: 0;
}
Enter fullscreen mode Exit fullscreen mode

No JavaScript framework needed. Pure CSS handles the interaction, and it performs well even on older devices.

API Design for Embeddable Widgets

Some B2B blogs wanted to embed our comparison widgets. We built a simple read-only API:

GET /api/compare?tools=jasper,copy-ai,writesonic&category=writing
Enter fullscreen mode Exit fullscreen mode

Returns a JSON payload with normalized scores, pricing for each persona, and a pre-rendered HTML snippet they can iframe. Rate-limited to 100 requests per hour per API key, which is generous enough for most embed use cases.

Infrastructure Decisions

We run on shared hosting behind Cloudflare. The entire site is effectively a static site that regenerates every 6 hours via cron. PHP renders the pages, but the output is cached at the Cloudflare edge.

Why not a static site generator? Because our data model has enough complexity (multi-category tools, pricing scenarios, dynamic filtering) that building it as a true SSG would mean a complex build pipeline. PHP with aggressive caching gives us the same performance with less tooling overhead.

What I Would Do Differently

If starting over today:

  1. Start with fewer categories. We launched with 12 categories and struggled to maintain data quality across all of them. Starting with 3-4 well-maintained categories would have been smarter.
  2. Build the change detection system first. Stale data erodes trust faster than missing data.
  3. Invest in structured data earlier. We added JSON-LD schema markup six months in. Should have been day one — it significantly improved our search visibility.

The Business Angle

From a technical perspective, comparison sites are deceptively simple to prototype and genuinely hard to maintain well. The moat is not the code — it is the data accuracy and freshness. Anyone can build a comparison table. Keeping it current across 200+ tools that each update quarterly is the real engineering challenge.

That ongoing maintenance is what makes aibusinesscompare.com valuable. The code is maybe 20% of the effort. The other 80% is the data pipeline and verification process.


If you are building comparison tools for any niche, happy to discuss architecture decisions — drop a comment below.

Top comments (0)