Yugant Hadiyal

Posted on May 27

I Ranked 171 AI Agents by Trust — Here's What I Found About Safety and Transparency

#ai #opensource #security #showdev

AI agents are everywhere. LangChain, AutoGPT, CrewAI, Dify, n8n — there are hundreds of open-source agent frameworks now, and the list keeps growing.

But here's the question nobody is asking: which ones can you actually trust?

I spent the last month building HVTracker, an open trust registry that scores 171 AI agents across five dimensions: Activity, Adoption, Transparency, Safety, and Identity.

This post shares what I found.

The Problem

Most developers pick an AI agent framework based on GitHub stars and vibes. Stars measure popularity, not trustworthiness. A repo with 100K stars can still have:

No security policy
No signed commits
No supply chain provenance
A failing OSSF Scorecard
No license at all

Stars tell you what's trendy. They don't tell you what's safe to deploy in production.

How HVTrust Scoring Works

Every agent gets a composite trust score from 0–100 across five dimensions:

Dimension	Max Points	What It Measures
Activity	25	Recent commits, release freshness
Adoption	20	GitHub stars, npm/PyPI downloads
Transparency	20	License, docs, OSSF Scorecard
Safety	20	OSSF score, provenance, signed commits
Identity	15	Verification status, evidence coverage

Each agent also gets an Evidence Grade (A through D) based on how many independent signal types we could verify:

Grade A: 4+ signal types (GitHub + downloads + scorecard + provenance)
Grade B: 3 signal types
Grade C: 2 signal types
Grade D: GitHub only

Surprising Findings

High stars doesn't mean high trust. Several agents with 100K+ stars scored below 50/100 on trust because they lack basic security hygiene.

Transparency is the weakest dimension across the board. Most agents have a license and README, but very few have OSSF Scorecards, signed commits, or provenance attestations.

Smaller projects sometimes score higher on safety. Projects that adopted Sigstore, SLSA provenance, or GitHub's artifact attestations early tend to outperform larger projects that grew before these tools existed.

Only a handful of agents achieve Grade A evidence. Most sit at Grade B or C — meaning we can only partially verify their trust signals from independent sources.

What Signals We Track

HVTracker pulls data from multiple independent sources every 4 hours:

GitHub API — stars, forks, commits, license, last push date
npm / PyPI — weekly downloads, provenance attestations
OSSF Scorecard (via deps.dev) — security practices score
GitHub Search API — fingerprint-based public actions
Algolia HN API — Hacker News mentions in the last 30 days

All signals refresh automatically via staggered GitHub Actions cron jobs — 6 batches per day, full cycle in 24 hours.

It's Fully Open

The full dataset is CC BY 4.0: hvtracker.net/data/latest.json
The scoring methodology is documented: hvtracker.net/methodology
The source code is on GitHub: github.com/YugantM/hvtracker
Every agent has an individual profile page with all raw signals

There's no login, no tracking, no backend — it's a static site on GitHub Pages.

Embeddable Trust Badges

Example badges for LangChain: HVTrust: 85.0 Grade: B

See them live: hvtracker.net/badge/langchain.svg

Embed them in your README:

[![HVTrust](https://hvtracker.net/badge/YOUR-AGENT.svg)](https://hvtracker.net/agents/YOUR-AGENT)

What's Next

I'm working on:

Agent comparison tool (compare 2–3 agents side by side)
7-day trust trend indicators
Agent submission via GitHub Issues
Reputation event history (track trust changes over time)

Try It

Browse the registry: hvtracker.net

Find your favorite agent. Check its trust score. You might be surprised.

I'd love feedback on the scoring methodology — especially whether the dimension weights feel right to you. Drop a comment or open an issue on GitHub.

Built solo as an open-source project. If you find it useful, a star on GitHub would mean a lot.

DEV Community