AI agents are everywhere. LangChain, AutoGPT, CrewAI, Dify, n8n — there are hundreds of open-source agent frameworks now, and the list keeps growing.
But here's the question nobody is asking: which ones can you actually trust?
I spent the last month building HVTracker, an open trust registry that scores 171 AI agents across five dimensions: Activity, Adoption, Transparency, Safety, and Identity.
This post shares what I found.
The Problem
Most developers pick an AI agent framework based on GitHub stars and vibes. Stars measure popularity, not trustworthiness. A repo with 100K stars can still have:
- No security policy
- No signed commits
- No supply chain provenance
- A failing OSSF Scorecard
- No license at all
Stars tell you what's trendy. They don't tell you what's safe to deploy in production.
How HVTrust Scoring Works
Every agent gets a composite trust score from 0–100 across five dimensions:
| Dimension | Max Points | What It Measures |
|---|---|---|
| Activity | 25 | Recent commits, release freshness |
| Adoption | 20 | GitHub stars, npm/PyPI downloads |
| Transparency | 20 | License, docs, OSSF Scorecard |
| Safety | 20 | OSSF score, provenance, signed commits |
| Identity | 15 | Verification status, evidence coverage |
Each agent also gets an Evidence Grade (A through D) based on how many independent signal types we could verify:
- Grade A: 4+ signal types (GitHub + downloads + scorecard + provenance)
- Grade B: 3 signal types
- Grade C: 2 signal types
- Grade D: GitHub only
Surprising Findings
High stars doesn't mean high trust. Several agents with 100K+ stars scored below 50/100 on trust because they lack basic security hygiene.
Transparency is the weakest dimension across the board. Most agents have a license and README, but very few have OSSF Scorecards, signed commits, or provenance attestations.
Smaller projects sometimes score higher on safety. Projects that adopted Sigstore, SLSA provenance, or GitHub's artifact attestations early tend to outperform larger projects that grew before these tools existed.
Only a handful of agents achieve Grade A evidence. Most sit at Grade B or C — meaning we can only partially verify their trust signals from independent sources.
What Signals We Track
HVTracker pulls data from multiple independent sources every 4 hours:
- GitHub API — stars, forks, commits, license, last push date
- npm / PyPI — weekly downloads, provenance attestations
- OSSF Scorecard (via deps.dev) — security practices score
- GitHub Search API — fingerprint-based public actions
- Algolia HN API — Hacker News mentions in the last 30 days
All signals refresh automatically via staggered GitHub Actions cron jobs — 6 batches per day, full cycle in 24 hours.
It's Fully Open
- The full dataset is CC BY 4.0: hvtracker.net/data/latest.json
- The scoring methodology is documented: hvtracker.net/methodology
- The source code is on GitHub: github.com/YugantM/hvtracker
- Every agent has an individual profile page with all raw signals
There's no login, no tracking, no backend — it's a static site on GitHub Pages.
Embeddable Trust Badges
Example badges for LangChain: HVTrust: 85.0 Grade: B
See them live: hvtracker.net/badge/langchain.svg
Embed them in your README:
[](https://hvtracker.net/agents/YOUR-AGENT)
What's Next
I'm working on:
- Agent comparison tool (compare 2–3 agents side by side)
- 7-day trust trend indicators
- Agent submission via GitHub Issues
- Reputation event history (track trust changes over time)
Try It
Browse the registry: hvtracker.net
Find your favorite agent. Check its trust score. You might be surprised.
I'd love feedback on the scoring methodology — especially whether the dimension weights feel right to you. Drop a comment or open an issue on GitHub.
Built solo as an open-source project. If you find it useful, a star on GitHub would mean a lot.
Top comments (0)