TrustStar

Posted on Jun 2

We Scanned 100 AI Repos on GitHub. Here's What We Found.

#opensource #ai #security #github

We Scanned 100 AI Repos on GitHub. Here's What We Found.

A drone firmware project with 3× more stars than the real one. A crypto protocol that turned GitHub into a points farm. A README with 6,289 stars and 2 commits.

As a developer turned architect, I used to treat GitHub stars as a proxy for trust. More stars meant more legitimate, fewer reasons to question before cloning. That instinct got me thinking. So I built TrustStar, audited hundreds of repos, and found that some people had figured out that instinct before me.

Here's what the data showed.

Case 1: The Airdrop Farm (QuipNetwork)

🔴 DANGEROUS

Repository	Stars	Forks	Fork/Star ratio
hashsigs-py	11,200	9	0.0008
hashsigs-rs	11,300	42	0.0037
hashsigs-ts	11,300	31	0.0027
hashsigs-solidity	11,300	33	0.003
quip-protocol	11,645	159	0.014
ethereum-sdk	~11,400	72	0.006
cpp-sdk	~11,300	44	0.004

Six repos in completely different languages (Python, Rust, TypeScript, Solidity, C++) all converging on exactly ~11,300 stars. Projects with genuinely different audiences don't do that.

The mechanism was on their own website: "Each GitHub repo star earns 5 QUIP points."

QuipNetwork launched a crypto airdrop in early February 2026. Users who wanted QUIP tokens starred every repo in the organization. 11,000 stars in 48 hours, after five months of zero activity.

The tell: dashboard.quip.network has 2 stars. nodes.quip.network has 2 stars. The repos they forgot to include in the airdrop show the real numbers.

This is the first documented instance of a crypto airdrop using GitHub as a gamification layer. These aren't bots. They're real users who just wanted tokens.

Case 2: The Typosquat (ShlkOfTheRa/scarab-osd)

🔴 DANGEROUS. The most dangerous case in this dataset.

ShikOfTheRa/scarab-osd is a legitimate drone flight controller firmware project. 468 stars, built over 10 years.

ShlkOfTheRa/scarab-osd, one character different, was created March 3, 2026. Byte-for-byte identical code. Twelve days later, 1,485 stars purchased in a 90-minute window.

From the API:

88.9% lockstep ratio on page 1 (100 stars in 6.5 minutes)
91.9% on page 2
Bot accounts confirmed: alborto8alalfsdfddfg, abdalyafei20233-prog

The typosquat now has 3× more stars than the 10-year-old original. It shows up first in GitHub search. No backdoor in the code. The attack is subtler: a developer finds the typosquat, trusts it because of the stars, and installs firmware on their drone from an account created 8 weeks ago.

Three more typosquats in the same cluster (dRoninFlight/dRonin, INAVFlights/inav, MultiWiii/baseflight), all created in the same 48-hour window, all targeting drone firmware.

This is a supply chain attack vector.

Case 3: The PPT Factory (op7418/guizang-ppt-skill)

🔴 DANGEROUS. Created April 23, 2026. 12,919 stars.

Raw timestamps from the GitHub API, page 2 of stargazers:
01:56:09 Leungggggg → 01:56:10 quhalamatt (1 second)
01:56:10 quhalamatt → 01:56:10 atopsnow (0 seconds) ← simultaneous
01:55:48 WSGsety → 01:55:49 KarenD006 (1 second)
01:58:34 zhan55-png → 01:58:35 chrisq47 (1 second)

01:56:09 Leungggggg → 01:56:10 quhalamatt (1 second)
01:56:10 quhalamatt → 01:56:10 atopsnow (0 seconds) ← simultaneous
01:55:48 WSGsety → 01:55:49 KarenD006 (1 second)
01:58:34 zhan55-png → 01:58:35 chrisq47 (1 second)

Two accounts starring the same repo at the exact same second. It's a multi-threaded bot. 27.3% of star pairs in that window were under 5 seconds apart.

The owner is a legitimate designer with a real portfolio. This isn't deception for its own sake. It's buying visibility in a market where stars are the primary discovery mechanism.

What the rest of the dataset shows

All 10 CAUTION repos have 0 commits per week. Growth without activity is the clearest signal of artificial inflation.

6 of those 10 had bursts in May 2026, the month of this analysis. This isn't historical. It's happening now.

Legitimate repos score cleanly: huggingface/transformers 88, open-webui 92, ray-project/ray 93, langchain 88. The signal is specific.

Methodology

TrustStar scores repos across four dimensions: Account Quality (26%), Temporal Behavior (23%), Project Health (26%), Authenticity (25%). Built on He et al., "Six Million (Suspected) Fake Stars on GitHub", ICSE 2026, arXiv:2412.13459.

All DANGEROUS labels were verified directly from the GitHub API. No case relies on secondary sources. Full dataset at truststar.co. GitHub repo coming soon. API available.

Audit any repo on TrustStar →

All data is publicly available and reproducible. No affiliation with the repositories mentioned. DANGEROUS labels required a minimum of 8 convergent signals.

Top comments (1)

Mike Czerwinski • Jun 25

The convergent-signals threshold (minimum 8) is the load-bearing piece, because it's what flips this from "narrative pattern-matching" into a falsifiable test. Any single signal can be argued. Eight signals authored by GitHub's own API, all pointing the same direction, can't be rewritten by the repo owner without rewriting GitHub itself. That's the audit equivalent of a re-checkable artifact. The evidence doesn't belong to the entity being evaluated.

The scarab-osd case is the one I keep coming back to. Star count as discovery proxy is a wall when it's load-bearing and paint when it isn't. For drone firmware, "first in search results" is load-bearing, which is exactly the surface a typosquat optimizes for. The defender has to know the discriminator (commit history, account age, fork/star ratio) before the install command runs. Once the firmware is on the drone, the trust decision has already been made.

Curious whether the same methodology ports to other discovery-trust surfaces. npm downloads, PyPI installs, Hugging Face likes. All have the same shape: aggregated signal authored partly by gameable behavior, used as proxy for legitimacy by downstream consumers.