DEV Community

DropThe
DropThe

Posted on

How We Score 440,000 Coffee Shops Using Data Completeness

Every directory faces the same problem: how do you rank places when you have no user reviews yet?

When we built CoffeeTrove, a coffee discovery platform indexing 440K+ cafes worldwide, we needed a scoring system that works from day one -- before a single user rates anything.

The Golden Drop Score

Our approach: score data completeness, not opinions.

Every cafe starts at 0 and earns points for each verified data field:

Data Field Points Rationale
Has name + coordinates 10 Baseline existence
Opening hours present 8 Actionable for visitors
Phone or website 5 Contactable
Photos available 7 Visual confirmation
Wheelchair accessible noted 5 Accessibility matters
Internet speed data 5 Nomad-critical
Specialty coffee tagged 5 Enthusiast signal
Independent (not chain) 10 Bonus for local businesses

Max possible: ~55 points from data alone + 10 point independent bonus.

Chain Detection

We built a three-tier badge system:

  • Global Chain (Starbucks, Costa, etc.) -- 11 brands detected
  • Local Chain (Blue Bottle, Intelligentsia) -- 12 regional brands
  • Independent -- everyone else, gets a +10 score bonus

The independent bonus reflects a real pattern: independent cafes correlate with higher specialty coffee quality.

The SQL

Scoring runs as a single UPDATE across all 440K rows:

UPDATE cafes SET score = (
  CASE WHEN name IS NOT NULL AND lat IS NOT NULL THEN 10 ELSE 0 END +
  CASE WHEN opening_hours IS NOT NULL THEN 8 ELSE 0 END +
  CASE WHEN phone IS NOT NULL OR website IS NOT NULL THEN 5 ELSE 0 END +
  CASE WHEN chain_type IS NULL THEN 10 ELSE 0 END
  -- ... more fields
);
Enter fullscreen mode Exit fullscreen mode

Runs in under 3 seconds on PostgreSQL 17. No per-row API calls, no external scoring service.

What We Learned

  1. Data completeness is a surprisingly good proxy for quality. Cafes that bother to list hours, upload photos, and maintain a website tend to be better.
  2. Rewarding independents is fair. Chain consistency is valuable, but discovery tools should surface the unexpected.
  3. Score transparency builds trust. Every cafe page on CoffeeTrove shows exactly how the score was computed. No black box.

This approach works for any directory vertical. We use similar data-completeness scoring on DropThe for ranking entities across movies, companies, and crypto.


We're building CoffeeTrove as a free, open coffee discovery tool. Check it out at coffeetrove.com.

Top comments (0)