At ZSky AI, we offer 50 free AI image generations per day without requiring users to create an account. This creates a fascinating engineering challenge: how do you enforce per-user rate limits when you don't know who the user is?
This article walks through our approach, the alternatives we considered, and the practical code patterns that make it work.
Why No Authentication?
Before diving into the technical solution, let's address the obvious question: why not just require signup?
We tested both approaches with real traffic:
| Metric | With Signup Wall | Without Signup Wall |
|---|---|---|
| Visitor → First Generation | 12% | 67% |
| First Generation → Paid Conversion | 4.2% | 3.1% |
| Net Paid Conversion (Visitor → Paid) | 0.50% | 2.08% |
Removing the signup wall 4x'd our effective conversion rate. People who experience the product first are more likely to pay for it later. The authentication barrier kills more revenue than abuse costs.
The Problem Space
Without authentication, we need to answer one question for every request: "Has this specific human used more than 50 generations today?"
This seems simple until you consider the edge cases:
- Multiple users behind a corporate NAT (same IP, different people)
- Users clearing cookies (resetting their count)
- VPN users (changing IP addresses)
- Bot traffic (automated scraping of generations)
- Privacy-conscious users blocking fingerprinting
- Multiple devices per user (phone + laptop)
No single signal perfectly identifies an anonymous user. Our approach uses multiple signals weighted by reliability.
The Multi-Signal Identity System
Signal 1: IP Address
The most obvious identifier, but also the least reliable on its own.
def get_ip_signal(request):
"""Extract the real client IP, handling proxies."""
# Trust Cloudflare's CF-Connecting-IP header
cf_ip = request.headers.get('CF-Connecting-IP')
if cf_ip:
return cf_ip
# Fallback to X-Forwarded-For (first IP in chain)
xff = request.headers.get('X-Forwarded-For')
if xff:
return xff.split(',')[0].strip()
return request.remote_addr
Reliability: Medium. Works well for residential users but fails for shared networks (offices, universities, mobile carriers using CGNAT).
Weight in our scoring: 30%
Signal 2: Browser Fingerprint
We compute a lightweight fingerprint from browser characteristics. We deliberately avoid invasive tracking — no canvas fingerprinting of rendered images, no WebGL renderer detection, no audio context fingerprinting.
// Client-side fingerprint computation
function computeFingerprint() {
const signals = [
screen.width,
screen.height,
screen.colorDepth,
navigator.language,
navigator.platform,
Intl.DateTimeFormat().resolvedOptions().timeZone,
navigator.hardwareConcurrency || 'unknown',
// Font detection via CSS measurement (no canvas)
detectFontSet(['Arial', 'Helvetica', 'Courier', 'Georgia', 'Verdana'])
];
return sha256(signals.join('|'));
}
The fingerprint is hashed client-side and sent as a header. We never see the raw signals — only the hash.
Reliability: Medium-high. Unique enough to distinguish most users, but can collide for users with identical system configurations.
Weight in our scoring: 35%
Signal 3: Signed Cookie Token
When a user first visits, we set an encrypted, signed cookie containing a unique session identifier and their current daily count.
import hmac
import json
import time
from base64 import b64encode, b64decode
SECRET_KEY = os.environ['COOKIE_SECRET']
def create_token(user_hash):
"""Create a signed token with daily count."""
payload = {
'id': user_hash,
'count': 0,
'date': time.strftime('%Y-%m-%d'),
'created': time.time()
}
data = json.dumps(payload)
signature = hmac.new(
SECRET_KEY.encode(),
data.encode(),
'sha256'
).hexdigest()
return b64encode(f"{data}|{signature}".encode()).decode()
def verify_token(token):
"""Verify and decode a signed token."""
try:
decoded = b64decode(token).decode()
data, signature = decoded.rsplit('|', 1)
expected = hmac.new(
SECRET_KEY.encode(),
data.encode(),
'sha256'
).hexdigest()
if hmac.compare_digest(signature, expected):
payload = json.loads(data)
# Reset count if date changed
if payload['date'] != time.strftime('%Y-%m-%d'):
payload['count'] = 0
payload['date'] = time.strftime('%Y-%m-%d')
return payload
except Exception:
return None
Reliability: High for honest users, easily bypassed by clearing cookies.
Weight in our scoring: 25%
Signal 4: Behavioral Pattern
We track request patterns that suggest abuse:
def compute_behavior_score(request_history):
"""Score recent request behavior. Lower = more suspicious."""
score = 1.0
# Rapid-fire requests (< 3 seconds apart)
if request_history.avg_interval < 3:
score *= 0.5
# Identical prompts repeated
unique_ratio = len(set(request_history.prompts)) / len(request_history.prompts)
if unique_ratio < 0.3:
score *= 0.6
# Cookie cleared but fingerprint matches
if request_history.cookie_resets > 2:
score *= 0.4
return score
Weight in our scoring: 10%
The Identity Resolution Algorithm
Each request goes through our identity resolver:
def resolve_identity(request):
"""Determine the most likely user identity and their daily count."""
ip = get_ip_signal(request)
fingerprint = request.headers.get('X-ZSky-FP', '')
cookie = verify_token(request.cookies.get('zsky_token'))
# Look up all matching identity records
candidates = []
if cookie and cookie.get('id'):
candidates.append({
'source': 'cookie',
'id': cookie['id'],
'confidence': 0.25,
'count': cookie.get('count', 0)
})
if fingerprint:
fp_record = redis.get(f"fp:{fingerprint}:{today()}")
if fp_record:
candidates.append({
'source': 'fingerprint',
'id': fingerprint,
'confidence': 0.35,
'count': int(fp_record)
})
ip_record = redis.get(f"ip:{ip}:{today()}")
if ip_record:
candidates.append({
'source': 'ip',
'id': ip,
'confidence': 0.30,
'count': int(ip_record)
})
# Take the highest count among high-confidence matches
if not candidates:
return {'count': 0, 'identity': generate_new_identity()}
# Weight by confidence and take the max count
best = max(candidates, key=lambda c: c['count'] * c['confidence'])
behavior = compute_behavior_score(get_request_history(best['id']))
return {
'count': best['count'],
'identity': best['id'],
'confidence': best['confidence'] * behavior,
}
The Rate Limiting Decision
Once we have an identity and count, the decision is straightforward:
DAILY_LIMIT = 50
SOFT_LIMIT = 55 # Small buffer for edge cases
def check_rate_limit(identity):
"""Determine if a generation request should proceed."""
if identity['count'] >= DAILY_LIMIT:
if identity['confidence'] > 0.7:
# High confidence this user hit the limit
return {
'allowed': False,
'reason': 'daily_limit',
'suggestion': 'create_account' # For guaranteed tracking
}
elif identity['count'] >= SOFT_LIMIT:
# Lower confidence but very high count
return {
'allowed': False,
'reason': 'daily_limit',
'suggestion': 'create_account'
}
else:
# Low confidence, might be a different user
# Allow but log for analysis
return {
'allowed': True,
'flagged': True
}
return {'allowed': True}
The key insight: we'd rather give a few extra free generations than wrongly block a legitimate user. The cost of one extra generation (~$0.002) is far less than the cost of losing a potential customer to frustration.
Server-Side Storage
We use Redis for fast lookups with automatic expiration:
def increment_usage(identity):
"""Increment the daily usage counter for all identity signals."""
pipe = redis.pipeline()
# Set keys with TTL of 26 hours (covers timezone edge cases)
ttl = 26 * 3600
for signal_type, signal_value in identity['signals'].items():
key = f"{signal_type}:{signal_value}:{today()}"
pipe.incr(key)
pipe.expire(key, ttl)
pipe.execute()
Redis was the obvious choice here:
- In-memory for fast lookups (<1ms)
- Automatic key expiration handles daily resets
- Atomic increment operations prevent race conditions
- Low memory footprint (each key is ~100 bytes)
At peak usage, our Redis instance for rate limiting uses less than 50MB of RAM.
Nginx Layer: The First Line of Defense
Before requests even reach the application, Nginx handles basic rate limiting:
# Rate limit zone: 2 requests/second per IP
limit_req_zone $binary_remote_addr zone=api_generate:10m rate=2r/s;
server {
location /api/generate {
limit_req zone=api_generate burst=5 nodelay;
limit_req_status 429;
proxy_pass http://app_backend;
}
}
This catches automated scripts and bots before they consume application resources. A human can't meaningfully use more than 2 generations per second, so this has zero impact on legitimate users.
Results and Metrics
After 6 months of operation:
- 98.3% of users never hit the daily limit
- 1.2% hit the limit through normal usage
- 0.5% attempt to circumvent the limit
- 0.02% successfully circumvent it consistently
The cost of the 0.02% who game the system? About $15/month in extra GPU compute. The cost of implementing perfect enforcement through mandatory authentication? Estimated 4x reduction in conversion rate, or roughly $2,000+/month in lost revenue.
Privacy Considerations
We take privacy seriously:
- Fingerprints are hashed client-side; we never see raw device information
- IP addresses are stored as hashed keys with 26-hour TTL
- No personal information is collected or stored for free-tier users
- All rate limiting data is ephemeral (auto-expires daily)
- We don't correlate identity signals across days
What We'd Do Differently
Start simpler. Our initial implementation was just IP + cookies. The fingerprinting and behavioral analysis were added later when we saw specific abuse patterns. Don't over-engineer from day one.
Monitor false positives aggressively. We track how often users see the rate limit message, and we alarm if the rate exceeds 3% of total users. False positives are more damaging than false negatives.
Consider geographic patterns. Users from regions with heavy CGNAT (mobile networks, some countries) need different treatment than residential broadband users.
Try It Yourself
If you want to see the system in action: zsky.ai. 50 free image generations per day, no signup required.
And if you're building something similar and want to discuss approaches, drop a comment or reach out. Anonymous rate limiting is a fascinating problem space with no perfect solution — only trade-offs.
Top comments (0)