DEV Community

Biricik Biricik
Biricik Biricik

Posted on • Originally published at zsky.ai

How We Handle 50 Free Credits/Day Without User Authentication

At ZSky AI, we offer 50 free AI image generations per day without requiring users to create an account. This creates a fascinating engineering challenge: how do you enforce per-user rate limits when you don't know who the user is?

This article walks through our approach, the alternatives we considered, and the practical code patterns that make it work.

Why No Authentication?

Before diving into the technical solution, let's address the obvious question: why not just require signup?

We tested both approaches with real traffic:

Metric With Signup Wall Without Signup Wall
Visitor → First Generation 12% 67%
First Generation → Paid Conversion 4.2% 3.1%
Net Paid Conversion (Visitor → Paid) 0.50% 2.08%

Removing the signup wall 4x'd our effective conversion rate. People who experience the product first are more likely to pay for it later. The authentication barrier kills more revenue than abuse costs.

The Problem Space

Without authentication, we need to answer one question for every request: "Has this specific human used more than 50 generations today?"

This seems simple until you consider the edge cases:

  • Multiple users behind a corporate NAT (same IP, different people)
  • Users clearing cookies (resetting their count)
  • VPN users (changing IP addresses)
  • Bot traffic (automated scraping of generations)
  • Privacy-conscious users blocking fingerprinting
  • Multiple devices per user (phone + laptop)

No single signal perfectly identifies an anonymous user. Our approach uses multiple signals weighted by reliability.

The Multi-Signal Identity System

Signal 1: IP Address

The most obvious identifier, but also the least reliable on its own.

def get_ip_signal(request):
    """Extract the real client IP, handling proxies."""
    # Trust Cloudflare's CF-Connecting-IP header
    cf_ip = request.headers.get('CF-Connecting-IP')
    if cf_ip:
        return cf_ip

    # Fallback to X-Forwarded-For (first IP in chain)
    xff = request.headers.get('X-Forwarded-For')
    if xff:
        return xff.split(',')[0].strip()

    return request.remote_addr
Enter fullscreen mode Exit fullscreen mode

Reliability: Medium. Works well for residential users but fails for shared networks (offices, universities, mobile carriers using CGNAT).

Weight in our scoring: 30%

Signal 2: Browser Fingerprint

We compute a lightweight fingerprint from browser characteristics. We deliberately avoid invasive tracking — no canvas fingerprinting of rendered images, no WebGL renderer detection, no audio context fingerprinting.

// Client-side fingerprint computation
function computeFingerprint() {
    const signals = [
        screen.width,
        screen.height,
        screen.colorDepth,
        navigator.language,
        navigator.platform,
        Intl.DateTimeFormat().resolvedOptions().timeZone,
        navigator.hardwareConcurrency || 'unknown',
        // Font detection via CSS measurement (no canvas)
        detectFontSet(['Arial', 'Helvetica', 'Courier', 'Georgia', 'Verdana'])
    ];

    return sha256(signals.join('|'));
}
Enter fullscreen mode Exit fullscreen mode

The fingerprint is hashed client-side and sent as a header. We never see the raw signals — only the hash.

Reliability: Medium-high. Unique enough to distinguish most users, but can collide for users with identical system configurations.

Weight in our scoring: 35%

Signal 3: Signed Cookie Token

When a user first visits, we set an encrypted, signed cookie containing a unique session identifier and their current daily count.

import hmac
import json
import time
from base64 import b64encode, b64decode

SECRET_KEY = os.environ['COOKIE_SECRET']

def create_token(user_hash):
    """Create a signed token with daily count."""
    payload = {
        'id': user_hash,
        'count': 0,
        'date': time.strftime('%Y-%m-%d'),
        'created': time.time()
    }
    data = json.dumps(payload)
    signature = hmac.new(
        SECRET_KEY.encode(),
        data.encode(),
        'sha256'
    ).hexdigest()
    return b64encode(f"{data}|{signature}".encode()).decode()

def verify_token(token):
    """Verify and decode a signed token."""
    try:
        decoded = b64decode(token).decode()
        data, signature = decoded.rsplit('|', 1)
        expected = hmac.new(
            SECRET_KEY.encode(),
            data.encode(),
            'sha256'
        ).hexdigest()
        if hmac.compare_digest(signature, expected):
            payload = json.loads(data)
            # Reset count if date changed
            if payload['date'] != time.strftime('%Y-%m-%d'):
                payload['count'] = 0
                payload['date'] = time.strftime('%Y-%m-%d')
            return payload
    except Exception:
        return None
Enter fullscreen mode Exit fullscreen mode

Reliability: High for honest users, easily bypassed by clearing cookies.

Weight in our scoring: 25%

Signal 4: Behavioral Pattern

We track request patterns that suggest abuse:

def compute_behavior_score(request_history):
    """Score recent request behavior. Lower = more suspicious."""
    score = 1.0

    # Rapid-fire requests (< 3 seconds apart)
    if request_history.avg_interval < 3:
        score *= 0.5

    # Identical prompts repeated
    unique_ratio = len(set(request_history.prompts)) / len(request_history.prompts)
    if unique_ratio < 0.3:
        score *= 0.6

    # Cookie cleared but fingerprint matches
    if request_history.cookie_resets > 2:
        score *= 0.4

    return score
Enter fullscreen mode Exit fullscreen mode

Weight in our scoring: 10%

The Identity Resolution Algorithm

Each request goes through our identity resolver:

def resolve_identity(request):
    """Determine the most likely user identity and their daily count."""
    ip = get_ip_signal(request)
    fingerprint = request.headers.get('X-ZSky-FP', '')
    cookie = verify_token(request.cookies.get('zsky_token'))

    # Look up all matching identity records
    candidates = []

    if cookie and cookie.get('id'):
        candidates.append({
            'source': 'cookie',
            'id': cookie['id'],
            'confidence': 0.25,
            'count': cookie.get('count', 0)
        })

    if fingerprint:
        fp_record = redis.get(f"fp:{fingerprint}:{today()}")
        if fp_record:
            candidates.append({
                'source': 'fingerprint',
                'id': fingerprint,
                'confidence': 0.35,
                'count': int(fp_record)
            })

    ip_record = redis.get(f"ip:{ip}:{today()}")
    if ip_record:
        candidates.append({
            'source': 'ip',
            'id': ip,
            'confidence': 0.30,
            'count': int(ip_record)
        })

    # Take the highest count among high-confidence matches
    if not candidates:
        return {'count': 0, 'identity': generate_new_identity()}

    # Weight by confidence and take the max count
    best = max(candidates, key=lambda c: c['count'] * c['confidence'])

    behavior = compute_behavior_score(get_request_history(best['id']))

    return {
        'count': best['count'],
        'identity': best['id'],
        'confidence': best['confidence'] * behavior,
    }
Enter fullscreen mode Exit fullscreen mode

The Rate Limiting Decision

Once we have an identity and count, the decision is straightforward:

DAILY_LIMIT = 50
SOFT_LIMIT = 55  # Small buffer for edge cases

def check_rate_limit(identity):
    """Determine if a generation request should proceed."""
    if identity['count'] >= DAILY_LIMIT:
        if identity['confidence'] > 0.7:
            # High confidence this user hit the limit
            return {
                'allowed': False,
                'reason': 'daily_limit',
                'suggestion': 'create_account'  # For guaranteed tracking
            }
        elif identity['count'] >= SOFT_LIMIT:
            # Lower confidence but very high count
            return {
                'allowed': False,
                'reason': 'daily_limit',
                'suggestion': 'create_account'
            }
        else:
            # Low confidence, might be a different user
            # Allow but log for analysis
            return {
                'allowed': True,
                'flagged': True
            }

    return {'allowed': True}
Enter fullscreen mode Exit fullscreen mode

The key insight: we'd rather give a few extra free generations than wrongly block a legitimate user. The cost of one extra generation (~$0.002) is far less than the cost of losing a potential customer to frustration.

Server-Side Storage

We use Redis for fast lookups with automatic expiration:

def increment_usage(identity):
    """Increment the daily usage counter for all identity signals."""
    pipe = redis.pipeline()

    # Set keys with TTL of 26 hours (covers timezone edge cases)
    ttl = 26 * 3600

    for signal_type, signal_value in identity['signals'].items():
        key = f"{signal_type}:{signal_value}:{today()}"
        pipe.incr(key)
        pipe.expire(key, ttl)

    pipe.execute()
Enter fullscreen mode Exit fullscreen mode

Redis was the obvious choice here:

  • In-memory for fast lookups (<1ms)
  • Automatic key expiration handles daily resets
  • Atomic increment operations prevent race conditions
  • Low memory footprint (each key is ~100 bytes)

At peak usage, our Redis instance for rate limiting uses less than 50MB of RAM.

Nginx Layer: The First Line of Defense

Before requests even reach the application, Nginx handles basic rate limiting:

# Rate limit zone: 2 requests/second per IP
limit_req_zone $binary_remote_addr zone=api_generate:10m rate=2r/s;

server {
    location /api/generate {
        limit_req zone=api_generate burst=5 nodelay;
        limit_req_status 429;

        proxy_pass http://app_backend;
    }
}
Enter fullscreen mode Exit fullscreen mode

This catches automated scripts and bots before they consume application resources. A human can't meaningfully use more than 2 generations per second, so this has zero impact on legitimate users.

Results and Metrics

After 6 months of operation:

  • 98.3% of users never hit the daily limit
  • 1.2% hit the limit through normal usage
  • 0.5% attempt to circumvent the limit
  • 0.02% successfully circumvent it consistently

The cost of the 0.02% who game the system? About $15/month in extra GPU compute. The cost of implementing perfect enforcement through mandatory authentication? Estimated 4x reduction in conversion rate, or roughly $2,000+/month in lost revenue.

Privacy Considerations

We take privacy seriously:

  • Fingerprints are hashed client-side; we never see raw device information
  • IP addresses are stored as hashed keys with 26-hour TTL
  • No personal information is collected or stored for free-tier users
  • All rate limiting data is ephemeral (auto-expires daily)
  • We don't correlate identity signals across days

What We'd Do Differently

  1. Start simpler. Our initial implementation was just IP + cookies. The fingerprinting and behavioral analysis were added later when we saw specific abuse patterns. Don't over-engineer from day one.

  2. Monitor false positives aggressively. We track how often users see the rate limit message, and we alarm if the rate exceeds 3% of total users. False positives are more damaging than false negatives.

  3. Consider geographic patterns. Users from regions with heavy CGNAT (mobile networks, some countries) need different treatment than residential broadband users.

Try It Yourself

If you want to see the system in action: zsky.ai. 50 free image generations per day, no signup required.

And if you're building something similar and want to discuss approaches, drop a comment or reach out. Anonymous rate limiting is a fascinating problem space with no perfect solution — only trade-offs.

Top comments (0)