DEV Community

Hagicode
Hagicode

Posted on • Originally published at docs.hagicode.com

Building a Low-Cost Download Distribution Station with Cheap Cloud Servers

Building a Low-Cost Download Distribution Station with Cheap Cloud Servers

Cloud storage egress traffic is ridiculously expensive, cross-border access is frustratingly slow, and CDN prices are enough to make you think twice... If you're in the distribution business, you know these pains all too well. This article shares a low-cost solution we developed for the HagiCode project. Cloud server + Nginx caching layer—costs cut in half, speeds improved. Small comfort, perhaps.

Background

On the internet, download speed and stability ultimately boil down to user experience. Whether open source or commercial, you need to provide users with a reliable download method.

Downloading files directly from cloud storage (like Azure Blob Storage or AWS S3) might seem simple, but there are actually quite a few issues:

Network Latency: Cross-region access is slow enough to make you want to smash your keyboard. Users wait forever—how can the experience be good?

Bandwidth Costs: Cloud storage egress traffic is painfully expensive. Azure Blob Storage accessed from mainland China costs about ¥0.5 per GB—that's ¥500 for 1TB per month. For small teams, that's neither too much nor too little, but money doesn't grow on trees.

Access Restrictions: In certain regions, access to foreign cloud services is spotty, sometimes completely unavailable. Users can't even download—pretty helpless situation.

CDN Costs: Commercial CDNs can indeed solve the problem, but the prices are equally impressive. Small teams can't afford them.

So is there a way to save money while still being effective? Actually, yes. Cloud server + reverse proxy + caching layer—simple and crude. Costs cut by about half, speeds improved. Some small comfort.

About HagiCode

This solution wasn't conjured out of thin air—it's experience we developed through working on the HagiCode project.

HagiCode is an AI code assistant that needs to provide download services for both server-side and desktop clients. Since it's a tool for developers, it's important that global users can download quickly and reliably. This is also why we had to figure out a low-cost solution—after all, money doesn't grow on trees.

If you find this solution valuable, it shows our engineering skills are decent... In that case, HagiCode itself is worth paying attention to, right?

Architecture Design

Overall Architecture Concept

Let's first look at the overall architecture design:

User Request
    ↓
DNS Resolution
    ↓
┌─────────────────────────────────────┐
│   Reverse Proxy Layer (Traefik/Bunker Web)   │ ← SSL termination, routing, security
├─────────────────────────────────────┤
│   Port: 80/443                       │
│   Features: Auto Let's Encrypt certs      │
│         Host routing                    │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│   Caching Layer (Nginx)                     │ ← File caching, Gzip compression
├─────────────────────────────────────┤
│   Port: 8080(server) / 8081(desktop) │
│   Cache policy:                          │
│   - index.json: 1 hour               │
│   - Other files: 7 days                   │
│   Cache size: 1GB                      │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│   Origin (Azure Blob Storage)         │ ← File storage
└─────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The core idea of this architecture is simply: add a caching layer between users and cloud storage.

User requests first hit the reverse proxy layer on the cloud server, then the Nginx caching layer takes over. File in cache? Serve directly to user. Not there? Fetch from cloud storage and store a local copy. Next time the same file is accessed, no need to bother cloud storage. It's like memory—once you remember something, you don't have to work hard to recall it...

Why Choose This Architecture?

Cloud Server Advantages:

  • Cost controllable: Cloud providers like Alibaba Cloud offer cheap servers (1-2 core 2GB config costs about ¥50-100/month)
  • Flexible deployment: Freely configure reverse proxy and caching policies
  • Geographic flexibility: Choose server regions close to users
  • Scalable: Can upgrade config based on traffic needs

Reverse Proxy + Caching Architecture:

  • Reduce origin pressure: Cache hot files, reduce cloud storage access
  • Lower costs: Cloud server traffic fees are far lower than cloud storage egress
  • Improve speed: Nearby access, server bandwidth usually better than cloud storage

Why Choose Nginx as the Caching Layer?

This wasn't an arbitrary choice—Nginx has its reasons:

  1. High performance: Nginx reverse proxy performance is well-known in the industry
  2. Mature caching: Built-in proxy_cache functionality, stable and reliable
  3. Low resource usage: Can run on 256MB memory, server-friendly
  4. Flexible config: Different file types can have different cache policies

Reverse Proxy Layer: Traefik vs Bunker Web

HagiCode's deployment solution actually supports two reverse proxies—each with its own characteristics:

Solution Features Use Case
Traefik Lightweight, auto SSL, simple config Basic deployment, low traffic scenarios
Bunker Web Built-in WAF, anti-DDoS, anti-crawler High security requirements, high traffic scenarios

Traefik: Lightweight Choice

Traefik is a modern HTTP reverse proxy and load balancer—the biggest feature is simple configuration and automatic Let's Encrypt certificates.

For initial deployment or low-traffic scenarios, Traefik is a good choice:

  • Low resource usage (1.5 CPU/512MB memory is enough)
  • Automatic SSL certificate configuration
  • Docker label-based routing, convenient

Bunker Web: High Security Scenarios

Bunker Web is an Nginx-based web application firewall with more comprehensive security protection.

When might you consider switching to Bunker Web? Probably in these situations:

  • Suffering DDoS attacks (though no one hopes for this)
  • Need ModSecurity protection
  • Want anti-crawler functionality
  • Higher security requirements

HagiCode provides the switch-deployment.sh script for quick switching between the two:

# Switch to Bunker Web
./switch-deployment.sh bunkerweb

# Switch back to Traefik
./switch-deployment.sh traefik

# Check current status
./switch-deployment.sh status
Enter fullscreen mode Exit fullscreen mode

The script automatically does pre-checks, health checks, and can auto-rollback. The switching process is safe and reliable—it won't just crash.

Nginx Caching Layer Configuration

The caching layer is the core of the entire architecture. How well you configure Nginx makes a huge difference in caching effectiveness. This is quite critical.

Cache Path Configuration

# Cache path configuration
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=azure_cache:10m
                   max_size=1g inactive=7d use_temp_path=off;
Enter fullscreen mode Exit fullscreen mode

Parameter explanation:

  • levels=1:2: Cache directory hierarchy, 2-level structure for efficient file access
  • keys_zone=azure_cache:10m: Cache key storage area, 10MB enough for many keys
  • max_size=1g: Maximum cache size 1GB
  • inactive=7d: Cache files deleted after 7 days of inactivity
  • use_temp_path=off: Write directly to cache directory for better performance

Tiered Caching Strategy

Different file types need different caching strategies:

# Server download service
server {
    listen 8080;

    # index.json short-term cache (for timely updates)
    location /index.json {
        proxy_cache azure_cache;
        proxy_cache_valid 200 1h;
        proxy_cache_key "$scheme$server_port$request_uri";
        add_header X-Cache-Status $upstream_cache_status;
        add_header Cache-Control "public, max-age=3600";

        # Reverse proxy to Azure OSS
        proxy_pass https://${SERVER_DL_HOST}/${SERVER_DL_CONTAINER}$uri?${SERVER_DL_SAS_TOKEN};
        proxy_ssl_server_name on;
        proxy_ssl_protocols TLSv1.2 TLSv1.3;
    }

    # Installation packages and other static files long-term cache
    location / {
        proxy_cache azure_cache;
        proxy_cache_valid 200 7d;
        proxy_cache_key "$scheme$server_port$request_uri";
        add_header X-Cache-Status $upstream_cache_status;
        add_header Cache-Control "public, max-age=604800";

        proxy_pass https://${SERVER_DL_HOST}/${SERVER_DL_CONTAINER}$uri?${SERVER_DL_SAS_TOKEN};
        proxy_ssl_server_name on;
        proxy_ssl_protocols TLSv1.2 TLSv1.3;
    }
}
Enter fullscreen mode Exit fullscreen mode

Why Design It This Way?

index.json is the version check file and needs timely updates, so cache time is set to 1 hour. This way after releasing a new version, users can detect the update within 1 hour—not too long.

Static files like installation packages change infrequently, so caching for 7 days greatly reduces origin access. When updates are needed, just manually clear the cache—not too troublesome.

X-Cache-Status Response Header:

This response header lets you check cache hit status:

  • HIT: Cache hit
  • MISS: Cache miss, fetched from origin
  • EXPIRED: Cache expired, re-fetched from origin
  • BYPASS: Cache bypassed

View method:

curl -I https://server.dl.hagicode.com/app.zip
Enter fullscreen mode Exit fullscreen mode

Cost Analysis

Assuming 1TB monthly download traffic, let's do the math:

Solution Traffic Cost Server Cost Total
Direct Azure OSS ~¥500 ¥0 ¥500
Cloud server + OSS (80% cache hit rate) ¥100 + ¥80 ¥60 ¥240
Commercial CDN ¥300-500 ¥0 ¥300-500

Conclusion: Caching layer saves about 50% of distribution costs.

This calculation assumes 80% cache hit rate. In practice, if file update frequency is low, the hit rate might be higher—this is natural.

Deployment Practice

Environment Preparation

First configure environment variables:

cd /path/to/hagicode_aliyun_deployment/docker
cp .env.example .env
vi .env  # Fill in Azure OSS SAS URL, Lark Webhook URL
Enter fullscreen mode Exit fullscreen mode

Important: The .env file contains sensitive information (SAS Token, Webhook URL)—never commit it to version control. This is critical.

DNS Configuration

Add the following DNS A records—don't forget this step:

  • server.dl.hagicode.com → Server IP
  • desktop.dl.hagicode.com → Server IP

Initialize Server

Use Ansible to automatically initialize the server:

cd /path/to/hagicode_aliyun_deployment
ansible-playbook -i ./ansible/inventory/hosts.yml ./ansible/playbooks/init.yml
Enter fullscreen mode Exit fullscreen mode

This playbook will automatically:

  • Create deployment user
  • Install Docker and Docker Compose
  • Configure SSH keys
  • Set up firewall rules

Not too complex—automation saves time and effort.

Deploy Services

./deploy.sh
Enter fullscreen mode Exit fullscreen mode

The deployment script will:

  • Check environment configuration
  • Pull latest code
  • Start Docker containers
  • Execute health checks
  • Send deployment notifications (Lark)

One command to handle it all—convenient.

Verify Deployment

# Check container status
docker ps

# Test download domains
curl -I https://server.dl.hagicode.com/index.json
curl -I https://desktop.dl.hagicode.com/index.json
Enter fullscreen mode Exit fullscreen mode

Operations Tips

Cache Management

Cache needs occasional maintenance:

Check cache disk usage:

docker volume inspect docker_nginx-cache
du -sh /var/lib/docker/volumes/docker_nginx-cache/_data
Enter fullscreen mode Exit fullscreen mode

Manually clear cache:

./clear-cache.sh
Enter fullscreen mode Exit fullscreen mode

Or execute manually—more麻烦 but still works:

docker exec nginx sh -c "rm -rf /var/cache/nginx/*"
docker restart nginx
Enter fullscreen mode Exit fullscreen mode

Resource Limits

On a 1-core 2GB server, resource limits are configured as follows:

services:
  traefik:
    deploy:
      resources:
        limits:
          cpus: '1.50'
          memory: 512M

  nginx:
    deploy:
      resources:
        limits:
          cpus: '0.50'
          memory: 256M
Enter fullscreen mode Exit fullscreen mode

Monitor resource usage occasionally:

docker stats
Enter fullscreen mode Exit fullscreen mode

SAS Token Security Practices

SAS Token is the credential for accessing Azure Blob Storage—leaking it is no joke:

  • .env file not committed to version control (already in .gitignore)
  • SAS Token set with appropriate expiration (recommend 1 year)
  • Limit SAS Token permissions (read-only)
  • Regularly rotate SAS Token

Monitoring Alerts

HagiCode integrates Lark/Feishu Webhook notifications for:

  • Deployment success/failure
  • Cache clearing status
  • Service anomalies

Notifications include server info, timestamps, error details for quick problem identification.

High Availability Scaling

When a single server can't handle the load, consider:

  1. Horizontal scaling: Deploy multiple nodes with DNS round-robin or load balancer
  2. CDN support: Add CDN in front of cloud server for further speed improvement
  3. Cache warming: Use scripts to preload popular files into cache

Important Notes

A few reminders—nobody wants surprises:

  1. SSL certificates: Let's Encrypt has rate limits, don't switch deployments too frequently or you might not get certificates
  2. Cache clearing: Remember to clear cache after updating important files, or users might not get the new version
  3. Log management: Regularly clean Docker logs or disk will fill up
  4. Backup strategy: Backup Traefik acme.json, Bunker Web config
  5. Monitoring alerts: Configure Lark notifications for timely deployment status and quick reaction to problems

Summary

Cloud server + Nginx caching layer—simple as that. HagiCode uses this solution with reasonable costs (server fees about ¥60-100/month) and good results. Core advantages:

  • Cost controllable: About 50% cheaper than direct cloud storage or commercial CDN
  • Flexible deployment: Traefik or Bunker Web—your choice
  • Scalable: Can scale horizontally or add CDN as needed
  • Simple operations: Shell scripts + Ansible for easy automated deployment

For small teams and individual developers needing file distribution, this solution is worth trying.

HagiCode has been running this architecture in production for a while with no major issues for global users. If you're looking for a similar solution, give it a try—it might help.

Tech Stack Summary

Finally, let's organize the technologies used:

Component Choice Purpose
Cloud server Alibaba Cloud ECS Base runtime environment
Reverse proxy Traefik / Bunker Web SSL termination, routing, security
Caching layer Nginx Reverse proxy cache, Gzip compression
File storage Azure Blob Storage File origin
Containerization Docker Compose Service orchestration
Automation Ansible Server configuration management
Notifications Lark/Feishu Webhook Deployment status notifications

References

Finally, some reference materials:


If this article helped you, it was worth it:


That about covers it. Hope this solution helps you—figuring these things out isn't easy... If you have good ideas too, let's discuss. Technology, after all, is about everyone progressing together.

Original Article & License

Thanks for reading. If this article helped, consider liking, bookmarking, or sharing it.
This article was created with AI assistance and reviewed by the author before publication.

Top comments (0)