DEV Community

Tenbyte Cloud
Tenbyte Cloud

Posted on

How I Reduced Latency by 40% Using Regional Cloud Setup

The Performance Problem

Our e-commerce platform served 50,000 daily users across Bangladesh. Infrastructure hosted in Singapore AWS Asia-Pacific region. Users reported slow page loads. Checkout abandonment rate: 28%. Customer support tickets about "website lag" increased 300% month over month.

Initial Performance Metrics (October 2024):

  • Average page load time: 3.8 seconds
  • Time to First Byte (TTFB): 1,200ms
  • Database query latency: 450ms
  • API response time: 890ms
  • Bounce rate: 42%
  • Conversion rate: 2.1%

Network analysis revealed root cause: Geographic distance. Singapore to Dhaka: 2,600 km physical distance. Minimum theoretical latency: 13ms one way. Real world measurements: 85-120ms round trip time.

Every HTTP request sequence:

  1. DNS resolution: 45ms
  2. TCP handshake (1.5 RTT): 128ms
  3. TLS negotiation (1 RTT): 95ms
  4. HTTP request/response: 890ms application processing + 85ms network
  5. Total: 1,243ms before first byte

Multiply across resources: 45 HTTP requests per page × 100ms average = 4,500ms additional latency. Pages loading in 3.8 seconds became acceptable slow, not fast.

Infrastructure Audit and Measurement

Application Architecture (Before):

User (Dhaka) → 85ms → Singapore AWS
  ↓
Application Load Balancer (Singapore)
  ↓
EC2 t3.medium × 4 (Web tier)
  ↓ 2ms local network
EC2 m5.large × 3 (Application tier)
  ↓ 1ms local network
RDS PostgreSQL db.m5.xlarge (Database)
  ↓
ElastiCache Redis cluster (Cache)
Enter fullscreen mode Exit fullscreen mode

Cost Structure (Monthly):

  • EC2 web tier: 4 × $30.37 = $121.48
  • EC2 app tier: 3 × $69.98 = $209.94
  • RDS database: $208.80
  • ElastiCache Redis: $45.36
  • Application Load Balancer: $23.76 + $12 LCU = $35.76
  • NAT Gateway: $32.85 + $45 data processing = $77.85
  • Data transfer: 8 TB × $0.09/GB = $737.28
  • EBS storage: 500 GB × $0.10 = $50
  • Total: $1,441.11 monthly

Latency Breakdown by Component:

Measured using New Relic APM and custom instrumentation:

DNS Resolution (Route 53):

  • Dhaka → Singapore authoritative NS: 42-58ms
  • Cache miss penalty: 45ms average

TCP Connection Establishment:

  • SYN → SYN-ACK → ACK: 1.5 × 85ms RTT = 127.5ms
  • Connection pool reuse: 0ms (when available)
  • Connection pool exhaustion: Forces new connections during traffic spikes

TLS Handshake:

  • TLS 1.2 (2-RTT): 170ms
  • TLS 1.3 (1-RTT): 85ms
  • Session resumption: 0ms (when ticket valid)
  • Session resumption rate: 62% (38% full handshake)

Application Response Time:

  • Simple page (homepage): 245ms processing + 85ms network = 330ms
  • Product listing (database query): 680ms + 85ms = 765ms
  • Checkout page (multiple API calls): 1,450ms + 85ms = 1,535ms

Database Query Latency:

  • SELECT queries: 180-350ms (includes 85ms network RTT)
  • Same query on local network: 12-28ms actual execution
  • Network penalty: 157-322ms wasted on geographic distance

Regional Migration Strategy

Target Infrastructure Decision:

Evaluated options:

  • AWS Singapore → AWS Mumbai: 40ms latency improvement, still 3,200 km from Dhaka
  • AWS Mumbai → Dedicated servers Bangladesh: 80ms latency improvement, operational complexity
  • Regional cloud provider (Tenbyte): 85ms+ latency improvement, managed services included

Selected Tenbyte Cloud for:

  1. Dhaka data center: <15km from 65% of user base
  2. Transparent pricing: No hidden NAT gateway, data transfer fees
  3. Similar managed services: Load balancing, VPC networking included
  4. Local ISP peering: BDIX connectivity with Grameenphone, Robi, Banglalink

New Architecture (Tenbyte Cloud - https://www.tenbyte.io/cloud-vm):

User (Dhaka) → 15ms → Dhaka Data Center
  ↓
Load Balancer (included)
  ↓
VM 4 vCPU, 8 GB × 4 (Web tier)
  ↓ 0.2ms local network
VM 8 vCPU, 16 GB × 3 (Application tier)
  ↓ 0.2ms local network
VM 8 vCPU, 32 GB × 1 (PostgreSQL Primary)
VM 8 vCPU, 32 GB × 1 (PostgreSQL Replica)
  ↓
VM 4 vCPU, 16 GB × 1 (Redis)
Enter fullscreen mode Exit fullscreen mode

New Cost Structure (Monthly):

  • Web tier: 4 × $58.40 = $233.60
  • App tier: 3 × $87.60 = $262.80
  • Database primary: $160.60
  • Database replica: $160.60
  • Redis: $87.60
  • Load balancer: Included ($0)
  • NAT gateway: Included ($0)
  • Data transfer: 8 TB included ($0)
  • Storage: 500 GB × $0.08 = $40
  • Total: $945.20 monthly

Cost savings: $1,441.11 - $945.20 = $495.91 monthly (34% reduction)
Annual savings: $5,950.92

Migration Execution Timeline

Week 1 - Infrastructure Provisioning:

Day 1-2: Account setup, VPC design

  • Created VPC: 10.0.0.0/16 CIDR block
  • Subnets: Public (10.0.1.0/24), App (10.0.10.0/24), Database (10.0.20.0/24)
  • Security groups: Web (80, 443 inbound), App (8080 from web), DB (5432 from app)

Day 3-4: VM deployment via Terraform

resource "tenbyte_vm" "web" {
  count  = 4
  name   = "web-${count.index + 1}"
  plan   = "medium"  # 4 vCPU, 8 GB RAM
  image  = "ubuntu-22.04-lts"
  vpc_id = tenbyte_vpc.main.id
  subnet_id = tenbyte_subnet.public.id
  security_group_ids = [tenbyte_security_group.web.id]
}

resource "tenbyte_vm" "database" {
  name   = "db-primary"
  plan   = "xlarge"  # 8 vCPU, 32 GB RAM
  image  = "ubuntu-22.04-lts"
  vpc_id = tenbyte_vpc.main.id
  subnet_id = tenbyte_subnet.database.id

  volume {
    size = 200  # GB SSD
    type = "ssd"
  }
}
Enter fullscreen mode Exit fullscreen mode

Day 5: Database setup

  • PostgreSQL 14 installation
  • Replication configuration from Singapore RDS
  • pg_dump initial data transfer: 45 GB database = 6 hours transfer time
  • Streaming replication established: <2 second lag

Week 2 - Application Deployment:

Day 1-2: Application server configuration

  • Node.js 18 runtime environment
  • PM2 process manager for application clustering
  • Environment variables: Database connection strings, Redis endpoints, API keys
  • Health check endpoint: GET /health returns 200 OK

Day 3-4: Testing and validation

  • Load testing: JMeter 1,000 concurrent users
  • Database connection pooling: 20 connections per app server
  • Redis cache warming: Pre-populate product catalog (2.3 GB data)
  • Application response time: 85% requests <200ms

Day 5: DNS preparation

  • Reduced TTL on www.example.com from 3600s to 300s (enables quick rollback)
  • Created CNAME record: www-new.example.com → tenbyte load balancer
  • Smoke testing via new hostname

Week 3 - Gradual Migration:

Monday: 10% traffic shift

  • Updated DNS: 10% weight to Tenbyte, 90% to AWS
  • Monitoring: New Relic, CloudWatch, Tenbyte dashboard
  • Error rate: Stable at 0.2%
  • Response time: 10% traffic seeing 1,100ms average (down from 1,850ms)

Wednesday: 25% traffic

  • Increased DNS weight to 25/75 split
  • Response time: 25% cohort averaging 950ms
  • Database replication lag: <1 second
  • No customer complaints

Friday: 50% traffic

  • DNS weight: 50/50 split
  • Cost monitoring: AWS data transfer fees decreasing proportionally
  • Performance: 50% users experiencing 880ms average page load

Week 4 - Full Cutover:

Monday: 75% traffic to Tenbyte

  • DNS weight: 75/25
  • AWS traffic decreasing, RDS connections dropping
  • Application logs: No errors related to migration

Wednesday: 100% traffic migration

  • DNS TTL: Full cutover to Tenbyte infrastructure
  • AWS infrastructure: Left running 24 hours for rollback capability
  • Monitoring: All metrics green

Friday: AWS decommission

  • Database final export for archival backup
  • EC2 instances terminated
  • RDS database deleted (final snapshot retained)
  • EBS volumes deleted
  • Migration complete

Performance Results

Latency Measurements (After Migration):

DNS Resolution:

  • Before: 45ms (Singapore authoritative nameservers)
  • After: 8ms (local DNS resolvers, Tenbyte nameservers in region)
  • Improvement: 82%

TCP Connection:

  • Before: 127.5ms (1.5 RTT × 85ms)
  • After: 22ms (1.5 RTT × 14.7ms local latency)
  • Improvement: 83%

TLS Handshake:

  • Before: 85-170ms (TLS 1.3/1.2)
  • After: 15-30ms
  • Improvement: 82%

Application Response:

  • Before: 890ms average
  • After: 245ms average (includes database query, cache lookup, rendering)
  • Improvement: 72%

Database Query:

  • Before: 450ms (includes 85ms network RTT each direction)
  • After: 18ms (local network 0.2ms, actual query execution time)
  • Improvement: 96%

Page Load Time Comparison:

Homepage:

  • Before: 2.1 seconds
  • After: 0.8 seconds
  • Improvement: 62%

Product Listing:

  • Before: 3.8 seconds
  • After: 1.4 seconds
  • Improvement: 63%

Checkout Page:

  • Before: 5.2 seconds
  • After: 2.1 seconds
  • Improvement: 60%

Overall Platform Metrics (30 days post migration):

Average Page Load Time:

  • Before: 3.8 seconds
  • After: 1.6 seconds
  • Improvement: 58%

Time to First Byte:

  • Before: 1,200ms
  • After: 185ms
  • Improvement: 85%

API Response Time (p95):

  • Before: 1,450ms
  • After: 380ms
  • Improvement: 74%

Bounce Rate:

  • Before: 42%
  • After: 28%
  • Improvement: 33% reduction

Conversion Rate:

  • Before: 2.1%
  • After: 3.4%
  • Improvement: 62% increase

Customer Complaints:

  • Before: 180 tickets/month about slowness
  • After: 12 tickets/month
  • Improvement: 93% reduction

Business Impact Analysis

Revenue Impact:

Conversion rate improvement: 2.1% → 3.4% (+1.3 percentage points)

Monthly transactions:

  • Before: 50,000 visitors × 2.1% = 1,050 transactions
  • After: 50,000 visitors × 3.4% = 1,700 transactions
  • Additional: 650 transactions monthly

Average order value: BDT 2,500 (approximately $23 USD)
Additional monthly revenue: 650 × BDT 2,500 = BDT 1,625,000 ($15,000 USD)
Annual revenue increase: BDT 19,500,000 ($180,000 USD)

Cost-Benefit Analysis:

Infrastructure cost reduction: $495.91 monthly savings
Revenue increase: $15,000 monthly additional revenue
Total monthly benefit: $15,495.91

Migration costs:

  • Engineer time: 160 hours × $50/hour = $8,000
  • Testing and validation: $2,000
  • Overlap period (running both): $1,441.11 × 0.5 months = $720.56
  • Total migration cost: $10,720.56

ROI calculation:

  • Payback period: $10,720.56 / $15,495.91 = 0.69 months
  • First year benefit: ($15,495.91 × 12) - $10,720.56 = $174,230.36
  • ROI: 1,625% first year return

Operational Improvements:

Deployment speed:

  • Before: 45 minutes average (Singapore region, slower network)
  • After: 8 minutes average (local network, faster instance provisioning)
  • Improvement: 82% faster deployments

Developer productivity:

  • Local development mirrors production latency characteristics
  • Faster testing cycles (no 85ms penalty on each API call)
  • Improved debugging (network timeout issues eliminated)

Customer support:

  • 93% reduction in slowness related tickets
  • Support team redirected to higher value activities
  • Customer satisfaction score: 6.8 → 8.4 (out of 10)

Technical Lessons Learned

DNS TTL Management:

Critical for safe migration. Reduced TTL from 3600s to 300s one week before cutover. Enabled rapid rollback capability if issues emerged. Post-migration, gradually increased back to 1800s for caching efficiency while maintaining rollback window.

Database Replication Strategy:

PostgreSQL streaming replication from Singapore to Dhaka worked reliably. Lag remained <2 seconds throughout migration. Post-cutover, converted Singapore database to read replica for disaster recovery. Cross region replication cost: Minimal compared to benefits.

Application Connection Pooling:

Essential for database performance. Configured PgBouncer with:

  • Pool mode: Transaction (releases connection after each transaction)
  • Max client connections: 200 per app server
  • Database pool size: 20 connections
  • Pool timeout: 30 seconds

Without proper pooling: Database connection exhaustion occurred during load testing.

Load Balancer Health Checks:

Configuration:

  • Protocol: HTTP
  • Path: /health
  • Interval: 10 seconds
  • Timeout: 5 seconds
  • Healthy threshold: 2 consecutive successes
  • Unhealthy threshold: 3 consecutive failures

Mistake: Initially configured 30-second interval. Instance failures took 90 seconds to detect (3 × 30s). Reduced to 10-second interval improved failover to 30 seconds.

Monitoring and Alerting:

Implemented comprehensive monitoring:

  • Application metrics: Response time, error rate, throughput (requests/sec)
  • Infrastructure metrics: CPU utilization, memory usage, disk I/O
  • Database metrics: Connection count, query time, replication lag
  • Network metrics: Bandwidth usage, packet loss, latency

Critical alerts:

  • Response time p95 >500ms: Page alert
  • Error rate >1%: Page alert
  • Database replication lag >5 seconds: Email alert
  • CPU >85% for 5 minutes: Email alert

Cost Optimization Strategies:

Rightsize VMs based on actual utilization:

  • Initial: Matched AWS instance types exactly
  • After 30 days: Analyzed CPU/memory utilization
  • Result: Reduced app tier from 8 vCPU to 4 vCPU on 2 instances (underutilized)
  • Additional monthly savings: $58.40

Storage optimization:

  • Enabled automated snapshots: Daily at 2 AM, 7-day retention
  • Cost: 500 GB volume × 20% change rate × 7 days × $0.04/GB-month = $28 monthly
  • Versus: AWS snapshot costs were $0.05/GB = $175 monthly for similar retention

Geographic Performance Distribution

User Location Analysis (Google Analytics):

Dhaka users (65% of traffic):

  • Before: 3.9 seconds average page load
  • After: 1.2 seconds average
  • Improvement: 69%

Chittagong users (18% of traffic):

  • Before: 4.1 seconds average
  • After: 1.8 seconds average (network path Chittagong → Dhaka 25ms)
  • Improvement: 56%

Sylhet users (8% of traffic):

  • Before: 4.3 seconds average
  • After: 2.1 seconds average (longer fiber route)
  • Improvement: 51%

International users (9% of traffic):

  • Before: 2.8 seconds average (already closer to Singapore)
  • After: 3.2 seconds average (now farther from Dhaka)
  • Degradation: 14% slower

Trade-off analysis: 91% users experienced dramatic improvement. 9% international users experienced minor degradation. Business decision: Optimize for primary market (Bangladesh users).

Future consideration: CDN implementation for static assets would serve international users from nearby edge locations while maintaining database in Dhaka for primary market.

Recommendations for Similar Migrations

Identify Geographic User Concentration:

Use analytics to determine user distribution:

  • Google Analytics: Audience → Geo → Location
  • Cloudflare Analytics: Traffic distribution by country
  • Application logs: Parse IP addresses, geolocate via MaxMind database

If >70% users concentrated in specific region distant from current infrastructure: Regional migration likely beneficial.

Measure Current Latency Baseline:

Tools for measurement:

  • Pingdom: Multi-location synthetic monitoring
  • WebPageTest: Waterfall analysis from target locations
  • New Relic: Real User Monitoring from actual user devices
  • Custom: curl -w "@curl-format.txt" -o /dev/null -s https://example.com

Calculate network latency component:

  • Total response time Application processing time = Network overhead
  • If network overhead >50% of total time: Geographic distance likely culprit

Evaluate Provider Options:

Criteria for regional cloud selection:

  1. Data center location: <100km from user concentration = <10ms latency
  2. ISP peering: Direct connections with major local carriers
  3. Pricing transparency: All inclusive versus hidden fee model
  4. Compliance: Data residency requirements for regulated industries
  5. Support: Local timezone coverage, language, technical expertise

Tenbyte advantages for Bangladesh/Malaysia:

  • Data centers in Dhaka, Chittagong, Kuala Lumpur, Cyberjaya
  • BDIX, MyIX peering with Grameenphone, Robi, Banglalink, Time, Maxis
  • Transparent pricing: https://www.tenbyte.io/
  • 24/7 support in regional timezones

Plan Gradual Migration:

Never big bang cutover for production systems. Phased approach:

  1. Provision parallel infrastructure (Week 1-2)
  2. Establish data replication (Week 2-3)
  3. Deploy application, validate functionality (Week 3)
  4. Gradual DNS traffic shift: 10% → 25% → 50% → 75% → 100% (Week 4)
  5. Monitor intensively during each phase
  6. Maintain rollback capability until 100% stable

Optimize Post Migration:

Migration completes execution. Optimization never stops:

  • Monitor resource utilization weekly
  • Rightsize VMs based on actual usage
  • Implement caching aggressively (Redis, CDN)
  • Database query optimization (indexes, query plans)
  • Enable compression (gzip, Brotli for text content)
  • Image optimization (WebP, responsive sizing, lazy loading)

Each 10% performance improvement compounds business impact.

Top comments (0)