HelperX

Posted on Jun 7 • Originally published at helperx.app

Residential Proxies for Web Automation: What We Learned Running 500+ Accounts

#webdev #devops #infrastructure #automation

If you're building automation for platforms with anti-abuse systems, proxy infrastructure is the unglamorous foundation that determines whether your system works or gets every account flagged.

At HelperX, every X account runs through its own residential proxy. After managing 500+ concurrent proxy connections, here's what we've learned about proxy types, rotation, failure handling, and cost optimization.

Why residential proxies

Three types of proxies exist. Two of them don't work for social media automation.

Datacenter proxies ($1-3/month)
IPs allocated to data centers (AWS, DigitalOcean, Hetzner). Platforms maintain blocklists of datacenter IP ranges. X blocks most of them within hours. Fast and cheap, but useless for accounts you want to keep.

Residential proxies ($5-15/GB)
IPs assigned to real residential internet connections (ISPs like Comcast, Vodafone, BT). They look like real users because they are real user IPs — routed through opt-in networks. Platforms can't bulk-block them without blocking real users.

Mobile proxies ($20-50/month)
IPs from mobile carriers (4G/5G). Shared among hundreds of real users via carrier-grade NAT. Extremely hard to block — carriers rotate IPs constantly. The gold standard, but expensive and sometimes slow.

Our choice: residential proxies for the balance of cost, reliability, and detection resistance. Mobile for high-value accounts where the budget allows it.

One proxy per account

This is non-negotiable. Every slot in HelperX requires its own proxy. No sharing.

Why? X correlates activity by IP. If two accounts:

Log in from the same IP within the same hour
Perform similar actions (replies, follows) with similar timing
Target the same types of accounts

...X links them as coordinated accounts. Both get flagged.

Even if the accounts are owned by different people, managed by different operators, with completely different content — same IP = coordination signal.

function validateProxy(slotId, proxyAddress) {
  // Check if any other active slot uses this proxy
  const existing = db.prepare(
    'SELECT slot_id FROM slots WHERE proxy_address = ? AND slot_id != ? AND active = 1'
  ).get(proxyAddress, slotId);

  if (existing) {
    throw new Error(
      `Proxy already in use by slot ${existing.slot_id}. Each slot needs its own proxy.`
    );
  }
}

We enforce this at the system level. You physically cannot assign the same proxy to two active slots.

Proxy verification

Before a slot can start any module, we verify the proxy:

async function verifyProxy(proxyConfig) {
  const checks = {
    connectivity: false,
    speed: null,
    type: null,
    geo: null
  };

  // 1. Can we connect at all?
  try {
    const response = await fetchViaProxy('https://httpbin.org/ip', proxyConfig);
    checks.connectivity = true;
    checks.externalIp = response.origin;
  } catch (e) {
    return { valid: false, reason: 'connection_failed', checks };
  }

  // 2. Response time acceptable?
  const start = Date.now();
  await fetchViaProxy('https://httpbin.org/get', proxyConfig);
  checks.speed = Date.now() - start;

  if (checks.speed > 10000) {
    return { valid: false, reason: 'too_slow', checks };
  }

  // 3. Is it actually residential?
  const ipInfo = await lookupIp(checks.externalIp);
  checks.type = ipInfo.type; // residential, datacenter, mobile
  checks.geo = ipInfo.country;

  if (checks.type === 'datacenter') {
    return {
      valid: false,
      reason: 'datacenter_ip_detected',
      checks,
      message: 'Datacenter IPs are flagged by X. Use a residential proxy.'
    };
  }

  return { valid: true, checks };
}

We check three things:

Connectivity — does the proxy actually work?
Speed — is it fast enough for real-time automation? (10s timeout)
Type — is it residential/mobile, or datacenter?

We warn (but don't block) if a datacenter proxy is detected. Some operators have specific setups where datacenter proxies work for their use case. But the default recommendation is residential only.

Connection handling

Residential proxies are less reliable than direct connections. ISP routes change, residential gateways go offline, provider pools get exhausted. You need robust connection handling.

Retry with backoff

async function fetchWithProxy(url, proxyConfig, options = {}) {
  const maxRetries = options.retries || 3;
  const baseDelay = options.baseDelay || 2000;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      const response = await makeRequest(url, proxyConfig, {
        timeout: options.timeout || 15000
      });
      return response;
    } catch (error) {
      if (isProxyError(error)) {
        const delay = baseDelay * Math.pow(2, attempt);
        await sleep(delay);
        continue;
      }
      throw error; // Non-proxy error, don't retry
    }
  }

  throw new ProxyConnectionError(
    `Failed after ${maxRetries} attempts via ${proxyConfig.host}`
  );
}

Fail-safe: stop, don't switch

When a proxy fails permanently, the module stops. It does not:

Fall back to a direct connection (exposing the server IP)
Borrow another slot's proxy (creating IP correlation)
Retry indefinitely (wasting resources)

async function handleProxyFailure(slotId, error) {
  // Stop all modules for this slot
  await stopAllModules(slotId);

  // Log the failure
  log(slotId, 'proxy_failure', {
    error: error.message,
    action: 'modules_stopped'
  });

  // Notify the operator
  await notifyOperator(slotId,
    'Proxy connection failed. Modules stopped. Please check your proxy settings.'
  );
}

This is a safety-over-convenience decision. If the proxy fails and we fallback, the account suddenly appears from a different IP — potentially a datacenter IP on our server. That's worse than stopping.

Supported proxy formats

Operators use various proxy providers with different URL formats. We normalize them:

function parseProxy(input) {
  // Format: protocol://user:pass@host:port
  // Format: host:port:user:pass
  // Format: host:port (no auth)

  // Try URL format first
  try {
    const url = new URL(input);
    return {
      protocol: url.protocol.replace(':', ''),
      host: url.hostname,
      port: parseInt(url.port),
      username: url.username || null,
      password: url.password || null
    };
  } catch (e) {
    // Try colon-separated format
    const parts = input.split(':');
    if (parts.length === 4) {
      return {
        protocol: 'http',
        host: parts[0],
        port: parseInt(parts[1]),
        username: parts[2],
        password: parts[3]
      };
    }
    if (parts.length === 2) {
      return {
        protocol: 'http',
        host: parts[0],
        port: parseInt(parts[1]),
        username: null,
        password: null
      };
    }
  }

  throw new Error('Unrecognized proxy format');
}

Supporting multiple formats reduces support tickets by ~30%. Operators copy-paste from their provider's dashboard and it just works.

Proxy credential security

Proxy credentials (username/password) are sensitive — they grant access to paid proxy bandwidth and can be used to identify the operator.

We encrypt proxy credentials with the same AES-256-GCM scheme we use for auth tokens:

Encrypted at rest in the database
Decrypted at runtime when making requests
Never logged in plaintext
Stored per-slot, isolated from other slots

Cost optimization

Residential proxies are billed per GB of traffic. Social media automation is text-heavy (low bandwidth) but connection-heavy (many small requests).

Our traffic profile per slot:

Average daily bandwidth: 15-30 MB
Monthly bandwidth per slot: 0.5-1 GB
At $5-10/GB: $2.50-10/month per slot

Optimization techniques:

Minimize media downloads — don't load images in API responses unless needed
Reuse connections — keep-alive connections reduce TLS handshake overhead
Compress where possible — accept gzip/brotli encoding
Cache static data — user profiles don't change every request

These optimizations cut bandwidth by ~40% compared to naive implementation.

Lessons from 500+ concurrent connections

1. Proxy quality varies by provider and time of day.
The same provider can be great at 2 PM and terrible at 2 AM when their residential pool shrinks. Monitor response times and success rates per provider.

2. Geo-matching matters.
An account that's "located" in New York but routes through a London proxy is suspicious. Match proxy geography to the account's claimed location.

3. Sticky sessions are worth the premium.
Rotating IPs on every request looks more suspicious than a consistent IP. Use sticky sessions (same IP for hours/days) when your provider offers them.

4. Have a backup provider.
Residential proxy providers have outages. When your primary goes down, you need a secondary that's already configured and tested. We don't auto-failover (see fail-safe above), but we make it easy for operators to switch.

5. Monitor before your users complain.
A proxy that returns 200 but with 8-second latency is technically working but practically broken. Set latency alerts, not just availability alerts.

HelperX requires one residential proxy per X account. Proxy credentials are AES-256-GCM encrypted, verified on setup, and isolated per slot. Safe automation starts with clean infrastructure.

DEV Community