DEV Community

Todd H. Gardner for TrackJS

Posted on

The Hidden Cost of Silent API Failures in Production

The checkout flow worked perfectly in staging. All the tests passed. The team celebrated shipping on time. Three weeks later, you're in an emergency meeting explaining how you lost $50,000 in revenue because nobody knew the payment API was returning HTML error pages instead of JSON.

This is a true story, just with the numbers rounded for their protection.

The Perfect Storm Nobody Saw Coming

Here's what happened: Their payment processor's API started having intermittent issues. Nothing major, just occasional 503 errors during high load. The kind of thing that happens to every API eventually.

But instead of returning a JSON error response, the payment gateway's load balancer served its default HTML maintenance page. The frontend code tried to parse this HTML as JSON, hit an Unexpected token '<' error, and silently swallowed the exception in a poorly-written try-catch block.

// The code that cost $50,000
try {
  const response = await fetch('/api/process-payment');
  const data = await response.json(); // 💥 Dies here when HTML returned

  if (data.success) {
    showSuccessMessage();
  } else {
    showErrorMessage(data.error);
  }
} catch (e) {
  // Developer assumed this would only catch network errors
  console.log('Network error, will retry...');
  // But it also caught JSON parsing errors
  // Customer sees nothing, assumes payment worked
}
Enter fullscreen mode Exit fullscreen mode

Customers would click "Complete Purchase," see a spinner, then... nothing. No error message. No success message. Many assumed it worked and left. Others tried multiple times, creating duplicate charges when the API recovered. Some gave up and bought from a competitor.

The worst part? This ran for three weeks before anyone noticed.

Why Silent Failures Are Your Biggest Threat

Loud failures are easy. Database down? Customers see a maintenance page. Everyone knows something's wrong.

Silent failures are insidious. They look like success. But revenue is quietly bleeding out.

1. Mismatched Content Types

Your code expects JSON, but gets:

  • HTML error pages from load balancers
  • XML from legacy systems
  • Plain text from misconfigured endpoints
  • HTML login pages from expired auth

2. Overly Optimistic Error Handling

// What junior devs write
try {
  return await apiCall();
} catch {
  return defaultValue; // "It'll be fine"
}

// What senior devs write after being burned
try {
  const response = await apiCall();

  // Validate EVERYTHING
  if (!response.ok) {
    throw new Error(`API returned ${response.status}`);
  }

  const contentType = response.headers.get('content-type');
  if (!contentType?.includes('application/json')) {
    throw new Error(`Expected JSON, got ${contentType}`);
  }

  return await response.json();
} catch (error) {
  // Log to monitoring service
  errorReporter.captureException(error, {
    endpoint: '/api/endpoint',
    status: response?.status,
    contentType: response?.headers.get('content-type')
  });

  // User sees actual error
  showUserError('Payment processing temporarily unavailable');
  throw error; // Re-throw to prevent silent failure
}
Enter fullscreen mode Exit fullscreen mode

3. Missing Observability Layers

Most teams monitor infrastructure metrics but miss application-level failures:

// Infrastructure says everything is fine
 Server: 200 OK
 Response time: 145ms  
 Memory usage: 62%

// But the actual response was:
<!DOCTYPE html>
<html>
<head><title>503 Service Unavailable</title></head>
<body>Maintenance in progress</body>
</html>

// Which caused:
 JSON parsing failed
 Payment flow broken
 Customer charged but no order created
 $50,000 in lost revenue
Enter fullscreen mode Exit fullscreen mode

The True Cost of Silent Failures

Let's do the math on that $50,000 loss:

const impactAnalysis = {
  directLoss: {
    failedTransactions: 823,
    averageOrderValue: 47.50,
    lostRevenue: 39_092.50
  },

  indirectLoss: {
    customerServiceHours: 120,
    hourlyCost: 35,
    laborCost: 4_200
  },

  reputationDamage: {
    negativeReviews: 47,
    estimatedLifetimeValueLost: 8_900
  },

  technicalDebt: {
    emergencyFixHours: 40,
    developerHourlyCost: 150,
    rushDeploymentCost: 6_000
  },

  totalImpact: 58_192.50
};

console.log(`Actual cost: $${totalImpact.toLocaleString()}`);
// "But our monitoring showed 99.9% uptime!" 🤡
Enter fullscreen mode Exit fullscreen mode

Building a Defense System

You can't prevent all API failures, but you can prevent them from being silent. Here's your defensive playbook:

Layer 1: Paranoid Parsing

Never trust external data:

class APIClient {
  async safeFetch(url, options = {}) {
    const response = await fetch(url, {
      ...options,
      headers: {
        'Accept': 'application/json',
        ...options.headers
      }
    });

    // Log non-JSON responses BEFORE parsing
    const contentType = response.headers.get('content-type');
    if (!contentType?.startsWith('application/json')) {
      // Capture the actual response for debugging
      const text = await response.text();

      this.logError({
        message: 'Non-JSON response received',
        url,
        status: response.status,
        contentType,
        preview: text.substring(0, 200)
      });

      throw new Error(
        `API returned ${contentType || 'unknown'} instead of JSON`
      );
    }

    return response;
  }
}
Enter fullscreen mode Exit fullscreen mode

Layer 2: Circuit Breakers with Metrics

Track failure patterns, not just failures:

class MonitoredCircuitBreaker {
  constructor(name, options) {
    this.name = name;
    this.failures = [];
    this.threshold = options.threshold || 5;
    this.timeout = options.timeout || 60000;
  }

  async execute(fn) {
    if (this.isOpen()) {
      this.metrics.circuitOpen(this.name);
      throw new Error(`Circuit breaker ${this.name} is open`);
    }

    try {
      const start = Date.now();
      const result = await fn();

      this.metrics.success(this.name, Date.now() - start);
      this.reset();
      return result;

    } catch (error) {
      this.recordFailure(error);

      // Track the TYPE of failure
      if (error.message.includes('Unexpected token')) {
        this.metrics.htmlResponse(this.name);
      } else if (error.message.includes('timeout')) {
        this.metrics.timeout(this.name);
      }

      throw error;
    }
  }

  recordFailure(error) {
    this.failures.push({
      timestamp: Date.now(),
      error: error.message,
      stack: error.stack
    });

    // Keep only recent failures
    const cutoff = Date.now() - this.timeout;
    this.failures = this.failures.filter(f => f.timestamp > cutoff);

    if (this.failures.length >= this.threshold) {
      this.trip();
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Layer 3: User-Visible Degradation

When things fail, fail loudly to the user:

// Instead of silent failure
function SilentCheckout() {
  const [processing, setProcessing] = useState(false);

  async function handleCheckout() {
    setProcessing(true);
    try {
      await processPayment();
      // Success path
    } catch {
      // User sees nothing! 
    }
    setProcessing(false);
  }
}

// Explicit failure states
function ResilientCheckout() {
  const [state, setState] = useState('idle');
  const [error, setError] = useState(null);

  async function handleCheckout() {
    setState('processing');
    setError(null);

    try {
      await processPayment();
      setState('success');

    } catch (error) {
      setState('error');

      // User ALWAYS sees something went wrong
      if (error.message.includes('Unexpected token')) {
        setError('Payment service is temporarily unavailable. Please try again in a few minutes.');

        // Track for analytics
        analytics.track('payment_api_html_response', {
          error: error.message
        });

      } else {
        setError('Payment failed. Please check your information and try again.');
      }

      // Log for developers
      errorReporter.captureException(error);
    }
  }

  return (
    <div>
      {state === 'error' && (
        <Alert severity="error">
          {error}
          <Button onClick={handleCheckout}>Retry Payment</Button>
        </Alert>
      )}
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

Layer 4: Proactive Monitoring

Don't wait for customers to complain:

// Synthetic monitoring - run every 5 minutes
async function syntheticCheckoutTest() {
  const testCard = '4111111111111111';

  try {
    const response = await fetch('/api/process-payment', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        amount: 1.00,
        card: testCard,
        test: true
      })
    });

    // Validate response format, not just status
    const contentType = response.headers.get('content-type');
    if (!contentType?.includes('json')) {
      await alertOncall({
        severity: 'critical',
        message: 'Payment API returning HTML',
        contentType,
        response: await response.text()
      });
    }

  } catch (error) {
    await alertOncall({
      severity: 'critical', 
      message: 'Synthetic checkout test failed',
      error: error.message
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

The Morning After Fix

After the $50,000 incident, here's what the team implemented:

  1. Mandatory response validation - Every API call validates content-type before parsing
  2. Error budgets - If JSON parsing errors exceed 0.1%, alerts fire
  3. Customer-facing error messages - No more silent failures
  4. Replay capability - Failed transactions can be retried when service recovers
  5. Real-time revenue monitoring - Sudden drops trigger immediate investigation

Most importantly, they learned that Unexpected token errors aren't just annoying console noise - they're canaries in the coal mine warning you about API contract violations.

Your Action Items

Stop reading and go check your production logs right now. Search for:

  • Unexpected token <
  • JSON.parse errors
  • SyntaxError
  • Empty catch blocks in payment/checkout code

If you find any, you might be losing money right now.

Then implement:

  1. Today: Add content-type validation to your API client
  2. This week: Set up monitoring for JSON parsing errors
  3. This sprint: Add circuit breakers to critical paths
  4. This quarter: Run chaos engineering tests with HTML injection

Because the next silent failure might not just cost $50,000. It might cost you a customer who never comes back.


Remember: In production, the scariest errors aren't the ones that throw exceptions - they're the ones that don't.

Top comments (0)