DEV Community

Caleb Nkunze
Caleb Nkunze

Posted on • Originally published at Medium on

The Day the Web Stood Still: A Firewall Configuration Catastrophe


World Wide Where (thanks to DALL-E)

“Welcome to the Great Firewall Fiasco of 2024 , where our beloved website transformed from the ‘World Wide Web’ into the ‘World Wide Where?’. On a seemingly ordinary Tuesday morning — or as we now refer to it, ‘The Day of Digital Silence’ — our team found itself in an unexpected game of digital hide-and-seek with our own website. The culprit? A mischievous firewall rule that decided to play the overzealous bouncer, turning away all the Internet traffic like unwanted party crashers.

Imagine this: One minute, you’re sipping your artisan coffee, basking in the glow of your smoothly running website. The next minute, you’re staring into the abyss of a 500 error, wondering if the Internet gods have forsaken you. Yes, folks, we had successfully turned our website into the world’s most exclusive online club — so exclusive that no one could get in, not even ourselves!

Join me on this epic tale of confusion, revelation, and redemption as we unravel the mystery behind the digital blackout that left 70% of our users pondering if our website had taken an impromptu vacation. Spoiler alert: It involves a firewall rule with a mind of its own and a team of IT detectives who’ve watched one too many episodes of ‘Sherlock’.”

Issue Summary

  • Outage Duration: January 10, 2024, 10:00 AM — 11:30 AM UTC
  • Impact: The main website was down, resulting in a complete service outage. Approximately 70% of our users were unable to access our services.
  • Root Cause: A misconfigured firewall rule blocked incoming traffic to the web server.


Firewall Stop (Thanks to DALL-E)

Timeline

  • 10:00 AM: Issue detected by our monitoring system alert.
  • 10:05 AM: Initial investigation by the IT team suggested a server overload.
  • 10:20 AM: Further analysis revealed normal server loads and memory usage.
  • 10:35 AM: Customer support reported multiple user complaints.
  • 10:50 AM: Network team joined the investigation, suspecting network issues.
  • 11:00 AM: Discovered a recent firewall rule update was incorrectly applied.
  • 11:15 AM: Firewall rule corrected to allow web traffic.
  • 11:30 AM: Service fully restored; monitoring confirmed.


I.T Team work (Thanks to DALL-E)

Root Cause and Resolution

  • Root Cause: The outage was caused by a new firewall rule that inadvertently blocked incoming HTTP and HTTPS traffic to our web server.
  • Resolution: The firewall rule was corrected to allow traffic on ports 80 and 443, restoring normal service.

Corrective and Preventative Measures

Improvements:

  • Implement a review process for all firewall rule changes.
  • Enhance monitoring to include network-level checks.

Tasks:

  • Task 1: Update the firewall management procedure to require peer review.
  • Task 2: Configure additional network traffic monitoring on firewall.
  • Task 3: Conduct a team training session on network security best practices.

In the end, our computer whizzes fixed the grumpy firewall with some quick finger-tapping and teamwork. The internet started working again, and all the little data bits went zooming through happily. It just goes to show, a few laughs and some good friends make even the biggest computer boo-boos better!

Top comments (0)