DEV Community

Arina Cholee
Arina Cholee

Posted on

Saving the System in 7 Days: How SafeLine WAF Rescued Our E-commerce Platform

As a freshly minted IT operations engineer, I found myself in a situation most seasoned professionals would dread—on the seventh day of my new job, the e-commerce order system collapsed, and the customer support line was overwhelmed. After frantic troubleshooting and multiple failed attempts, I turned to a free Web Application Firewall, and it worked wonders. This article recaps how I saved the day with SafeLine WAF and shares key insights for other new Ops engineers dealing with similar challenges.

The Crisis: The Order System Crashed

It was just another hectic morning when I logged in to work. Almost immediately, the monitoring system sent a flurry of alerts: all 6 Apache servers were maxed out with 99% CPU usage. The response time for the order API went from 200ms to 3 seconds, and users were getting an error message saying "Payment Failed" despite the money being deducted.

To make matters worse, I couldn’t pinpoint the root cause. The senior Ops engineer was on leave, and I had to figure this out on my own. After digging through the logs with the command tail -f /var/log/httpd/access_log, I was shocked. There were over 1,000 requests per minute from the same IP address, all hitting the /api/order/pay endpoint with suspicious parameters like amount=-100, which was an obviously invalid payment amount.

I tried blocking the IP address, but within minutes, new attack IPs kept flooding the system. I couldn’t keep up, and the pressure was mounting as management was asking for updates. I was sweating, but then I remembered something from my onboarding training—a mention of SafeLine WAF.

The Turning Point: SafeLine WAF to the Rescue

In my desperate state, I quickly googled SafeLine WAF and was relieved to find that it had strong capabilities for Apache clusters. It was free, easy to integrate, and had a one-click deployment script. With no time to waste, I decided to give it a shot, hoping it might help.

The Solution: How SafeLine WAF Saved the Day

Our infrastructure consisted of CentOS 7.6 with Apache 2.4 running on 6 nodes, and an NGINX load balancer. I followed the steps to deploy SafeLine WAF and, to my surprise, the whole process took less than an hour, and the results were immediate.

Step 1: Setting Up the Primary WAF Node (15 Minutes)

First, I set up one of the servers as the primary WAF node, which would handle the control panel and rule management, while the other 5 servers would act as proxy nodes that would sync rules from the primary node.

Here’s how I deployed the primary WAF:

  1. Install Docker and NFS (for rule sharing across the nodes):
   yum install -y nfs-utils && curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
Enter fullscreen mode Exit fullscreen mode
  1. Start services and set them to auto-start on boot:
   systemctl start docker nfs-server && systemctl enable docker nfs-server
Enter fullscreen mode Exit fullscreen mode
  1. Create a shared directory for WAF rules:
   mkdir -p /data/waf/rules && chmod 777 /data/waf/rules
Enter fullscreen mode Exit fullscreen mode
  1. Deploy SafeLine WAF (using Docker):
   docker run -d --name safeline-master -p 80:80 -p 443:443 -v /data/waf/rules:/etc/safeline/rules --restart=always safeline/waf:community
Enter fullscreen mode Exit fullscreen mode
  1. Configure WAF: I logged into the SafeLine control panel via the server’s IP and added the website. I configured specific rules for the /api/order/pay endpoint, such as blocking any amount less than 0 and limiting requests per IP to 20 per minute.

Step 2: Deploying the Proxy Nodes (10 Minutes per Node)

Next, I set up the remaining 5 Apache servers as proxy nodes. They would pull the latest rule configurations from the primary node, reducing the need for repeated manual configurations.

  1. Install Docker and NFS Client:
   yum install -y nfs-utils && curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
Enter fullscreen mode Exit fullscreen mode
  1. Start Docker and mount the shared directory:
   systemctl start docker && systemctl enable docker
   mount 192.168.0.10:/data/waf/rules /data/waf/rules  # 192.168.0.10 is the primary node IP
Enter fullscreen mode Exit fullscreen mode
  1. Deploy the WAF Proxy on each node:
   docker run -d --name safeline-slave -p 80:80 -p 443:443 -v /data/waf/rules:/etc/safeline/rules --restart=always safeline/waf:community --proxy-mode
Enter fullscreen mode Exit fullscreen mode

Step 3: Configuring the Load Balancer (3 Minutes)

I modified the NGINX load balancer configuration to route traffic through the SafeLine WAF cluster instead of directly to Apache:

upstream waf_cluster {
    server 192.168.0.10:80;  # Primary WAF node
    server 192.168.0.11:80;  # Proxy Node 1
    server 192.168.0.12:80;  # Proxy Node 2
    server 192.168.0.13:80;  # Proxy Node 3
    server 192.168.0.14:80;  # Proxy Node 4
    server 192.168.0.15:80;  # Proxy Node 5
}

server {
    listen 80;
    server_name order.xxx.com;
    location / {
        proxy_pass http://waf_cluster;
        proxy_set_header Host $host;
    }
}
Enter fullscreen mode Exit fullscreen mode

After applying the changes with nginx -s reload, the entire protection system was live and fully operational.

The Results: System Restored in 5 Minutes

Five minutes after deployment, the system was up and running again:

  • CPU usage dropped from 99% to 38%.
  • All abnormal requests were blocked (100% interception rate), and the logs were filled with messages like “Amount error blocked” and “High-frequency request blocked.”
  • The order API response time dropped back to 190ms, and users were able to complete their orders without issues. The phone stopped ringing off the hook.

When the boss came by for a check, he was impressed with how quickly the system was stabilized. When I explained that I used SafeLine WAF, a free tool, he was surprised—he hadn’t expected such a robust solution from a free service.

Key Takeaways: What I Learned

As a new Ops engineer, here are the three main lessons I learned from this experience:

  1. Stay Calm and Check Logs First: In the heat of the moment, logs are your best friend. Identifying abnormal IPs and request patterns is crucial.
  2. Use Tools that Simplify Setup: Tools like SafeLine that support master-slave synchronization can save you time and effort in a clustered environment.
  3. Don’t Overcomplicate Things: Even free tools can be incredibly powerful. SafeLine’s proven protection was a lifesaver and much more reliable than trying to configure custom rules manually.

Since then, I’ve documented this setup, and now it’s the go-to solution for any new projects. If you’re an Ops newbie facing a similar challenge, try SafeLine—it might just save your system and your sanity.

official Website: https://safepoint.cloud/landing/safeline

Top comments (0)