Fixing A Disk Read Loop In A PHP Script
The Server Status
I am a site administrator. I manage Linux servers. I have 15 years of experience. I do my work every day. I sit at my desk. I open my computer. I open my terminal program. I connect to a client server. I use the SSH protocol. I type my username. I type my password. I press the enter key. The server accepts my password. The screen shows a command prompt.
I check the routine system status. This is my daily habit. I type the uptime command. I press the enter key. The command prints a line of text. The text shows the server run time. The text shows the load average. The load average has three numbers. The numbers represent one minute, five minutes, and fifteen minutes. The one-minute load average is 8.5. The server has four CPU cores. A load average of 8.5 on a four-core server is high. The server is doing too much work. I need to find the reason. I do not guess the reason. I look at the system data.
The client owns this server. The client runs a business. The client has a website. The client updated the website yesterday. The client installed Monni - A Creative Multi-Concept Theme for Agencies and Freelancers. The theme changed the website appearance. The server load increased after this update. So, I start my investigation here.
The Diagnostic Path
Checking The System Resources
I need to see the active processes. I type the top command. I press the enter key. The program starts. The program clears the terminal screen. The program draws a table. The table updates every three seconds. I look at the top rows. The top rows show CPU statistics. I read the numbers. The user CPU time is 5%. The system CPU time is 2%. The wait CPU time is 45%.
The wait CPU time is the problem. The wait CPU time is the I/O wait. I/O means input and output. The CPU is fast. The disk is slow. The CPU wants data. The disk is reading the data. The CPU waits for the disk. The CPU does nothing while it waits. This causes the high load average. I know the server has a read or write issue.
I look at the process list in the table. I look at the command column. I see the php-fpm process. I see many php-fpm processes. They change positions. They use very little CPU. But they exist in the list. I press the Q key. The top program stops. The command prompt returns.
Profiling The Kernel
I need more specific data. I want to see what the kernel is doing. I use the perf tool. The perf tool is a Linux profiler. It reads performance counters. I type perf record -a -g. I press the enter key. The tool starts. The -a flag tells the tool to watch all CPUs. The -g flag tells the tool to record call graphs. Call graphs show the function paths.
I wait for fifteen seconds. I watch the blinking cursor. I press the CTRL key and the C key. This stops the tool. The tool writes the data to a file. The file name is perf.data. The tool prints a summary. The summary says it recorded many events.
I need to read the data. I type perf report. I press the enter key. The screen changes. The screen shows a list of functions. I look at the top function. The function takes 30% of the recorded time. The function name is vfs_read. The vfs_read function is a kernel function. The virtual file system uses this function. It reads data from files on the disk.
I press the right arrow key. The tool expands the call graph. I see the path. The path goes from vfs_read to sys_read. The path goes from sys_read to the PHP process. The php-fpm process calls the read function constantly. I press the Q key. The tool closes. I know PHP is reading files too much.
Inspecting Network Traffic
I want to rule out outside factors. Sometimes bad traffic causes server load. I check the network packets. I use the tcpdump tool. The tcpdump tool captures network packets. I type tcpdump -i eth0 port 80 -c 100. I press the enter key. The -i flag selects the network interface. The interface is eth0. The port 80 selects web traffic. The -c 100 flag limits the capture to 100 packets.
The packets scroll on the screen. The scrolling stops. I read the text. I look at the source IP addresses. I look at the destination IP addresses. I look at the TCP flags. I see SYN flags. I see ACK flags. I see PSH flags. The traffic is normal web traffic. The server receives HTTP GET requests. The server sends HTTP 200 OK responses. I do not see any strange patterns. The network is not the cause. The problem is inside the server.
Tracing Open Files
I need to know which file PHP is reading. I use the lsof tool. The lsof tool lists open files. I need a process ID. I type pgrep php-fpm. I press the enter key. The command prints a list of numbers. These are the process IDs. I pick the first number. The number is 4092.
I type lsof -p 4092. I press the enter key. The command prints a list. The list shows all files used by process 4092. I look at the NAME column. I see system libraries. I see PHP extension files. I see the Nginx socket file. I look at the bottom of the list. I see a website file. The file path is /var/www/html/wp-content/themes/monni/assets/data/locations.json.
I need to confirm this. I run the lsof command again. I use a different process ID. I type lsof -p 4095. I press the enter key. I look at the list. I see the exact same file. Every PHP process opens this .json file.
Web developers build many tools. They create layouts. They add features. Users Download WordPress Themes for these features. The themes contain PHP scripts. The scripts execute on the server. If a script has bad logic, the server suffers. I suspect this .json file is part of bad logic.
The Code Review
Examining The Target File
I need to look at the .json file. I change my directory. I type cd /var/www/html/wp-content/themes/monni/assets/data/. I press the enter key. I list the files. I type ls -lh. I press the enter key. The l flag shows details. The h flag shows human-readable sizes.
I look at the output. I see locations.json. I look at the file size. The size is 12 megabytes. This is a very large JSON file. A text file of 12 megabytes contains a lot of data.
I need to find the PHP code. The PHP code reads this file. I change my directory. I go to the theme root folder. I type cd /var/www/html/wp-content/themes/monni/. I press the enter key.
I search for the file name in the code. I use the grep tool. I type grep -rn "locations.json" .. I press the enter key. The r flag searches all folders. The n flag shows the line number. The . specifies the current folder.
The command prints one line. The line shows a match. The match is in a file. The file name is functions.php. The line number is 450.
Analyzing The PHP Logic
I open the functions.php file. I use the vim text editor. I type vim functions.php. I press the enter key. The editor opens. The screen fills with code. I type :450. I press the enter key. The cursor moves to line 450.
I read the code. The code defines a custom function. The function generates a map for the website footer. The map needs location data. The code calls the file_get_contents function. The file_get_contents function targets the locations.json file.
I look at the surrounding code. The code has a foreach loop. The loop iterates through website categories. The website has 40 categories. The custom function is inside the loop.
I understand the sequence. A visitor requests a page. Nginx passes the request to PHP. PHP runs the theme code. The code starts the loop. The loop runs 40 times. In each loop, PHP calls file_get_contents. PHP opens the 12-megabyte locations.json file. PHP reads the 12-megabyte file. PHP closes the file. PHP repeats this 40 times.
One page load causes 480 megabytes of disk read. Ten concurrent visitors cause 4,800 megabytes of disk read. The solid-state drive is fast. But it cannot handle this volume constantly. This creates the I/O wait. This causes the high load average. The logic is inefficient.
The Resolution
Modifying The Code
I must fix the code logic. I stay in the vim editor. I move the cursor. I use the arrow keys. I go to line 448. This is above the foreach loop.
I press the i key. The editor enters insert mode. I type a new line of code. I write $location_data = file_get_contents( get_template_directory() . '/assets/data/locations.json' );. I press the enter key. I write $parsed_locations = json_decode( $location_data, true );.
I move the cursor down. I go inside the loop. I delete the old file_get_contents line. I use the dd keyboard shortcut. I change the variable in the loop. The loop now reads the $parsed_locations array in the RAM.
This change is basic. The code now reads the disk one time. The code stores the 12 megabytes of data in the server RAM. The loop runs 40 times. The loop accesses the RAM 40 times. RAM operates in nanoseconds. The disk operates in milliseconds. The disk does not work during the loop.
I save the file. I press the ESC key. The editor leaves insert mode. I type :wq. I press the enter key. The editor writes the changes to the disk. The editor closes. The command prompt returns.
According to the official PHP documentation, "Memory allocation and data structures are handled internally by the Zend Engine" (The PHP Group). The Zend Engine manages the array in RAM efficiently.
Verifying The Fix
I must confirm the server status. I type the systemctl reload php8.1-fpm command. I press the enter key. The PHP service reloads the workers. The new code takes effect.
I check the load average. I type uptime. I press the enter key. I read the numbers. The one-minute load average is 6.0. It is dropping. I wait one minute. I type uptime again. I press the enter key. The one-minute load average is 2.1. The load is normal.
I check the CPU metrics. I type top. I press the enter key. I look at the wait CPU time. The wait CPU time is 0.5%. The I/O wait is gone. The disk is idle. The server responds quickly. I press the Q key. I stop the top program. I type exit. I press the enter key. The SSH connection closes.
Top comments (0)