We are currently undergoing one of the most impacting pandemics in human history. With the outbreak of the COVID-19, thousands of people are getting infected worldwide and thousands more are dying.
One of the major fronts in reducing the damage cause by this pandemic is to prevent its spread, "flattening the curve" to prevent overwhelming the health system capacity. But what is this curve?
One of the simplest mathematical models of the spread of diseases is the SIR model. It groups a total population of hosts into three categories and defines the flow of people between them.
The model gets its name after its three groups: Susceptible, people who haven't come in contact with the disease yet; Infected, all people who are
carriers, which may or may not have symptoms; and the Recovered group, a somewhat euphemistic blanket term for people who either have gotten better or have died due to complications.
The model also defines the flow of people going from the susceptible to infected groups and from infected to recovered.
The first flow is proportional to the size of the susceptible group (since there are more available hosts), the size of the infected group (since more people are spreading the disease) and a fixed rate we are calling the "infection rate".
This infection rate is a collection of parameters that includes the disease's infectivity, preventive measures taken by people, amongst many others, all condensed into a single parameter.
There's another flow of people that go from the infected to the recovered group. This includes both people who got better (and are not further infecting others) and people who are dying because of the disease.
This flow is also simplified to be only proportional to the number of infected people and a fixed "recovery rate".
Before we can calculate and graph this curve, we will need the
matplotlib libraries, respectively:
pip install scipy matplotlib
Starting our Python file, we define the initial parameters: the simulation length, the percentage of initially infected people (from 0.0 to 1.0) and the infection and recovery rates mentioned before:
MAX_TIME = 20.0 infected_start = 0.01 infection_rate = 2.5 recovery_rate = 0.3
We'll assign the remaining percentage of the population to the susceptible group, leaving the recovered group initially empty:
susceptible_start = 1.0 - infected_start recovered_start = 0.0
In order to calculate our curve, we will need to borrow a technique from calculus. We will integrate a set of ordinary differential equations. All this does is apply the rate of change we defined over a set period of time. Luckily,
scipy can do all the hard work for us! In order to calculate it, we only need to define an initial state of our model and the expressions that define its rate of change.
The initial state of our model is the size of each of the groups:
initial_state = [susceptible_start, infected_start, recovered_start]
To define the rate of change we need a function of the current state and time.
Since the SIR model doesn't change with time, we will ignore the time parameter.
The rate change of each group is declared sequentially:
state_change_rate = lambda state, _time: [ -state * state * infection_rate, state * state * infection_rate - state * recovery_rate, state * recovery_rate, ]
Then we define the simulation length, using
linspace function to create 100 regularly spaced samples between 0 and
import numpy as np time_series = np.linspace(0, MAX_TIME, 100)
The following step is where the magic happens. By having the initial state, the rate of change and the time series, the
scipy library is capable of a numerical integration for the defined amount of time:
from scipy.integrate import odeint state_series = odeint(state_change_rate, initial_state, time_series)
We are done! The
state_series variable now holds our state change over time.
We can now extract the three timeseries for each of the population groups:
susceptible_series = state_series[:, 0] infected_series = state_series[:, 1] recovered_series = state_series[:, 2]
Finally, if we want to see the graph, we can use
matplotlib to execute that in a few simple steps.
First we plot the three curves:
import matplotlib.pyplot as plt plt.plot(time_series, susceptible_series, label="Susceptible") plt.plot(time_series, infected_series, label="Infected") plt.plot(time_series, recovered_series, label="Recovered")
Label both axis and enable legend:
plt.xlabel("Time") plt.ylabel("Population") plt.legend()
And at last, render our graph:
We can see that at one point, over 60% of the population was infected at the same time. This is much more than doctors, nurses, and all the people in the health system are able to handle simultaneously.
The two main parameters we can change in this simulation are the infection rate and the recovery rate.
On a real world example, methods such as social distancing, the usage of masks and gloves, and home quarantines are all actively trying to reduce the infection rate.
The recovery rate may not be easily controllable, since it depends on various factors such as the disease's characteristics and the infrastructure of the health system.
We can go back to the code change the infection rate from 2.5 down to 0.75:
infection_rate = 0.75
And see how that reflects on the final curve:
We can see that the infected group maximum is much less critical and also the susceptible group never drops all the way to zero, which means that there is a certain percentage of the population that never comes across the disease.
Although this is a simplified model of a real world phenomenon, it captures most of the dynamics that governs this system and also highlights the importance of the preventive measurements currently in place.
Stay safe, fellow earthlings!