COVID-19 Modeling

Part 0: Project Background

Why bother? During this time of quarantining and job searching I decided to take a break from vehicle dynamics modeling and try my hand at a data science project. It seems only fitting with the current quarantine - haha get it fitting because we’re going to be fitting data - to work on infectious disease modeling.

WARNING #1: I am by no means an expert at modeling infectious diseases and am not trying to convince you that my model can perfectly predict COVID-19 cases all over the world. Instead I’ll derive the equations, show a few simple models, and hopefully get some good correlation at the end with some actual coronavirus data. My goals for this project are to gain more experience using Python’s scientific computing packages (SciPy, Matplotlib, NumPy, and Pandas) through solving ODEs, plotting the results, and fitting data to my model. It’s important to note too that I am not the first one to create these models. Below are several links to already existing models which I used for inspiration.

Interactive Models:

WARNING #2: I am certain I have made mistakes in this article, but I’ll do my best to correct them as quickly as possible. If you see any problems, please reach out to me.

Now that we have that stuff out of the way…let’s get started.

Part 1: Model Background

In order to model an infectious disease, we must first understand how it spreads. Once we know that we look to find out information on how quickly it spreads, what ratio of the population they infect, what ratio dies, etc. A very simple approach to modeling this scenario is with a compartmental model. This model breaks up the population into several compartments such as:

Susceptible (can be infected but are healthy)
Infected
Recovered (were previously infected and cannot be infected again)

This specific model is a SIR model. There are several more complex versions of this model, but for now let’s use this simpler model to derive the system equations and gain an understanding for the system dynamics.

First let’s introduce the the most important variables and their definitions. Then we’ll get into a simple example to show how they interact.

N: total population
S(t): number of people susceptible on day t
I(t): number of people infected on day t
β: expected number of people an infected person infects per day
D: number of days an infected person has and can spread the disease
γ: the proportion of infected people recovering per day (γ = 1 / D)
Ro: the total number of people an infected person infects (Ro = β / γ )

Here is a simple example created by Henri Froese. Let’s say we have a population N = 1000 (for instance 1000 people) and we know that 400 people are infected at a time t (for instance t = 7 days after the outbreak of the disease). This is denoted by S(7) = 400. So, this allows us to get all values for S(t), I(t), ad R(t) for all days t.

Let’s say we have a new disease called disease X. For this disease the probability of an infected person to infect a healthy person is 20%. The average number of people a person is in contact with per day is 5. So, per day, an infected individual meets 5 people and infects each with 20% probability. We expect this individual to infect 1 person (20% * 5 = 1) per day. This β is the expected number of people an infected person infects per day.

From this is is clear that D, the number of days that an infected person has can can spread the disease, is very important. If D = 7, an infected person walks around for 7 days spreading the disease and infects 1 person per day (because β = 1). Thus, we can expect an infected person to infect 1*7 (1 person per day for 7 days) = 7 other people. This is how we arrive at the basic reproduction number Ro, the total number of people an infected person infects. The last thing to note is γ. This can be thought of as the recovery rate, or the proportion of infected people recovering per day.

Now we need to determine the number of susceptible, infected, and recovered people in our population. Say we have 60 people infected on day t (ie I(t) = 60), the total population is 100 (N = 100), and 30 people are still susceptible (S(t) = 30 and R(t) = 100-60-30 = 10). Easy enough.

What about the next day? We have 60 infected people and each of them infects 1 person per day. However, only 30/100 = 30% of the population are still susceptible and can be infected (S(t) / N). So if these people infect 60 * 1 * 30/100 = 18 people. So if we plug in the variables we have our first formula:

Change of S(t) to the next day: S’(t) = - β * I(t) * S(t) / N

Now for the infected people. We just saw that some new people are infected, and we know that exactly the amount of people that “leave” S(t) “arrive” at I(t). We already have a formula to represent this, we just need to change the sign. However, one thing is missing: some of the infected people recover. We already have a metric for this γ. Remember it is the proportion of infected people recovering per day. We have 60 infected people and γ = 1/3, so only 1/3 of the 60 people recover which is 20 people.

Change of I(t) to the next day: I’(t) = β * I(t) * S(t) / N - γ * I(t)

The last part of this is the change in recovered people. Again, this is pretty straight forward since the newly recovered people are the 20 people we just determined. So, there are no people leaving the “recovered” compartment. Recall that once you recover you say immune. Letting people become infected after they recover is another variable we’ll explore later.

Change of R(t) to the next day: R’(t) = γ * I(t)

This is fantastic. Now we have created a solid understanding of how this system works and derived the necessary equations to represent the system. Huge thank you to Henri Froese for creating such an elegant example to help explain this system.

COVID-19 Modeling

Part 0: Project Background

Part 1: Model Background

Part 1a: Programming the Model