In a previous chapter, I covered how to simulate hundreds of coin flips. Now, we are familiar with many additional topics, including functions, conditional statements, and loops. In this chapter, I will show a somewhat complicated simulation that combines several of these concepts. This simulation will simulate the spread of a disease in a population. It will have a lot of moving parts, so it will be a good exercise in combining the concepts we have learned so far.
The code below is selectable, so you can copy and paste it into RStudio to look it over and run it. I will split the code into sections: setup, functions, and simulation. The setup section will contain the parameters of the simulation, such as the number of individuals in the population and the probabilities of infection and recovery. The functions section will contain the functions that will be used in the simulation. The simulation section will contain the code that actually runs the simulation.
Although it will be quite the project, put in the effort to understand the code below. Refer to the variable names to try to get a grasp of what is happening. It will be a good exercise in understanding how to combine the concepts, and will help you to start thinking creatively about coding.
# Here is an example of a simulation in R. It is an
# epidemiological simulation that uses a lattice network (think
# chessboard where your neighbors are N, S, E or W of you) to
# describe how individuals can spread disease.
### Simulation parameters
set.seed(250)
lattice.width <- 15
total.timesteps <- 500
frame.duration <- 0.1 # in seconds
number.of.individuals <- lattice.width ^ 2
center <- round(lattice.width / 2)
# probabilities
infection.probability <- 0.1
recovery.probability <- 0.1
loss.of.resistance.probability <- 0.02
# change this to 0.01 to see no cycles
# printed characters
susceptible.symbol <- " "
infected.symbol <- "@"
resistant.symbol <- "_"
### Functions
clearScreen <- function () {cat("\f")}
eventOccurs <- function (prob) {runif(1) < prob}
makePopulationMatrix <- function () {
population.matrix <- matrix(
rep(susceptible.symbol, number.of.individuals),
ncol=lattice.width
)
population.matrix[center, center] <- infected.symbol
return (population.matrix)
}
makeGraph <- function () {
plot(
NULL,
xlim=c(0, total.timesteps),
ylim=c(0, number.of.individuals),
xlab="Time step",
ylab="Count"
)
legend(
"topright",
legend=c("susceptible", "infected", "recovered"),
col=1:3,
pch=1
)
stats <- c(number.of.individuals - 1, 1, 0)
for (i in 1:length(stats)) {
points(1, stats[i], col=i)
}
}
changeIndex <- function (index, delta) {
new.index <- index + delta
if (new.index == 0) {
return (lattice.width)
} else if (new.index > lattice.width) {
return (1)
} else {
return (new.index)
}
}
getPopulationStats <- function (population.matrix) {
stats <- c()
for (char in c(susceptible.symbol, infected.symbol, resistant.symbol)) {
stats <- c(stats, sum(population.matrix == char))
}
return (stats)
}
updateGraphAndPrintMatrix <- function (population.matrix, current.time) {
stats <- getPopulationStats(population.matrix)
susceptible.count <- stats[1]
infected.count <- stats[2]
resistant.count <- stats[3]
clearScreen()
# update graph every 5th point
if (current.time %% 5 == 0) {
for (i in 1:length(stats)) {
points(current.time, stats[i], col=i)
}
}
# print time step information
cat(paste0(
"\n",
"Time step: ", current.time, " / ", total.timesteps, "\n"
))
# print population matrix
for (row.number in 1:lattice.width) {
cat(paste(population.matrix[row.number,], collapse=" "))
cat("\n")
}
# print current stats
cat(paste0(
"\n",
"# susceptible: ", susceptible.count, "\n",
"# infected: ", infected.count, "\n",
"# resistant: ", resistant.count, "\n"
))
}
getPopulationNextTimestep <- function (population.matrix) {
# make a copy so that population.matrix can be a reference for the past time step
next.timestep.matrix <- population.matrix
# check each individual's state
for (row.number in 1:lattice.width) {
for (column.number in 1:lattice.width) {
current.individual <- population.matrix[row.number, column.number]
# handle sick individuals
if (current.individual == infected.symbol) {
# get each neighbor index
up <- c(changeIndex(row.number, -1), column.number)
down <- c(changeIndex(row.number, 1), column.number)
left <- c(row.number, changeIndex(column.number, 1))
right <- c(row.number, changeIndex(column.number, -1))
# check if neighbors get infected
for (indices in list(up, down, left, right)) {
neighbor.row <- indices[1]
neighbor.column <- indices[2]
neighbor.is.susceptible <- population.matrix[neighbor.row, neighbor.column] == susceptible.symbol
if (neighbor.is.susceptible & eventOccurs(infection.probability)) {
next.timestep.matrix[neighbor.row, neighbor.column] <- infected.symbol
}
}
# check if sick individual recovers
if (eventOccurs(recovery.probability)) {
next.timestep.matrix[row.number, column.number] <- resistant.symbol
}
# handle resistant individuals
} else if (current.individual == resistant.symbol & eventOccurs(loss.of.resistance.probability)) {
next.timestep.matrix[row.number, column.number] <- susceptible.symbol
}
}
}
return (next.timestep.matrix)
}
### Simulation
population.matrix <- makePopulationMatrix()
makeGraph()
for (timestep in 1:total.timesteps) {
updateGraphAndPrintMatrix(population.matrix, timestep)
population.matrix <- getPopulationNextTimestep(population.matrix)
Sys.sleep(frame.duration)
}
In the code above, there are a lot of moving parts. However, the names of variables are selected to be descriptive, so you should be able to understand some of the structure of the code. Try to walk through the entirety of the code, making observations about what functions reference each other. When you see a bit of code that you don't understand, do some research online to get a grasp of what is going on.
When you run this code, what kind of trends do you see in the simulated data? What happens when you change the probabilities of infection and recovery? What happens when you change the loss of resistance probability?
Try making your own simulation that answers a question of interest to you. For example, you might do a simulation that models the outcomes of a Galton board, or the spread of a rumor amongst a friend group, or the spread of a wildfire. For your simulation, you will need to think about the following ideas:
Remember that writing pseudo-code here is pivotal! It will help you break down the big problem into smaller problems, and will help you to think about how to structure your code.