This tutorial demonstrates methods for dynamic visualization of the
spread of infectious disease over networks using the ndtv
package for R. ndtv
is part of the larger
Statnet
suite of software for the representation, modeling,
and analysis of network data. It has been designed to work with
network-based epidemic modeling data from EpiModel
.
We need to start by installing the ndtv
package. This
package version has just been updated on CRAN. You can first start by
trying:
install.packages("ndtv")
If that does not work (because binaries for ndtv
may not
yet be available on CRAN), you can try with:
# install this package in case you do not already have it installed
install.packages("remotes")
remotes::install_github("statnet/ndtv", subdir = "ndtv")
Then check the package version with, which should match the version number below.
library("ndtv")
packageVersion("ndtv")
## [1] '0.13.2'
We start by loading EpiModel
. The
htmlwidgets
library additionally allows for embedding
ndtv
movies into RMarkdown HTML pages like this (you do not
need to install/load it for this course but we need it to build this
page).
library("EpiModel")
library("htmlwidgets")
In general, one primary goal of dynamic network visualization is to gain insight into how the structure of networks over time can influence transmission dynamics. This is different from estimating epidemic outcomes such as disease incidence or prevalence over time. The latter will typically require large networks (thousands of nodes) over large numbers of time steps (decades of time) with many hundreds or thousands of simulations. This allows for more accurate and stable estimation of outcome averages and spread. Dynamic visualization, on the other hand, usually only requires a single simulation of a small network size over a short number of time steps. That is because the visualization is meant to demonstrate the mechanisms rather than estimate the effects.
We start by setting a seed to ensure reproducible results. You can skip this to get different results from the tutorial page.
set.seed(1234)
This shows an SI epidemic in a closed population where the infection probability is 1, and we will start to build up complexity to the network model step-by-step. We start with a network of 100 nodes, with an edges-only model in the formation and dissolution.
nw <- network_initialize(n = 100)
formation <- ~edges
target.stats <- 40
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 20)
est <- netest(nw, formation, target.stats, coef.diss)
For the epidemic model, we will replicate the poker chip example in which the transmission probability per act is 100%. We could modify this to be a lower (more relistic for most diseases) value, but we choose a high value here to show a speedy epidemic spread. We simulate the model over 25 time steps with a single simulation.
param <- param.net(inf.prob = 1)
init <- init.net(i.num = 1)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)
ndtv
is an extension package in Statnet
that allows for dynamic visualization for networks over time. Because
this involves animation processing, this may require some heavy duty
computation.
First we need to extract the network objects from the larger
netsim
object. This was one of the key uses of this object
type first discussed in Session 3 today. Note that this object has
details on the distinct time changes.
nw <- get_network(sim)
nw
## NetworkDynamic properties:
## distinct change times: 27
## maximal time range: -Inf until Inf
##
## Dynamic (TEA) attributes:
## Vertex TEAs: testatus.active
##
## Includes optional net.obs.period attribute:
## Network observation period info:
## Number of observation spells: 2
## Maximal time range observed: 1 until 26
## Temporal mode: discrete
## Time unit: step
## Suggested time increment: 1
##
## Network attributes:
## vertices = 100
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## net.obs.period: (not shown)
## total edges= 104
## missing edges= 0
## non-missing edges= 104
##
## Vertex attribute names:
## active status testatus.active vertex.names
##
## Edge attribute names:
## active
Next, we need to add a time-varying nodal attribute to each network
that we will animate. This will allow ndtv
to color the
nodes by disease status over time. See the help documentation for
color_tea
to see some of the options and details for this
step.
nw <- color_tea(nw, verbose = FALSE)
Before we start making a movie with this updated network object,
recall the types of static visualizations of the co-evolving dynamic
network and epidemic data that are possible with EpiModel
.
One approach is to extract and work with the transmission matrix within
the netsim
object.
tm <- get_transmat(sim)
tm
## at sus inf infDur transProb actRate finalProb
## 1 2 13 64 16 1 1 1
## 2 2 47 64 16 1 1 1
## 3 3 72 47 1 1 1 1
## 4 4 67 64 18 1 1 1
## 5 8 81 67 4 1 1 1
## 6 9 1 81 1 1 1 1
## 7 10 29 67 6 1 1 1
## 8 10 68 1 1 1 1 1
## 9 11 57 68 1 1 1 1
## 10 11 16 68 1 1 1 1
## 11 12 48 16 1 1 1 1
## 12 15 32 81 7 1 1 1
## 13 17 45 72 14 1 1 1
## 14 18 78 45 1 1 1 1
## 15 18 54 45 1 1 1 1
## 16 19 53 47 17 1 1 1
## 17 19 91 78 1 1 1 1
## 18 20 35 54 2 1 1 1
## 19 20 49 1 11 1 1 1
## 20 20 85 72 17 1 1 1
## 21 21 4 35 1 1 1 1
## 22 21 89 54 3 1 1 1
## 23 21 96 49 1 1 1 1
## 24 21 92 35 1 1 1 1
## 25 21 94 85 1 1 1 1
## 26 22 27 94 1 1 1 1
## 27 22 82 89 1 1 1 1
## 28 22 69 96 1 1 1 1
## 29 22 74 89 1 1 1 1
## 30 22 46 16 11 1 1 1
## 31 23 95 69 1 1 1 1
## 32 23 76 74 1 1 1 1
## 33 23 97 27 1 1 1 1
## 34 24 9 97 1 1 1 1
## 35 24 11 76 1 1 1 1
## 36 24 8 97 1 1 1 1
## 37 25 31 49 5 1 1 1
## 38 25 65 11 1 1 1 1
## 39 25 20 9 1 1 1 1
Next, there are three built-in visualizations for the transmission
matrix, as shown here. Previously we converted the transmission matrix
into a phylo
class object and plotted a phylogram with the
ape
package. Here, we are showing that we can also plot
this transmat
object directly as a phylogram, and also as
two related formats. Each show the time-directed chain of transmissions
over time, starting with an initial seed. The middle plot is a directed
network (showing who infected whom, here in a single large component
because there was one seed). The right plot is a “transmission timeline”
that plots the temporal time (the 25 time steps in our model) against
generation time for transmissions.
par(mfrow = c(1, 3), mar = c(3, 3, 2, 1))
plot(tm, style = "phylo")
plot(tm, style = "network", displaylabels = TRUE)
plot(tm, style = "transmissionTimeline")
Finally, with the updated ndtv
network object from
above, we can plot a “proximity timeline” that shows the proximity of
the nodes in directed temporal graph space, colored by disease status.
Here the network is the full contact network, not just the transmission
network of infected nodes. This helps identify the micro-outbreaks of
infection that are occurring within sub-clusters of the network with
greater closeness.
par(mfrow = c(1, 1), mar = c(3, 3, 2, 1))
proximity.timeline(nw,
vertex.col = "ndtvcol",
spline.style = "color.attribute",
mode = "sammon",
default.dist = 10,
chain.direction = "reverse")
There are lots of possibilities for how to model our networks over
time. Here we animate the first 25 time steps of the simulation. To find
out all the options, consult the main vignette for the ndtv
package.
slice.par <- list(start = 1, end = 25, interval = 1,
aggregate.dur = 1, rule = "any")
render.par <- list(tween.frames = 10, show.time = FALSE)
plot.par <- list(mar = c(0, 0, 0, 0))
This step figures out where the nodes should be displayed in each animation frame. There are several dynamic layout options available, but we use the defaults here.
compute.animation(nw, slice.par = slice.par, verbose = TRUE)
Finally, we render the animation and save it out to an HTML file
using d3 Javascript. This makes for nice-looking animations within
webpages. See the ndtv
help for all the potential options
in this step.
render.d3movie(
nw,
render.par = render.par,
plot.par = plot.par,
vertex.cex = 0.9,
vertex.col = "ndtvcol",
edge.col = "darkgrey",
vertex.border = "lightgrey",
displaylabels = FALSE,
filename = paste0(getwd(), "/movie.html"))
To embed movies into an RMarkdown document like this, we can set the
output.mode
accordingly.
render.d3movie(
nw,
render.par = render.par,
plot.par = plot.par,
vertex.cex = 0.9,
vertex.col = "ndtvcol",
edge.col = "darkgrey",
vertex.border = "lightgrey",
displaylabels = FALSE,
output.mode = "htmlWidget")
Let’s return to the core network model and experiment with extra
terms, and then run the movie again. Next we will add
concurrent
term in the model to control the number of nodes
with overlapping partnerships. For reference, the expected number of
concurrent nodes is a function of a Poisson distribution with the
parameter equal to the mean degree (which is
2 x edges/N
):
ppois(1, lambda = 0.8, lower.tail = FALSE) * 100
## [1] 19.12079
One helpful ERGM term is concurrent
. This is the number
of nodes with degree of 2 or higher. From the target stats above, next
we can add a value for concurrent
above or below the
expected value of 19.1, and see how that plays out for our network
movie. You may find that your model does not fit if the
concurrent
target stat is too high. That’s because you
can’t have a huge number of concurrent persons if the mean degree
(related to concurrency) is 0.8. In other words, the model terms and
associated target statistics must be compatible.
In this example, we select an extreme concurrency statistic of 0 (no nodes exhibit concurrency).
nw <- network_initialize(n = 100)
formation <- ~edges + concurrent
target.stats <- c(40, 0)
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 20)
est <- netest(nw, formation, target.stats, coef.diss)
We use the same epidemic parameters as above, but you can experiment with these too now.
set.seed(12345)
param <- param.net(inf.prob = 1)
init <- init.net(i.num = 1)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)
Let’s just look at what the static transmission outputs look like without network concurrency but with the same epidemic parameters. This network with much more sparse connectivity results in a transmission tree that is also sparser.
par(mfrow = c(1, 3), mar = c(3, 3, 2, 1))
tm <- get_transmat(sim)
plot(tm, style = "phylo")
plot(tm, style = "network", displaylabels = TRUE)
plot(tm, style = "transmissionTimeline")
We follow the same steps to make the movie. Notice here that nearly all nodes are connected within a dyad, and therefore that no concurrency implies few isolates given this relatively high mean degree (0.8). In fact, the logical maximum mean degree in a network with no concurrency would be 1.0. Notice how this “serial monogamy” constrains the speed of epidemic spread.
nw <- get_network(sim)
nw <- color_tea(nw, verbose = FALSE)
compute.animation(nw, slice.par = slice.par)
render.d3movie(
nw,
render.par = render.par,
plot.par = plot.par,
vertex.cex = 0.9,
vertex.col = "ndtvcol",
edge.col = "darkgrey",
vertex.border = "lightgrey",
displaylabels = FALSE,
filename = paste0(getwd(), "/movie.html"))
One feature of network models is to represent persistent relationships, in which there are repeated contacts between the same set of persons over time. This involves modeling the incident formation and persistence of edges over time. We can also model the extremes of relational duration to see its impact on epidemic potential.
Here we use the same model as Model 1, but set the duration to 100,000 time steps. This effectively makes all edges fixed over the 25 simulation time steps.
nw <- network_initialize(n = 100)
formation <- ~edges
target.stats <- 40
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 1e5)
est <- netest(nw, formation, target.stats, coef.diss)
We use the same epidemic parameters as above.
param <- param.net(inf.prob = 1)
init <- init.net(i.num = 1)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)
We follow the same steps to make the movie. Nothing much happening, even with a transmission probability of 100%.
nw <- get_network(sim)
nw <- color_tea(nw, verbose = FALSE)
compute.animation(nw, slice.par = slice.par)
render.d3movie(
nw,
render.par = render.par,
plot.par = plot.par,
vertex.cex = 0.9,
vertex.col = "ndtvcol",
edge.col = "darkgrey",
vertex.border = "lightgrey",
displaylabels = FALSE,
filename = paste0(getwd(), "/movie.html"))
Here we use the same model as Model 1, but set the duration to 2 time steps. This makes edges randomly disolve with high probabilities at each time step.
nw <- network_initialize(n = 100)
formation <- ~edges
target.stats <- 40
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 2)
est <- netest(nw, formation, target.stats, coef.diss)
We use the same epidemic parameters as above.
param <- param.net(inf.prob = 1)
init <- init.net(i.num = 1)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)
Yikes! There is not only massive turnover in the edges at each time step (which makes the visualization a jumble of moving nodes), but the epidemic quickly reaches a saturation point.
nw <- get_network(sim)
nw <- color_tea(nw, verbose = FALSE)
compute.animation(nw, slice.par = slice.par)
render.d3movie(
nw,
render.par = render.par,
plot.par = plot.par,
vertex.cex = 0.9,
vertex.col = "ndtvcol",
edge.col = "darkgrey",
vertex.border = "lightgrey",
displaylabels = FALSE,
filename = paste0(getwd(), "/movie.html"))
Next let’s try another model where we add extra “triangles”: clusters
of three people connected together in a local group. The
gwesp
term allows us to do this without problems of model
degeneracy (see Day 2 materials). The null value for the target
statistic is less than 1, so let’s ramp it up a bit and see what
happens. We will recycle the same epidemic parameters from the model
above.
nw <- network_initialize(n = 100)
formation <- ~edges + gwesp(0, TRUE)
target.stats <- c(40, 25)
coef.diss <- dissolution_coefs(dissolution = ~offset(edges), duration = 20)
est <- netest(nw, formation, target.stats, coef.diss)
sim <- netsim(est, param, init, control)
We follow the same steps to make the movie. Notice the distinct triangle formation in the network edges. Are triangles associated with faster or slower epidemic spread?
nw <- get_network(sim)
nw <- color_tea(nw, verbose = FALSE)
compute.animation(nw, slice.par = slice.par)
render.d3movie(
nw,
render.par = render.par,
plot.par = plot.par,
vertex.cex = 0.9,
vertex.col = "ndtvcol",
edge.col = "darkgrey",
vertex.border = "lightgrey",
displaylabels = FALSE,
filename = paste0(getwd(), "/movie.html"))
Finally, we show an example of how to work with additional vertex
attributes within the model. Unlike in today’s Session 5 tutorial, which
used the special group
attribute, here we show how to use
other (non-special) attributes in combination.
In this model, we want to parameterize a network model in which age and race/ethnicity attributes are represented. Race will be a single binary variable (0 and 1), whereas age will be an integer in years between 18 and 50. We use R’s built-in random sampling functions to generate values for these attributes and then set them on the network object.
nw <- network_initialize(n = 100, directed = FALSE)
nw <- set_vertex_attribute(nw, "race", rbinom(100, 1, 0.5))
nw <- set_vertex_attribute(nw, "age", sample(18:50, 100, TRUE))
nw
## Network attributes:
## vertices = 100
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 0
## missing edges= 0
## non-missing edges= 0
##
## Vertex attribute names:
## age race vertex.names
##
## No edge attributes
This will be the most complex TERGM formula that we have seen so far.
It introduces a couple of new ERGM terms for Day 3 and chains them all
together in a single model. We start with an edges
term
with an associated target statistic of 50 (so an overall mean degree of
1). Next is a nodematch("race")
term that specifies that 40
of those 50 edges are within the same race group. After that the
nodefactor("race")
term allows the mean degree to vary by
race. We will work on parameterizing this statistic further in Day 4,
but with a target statistic of 70, this means that the
race=1
group has a mean degree of 70/50=1.4
(the nodefactor
term has a default reference group that is
the lowest value of the referenced attribute). Next, the
absdiff
term is a mixing term, like nodematch
,
but for a continuous attribute (like age). The target statistic of 100
represents the sum of the absolute differences in the age attribute
across all dyads in the network; with 50 edges, this means that dyads
are on average 2 years apart in age (in either direction). Finally, we
include a concurrent
term to parameterize network degree.
In the dissolution model, we specify that the average duration of edges
is 20 time steps in relations not matched on race, and 10 time
steps in relations matched on race.
formation <- ~edges + nodematch("race") + nodefactor("race") +
absdiff("age") + concurrent
target.stats <- c(50, 40, 70, 100, 30)
coef.diss <- dissolution_coefs(dissolution = ~offset(edges) + offset(nodematch("race")),
duration = c(20, 10))
est <- netest(nw, formation, target.stats, coef.diss)
Phew! The epidemic model is simple, and parameterized as before.
param <- param.net(inf.prob = 1)
init <- init.net(i.num = 10)
control <- control.net(type = "SI", nsteps = 25, nsims = 1)
sim <- netsim(est, param, init, control)
The first steps of making the network movie are the same as before.
nw <- get_network(sim)
nw <- color_tea(nw)
slice.par <- list(start = 1, end = 25, interval = 1,
aggregate.dur = 1, rule = "any")
render.par <- list(tween.frames = 10, show.time = FALSE)
plot.par <- list(mar = c(0, 0, 0, 0))
compute.animation(nw, slice.par = slice.par, verbose = TRUE)
But to visualize race and age in the movie, we will extract the attributes back from the network object. And then scale them for appropriate visuals. We convert race into two numbers, 4 and 50, the will represent the number of sides in the node (so: a square and a circle). We scale age by a factor of 1/25 to allow the node size to be a simple function of age.
race <- get_vertex_attribute(nw, "race")
race.shape <- ifelse(race == 1, 4, 50)
age <- get_vertex_attribute(nw, "age")
age.size <- age/25
And here’s our final movie!
render.d3movie(
nw,
render.par = render.par,
plot.par = plot.par,
vertex.cex = age.size,
vertex.sides = race.shape,
vertex.col = "ndtvcol",
edge.col = "darkgrey",
vertex.border = "lightgrey",
displaylabels = FALSE,
filename = paste0(getwd(), "/movie.html"))
Last updated: 2022-07-07 with EpiModel v2.3.0