Last updated 2024-06-23
This tutorial is a joint product of the Statnet Development Team:
Pavel N. Krivitsky (University of New South Wales)
Martina Morris (University of Washington)
Mark S. Handcock (University of California, Los Angeles)
Carter T. Butts (University of California, Irvine)
David R. Hunter (Penn State University)
Steven M. Goodreau (University of Washington)
Chad Klumb (University of Washington)
Skye Bender de-Moll (Oakland, CA)
Michał Bojanowski (Kozminski University, Poland)
All Statnet packages are open-source, written for the R computing environment, and published on CRAN. The source repositories are hosted on GitHub. Our website is statnet.org
Need help? For general questions and comments, please email the Statnet users group at statnet_help@uw.edu. You’ll need to join the listserv if you’re not already a member. You can do that here: Statnet_help listserve.
Found a bug in our software? Please let us know by filing an issue in the appropriate package GitHub repository, with a reproducible example.
Want to request new functionality? We welcome suggestions – you can make a request by filing an issue on the appropriate package GitHub repository. The chances that this functionality will be developed are substantially improved if the requests are accompanied by some proposed code (we are happy to review pull requests).
For all other issues, please email us at contact@statnet.org.
This workshop and tutorial cover advanced topics in modeling network
data with Exponential family Random Graph Models (ERGMs) using
statnet
software. It assumes a familiarity with the
ergm
package at least at the level of the introductory
workshop entitled ERGMs
using statnet. This online tutorial is also designed for self-study,
with example code and self-contained data.
Some of the material in this workshop is drawn from a pair of recent articles by Krivitsky et al.: ergm 4: New features and ergm 4: Computational Improvements.
This workshop assumes basic familiarity with R,
experience with network concepts and data, familiarity with the ERGM
modeling framework, and proficiency using the ergm
package.
Minimally, you will need to install the latest version of
R (available
here) and the statnet
packages ergm.multi
and ggrepel
, which will in turn pull in the rest of the
packages needed to run the code presented here.
The workshops are conducted using the free version of
Rstudio
(available
here).
If you have not already downloaded the statnet
packages
for this workshop, the quickest way to install these (and the other most
commonly used packages from the statnet
suite), is to open
an R session and type:
The first line above will install all three required packages, since
both ergm.multi
depends on both ergm
and
network
. The second line ensures that any already-installed
packages, including ergm
and network
, are
updated to their latest versions.
Now load ergm.multi
, which will also load
ergm
and others.
Loading required package: ergm
Loading required package: network
'network' 1.18.1 (2023-01-24), part of the Statnet Project
* 'news(package="network")' for changes since last version
* 'citation("network")' for citation information
* 'https://statnet.org' for help, support, and other information
'ergm' 4.5.0 (2023-05-27), part of the Statnet Project
* 'news(package="ergm")' for changes since last version
* 'citation("ergm")' for citation information
* 'https://statnet.org' for help, support, and other information
'ergm' 4 is a major update that introduces some backwards-incompatible
changes. Please type 'news(package="ergm")' for a list of major
changes.
'ergm.multi' 0.2.0 (2023-05-29), part of the Statnet Project
* 'news(package="ergm.multi")' for changes since last version
* 'citation("ergm.multi")' for citation information
* 'https://statnet.org' for help, support, and other information
Attaching package: 'ergm.multi'
The following object is masked from 'package:ergm':
snctrl
You can check the version number with:
[1] '4.6.0'
Throughout, we will set a random seed via set.seed()
for
commands in tutorial that require simulating random values. This is not
necessary, but it ensures that you will get the same results as the
online tutorial.
The general form of an ERGM can be written as \[ P(\mathbf{Y}=\mathbf{y})=\frac{\exp({\boldsymbol{\theta}}^\top \mathbf{g}(\mathbf{y}))}{k({\boldsymbol{\theta}})}, \] where
\(\mathbf{Y}\) is the random variable for the state of the network (with realization \(\mathbf{y}\)),
\(\mathbf{g}(\mathbf{y})\) is a \(p\)-dimensional vector of model statistics for network \(\mathbf{y}\),
\({\boldsymbol{\theta}}\) is the \(p\)-dimensional vector of coefficients for those statistics, and
\(k({\boldsymbol{\theta}})\) represents the quantity in the numerator summed over all possible networks (typically constrained to be all networks with the same node set as \(\mathbf{y}\)).
In particular, the model implies that the probability attached to a network \(y\) only depends on the network via the vector of statistics \(g(y)\). Among other things, this means that maximum likelihood estimation may be carried out even if we don’t observe the network itself, as long as we know the observed value of \(g(y)\). We will see an example of this procedure in Section 2 of this tutorial.
The statistics \(\mathbf{g}(\mathbf{y})\) can be thought of as the “covariates” in the model. In the network modeling context, these represent network features like density, homophily, triads, etc. In one sense, they are like covariates you might use in other statistical models. But they are different in one important respect: these \(\mathbf{g}(\mathbf{y})\) statistics are functions of the network itself—each is defined by the frequency of a specific configuration of dyads observed in the network—so they are not measured by a question you include in a survey (e.g., the income of a node), but instead need to be computed on the specific network you have, after you have collected the data.
As a result, every term in an ERGM must have an associated algorithm
for computing its value for your network. The ergm
package
in statnet
includes about 150 term-computing algorithms.
You can get an up-to-date list of all available terms, and the syntax
for using them, by typing ?ergmTerm
. When using RStudio, it
is possible to press the tab key after starting a line with
?ergm
to view the wide range of possible help options
beginning with the letters ergm
.
To obtain help for a specific term, use either
help("[name]-ergmTerm")
or the shorthand version
ergmTerm?[name]
, where [name]
is the name of
the term.
One key categorization of model terms is worth keeping in mind: terms are either dyad independent or dyad dependent. Dyad independent terms (like nodal homophily terms) imply no dependence between dyads—the presence or absence of a tie may depend on nodal attributes, but not on the state of other ties. Dyad dependent terms (like degree terms, or triad terms), by contrast, imply dependence between dyads. Dyad dependent terms have very different effects, and much of what is different about network models comes from these terms. They introduce complex cascading effects that can often lead to counter-intuitive and highly non-linear outcomes. In addition, a model with at least one dyad dependent term requires a different estimation algorithm, so when we use these terms below you will see some different components in the output.
The ERGM expression for the probability of the entire graph shown above can be re-expressed in terms of the conditional log-odds (that is, the logit of the conditional probability) of a single tie between two actors: \[ \operatorname{logit}{P(Y_{ij}=1|\mathbf{y}^{c}_{ij})={\boldsymbol{\theta}}^\top\boldsymbol{\delta}_{ij}(\mathbf{y})}, \] where
\(Y_{ij}\) is the random variable for the state of the actor pair \(i,j\) (with realization \(y_{ij}\)), and
\(\mathbf{y}^{c}_{ij}\) signifies the complement of \(y_{ij}\), i.e. the entire network \(\mathbf{y}\) except for \(y_{ij}\).
\(\boldsymbol{\delta}_{ij}(\mathbf{y})\) is a vector of the “change statistics” for each model term. The change statistic records how the \(\mathbf{g}(\mathbf{y})\) term changes if the \(y_{ij}\) tie is toggled from off to on while fixing the rest of the network. So \[ \boldsymbol{\delta}_{ij}(\mathbf{y}) = g(\mathbf{y}^{+}_{ij})-g(\mathbf{y}^{-}_{ij}), \] where
\(\mathbf{y}^{+}_{ij}\) is defined as \(\mathbf{y}^{c}_{ij}\) along with \(y_{ij}\) set to 1, and
\(\mathbf{y}^{-}_{ij}\) is defined as \(\mathbf{y}^{c}_{ij}\) along with \(y_{ij}\) set to 0.
So \(\boldsymbol{\delta}_{ij}(\mathbf{y})\) equals the value of \(\mathbf{g}(\mathbf{y})\) when \(y_{ij}=1\) minus the value of \(\mathbf{g}(\mathbf{y})\) when \(y_{ij}=0\), but all other dyads are as in \(\mathbf{y}\). When this vector of change statistics is multiplied by the vector of coefficients \({\boldsymbol{\theta}}\), the equation above shows that this dot product is the log-odds of the tie between \(i\) and \(j\), conditional on all other dyads remaining the same.
In other words, for an individual statistic, its change value for \(Y_{ij}\) times its corresponding coefficient can be interpreted as that term’s contribution to the log-odds of that tie, conditional on all other dyads remaining the same.
Many applications take the sample space \(\mathcal{Y}\) to be the power set \(2^\mathbb{Y}\) of (possibly a subset of) all potential relationships. Yet it is sometimes desirable to restrict the sample space by placing constraints on which relationships \((i,j)\) are allowed in \(\mathbb{Y}\) and further which networks \(\mathbf{y}\in 2^\mathbb{Y}\) are allowed in \(\mathcal{Y}\).
For example, a bipartite network allows only edges connecting nodes from one subset, or mode, to nodes from its complement. Alternatively, we may wish to allow edges only within subsets of the node set, a situation often called a block-diagonal constraint. As still another, some applications impose a cap on the degree of any node, which constrains the sample space to include only those networks in which every node has a permitted degree.
Correct statistical inference for ERGMs depends on correctly
incorporating constraints into the fitting process. They are specified
using the constraints
argument, a one-sided formula whose
terms specify the constraints on the sample space.
For example, suppose we wish to constrain the sample space to only
those networks with a particular edge density. To accomplish this,
constraints = ~ edges
specifies \(\mathcal{Y}^{\text{edges}}(\mathbf{y})=\{
\mathbf{y}'\in \mathcal{Y}: |\mathbf{y}'|=|\mathbf{y}|
\}\), where \(\mathbf{y}\) is
the observed network specified on the left-hand side. Here is a simple
example in which we generate a random network according to a fitted ERGM
and conditional on fixing the number of edges:
data("faux.mesa.high")
fmh.fit <- ergm(faux.mesa.high ~ edges + nodematch("Grade"))
newnw <- simulate(fmh.fit, constraints = ~edges, nsim = 10)
rbind(faux.mesa.high = summary(faux.mesa.high ~ edges),
newnw = summary(newnw ~ edges))
edges
faux.mesa.high 203
203
203
203
203
203
203
203
203
203
203
The number of edges in the randomly-simulated newnw
above will always be the same as in faux.mesa.high
due to
the constraint.
A full list of currently implemented constraints is obtained via
?ergmConstraint
, and a specific constraint called
[name]
can be looked up with
help("[name]-ergmConstraint")
or
ergmConstraint?[name]
. The handling of various constraints
by MCMC proposals in the ergm
package is addressed in Krivitsky et al (2022b).
Dyad-independent constraints, which affect \(\mathcal{Y}\) only through \(\mathbb{Y}\) and do not induce stochastic dependencies among the dyad states, may be combined arbitrarily because they do not require special Metropolis–Hastings proposal algorithms for efficient sampling.
In the remainder of this section, we illustrate some of
ergm
’s constraints using a dataset due to Coleman (1964).
These data are self-reported friendship ties among 73 boys measured at
two time points during the 1957–1958 academic year. They are included as
a \(2\times73\times73\) array and
documented in the sna
package.
Here, we use the Coleman data to create a network
object
with \(2\times73\) nodes:
library("sna")
data("coleman")
cole <- matrix(0, 2 * 73, 2 * 73)
# Upper left block:
cole[1 : 73, 1 : 73] <- coleman[1, , ]
# Lower right block:
cole[73 + (1 : 73), 73 + (1 : 73) ] <- coleman[2, , ]
coleNW <- network(cole)
coleNW %v% "Semester" <- rep(c("Fall", "Spring"), each = 73)
coleNW
Network attributes:
vertices = 146
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 506
missing edges= 0
non-missing edges= 506
Vertex attribute names:
Semester vertex.names
No edge attributes
By construction, the coleNW
network includes the Fall
1957 semester data and the Spring 1958 data as the upper left \(73\times73\) and lower right \(73\times73\) blocks, respectively. To see
this structure, we create a simple function to plot an binary adjacency
matrix so that the zeros and ones are different colors:
BinaryMatrixPlot <- function(x) {
n <- dim(x)[1]
plot(0:n, 0:n, type="n", xaxt="n", yaxt="n", xlab="", ylab="", bty="n", ylim=c(n,0))
therows <- row(x)[x>0]
thecols <- col(x)[x>0]
rect(thecols-1, therows-1, thecols, therows)
rect(0,0,n,n)
}
Now we can visualize the constrained structure of the
coleNW
network, where the upper left corner depicts the
relationships measured in the fall and the lower right corner depicts
the spring relationships. Relationships are impossible in the upper
right and lower left, a fact we must take into account when calculating
statistical estimates.
The Dyads(fix=NULL, vary=NULL)
operator takes one or two
ergm
formulas that may contain only dyad-independent terms.
For the terms in the fix=
formula, dyads that affect the
network statistic (i.e., have nonzero change statistic) for any
the terms will be fixed at their current values. For the terms in the
vary=
formula, only those that change at least one
of the terms will be allowed to vary, and all others will be fixed. A
formula passed without an argument name will default to
fix=
, for consistency with other constraints’
semantics.
The key to our treatment of the coleNW
network using the
Dyads
operator is the Semester
vertex
attribute:
Fall Spring
73 73
In particular, the nodematch("Semester")
term has a
change statistic equal to one for exactly those dyads representing boys
measured during the same semester, and this change statistic is zero
otherwise. Therefore, in our 146-node directed network there are \(146\times 145\), or \(21{,}170\), total dyads, of which \(2\times73\times72\), or \(10{,}512\), have nonzero change statistics
for nodematch("Semester")
.
We can easily see exactly how many total edges there are and verify that they all match on the “Semester” variable:
edges nodematch.Semester
506 506
Another way to verify that no edges exist between Fall semester and
Spring semester nodes uses the mm
(for mixingmatrix)
term:
mm[Semester=Fall,Semester=Fall] mm[Semester=Spring,Semester=Fall]
243 0
mm[Semester=Fall,Semester=Spring] mm[Semester=Spring,Semester=Spring]
0 263
If we ignore the constraints entirely, the edges
coefficient is the log-odds, or logit, of 506 / 21170:
[,1] [,2]
edges -3.709612 -3.709612
Next, we use the Dyads
constraint allowing only those
dyads with nonzero change statistics for
nodematch("Semester")
to vary, and verify that we now
obtain the log-odds of 506 / 10512:
cbind(logit(506 / 10512),
coef(ergm(coleNW ~ edges,
constraints = ~ Dyads(vary = ~ nodematch("Semester")))))
[,1] [,2]
edges -2.984404 -2.984404
A significant limitation of this particular constraint is that its initialization requires testing every possible dyad and therefore takes up time and memory in proportion to the square of the number of nodes.
We may reproduce the example above using the blocks
operator, which constrains changes to any dyads that involve certain
pairs of categories defined by a particular nodal covariate. First,
consider the full complement of statistics produced by the
nodemix
model term:
mix.Semester.Fall.Fall mix.Semester.Spring.Fall
243 0
mix.Semester.Fall.Spring mix.Semester.Spring.Spring
0 263
The levels = TRUE
argument ensures that nodemix
considers every value of "group"
in constructing a mixing
matrix of possible dyad combinations. The levels2 = TRUE
argument ensures that, from the full complement of such possible
combinations, every one is included as a statistic. By default,
levels = TRUE
whereas levels2 = -1
, since we
frequently want to exclude at least one possible mixing combination to
avoid collinearity in a model that also includes the edges
term.
We may now use levels2
in conjunction with
blocks
to select exactly which of the nodemix
combinations should be constrained as fixed, namely, the second and
third statistics from the full mixing matrix:
# This coefficient estimate should match the example above
coef(ergm(coleNW ~ edges,
constraints = ~ blocks("Semester", levels2 = c(2, 3))))
edges
-2.984404
Additional examples using levels2
, among other nodal
attribute features, are contained in the nodal_attributes
vignette within the ergm
package.
Recent versions of ergm
allow for more control over
various aspects of the ERGM fitting process, such as the
Metropolis-Hastings proposal distribution and various algorithmic
control parameters. Here, we illustrate some of these features using an
example from Section 5 of Krivitsky et al (2022) in
which we estimate the parameters of an ergm for a hypothetical network
on 50,000 nodes.
Our network will be based on the cohab
dataset in the
ergm
package, which consists of three R
objects, none of them a network object. These objects are based on
aggregated statistics from the National Survey of Family
Growth (NSFG).
cohab_PopWts
: A set of NSFG demographic
attributes—sex, age, and race/ethnicity/immigration status—along with
weights that have been adjusted to match the demographics of King County
in Washington State.
cohab_TargetStats
: A vector of expected statistics
for a 15-term ERGM applied to a network of 50,000 nodes, with in which a
tie represents a heterosexual cohabition relationship between two
nodes.
cohab_MixMat
: A Mixing matrix on the ‘race’
variable, giving the number of cohabiting male-female pairs of each
possible combination of ‘race’ values. The five values of this variable,
which captures some aspects of race/ethnicity/immigration status, are
Black, Black immigrant, Hispanic, Hispanic immigrant, and White. The
\(5\times5\) mixing matrix is, like
cohab_PopWts
, based on NSFG data reweighted to match King
County demographics.
Additional information about the cohab
dataset is
available by typing help(cohab)
.
The column names of cohab_PopWts
tell us which nodal
attributes we need to set for our 50,000-node network:
[1] "weight" "age" "race" "sex" "sex.ident"
[6] "othr.net.deg" "agesq" "sqrt.age.adj"
The first column of cohab_PopWts
is the set of weights,
proportional to the probabilities we will use to construct our random
set of 50,000 nodes so that it matches the demographics of King County.
The value of agesq
is simply the square of
age
, and sqrt.age.adj
is the square root of
age
with a small upward adjustment for females.
Letting nw
denote our 50,000 network, the ERGM formula
we wish to use is the same one used to generate the
cohab_TargetStats
object according to
?cohab
:
cohab.formula <-
(nw ~ edges + nodefactor("sex.ident", levels = 3)
+ nodecov("age") + nodecov("agesq")
+ nodefactor("race", levels = -5)
+ nodefactor("othr.net.deg", levels = -1)
+ nodematch("race", diff = TRUE) + absdiff("sqrt.age.adj"))
Recall that we do not actually need to observe the nw
network to fit an ERGM, although we do need the nodes in the network to
be specified. Thus, our code first creates a network containing no edges
in which the nodal attributes are selected according to the weights
specified by cohab_PopWts
.
set.seed(4321)
net_size <- 50000
nw <- network.initialize(net_size, directed = FALSE)
inds <- sample(seq_len(NROW(cohab_PopWts)),
net_size, TRUE, cohab_PopWts$weight)
set.vertex.attribute(nw, names(cohab_PopWts)[-1],
cohab_PopWts[inds,-1])
In this network, a tie is by definition a heterosexual cohabitation
relationship and we assume that no individual may have more than one
such relationship. We wish to constrain the set of allowable networks
according these assumptions, which we may accomplish using the
bd
(bounded degree) and blocks
constraints:
Thus, the following formula will be passed to the ergm
function via the constraints=
argument:
We may now fit the model. This takes a few minutes, so we’ll start it and describe some of the control parameters as it runs:
set.seed(1234)
fit <- ergm(cohab.formula,
target.stats = cohab_TargetStats,
eval.loglik = FALSE,
constraints = constraint.formula,
control = snctrl(
MCMC.prop = ~strat(attr = ~race, empirical = TRUE) + sparse,
init.method = "MPLE",
init.MPLE.samplesize = 5e7,
MPLE.constraints.ignore = TRUE,
MCMLE.effectiveSize = 64,
SAN.nsteps = 5e7,
SAN.prop=~strat(attr = ~race, pmat = cohab_MixMat) + sparse))
In RStudio, typing ?snctrl
shows all of the control
arguments that may be used. The improvements to the fitting algorithm
are not merely due to an expand list of control arguments; coding
efficiencies too have resulted in quicker fitting. According to Krivitsky et al (2022), the
example we just considered took anywhere from 1.3 hours to 22.64 hours
to fit using ergm
version 3.10, depending on the settings
used.
Finally, we may examine the set of estimated coefficients.
Estimate Std. Error MCMC % z value
edges 0.785277529 0.4839693005 0 1.6225772
nodefactor.sex.ident.msmf -1.509318824 0.2043290557 0 -7.3867068
nodecov.age -0.401146744 0.0143411971 0 -27.9716360
nodecov.agesq 0.006836269 0.0002244601 0 30.4564931
nodefactor.race.B 1.398141315 0.1564859095 0 8.9346147
nodefactor.race.BI 1.303563897 0.1645346127 0 7.9227336
nodefactor.race.H 2.748098856 0.1416603826 0 19.3992054
nodefactor.race.HI 1.430449585 0.1051492120 0 13.6039972
nodefactor.othr.net.deg.1 -4.542462604 0.1292577459 0 -35.1426723
nodematch.race.B 3.258740036 0.2550856025 0 12.7750841
nodematch.race.BI 3.914819910 0.2603545458 0 15.0364953
nodematch.race.H 0.050088295 0.1998123726 0 0.2506766
nodematch.race.HI 2.883540053 0.1490205662 0 19.3499470
nodematch.race.W 3.003038059 0.1347766112 0 22.2815964
absdiff.sqrt.age.adj -3.143347800 0.0571064224 0 -55.0436828
Pr(>|z|)
edges 1.046798e-01
nodefactor.sex.ident.msmf 1.505100e-13
nodecov.age 3.597251e-172
nodecov.agesq 9.827730e-204
nodefactor.race.B 4.085994e-19
nodefactor.race.BI 2.323451e-15
nodefactor.race.H 7.837344e-84
nodefactor.race.HI 3.791330e-42
nodefactor.othr.net.deg.1 1.504186e-270
nodematch.race.B 2.259033e-37
nodematch.race.BI 4.233789e-51
nodematch.race.H 8.020641e-01
nodematch.race.HI 2.040520e-83
nodematch.race.W 5.573561e-110
absdiff.sqrt.age.adj 0.000000e+00
ergm
4 introduces a new way to augment an
ergm
function call that we call a term operator,
or simply operator. In mathematics, an operator is a function,
like differentiation, that takes functions as its inputs; analogously, a
term operator takes one or more ERGM formulas as input and transforms
them by modifying their inputs and/or outputs.
Most ergm
operators have the form
X(formula, ...)
where X
is the name of the
operator, typically capitalized, formula
is a one-sided
formula specifying the network statistics to be evaluated, and the
remaining arguments control the transformation applied to the network
before formula
is evaluated and/or to the transformation
applied to the network statistics obtained by evaluating
formula
. Operators are documented alongside other terms,
accessible as help("[name]-ergmTerm")
or
ergmTerm?[name]
The operator F(formula, filter)
evaluates the terms in
formula
on a filtered network, with filtering specified by
filter
. Here, filter
is the right-hand side of
a formula that must contain one binary dyad-independent
ergm
term, having exactly one statistic with a dyadwise
contribution of 0 for a 0-valued dyad. That is, the term must be
expressible as \[\begin{equation}\label{eq:DyadIndependent}
g(\mathbf{y}) = \sum_{{{(i,j)}\in\mathbb{Y}}} f_{i,j}(y_{i,j}),
\end{equation}\] where for all possible \((i,j)\), \(f_{i,j}(0)=0\). One may verify that this
condition implies that an ERGM containing the single term \(g(\mathbf{y})\) has the property that the
dyads \(Y\!_{i,j}\) are jointly
independent, which is why such a term is called “dyad-independent”.
Examples of such terms include nodemix
,
nodematch
, nodefactor
, nodecov
,
and edgecov
. Then, formula
will be evaluated
on a network constructed by taking \(\mathbf{y}\) and keeping only those edges
for which \(f_{i,j}(y_{i,j}) \ne 0\).
This predicate can be modified slightly by very simple comparison or
logical expressions in the filter
formula. In particular,
placing !
in front of the term negates it (i.e., keep \((i,j)\) only if \(f_{i,j}(y_{i,j}) = 0\)) and comparison
operators (==
, <
, etc.) allow comparing
\(f_{i,j}(y_{i,j})\) to values other
than 0.
Sampson’s Monks can provide illustrative examples. ergm
includes a version of these data reporting cumulative liking nominations
over the three time periods Sampson asked a group of monks to identify
those they liked. We may plot this directed, 18-node network, where in
the labels we use “L”, “O”, or “T” to indicate Sampson’s categorization
of each monk as a Loyalist, Outcast, or Young Turk:
set.seed(123)
data("sampson")
lab <- paste0(1:18, " ", substr(samplike %v% "group", 1, 1),
": ", samplike %v% "vertex.names")
plot(samplike, displaylabels=TRUE, label = lab)
As an example of the F
operator, the code below uses
four different methods to summarize the number of ties between pairs of
nodes in the Turks
group:
summary(
(samplike ~ nodematch("group", diff = TRUE, levels = "Turks")
+ F( ~ nodematch("group"), ~ nodefactor("group", levels = "Turks"))
+ F( ~ edges, ~ nodefactor("group", levels = "Turks") == 2)
+ F( ~ edges, ~ !nodefactor( ~ group != "Turks"))))
nodematch.group.Turks
30
F(nodefactor("group",levels="Turks"))~nodematch.group
30
F(nodefactor("group",levels="Turks")==2)~edges
30
F(!nodefactor(~group!="Turks"))~edges
30
Here, the third method works because this particular \(f_{i,j}(y_{i,j})\) counts how many of the two nodes \(i\) and \(j\) are Turks, and so equals 2 if and only if both are; and the fourth method works because the new \(f_{i,j}(y_{i,j})\) is 0 exactly when neither \(i\) nor \(j\) is a Turk.
It is also possible to filter on a quantitative variable. For
instance, an alternative way to count the number of edges in
faux.mesa.high
that match on "Grade"
is to
report total edges after filtering by node pairs whose absolute
difference on the "Grade"
variable is less than 1:
cbind(summary(faux.mesa.high ~ nodematch("Grade")),
summary(
faux.mesa.high ~ F( ~ edges, ~ absdiff("Grade") < 1)))
[,1] [,2]
nodematch.Grade 163 163
While filter
must be dyad-independent,
formula
can have dyad-dependent terms as well. For
instance, we may count the transitive triples—i.e., triples \((i,j,k)\) where \(y_{i,j} = y_{j,k} = y_{i,k}=1\)—in the
samplike
network, then perform the same count on the
subnetwork consisting only of those edges connecting two monks not in
attendance in the minor seminary of Cloisterville before coming to the
monastery:
ttriple
154
F(nodefactor("cloisterville")==0)~ttriple
12
The operator Symmetrize(formula, rule)
evaluates the
terms in formula
on an undirected network constructed by
symmetrizing the underlying directed network according to
rule
. The possible values of rule
, which match
the terminology of the symmetrize
function of the
sna
package, are (a) “weak”, (b) “strong”, (c) “upper”, and
(d) “lower”. These four values result in an undirected tie between \(i\) and \(j\), given any \(i<j\), if and only if:
weak: either \(y_{i,j}\) or \(y_{j,i}\) equals 1;
strong: both \(y_{i,j}\) and \(y_{j,i}\) equal 1;
upper: \(y_{i,j}=1\);
lower: \(y_{j,i}=1\)
For example, we can compute the number of node pairs \(i<j\) with reciprocated edges, equivalent to mutuality:
Symmetrize(strong)~edges mutual
28 28
We can also compute the number of node pairs in which at least one edge is present; adding this value to the number of mutually connected node pairs (so that the latter are counted twice each) yields the total number of directed edges:
Symmetrize(weak)~edges edges
60 88
The operator S(formula, attrs)
evaluates the terms in
formula
on an induced subgraph constructed from vertices
identified by attrs
, a formula that can be one-sided or, to
make the induced subgraph bipartite, two-sided. For instance, suppose
that we wish to model the density and mutuality dynamics within the
group “Young Turks” as different from those of the rest of the
network:
coef(summary(
ergm(
(samplike ~ edges + mutual
+ S( ~ edges + mutual, ~ (group == "Turks"))),
control = snctrl(seed = 123))))
Estimate Std. Error MCMC % z value Pr(>|z|)
edges -2.007074 0.2377493 0 -8.441979 3.120095e-17
mutual 2.351613 0.4996500 0 4.706519 2.519819e-06
S((group=="Turks"))~edges 2.812378 0.8650057 0 3.251282 1.148857e-03
S((group=="Turks"))~mutual -2.165222 1.1965294 0 -1.809585 7.036009e-02
Thus, the density within the group is statistically significantly higher, whereas the reciprocation within the group is lower, though not statistically significantly at the 5% level.
The attrs
argument of the S(formula, attrs)
operator either takes a value as explained in the
?nodal_attributes
online help or, to obtain a bipartite
network, a two-sided formula with the left-hand side specifying the
tails and the right-hand side specifying the heads.
As another example, consider the directed edges from non-Young Turks to Young Turks. Creating the induced subgraph from these edges results in a bipartite network, shown in the figure.
The emphasized edges in the figure are those involved in a 4-cycle. We can define the bipartite network, and count the 4-cycles in two ways: If only directed edges from non-Turks to Turks (black in the figure) are viewed as bipartite edges, there are three 4-cycles.
S((group!="Turks"),(group=="Turks"))~cycle4
3
If we also include edges from Turks to non-Turks (dotted red in the
figure) via Symmetrize
and the “weak” rule, we get two
more.
Symmetrize(weak)~S((group!="Turks"),(group=="Turks"))~cycle4
5
The last example also illustrates that term operators may be nested arbitrarily.
Finally, we illustrate a common use case in which
Symmetrize
is used to analyze mutuality in a directed
network as a function of a predictor. The faux.dixon.high
dataset is a directed friendship network of seventh through twelfth
graders. Suppose we wish to check how strongly the tendency toward
mutuality in friendships is affected by students’ closeness in grade
level.
data("faux.dixon.high")
FDHfit <- ergm(faux.dixon.high ~ edges + mutual + absdiff("grade")
+ Symmetrize( ~ absdiff("grade"), "strong"),
control = snctrl(seed=321))
coef(summary(FDHfit) )
Estimate Std. Error MCMC % z value
edges -3.2468082 0.05110162 0 -63.536313
mutual 3.2407587 0.12095858 0 26.792301
absdiff.grade -0.9145735 0.04309196 0 -21.223763
Symmetrize(strong)~absdiff.grade -0.4237874 0.18035755 0 -2.349707
Pr(>|z|)
edges 0.000000e+00
mutual 3.972740e-158
absdiff.grade 5.762326e-100
Symmetrize(strong)~absdiff.grade 1.878819e-02
After correcting for the overall network density, the propensity for friendships to be reciprocated, and the predictive effect of grade difference on friendship formation, the difference in grade level has a statistically significant negative effect on the tendency to form mutual friendships (\(p\)-value = 0.019).
For binary ERGMs, interactions between dyad-independent
ergm
terms can be specified in a manner similar to
lm
and glm
via the :
and
*
operators.
Let us first consider the colon (:
) operator. Generally,
if term A
creates \(p_A\)
statistics and term B
creates \(p_B\) statistics, then A:B
will create \(p_A \times p_B\) new
statistics. If A
and B
are dyad-independent
terms, expressed for \(a=1,\dotsc,p_A\)
and \(b=1,\dotsc, p_B\) as \[
g_{a}(\mathbf{y})=\sum_{{{(i,j)}\in\mathbb{Y}}} u^{a}_{i,j}y_{i,j}
\ \text{and}\
h_{b}(\mathbf{y})=\sum_{{{(i,j)}\in\mathbb{Y}}} v^{b}_{i,j}y_{i,j}
\] for appropriate covariate matrices \(U^{a}\) and \(V^{b}\), then the corresponding interaction
term is \[\begin{equation}\label{InteractionTerm}
g_{a:b}(\mathbf{y})=\sum_{{{(i,j)}\in\mathbb{Y}}}
u^{a}_{i,j}v^{b}_{i,j}y_{i,j}.
\end{equation}\]
As an example, consider the Grade
and Sex
effects, expressed as model terms via nodefactor
, in the
faux.mesa.high
dataset:
nodefactor.Grade.7:nodefactor.Sex.M nodefactor.Grade.8:nodefactor.Sex.M
70 99
nodefactor.Grade.9:nodefactor.Sex.M nodefactor.Grade.10:nodefactor.Sex.M
63 46
nodefactor.Grade.11:nodefactor.Sex.M nodefactor.Grade.12:nodefactor.Sex.M
38 26
In the call above, we deliberately include all
Grade
-factor levels via levels=TRUE
, whereas
we employ the default behavior of nodefactor
for the
Sex
factor, which leaves out one level. Thus, the 6-level
Grade
factor and the 2-level Sex
factor, with
one level of the latter omitted, produce \(6\times 1\) interaction terms in this
example.
The *
operator, by contrast, produces all interactions
in addition to the main effects or statistics. Therefore, in the
scenario described above, A*B
will add \(p_{A} + p_{B} + p_{A} \times p_{B}\)
statistics to the model. Below, we use the default behavior of
nodefactor
on both the 6-level Grade
factor
and the 2-level Sex
factor, together with an additional
edges
term, to produce a model with \(1+ 5 + 1 + 5\times 1\) terms:
m <- ergm(faux.mesa.high ~ edges + nodefactor("Grade") * nodefactor("Sex"))
print(summary(m), digits = 3)
Call:
ergm(formula = faux.mesa.high ~ edges + nodefactor("Grade") *
nodefactor("Sex"))
Maximum Likelihood Results:
Estimate Std. Error MCMC % z value
edges -3.028 0.173 0 -17.53
nodefactor.Grade.8 -1.424 0.263 0 -5.41
nodefactor.Grade.9 -1.166 0.229 0 -5.10
nodefactor.Grade.10 -1.633 0.357 0 -4.58
nodefactor.Grade.11 -0.328 0.237 0 -1.38
nodefactor.Grade.12 -0.794 0.324 0 -2.45
nodefactor.Sex.M -1.764 0.240 0 -7.36
nodefactor.Grade.8:nodefactor.Sex.M 1.386 0.202 0 6.86
nodefactor.Grade.9:nodefactor.Sex.M 1.012 0.211 0 4.79
nodefactor.Grade.10:nodefactor.Sex.M 1.347 0.264 0 5.11
nodefactor.Grade.11:nodefactor.Sex.M 0.419 0.240 0 1.75
nodefactor.Grade.12:nodefactor.Sex.M 1.059 0.290 0 3.65
Pr(>|z|)
edges < 1e-04 ***
nodefactor.Grade.8 < 1e-04 ***
nodefactor.Grade.9 < 1e-04 ***
nodefactor.Grade.10 < 1e-04 ***
nodefactor.Grade.11 0.16714
nodefactor.Grade.12 0.01429 *
nodefactor.Sex.M < 1e-04 ***
nodefactor.Grade.8:nodefactor.Sex.M < 1e-04 ***
nodefactor.Grade.9:nodefactor.Sex.M < 1e-04 ***
nodefactor.Grade.10:nodefactor.Sex.M < 1e-04 ***
nodefactor.Grade.11:nodefactor.Sex.M 0.08074 .
nodefactor.Grade.12:nodefactor.Sex.M 0.00026 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 28987 on 20910 degrees of freedom
Residual Deviance: 2189 on 20898 degrees of freedom
AIC: 2213 BIC: 2308 (Smaller is better. MC Std. Err. = 0)
Interactions involving dyad-dependent terms are not straightforward,
so ergm
’s default behavior if :
or
*
is used with a dyad-dependent term is to return an error.
This default behavior may be changed by setting the
interact.dependent
option; see
help("ergm-options")
for more details.
Since interactions are defined by multiplying change statistics dyadwise and then summing over all dyads, interactions of terms are not necessarily the same as terms derived from products. Here is an example using the undirected Florentine marriage dataset:
data("florentine")
summary(flomarriage ~ nodecov("wealth") : nodecov("wealth")
+ nodecov( ~ wealth ^ 2))
nodecov.wealth:nodecov.wealth nodecov.wealth^2
284058 187814
The reason the two statistics above are not the same is that
nodecov("wealth")
is expressed as \[
g(\mathbf{y})=\sum_{{{(i,j)}\in\mathbb{Y}}} u_{i,j}y_{i,j}
\] by setting \(u^a_{i,j} =
\mbox{wealth}_i + \mbox{wealth}_j\). Thus, in the interaction
term, each \(y_{i,j}\) is multiplied by
\((\mbox{wealth}_i +
\mbox{wealth}_j)^2\), whereas in the wealth ^ 2
term, each \(y_{i,j}\) is multiplied by
only \(\mbox{wealth}_i^2 +
\mbox{wealth}_j^2\).
The term operator Sum(formulas, label)
allows arbitrary
linear combinations of existing statistics to be added to the model.
Suppose \(\mathbf{g}_1(\mathbf{y}),\dotsc,\mathbf{g}_K(\mathbf{y})\)
is a set of \(K\) vector-valued network
statistics, each corresponding to one or more ergm
terms
and of arbitrary dimension. Also suppose that \(A_1,\dotsc,A_K\) is a set of known constant
matrices all having the same number of rows such that each matrix
multiplication \(A_k\mathbf{g}_k(\mathbf{y})\) is
well-defined. Then it is now possible to define the statistic \[\mathbf{g}_\text{Sum}(\mathbf{y})=\sum_{k=1}^K
A_k\mathbf{g}_k(\mathbf{y}).\]
The first argument to Sum
is a formula or a list of
\(K\) formulas, each representing a
vector statistic. If a formula has a left-hand side, the left-hand side
will be used to define the corresponding \(A_k\) matrix: If it is a scalar or a
vector, \(A_k\) will be a diagonal
matrix thus multiplying each element by its corresponding element; and
if it is a matrix, \(A_k\) will be used
directly. When no left-hand side is given, \(A_k\) is defined as 1. To simplify this
function for some common cases, if the left-hand side is
"sum"
or "mean"
, the sum (or mean) of the
statistics in the formula is calculated.
As an example, consider a vector of statistics consisting of the numbers of friendship ties received by each subgroup of Sampson’s monks:
nodeifactor.group.Loyal nodeifactor.group.Outcasts
29 13
nodeifactor.group.Turks
46
We may create a single statistic equal to the friendship ties
received by both groups of non-Outcasts by adding the first and third
components of the nodefactor
vector, either by
left-multiplying by \(\begin{bmatrix} 1 &
0 & 1 \end{bmatrix}\) or by deselecting the second component
at the nodeifactor
level and summing the remaining two:
summary(samplike ~
Sum(cbind(1, 0, 1) ~ nodeifactor("group", levels = TRUE), "nf.L_T")
+ Sum("sum" ~ nodeifactor("group", levels = -2), "nf.L_T"))
Sum~nf.L_T Sum~nf.L_T
75 75
Whereas the Sum
operator calculates linear combinations
of network statistics, the Prod
operator calculates the
products of their powers. As of this writing, it is implemented for
nonnegative statistics only, by first applying the Log
operator (which returns the natural logarithm, log
in
R
, of the statistics passed to it), then the
Sum
operator, and finally the Exp
operator
(which returns the exponential function, exp
in
R
). As a simple illustration, we may verify that the
Sum
and Prod
operators do in fact produce
network statistics as expected if we simply use each with a list of
formulas having no left hand side:
summary(faux.dixon.high ~ edges + mutual
+ Sum(list( ~ edges, ~ mutual), "EdgesAndMutual")
+ Prod(list( ~ edges, ~ mutual), "EdgesAndMutual"))
edges mutual Sum~EdgesAndMutual
1197 219 1416
Exp~Sum~EdgesAndMutual
262143
The For()
operator can be used to parametrize other
terms. This operator evaluates the formula given to it substituting the
specified loop counter variable with each element in a sequence.
Provide:
An ergm
formula in an unnamed argument containing
placeholders to be substituted.
One or more named arguments in the form
<placeholder> = <sequence>
, indicating what to
substitute:
The following are equivalent ways to compute differential homophily:
[1] "Loyal" "Outcasts" "Turks"
The “normal” way:
nodematch.group.Loyal nodematch.group.Outcasts nodematch.group.Turks
23 10 30
One element at a time, specifying a list:
nodematch.group.Loyal nodematch.group.Outcasts nodematch.group.Turks
23 10 30
One element at a time, specifying a function that returns a list:
summary(samplike ~ For(~nodematch("group", levels=., diff=TRUE),
. = function(nw) sort(unique(nw%v%"group"))))
nodematch.group.Loyal nodematch.group.Outcasts nodematch.group.Turks
23 10 30
One element at a time, specifying a formula whose RHS expression returns a list:
nodematch.group.Loyal nodematch.group.Outcasts nodematch.group.Turks
23 10 30
Here, absdiff()
is being computed for each combination
of attribute and power, the “normal” way:
data(florentine)
summary(flomarriage ~ absdiff("wealth", pow=1) + absdiff("priorates", pow=1) +
absdiff("wealth", pow=2) + absdiff("priorates", pow=2) +
absdiff("wealth", pow=3) + absdiff("priorates", pow=3))
absdiff.wealth absdiff.priorates absdiff2.wealth absdiff2.priorates
1146 638 91570 26756
absdiff3.wealth absdiff3.priorates
8588334 1289504
and with a loop, with attribute (a
) iterated within
power (.
):
absdiff.wealth absdiff.priorates absdiff2.wealth absdiff2.priorates
1146 638 91570 26756
absdiff3.wealth absdiff3.priorates
8588334 1289504
The ergm.multi
package provides support for modeling
multiple networks, as we illustrate in this section using datasets
supplied in that package.
First, load the ergm.multi
package along with some
helper packages. You might need to use install.packages
first if any of these is not yet installed:
The list of networks studied by Goeyvaerts et al (2018) is included in this package:
[1] 318
An explanation of the networks, including a list of their network
(%n%
) and vertex (%v%
) attributes, can be
obtained via ?Goeyvaerts
. A total of 318 complete networks
were collected, then two more excluded due to “nonstandard” family
composition:
[[1]]
# A tibble: 4 × 5
vertex.names age gender na role
<int> <int> <chr> <lgl> <chr>
1 1 32 F FALSE Mother
2 2 48 F FALSE Grandmother
3 3 32 M FALSE Father
4 4 10 F FALSE Child
[[2]]
# A tibble: 3 × 5
vertex.names age gender na role
<int> <int> <chr> <lgl> <chr>
1 1 29 F FALSE Mother
2 2 28 F FALSE Mother
3 3 0 F FALSE Child
To reproduce the analysis, exclude them as well:
Obtain weekday indicator, network size, and density for each network, and summarize them as in Goeyvaerts et al (2018), Table 1:
G %>% map(~list(weekday = . %n% "weekday",
n = network.size(.),
d = network.density(.))) %>% bind_rows() %>%
group_by(weekday, n = cut(n, c(1,2,3,4,5,9))) %>%
summarize(nnets = n(), p1 = mean(d==1), m = mean(d)) %>% kable()
weekday | n | nnets | p1 | m |
---|---|---|---|---|
FALSE | (1,2] | 3 | 1.0000000 | 1.0000000 |
FALSE | (2,3] | 19 | 0.7368421 | 0.8771930 |
FALSE | (3,4] | 48 | 0.8541667 | 0.9618056 |
FALSE | (4,5] | 18 | 0.7777778 | 0.9500000 |
FALSE | (5,9] | 3 | 1.0000000 | 1.0000000 |
TRUE | (1,2] | 9 | 1.0000000 | 1.0000000 |
TRUE | (2,3] | 53 | 0.9056604 | 0.9622642 |
TRUE | (3,4] | 111 | 0.7747748 | 0.9279279 |
TRUE | (4,5] | 39 | 0.6410256 | 0.8974359 |
TRUE | (5,9] | 13 | 0.4615385 | 0.8454212 |
We now reproduce the ERGM fits. First, we extract the weekday networks:
[1] 225
Next, we specify the multi-network model using the
N(formula, lm)
operator. This operator will evaluate the
ergm
formula formula
on each network, weighted
by the predictors passed in the one-sided lm
formula, which
is interpreted the same way as that passed to the built-in
lm()
function, with its “data
” being the table
of network attributes.
Since different networks may have different compositions, to have a consistent model, we specify a consistent list of family roles.
We now construct the formula object, which will be passed directly to
ergm()
:
# Networks() function tells ergm() to model these networks jointly.
f.wd <- Networks(G.wd) ~
# This N() operator adds three edge counts:
N(~edges,
~ # one total for all networks (intercept implicit as in lm),
I(n<=3)+ # one total for only small households, and
I(n>=5) # one total for only large households.
) +
# This N() construct evaluates each of its terms on each network,
# then sums each statistic over the networks:
N(
# First, mixing statistics among household roles, including only
# father-mother, father-child, and mother-child counts.
# Since tail < head in an undirected network, in the
# levels2 specification, it is important that tail levels (rows)
# come before head levels (columns). In this case, since
# "Child" < "Father" < "Mother" in alphabetical order, the
# row= and col= categories must be sorted accordingly.
~mm("role", levels = I(roleset),
levels2=~.%in%list(list(row="Father",col="Mother"),
list(row="Child",col="Father"),
list(row="Child",col="Mother"))) +
# Second, the nodal covariate effect of age, but only for
# edges between children.
F(~nodecov("age"), ~nodematch("role", levels=I("Child"))) +
# Third, 2-stars.
kstar(2)
) +
# This N() adds one triangle count, totalled over all households
# with at least 6 members.
N(~triangles, ~I(n>=6))
See ergmTerm?mm
for documentation on the mm
term used above. Now, we can fit the model:
Call:
ergm(formula = f.wd)
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error
N(1)~edges 0.84277 0.53315
N(I(n <= 3)TRUE)~edges 1.48525 0.43480
N(I(n >= 5)TRUE)~edges -0.80143 0.20811
N(1)~mm[role=Child,role=Father] -0.63765 0.48484
N(1)~mm[role=Child,role=Mother] 0.13614 0.52994
N(1)~mm[role=Father,role=Mother] 0.25377 0.58752
N(1)~F(nodematch("role",levels=I("Child")))~nodecov.age -0.07154 0.01695
N(1)~kstar2 -0.25767 0.21225
N(1)~triangle 2.05168 0.31374
N(I(n >= 6)TRUE)~triangle -0.28102 0.10850
MCMC % z value Pr(>|z|)
N(1)~edges 0 1.581 0.113936
N(I(n <= 3)TRUE)~edges 0 3.416 0.000636
N(I(n >= 5)TRUE)~edges 0 -3.851 0.000118
N(1)~mm[role=Child,role=Father] 0 -1.315 0.188450
N(1)~mm[role=Child,role=Mother] 0 0.257 0.797254
N(1)~mm[role=Father,role=Mother] 0 0.432 0.665792
N(1)~F(nodematch("role",levels=I("Child")))~nodecov.age 0 -4.220 < 1e-04
N(1)~kstar2 0 -1.214 0.224754
N(1)~triangle 0 6.539 < 1e-04
N(I(n >= 6)TRUE)~triangle 0 -2.590 0.009600
N(1)~edges
N(I(n <= 3)TRUE)~edges ***
N(I(n >= 5)TRUE)~edges ***
N(1)~mm[role=Child,role=Father]
N(1)~mm[role=Child,role=Mother]
N(1)~mm[role=Father,role=Mother]
N(1)~F(nodematch("role",levels=I("Child")))~nodecov.age ***
N(1)~kstar2
N(1)~triangle ***
N(I(n >= 6)TRUE)~triangle **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 1975 on 1425 degrees of freedom
Residual Deviance: 611 on 1415 degrees of freedom
AIC: 631 BIC: 683.6 (Smaller is better. MC Std. Err. = 0.6565)
Similarly, we can extract the weekend network, and fit it to a
smaller model. We only need one N()
operator, since all
statistics are applied to the same set of networks, namely, all of
them.
G.we <- G %>% discard(`%n%`, "weekday")
fit.we <- ergm(Networks(G.we) ~
N(~edges +
mm("role", levels=I(roleset),
levels2=~.%in%list(list(row="Father",col="Mother"),
list(row="Child",col="Father"),
list(row="Child",col="Mother"))) +
F(~nodecov("age"), ~nodematch("role", levels=I("Child"))) +
kstar(2) +
triangles))
Call:
ergm(formula = Networks(G.we) ~ N(~edges + mm("role", levels = I(roleset),
levels2 = ~. %in% list(list(row = "Father", col = "Mother"),
list(row = "Child", col = "Father"), list(row = "Child",
col = "Mother"))) + F(~nodecov("age"), ~nodematch("role",
levels = I("Child"))) + kstar(2) + triangles))
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error
N(1)~edges 2.1574 1.4273
N(1)~mm[role=Child,role=Father] -1.1795 1.4697
N(1)~mm[role=Child,role=Mother] 0.1391 1.5732
N(1)~mm[role=Father,role=Mother] -0.7507 1.5343
N(1)~F(nodematch("role",levels=I("Child")))~nodecov.age -0.1765 0.0519
N(1)~kstar2 -0.8488 0.3592
N(1)~triangle 3.5582 0.7602
MCMC % z value Pr(>|z|)
N(1)~edges 0 1.511 0.130662
N(1)~mm[role=Child,role=Father] 0 -0.803 0.422257
N(1)~mm[role=Child,role=Mother] 0 0.088 0.929571
N(1)~mm[role=Father,role=Mother] 0 -0.489 0.624649
N(1)~F(nodematch("role",levels=I("Child")))~nodecov.age 0 -3.400 0.000674
N(1)~kstar2 0 -2.363 0.018127
N(1)~triangle 0 4.680 < 1e-04
N(1)~edges
N(1)~mm[role=Child,role=Father]
N(1)~mm[role=Child,role=Mother]
N(1)~mm[role=Father,role=Mother]
N(1)~F(nodematch("role",levels=I("Child")))~nodecov.age ***
N(1)~kstar2 *
N(1)~triangle ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 802.7 on 579 degrees of freedom
Residual Deviance: 133.2 on 572 degrees of freedom
AIC: 147.2 BIC: 177.8 (Smaller is better. MC Std. Err. = 0.7987)
Perform diagnostic simulation (Krivitsky et al, 2022c), summarize the residuals, and make residuals vs. fitted and scale-location plots:
Constructing simulation model(s).
Constructing GOF model.
Simulating unconstrained sample.
Collating the simulations.
Summarizing.
$`Observed/Imputed values`
edges kstar2 triangle
Min. : 1.000 Min. : 0.00 Min. : 0.000
1st Qu.: 3.000 1st Qu.: 3.00 1st Qu.: 1.000
Median : 6.000 Median :12.00 Median : 4.000
Mean : 5.778 Mean :13.55 Mean : 4.324
3rd Qu.: 6.000 3rd Qu.:12.00 3rd Qu.: 4.000
Max. :18.000 Max. :78.00 Max. :23.000
NA's :9 NA's :9
$`Fitted values`
edges kstar2 triangle
Min. : 0.860 Min. : 2.620 Min. : 0.700
1st Qu.: 2.950 1st Qu.: 7.622 1st Qu.: 2.348
Median : 5.570 Median :10.610 Median : 3.390
Mean : 5.754 Mean :13.409 Mean : 4.277
3rd Qu.: 5.840 3rd Qu.:11.473 3rd Qu.: 3.743
Max. :14.920 Max. :59.360 Max. :19.680
NA's :9 NA's :9
$`Pearson residuals`
edges kstar2 triangle
Min. :-5.498132 Min. :-4.2209 Min. :-4.20054
1st Qu.: 0.215984 1st Qu.: 0.2233 1st Qu.: 0.19607
Median : 0.364671 Median : 0.3825 Median : 0.39214
Mean : 0.000977 Mean : 0.0052 Mean : 0.01461
3rd Qu.: 0.477977 3rd Qu.: 0.5123 3rd Qu.: 0.54272
Max. : 1.752754 Max. : 2.0089 Max. : 1.93910
NA's :9 NA's :9
$`Variance of Pearson residuals`
$`Variance of Pearson residuals`$edges
[1] 1.153315
$`Variance of Pearson residuals`$kstar2
[1] 1.091188
$`Variance of Pearson residuals`$triangle
[1] 1.028553
$`Std. dev. of Pearson residuals`
$`Std. dev. of Pearson residuals`$edges
[1] 1.073925
$`Std. dev. of Pearson residuals`$kstar2
[1] 1.044599
$`Std. dev. of Pearson residuals`$triangle
[1] 1.014176
Variances of Pearson residuals substantially greater than 1 suggest unaccounted-for heterogeneity.
Loading required namespace: ggrepel
[[1]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
[[2]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
[[3]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[4]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[5]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[6]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
The plots don’t look unreasonable.
Also make plots of residuals vs. square root of fitted and vs. network size:
[[1]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
[[2]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
[[3]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[4]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[5]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[6]]
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_point()`).
Warning: Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[1]]
[[2]]
[[3]]
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[4]]
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[5]]
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
[[6]]
Warning: Removed 9 rows containing non-finite outside the scale range
(`stat_boxplot()`).
Removed 9 rows containing missing values or values outside the scale
range (`geom_text_repel()`).
It looks like network-size effects are probably accounted for.
It is quite common that network data are incomplete in various ways.
The ergm
package includes the capability to handle missing
edge data, whereas other types of missingness such as missing nodal
information are not addressed.
We illustrate using the samplike
dataset used earlier.
Consider a simple model with edges, mutuality (reciprocated dyads),
transitive ties, and cyclical ties. For the sake of comparison, we first
fit the model assuming no missing edge data. First, verify that
samplike
has no missing edges:
Network attributes:
vertices = 18
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
total edges= 88
missing edges= 0
non-missing edges= 88
Vertex attribute names:
cloisterville group vertex.names
Edge attribute names:
nominations
Now fit the model:
summary(full.fit <-
ergm(samplike ~ edges + mutual + transitiveties + cyclicalties,
eval.loglik = TRUE), control = snctrl(seed = 321))
Call:
ergm(formula = samplike ~ edges + mutual + transitiveties + cyclicalties,
eval.loglik = TRUE)
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
edges -1.8942 0.3648 0 -5.192 <1e-04 ***
mutual 2.4790 0.4375 0 5.666 <1e-04 ***
transitiveties 0.5225 0.3076 0 1.699 0.0894 .
cyclicalties -0.4546 0.2493 0 -1.823 0.0682 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 424.2 on 306 degrees of freedom
Residual Deviance: 329.5 on 302 degrees of freedom
AIC: 337.5 BIC: 352.4 (Smaller is better. MC Std. Err. = 0.4547)
Next, suppose that Monk #1 (John Bosco) refused to respond during all three waves, rendering his replies missing:
Network attributes:
vertices = 18
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
total edges= 99
missing edges= 17
non-missing edges= 82
Vertex attribute names:
cloisterville group vertex.names
Edge attribute names:
nominations
If we pass this modified object to ergm
, it will
automatically calculate the MLE under the assumption that the monk’s
refusal is unrelated to his choice of relations, i.e., that the data are
ignorably missing with respect to the specified model:
summary(m1.fit <-
ergm(samplike1 ~ edges + mutual + transitiveties + cyclicalties,
eval.loglik = TRUE), control = snctrl(seed = 321))
Call:
ergm(formula = samplike1 ~ edges + mutual + transitiveties +
cyclicalties, eval.loglik = TRUE)
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
edges -2.0219 0.3918 0 -5.161 <1e-04 ***
mutual 2.4080 0.4734 0 5.087 <1e-04 ***
transitiveties 0.4585 0.4021 0 1.140 0.254
cyclicalties -0.2777 0.3701 0 -0.750 0.453
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 400.6 on 289 degrees of freedom
Residual Deviance: 313.7 on 285 degrees of freedom
AIC: 321.7 BIC: 336.4 (Smaller is better. MC Std. Err. = 0.6715)
The degrees of freedom associated with the missing data fit have decreased because unobserved dyads do not carry information.
For the theoretical grounding of the ergm
package’s
algorithm for networks with missing edges, see Krivitsky et al (2022d),
which is based on the framework developed by Handcock and Gile (2010).
For more on the ignorability assumption for edge variables, see Handcock
and Gile (2010).
The estimation approach above can be extended to other types of incomplete network observation. Karwa et al. (2017) applied it to fit arbitrary ERGMs to networks whose dyad values had been stochastically perturbed—ties added and removed at random, with known probabilities—in order to preserve privacy. Another use case is multiple imputation for networks with missing data, in which multiple random versions of the full network are constructed by randomly inserting values for unobserved dyads according to probabilities that are determined based on, say, some type of logistic regression model.
These mechanisms may be invoked by passing an
obs.constraints
formula, specifying how the network of
interest was observed. Of particular interest are the following
constraints:
observed
restricts the proposal to changing only those dyads that are recorded as missing.
egocentric(attr = NULL, direction = c("both", "out", "in") )
restricts the proposal to changing only those dyads that would not be
observed in an egocentric sample. That is, dyads cannot be modified that
are incident on vertices for which attribute specification
attr
has value TRUE
or, if attr
is NULL
, the vertex attribute "na"
has value
FALSE
. For directed networks, direction=="out"
only preserves the out-dyads of those actors, and
direction=="in"
preserves their in-dyads.
dyadnoise(p01, p10)
Unlike the others, this is a soft constraint to adjust the sampled
distribution for dyad-level noise with known perturbation probabilities,
which can arise in a variety of contexts. It is assumed that the
observed LHS network is a noisy observation of some unobserved true
network, with p01
giving the dyadwise probability of
erroneously observing a tie where the true network had a non-tie and
p10
giving the dyadwise probability of erroneously
observing a nontie where the true network had a tie. p01
and p10
can be either both be scalars or both be adjacency
matrices of the same dimension as that of the LHS network giving these
probabilities.
We may use the obs.constraints
argument to re-fit the
model above:
samplike2 <- samplike
samplike2[1,] <- 0 # Careful! Zeros are not the same as missing, but...
samplike2 %v% "refused" <-
rep(c(TRUE,FALSE),c(1,17)) # This new nodal covariate labels who is missing
print(samplike2)
Network attributes:
vertices = 18
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
total edges= 82
missing edges= 0
non-missing edges= 82
Vertex attribute names:
cloisterville group refused vertex.names
Edge attribute names:
nominations
summary(m2.fit <-
ergm(samplike2 ~ edges + mutual + transitiveties + cyclicalties,
obs.constraints = ~ egocentric( ~ !refused, "out"),
control = snctrl(seed = 123) ) )
Call:
ergm(formula = samplike2 ~ edges + mutual + transitiveties +
cyclicalties, obs.constraints = ~egocentric(~!refused, "out"),
control = snctrl(seed = 123))
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
edges -2.0149 0.4104 0 -4.910 <1e-04 ***
mutual 2.4407 0.4792 0 5.093 <1e-04 ***
transitiveties 0.4519 0.4335 0 1.042 0.297
cyclicalties -0.2832 0.3903 0 -0.726 0.468
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 400.6 on 289 degrees of freedom
Residual Deviance: 314.0 on 285 degrees of freedom
AIC: 322 BIC: 336.6 (Smaller is better. MC Std. Err. = 0.5995)
Finally, the observational process can be included as a part of the
network dataset using the %ergmlhs%
operation. If we do
this, we do not need to explicitly pass the obs.constraints
argument to the ergm
function:
samplike2 %ergmlhs% "obs.constraints" <- ~ egocentric( ~ !refused, "out")
summary(m3.fit <-
ergm(samplike2 ~ edges + mutual + transitiveties + cyclicalties),
control = snctrl(seed = 231) )
Call:
ergm(formula = samplike2 ~ edges + mutual + transitiveties +
cyclicalties)
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
edges -2.0409 0.3773 0 -5.409 <1e-04 ***
mutual 2.4716 0.4719 0 5.238 <1e-04 ***
transitiveties 0.4148 0.4128 0 1.005 0.315
cyclicalties -0.2314 0.3691 0 -0.627 0.531
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 400.6 on 289 degrees of freedom
Residual Deviance: 313.1 on 285 degrees of freedom
AIC: 321.1 BIC: 335.8 (Smaller is better. MC Std. Err. = 0.6085)
Also known as multiplex, multirelational, or multivariate networks, in a multilayer network a pair of actors can have multiple simultaneous relations of different types. For example, in the famous Lazega lawyer, each pair of lawyers in the firm can have an advice relationship, a coworking relationship, a friendship relationship, or any combination thereof. Application of ERGMs to multilayer networks has a long history (e.g., Pattison and Wasserman 1999, Lazega and pattison 1999), and a number of R packages exist for analysing and estimating them.
ergm.multi
implements the general approach of Krivitsky,
Koehly, and Marcum (2020) for specifying multilayer ERGMs, including
Layer Logic and the various cross-layer specifications. Its advantages
include seamless integration with ergm()
, number of layers
limited only by computing power, ability to incorporate any ERGM effects
into the framework, handling of networks that have both directed and
undirected layers, and experimental multimode/multilevel support.
To keep things simple (and fast), we will use the Florentine dataset
included with ergm
. It comprises two networks of relations
among Florentine families, one of marriages, the other of business
relations. Although the following examples only include two layers,
ergm.multi
supports an arbitrary number.
data(florentine)
# Method 1: list of networks
flo <- Layer(list(m = flomarriage, b = flobusiness))
# Method 2: networks as arguments
flo <- Layer(m = flomarriage, b = flobusiness)
# Method 3: edge attributes:
flo2 <- flomarriage | flobusiness # superset of all edges in any layer
# set attributes
flo2[,, names.eval="m"] <- as.matrix(flomarriage)
flo2[,, names.eval="b"] <- as.matrix(flobusiness)
flo2
Network attributes:
vertices = 16
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 27
missing edges= 0
non-missing edges= 27
Vertex attribute names:
vertex.names
Edge attribute names:
b m
Combined 2 networks on '.LayerID'/'.LayerName':
1: n = 16, directed = FALSE, bipartite = FALSE, loops = FALSE
2: n = 16, directed = FALSE, bipartite = FALSE, loops = FALSE
Network attributes:
vertices = 32
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
ergm:
Length Class Mode
constraints 2 formula call
total edges= 35
missing edges= 0
non-missing edges= 35
Vertex attribute names:
.bipartite .LayerID .LayerName .undirected vertex.names
Edge attribute names:
b m
flo
is now a network with some additional metadata
set.
Incidentally, we can extract the network defined by a particular edge
attribute using the network_view()
helper function:
[1] TRUE
We will also use the Lazega Lawyers data to illustrate available
techniques. This dataset is included with the ergm.multi
package:
Network attributes:
vertices = 71
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 1574
missing edges= 0
non-missing edges= 1574
Vertex attribute names:
age gender office practice school seniority status vertex.names yrs_frm
Edge attribute names not shown
This network has a number of vertex attributes, including each lawyer’s demographic information, status in the firm, and professional history. It also has three non-NA edge attributes,
[1] "advice" "coworker" "friendship" "na"
[1] 1 1 1 1 1 0
[1] 0 0 0 0 0 0
[1] 0 0 0 0 0 1
each representing a type of relationship collected. Note that the edges of the network itself are a superset of all layers’ edges:
[1] TRUE
We can extract the individual layers using a helper function
network_view()
:
Network attributes:
vertices = 71
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 892
missing edges= 0
non-missing edges= 892
Vertex attribute names:
age gender office practice school seniority status vertex.names yrs_frm
Edge attribute names not shown
library(sna)
# Plot the advice network
coord <- gplot(L.a,
# Colour the vertex according to the practice.
vertex.col=as.numeric(as.factor(Lazega%v%"practice"))+1,
# Size the vertex according the seniority (2 if associate, 3 if partner):
vertex.cex=2+(Lazega%v%"status"=="partner"),
# # sides according to gender (2+1 for Female, 2+2 for Male):
vertex.sides=as.numeric(as.factor(Lazega%v%"gender"))+2,
displayisolates=FALSE)
text(coord, label=substr(Lazega%v%"office",1,1))
# Also, add a legend for colour, shape, plotting symbol, and letter:
legend("topleft",legend=levels(as.factor(Lazega%v%"practice")),
col=2:3,pch=19, title="Practice")
legend("topright",legend=levels(as.factor(Lazega%v%"gender")),
pch=c(2,5), title="Gender")
legend("bottomright",legend=levels(factor(Lazega%v%"status")),
pt.cex=c(2,3), pch=1, title="Status")
legend("bottomleft",legend=levels(factor(Lazega%v%"office")),
pch=substr(levels(factor(Lazega%v%"office")),1,1),title="Location")
# Plot the coworker network
coord <- gplot(L.c,
# Colour the vertex according to the practice.
vertex.col=as.numeric(as.factor(Lazega%v%"practice"))+1,
# Size the vertex according the seniority (2 if associate, 3 if partner):
vertex.cex=2+(Lazega%v%"status"=="partner"),
# # sides according to gender (2+1 for Female, 2+2 for Male):
vertex.sides=as.numeric(as.factor(Lazega%v%"gender"))+2,
displayisolates=FALSE)
text(coord, label=substr(Lazega%v%"office",1,1))
# Also, add a legend for colour, shape, plotting symbol, and letter:
legend("topleft",legend=levels(as.factor(Lazega%v%"practice")),
col=2:3,pch=19, title="Practice")
legend("topright",legend=levels(as.factor(Lazega%v%"gender")),
pch=c(2,5), title="Gender")
legend("bottomright",legend=levels(factor(Lazega%v%"status")),
pt.cex=c(2,3), pch=1, title="Status")
legend("bottomleft",legend=levels(factor(Lazega%v%"office")),
pch=substr(levels(factor(Lazega%v%"office")),1,1),title="Location")
# Plot the friendship network
coord <- gplot(L.f,
# Colour the vertex according to the practice.
vertex.col=as.numeric(as.factor(Lazega%v%"practice"))+1,
# Size the vertex according the seniority (2 if associate, 3 if partner):
vertex.cex=2+(Lazega%v%"status"=="partner"),
# # sides according to gender (2+1 for Female, 2+2 for Male):
vertex.sides=as.numeric(as.factor(Lazega%v%"gender"))+2,
displayisolates=FALSE)
text(coord, label=substr(Lazega%v%"office",1,1))
# Also, add a legend for colour, shape, plotting symbol, and letter:
legend("topleft",legend=levels(as.factor(Lazega%v%"practice")),
col=2:3,pch=19, title="Practice")
legend("topright",legend=levels(as.factor(Lazega%v%"gender")),
pch=c(2,5), title="Gender")
legend("bottomright",legend=levels(factor(Lazega%v%"status")),
pt.cex=c(2,3), pch=1, title="Status")
legend("bottomleft",legend=levels(factor(Lazega%v%"office")),
pch=substr(levels(factor(Lazega%v%"office")),1,1),title="Location")
To model it, we need to tell ergm
to interpret it as a
multilayer network, which we do using the Layer()
function
(not an ERGM term). In this case, the most convenient way to do
so is by referencing edge attributes:
However, an optionally named list of networks or networks as named arguments are also accepted. We will use it to shorten the layer names:
More generally, Layer()
will produce the “least common
denominator” network: if any unipartite layers are present, bipartite
layers will be coerced to unipartite (with some additional metadata to
ensure that disallowed edges are never formed), and if any directed
layers are present, all layers will be coerced to directed. It is
possible to use Symmetrize()
and S()
(Subgraph) operators to evaluate undirected and bipartite effects on
these layers.
Given the Layer()
construct on the LHS of an
ergm()
formula, we can use layer-aware terms on
the RHS. By convention, layer-aware terms have capital L
appended to them. For example, mutualL
is a layer-aware
generalization of mutual
. These terms have one or more
explicit (usually optional) layer specification arguments. By
convention, an argument that requires one layer specification is named
L=
and one that requires a list of specifications
(constructed by list()
or just c()
is named
Ls=
; and a specification of the form ~.
is a
placeholder for all observed layers.
We can get a list of layer-aware terms currently visible to ERGM with
Found 10 matching ergm terms:
CMBL(Ls=~.) (binary)
Conway--Maxwell-Binomial dependence among layers
ddspL(d, type="OTP", Ls.path=NULL, L.in_order=FALSE) (binary)
dspL(d, type="OTP", Ls.path=NULL, L.in_order=FALSE) (binary)
Dyadwise shared partners on layers
despL(d, type="OTP", L.base=NULL, Ls.path=NULL, L.in_order=FALSE) (binary)
espL(d, type="OTP", L.base=NULL, Ls.path=NULL, L.in_order=FALSE) (binary)
Edgewise shared partners on layers
dgwdspL(decay, fixed=FALSE, cutoff=30, type="OTP", Ls.path=NULL, L.in_order=FALSE) (binary)
gwdspL(decay, fixed=FALSE, cutoff=30, type="OTP", Ls.path=NULL, L.in_order=FALSE) (binary)
Geometrically weighted dyadwise shared partner distribution on layers
dgwespL(decay, fixed=FALSE, cutoff=30, type="OTP", L.base=NULL, Ls.path=NULL, L.in_order=FALSE) (binary)
gwespL(decay, fixed=FALSE, cutoff=30, type="OTP", L.base=NULL, Ls.path=NULL, L.in_order=FALSE) (binary)
Geometrically weighted edgewise shared partner distribution on layers
dgwnspL(decay, fixed=FALSE, cutoff=30, type="OTP", L.base=NULL, Ls.path=NULL, L.in_order=FALSE) (binary)
gwnspL(decay, fixed=FALSE, cutoff=30, type="OTP", L.base=NULL, Ls.path=NULL, L.in_order=FALSE) (binary)
Geometrically weighted non-edgewise shared partner distribution on layers
dnspL(d, type="OTP", L.base=NULL, Ls.path=NULL, L.in_order=FALSE) (binary)
nspL(d, type="OTP", L.base=NULL, Ls.path=NULL, L.in_order=FALSE) (binary)
Non-edgewise shared partners and paths on layers
L(formula, Ls=~.) (binary)
Evaluation on layers
mutualL(same=NULL, diff=FALSE, by=NULL, keep=NULL, Ls=NULL) (binary)
Mutuality
twostarL(Ls, type, distinct=TRUE) (binary)
Multilayer two-star
But how do we specify layers?
Layer Logic is an approach to specifying layers, their transformations, and interactions between two or more layers by constructing logical layers, which are layers constructed by evaluating logical expressions on observed layers.
Each formula’s right-hand side describes an observed layer or some “logical” layer, whose ties are a function of corresponding ties in observed layers. (Krivitsky et al. 2020)
The observed layers can be referenced either by name or by number
(i.e., order in which they were passed to Layer
). When
referencing by number, enclose the number in quotation marks (e.g.,
"1"
) or backticks (e.g., 1
).
Standard arithmetical, comparison, and logical operations can be
used, as well as as some mathematical functions such as
abs()
, round()
, and sign()
.
Standard operator precedence is followed applies, so use of parentheses
is recommended to ensure the logical expression is what it looks
like.
For example, if LHS is Layer(A=nwA, B=nwB)
, both and
~B
refer to nwB
, while A&!B
refers to a logical layer that has ties that are in nwA
but
not in nwB
.
Transpose function t()
applied to a directed layer will
reverse the direction of all relations (transposing the sociomatrix).
Unlike the others, it can only be used on an observed layer directly.
For example, is valid but is not.
At this time, logical expressions that produce complete graphs from
empty graph inputs (e.g., A==B
or !A
) are not
supported.
L(formula, Ls)
operatorThis is probably the most frequently used layer-aware term: it takes
an arbitrary binary ergm
formula and evaluates it on each
observed or logical layer specified in Ls
, and then adds up
the results elementwise. A very common usage is to write
L(~[TERMS], ~.)
to fit an effect of [TERMS]
homogeneous over all layers.
L(f)~edges
575
edges
575
L((f,a))~edges
1467
edges
1467
L(.)~edges
2571
edges
2571
L(f|a)~edges
1109
edges
1109
However, note that while the statistics are the same, the MLEs are different:
Call:
ergm(formula = Layer(f = L.f, a = L.a) ~ L(~edges, ~f | a))
Monte Carlo Maximum Likelihood Coefficients:
L(f|a)~edges
-2.342
Call:
ergm(formula = (L.f | L.a) ~ edges)
Maximum Likelihood Coefficients:
edges
-1.247
This is because the sample space for the first one is all-possible two-layer networks, whereas the sample space for the second one is all possible one-layer networks. Thus, the normalizing constant is different.
CMBL(Ls)
: Conway–Maxwell-binomial
distribution: By “default”, if \(m\) layers in Ls
are
homogeneous, the number of the layers with an edge should be \(\mbox{Binomial}(m, p)\) where \(p\) is the density of each layer. CMB
replaces the “\(\binom{m}{y_{i,j}}\)”
in the binomial density with “\(\binom{m}{y_{i,j}}^{\theta_{CMB}}\)”: a
positive coefficient implies positive dependence and a negative one
induces negative dependence.Is there an overall positive association between nominations?
Call:
ergm(formula = LLazega ~ L(~edges, ~.) + CMBL(~.))
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
L(.)~edges -0.82841 0.01878 0 -44.12 <1e-04 ***
CMBL(~.) 1.42167 0.03479 0 40.87 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 20670 on 14910 degrees of freedom
Residual Deviance: 12290 on 14908 degrees of freedom
AIC: 12294 BIC: 12309 (Smaller is better. MC Std. Err. = 3.847)
Yes!
What if we focus on giving advice rather than receiving?
Call:
ergm(formula = LLazega ~ L(~edges, c(~t(a), ~c, ~f)) + CMBL(c(~t(a),
~c, ~f)))
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
L((t(a),c,f))~edges -0.89666 0.02148 1 -41.75 <1e-04 ***
CMBL(list(~t(a),~c,~f)) 1.21809 0.03489 0 34.91 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 20670 on 14910 degrees of freedom
Residual Deviance: 12732 on 14908 degrees of freedom
AIC: 12736 BIC: 12751 (Smaller is better. MC Std. Err. = 4.932)
Still yes.
Consider a two-layer model for a dyad \((i,j)\), \[\Pr(Y _ {i,j,A} = y _ {i,j,A} \land Y _ {i,j,B} = y _ {i,j,B}) \propto \exp(\theta_{1} y _ {i,j,A} + \theta_{2} y _ {i,j,B} + \theta_{3} y _ {i,j,A\square B}),\] for some logical operation \(\square\).
What happens if we fix \(y _ {i,j,B}\) and look at the conditional probability of \(\Pr(Y _ {i,j,A}=1)\)?
a&b
: Conjunction:
\({\mbox{logit}}^{-1}(\theta_{1} + \theta_{3}
y _ {i,j,B})\), so \(\theta_{3} >0
\implies\) presence of an edge in layer b
will
increase the probability of the edge in layer a
.
xor(a,b)
: Mutual
exclusivity: \({\mbox{logit}}^{-1}(\theta_{1} + \theta_{3}
(-1)^{y _ {i,j,B}})\), so \(\theta_{3}
>0 \implies\) presence of an edge in layer b
will
decrease the probability of the edge in layer a
.
a|b
: Substitutability:
\({\mbox{logit}}^{-1}(\theta_{1} + \theta_{3}
(1-y _ {i,j,B}))\), so \(\theta_{3}
>0 \implies\) presence of an edge in layer b
will
undo a “default” increase in the probability of the edge in layer
a
.
Call:
ergm(formula = Layer(a = L.a, c = L.c) ~ L(~edges, ~a) + L(~edges,
~c) + L(~edges, ~a & c))
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
L(a)~edges -2.43931 0.05655 0 -43.14 <1e-04 ***
L(c)~edges -1.92166 0.04613 0 -41.66 <1e-04 ***
L(a&c)~edges 2.54346 0.08299 0 30.65 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 13780 on 9940 degrees of freedom
Residual Deviance: 8951 on 9937 degrees of freedom
AIC: 8957 BIC: 8979 (Smaller is better. MC Std. Err. = 3.07)
Call:
ergm(formula = Layer(a = L.a, c = L.c) ~ L(~edges, ~a) + L(~edges,
~c) + L(~edges, ~xor(a, c)))
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
L(a)~edges -1.16375 0.04221 0 -27.57 <1e-04 ***
L(c)~edges -0.64265 0.04132 0 -15.55 <1e-04 ***
L(xor(a,c))~edges -1.27716 0.04052 0 -31.52 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 13780 on 9940 degrees of freedom
Residual Deviance: 8951 on 9937 degrees of freedom
AIC: 8957 BIC: 8978 (Smaller is better. MC Std. Err. = 3.54)
Call:
ergm(formula = Layer(a = L.a, c = L.c) ~ L(~edges, ~a) + L(~edges,
~c) + L(~edges, ~a | c))
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
L(a)~edges 0.11157 0.05773 0 1.933 0.0533 .
L(c)~edges 0.63421 0.06919 0 9.166 <1e-04 ***
L(a|c)~edges -2.55271 0.07640 0 -33.413 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 13780 on 9940 degrees of freedom
Residual Deviance: 8950 on 9937 degrees of freedom
AIC: 8956 BIC: 8978 (Smaller is better. MC Std. Err. = 2.848)
mutualL(Ls)
: Layer-aware
mutuality: Given Ls=c(~a, ~b)
, counts the
number of ordered pairs \((i,j)\) for
which \(y _ {i,j,A}=1\land y _
{j,i,B}=1\).L(~..., ~a&t(b))
: Reversal layer
logic: Another way of expressing the above.Net of within-layer density and mutuality and co-occurrence, does advice reciprocate coworking?
summary(ergm(Layer(a=L.a,c=L.c)~L(~edges + mutual, ~a) + L(~edges + mutual, ~c) +
L(~edges, ~a&c) + mutualL(Ls=c(~a,~c))))
Call:
ergm(formula = Layer(a = L.a, c = L.c) ~ L(~edges + mutual, ~a) +
L(~edges + mutual, ~c) + L(~edges, ~a & c) + mutualL(Ls = c(~a,
~c)))
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
L(a)~edges -2.55765 0.06282 0 -40.711 <1e-04 ***
L(a)~mutual 0.34352 0.15616 0 2.200 0.0278 *
L(c)~edges -2.79888 0.06573 0 -42.584 <1e-04 ***
L(c)~mutual 2.54717 0.14507 0 17.558 <1e-04 ***
L(a&c)~edges 2.07234 0.10850 0 19.100 <1e-04 ***
L(a,c)~mutual 0.64414 0.12039 0 5.351 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Null Deviance: 13780 on 9940 degrees of freedom
Residual Deviance: 8116 on 9934 degrees of freedom
AIC: 8128 BIC: 8172 (Smaller is better. MC Std. Err. = 2.295)
It appears so.
Equivalently, we can write:
twostarL(Ls, type, distinct)
:
Cross-layer two-star or two-path: Ls
is a
list of two layers of interest (e.g., list(~a,~b)
), and
type
is
"any"
Undirected: instances where node
\(i\) has an edge in layer
a
and an edge in layer b
. (Whether these edges
may be coincident depends on distinct
.)"out"
/"in"
Directed:
instances where node \(i\) has
out-edge/in-edge in a
and out-edge/in-edge in
b
. (Whether these edges may be coincident depends on
distinct
.)"path"
Directed: instances where \(i\) has in-edge in a
and an
out-edge in b
. (Whether these edges may be reciprocal
depends on distinct
.)For illustration purposes, I will set distinct=FALSE
. In
your case, it should be driven by substantive or theoretical
considerations.
twostarL(c<>f)
10563
# Equivalently...
sum(
rowSums(as.matrix(L.c)) # Outdegree of each node in layer c
*
rowSums(as.matrix(L.f)) # Outdegree of each node in layer f
)
[1] 10563
twostarL(c><f)
10118
# Equivalently...
sum(
colSums(as.matrix(L.c)) # Indegree of each node in layer c
*
colSums(as.matrix(L.f)) # Indegree of each node in layer f
)
[1] 10118
twostarL(c>>f)
9890
# Equivalently...
sum(
colSums(as.matrix(L.c)) # Indegree of each node in layer c
*
rowSums(as.matrix(L.f)) # Outdegree of each node in layer f
)
[1] 9890
Triadic effects are generalized by gwdspL
,
gwespL
, and gwnspL
which take the standard
arguments of gw*sp
and dgw*sp
effects,
incluidng type
, and, in addition, L.base
,
Ls.path
, and L.in_order
, specifying the layer
of the base (for gwespL
and gwnspL
), the layer
of the two-path, and whether, for the directed case, the order of the
two-path matters.
For a general orientation to the statnet
packages, the
best place to start is the special volume of the Journal of
Statistical Software (JSS) devoted to statnet
: https://www.jstatsoft.org/issue/view/v024.
The nine papers in this volume cover a wide range of theoretical and
practical topics related to ERGMs, and their implementation in
statnet
.
However, this volume was written in 2008. The statnet
code base has evolved considerably since that time, and with the release
of ergm
version 4.0, the most current paper describing the
capabilities of the ergm
package is the following
preprints:
Krivitsky, P. N., Hunter, D. R., Morris, M., and Klumb, C. (2023). ergm 4: New Features for Analyzing Exponential-Family Random Graph Models. Journal of Statistical Software, 105(6), 1–44. doi: 10.18637/jss.v105.i06
Krivitsky, P. N., David R. Hunter, Martina Morris, and Chad Klumb (2022b). ergm 4.0: Computational Improvements. arXiv: 2203.08198.
For social scientists, a good introductory application paper is:
Goodreau, S., J. Kitts and M. Morris (2009). Birds of a Feather, or Friend of a Friend? Using Statistical Network Analysis to Investigate Adolescent Social Networks. Demography 46(1): 103-125. doi: 10.1353/dem.0.0045
Goodness of Fit
Hunter, D. R., Goodreau, S. M., and Handcock, M. S. (2008).
Goodness of Fit for Social Network Models. Journal of the American
Statistical Association, 103(481), 248–258. doi:
10.1198/016214507000000446
Temporal ERGMs
Krivitsky, P.N., Handcock, M.S,(2014). A separable model for dynamic networks JRSS Series B-Statistical Methodology, 76(1):29-46. doi: 10.1111/rssb.12014
Krivitsky, P. N., M. S. Handcock and M. Morris (2011). Adjusting for Network Size and Composition Effects in Exponential-family Random Graph Models, Statistical Methodology 8(4): 319-339. doi: 10.1016/j.stamet.2011.01.005
Multiple and Multilayer ERGMS
Goeyvaerts, N., Santermans, E. Potter, G., Torneri, A., Van Kerckhove, K., Lander, W., Aerts, M., Beutels, P., and Hens, N. (2018). Household Members Do Not Contact Each Other at Random: Implications for Infectious Disease Modelling. Proceedings of the Royal Society B: Biological Sciences, 285(1893): 20182201. doi: 10.1098/rspb.2018.2201
Krivitsky, P. N., Coletti, P., and Hens, N. (2023). A Tale of Two Datasets: Representativeness and Generalisability of Inference for Samples of Networks. Journal of the American Statistical Association, 118(544): 2213–2224. doi: 10.1080/01621459.2023.2242627
Krivitsky, P. N., Koehly, L. M., and Marcum, C. S. (2020) Exponential-family Random Graph Models for Multi-layer Networks. Psychometrika, 85(3): 630-659. doi: 10.1007/s11336-020-09720-7
Missing Data
Handcock, M. S. and Gile, K. J. (2010). Modeling Social Networks from Sampled Data. Annals of Applied Statistics, 4(1), 5—25. doi:10.1214/08-AOAS221
Karwa, V., Krivitsky, P. N., and Slavković, A. B., (2017). Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models. Journal of the Royal Statistical Society C, 66(3), 481–500. doi: 10.1111/rssc.12185
Krivitsky, P. N., Kuvelkar, A. R., and Hunter, D. R. (2022d). Likelihood-based Inference for Exponential-Family Random Graph Models via Linear Programming. Electronic Journal of Statistics, 17(2), 3337–3356. doi: 10.1214/23-ejs2176