Resources: STERGM.Snw

File STERGM.Snw, 38.4 KB (added by lxwang, 5 years ago)
Line 
1\documentclass[11pt]{article}
2
3\SweaveOpts{echo=true}
4\SweaveOpts{keep.source=TRUE}
5
6\usepackage{color}
7\usepackage{verbatim}
8\usepackage{amsmath}
9\usepackage{amsfonts}
10\usepackage{url}
11
12\newcommand{\notes}[1]{\textcolor{red}{#1}}
13\newcommand{\adjmat}{\mathbf{Y}}
14\newcommand{\obsadj}{\mathbf{y}}
15\newcommand{\grphSpace}{\mathcal{Y}}
16\newcommand{\dyadSpace}{\mathbb{Y}}
17\newcommand{\form}{^+}
18\newcommand{\diss}{^-}
19\newcommand{\paramVec}{\theta}
20\newcommand{\covmat}{\mathbf{X}}
21
22<<echo=false>>=
23options(width=60)
24@
25
26<<echo=false>>=
27options(continue=" ")
28@
29
30
31\begin{document}
32
33\title{STERGM - Separable Temporal ERGMs for modeling discrete relational dynamics with \emph{statnet}}
34\author{Pavel N.~Krivitsky, Steven M.~Goodreau,\\ The \texttt{Statnet} Development Team}
35\maketitle
36
37\tableofcontents
38
39\section{Getting the software}
40
41If you have not already done so, please download and install \texttt{ergm} version 3.0-1 and \texttt{networkDynamic} version 0.2-2.
42You will also want to make sure you have a reasonably new version of R, preferably the latest (2.14.2).
43
44<<eval=FALSE>>=
45install.packages("ergm")
46install.packages("networkDynamic")
47library(ergm)
48library(networkDynamic)
49@
50
51\section{A quick review of static ERGMs}
52
53Exponential-family random graph models (ERGMs) represent a general class of models based in exponential-family theory for specifying the probability distribution
54underlying a set of random graphs or networks. Within this framework, one can---among other tasks---obtain maximum-likehood estimates for the parameters of a specified model
55for a given data set; simulate additional networks with the underlying probability distribution implied by that model; test individual models for goodness-of-fit,
56and perform various types of model comparison.
57
58The basic expression for the ERGM class can be written as:
59\begin{equation}
60P(Y=y)=\frac{\exp(\theta'g(y))}{k(y)}
61\end{equation}
62where Y is the random variable for the state of the network (with realization y), $g(y)$ is the vector of model statistics for network y,
63$\theta$ is the vector of coefficients for those statistics, and $k(y)$ represents the quantity in the numerator summed over all possible networks
64(typically constrained to be all networks with the same node set as y).
65
66This can be re-expressed in terms of the conditional log-odds of a single actor pair:
67\begin{equation}
68\operatorname{logit}{(Y_{ij}=1|y^{c}_{ij})=\theta'\delta(y_{ij})}
69\end{equation}
70where $Y_{ij}$ is the random variable for the state of the actor pair $i,j$ (with realization $y_{ij}$), and $y^{c}_{ij}$ signifies the complement of $y_{ij}$, i.e. all dyads in the network other than $y_{ij}$.
71The variable $\delta(y_{ij})$ equals $g(y^{+}_{ij})-g(y^{-}_{ij})$, where $y^{+}_{ij}$ is defined as $y^{c}_{ij}$ along with $y_{ij}$ set to 1, and $y^{-}_{ij}$ is defined as $y^{c}_{ij}$ along with $y_{ij}$ set to 0. 
72That is, $\delta(y_{ij})$ equals the value of $g(y)$ when $y_{ij}=1$ minus the value of $g(y)$ when $y_{ij}=0$, but all other dyads are as in $g(y)$. 
73This emphasizes the log-odds of an individual tie conditional on all others.
74We call $g(y)$ the statistics of the model, and $\delta(y_{ij})$ the ``change statistics'' for actor pair $y_{ij}$.
75
76Fitting an ERGM usually begins with obtaining data:
77
78<<>>=
79library(ergm)
80data("florentine")
81ls()
82@
83
84<<flobusplot, include=FALSE, fig=TRUE>>=
85plot(flobusiness)
86@
87
88\begin{figure}[ht]
89\begin{center}
90\includegraphics[height=3in]{STERGM-flobusplot}
91\end{center}
92\end{figure}
93
94To refresh our memories on ERGM syntax, let us fit a cross-sectional example. Just by looking at the plot of \texttt{flobusiness},
95we might guess that there are more triangles than expected by chance for a network of this size and density, and thus that there is some
96sort of explicit triangle closure effect going on.  One useful way to model this effect in ERGMs that has been explored in the literature
97is with a \texttt{gwesp} statistic.
98
99<<>>=
100fit1 <- ergm(flobusiness~edges+gwesp(0,fixed=T))
101summary(fit1)
102@
103
104With the estimation in place, we can simulate a new network from the given model:
105
106<<sim1plot, include=FALSE, fig=TRUE>>=
107sim1 <- simulate(fit1,nsim=1,
108          control=control.simulate(MCMC.burnin=1000))
109plot(sim1)
110@
111
112\begin{figure}[ht]
113\begin{center}
114\includegraphics[height=3in]{STERGM-sim1plot}
115\end{center}
116\end{figure}
117
118
119\section{An Introduction to STERGMs (non-technical)}
120
121Separable Temporal ERGMs (STERGMs) are an extension of ERGMs for modeling dynamic networks in discrete time, introduced in Krivitsky and Handcock (2010).
122The cross-sectional ERGM entails a single network, and a single model on that network.
123STERGMs, in contrast, posit two models: one ERGM underlying relational formation, and a second one underlying relational dissolution.
124Specifying a STERGM thus entails writing two ERGM formulas instead of one.  It also requires dynamic data, of course;
125such data can come in many forms, and we will cover a few examples today.
126
127This approach is not simply a methodological development, but a theoretical one as well, and one which matches common sense for many social processes.
128Think of romantic relations.  It seems intuitive that the statistical model underlying relational formation
129(i.e. affecting who becomes partners with whom, out of the set of people who aren't already) is likely to be different than the model underlying relational dissolution
130(i.e. affecting who breaks up with whom, out of the set of people currently in relationships).  Any reasonable model of the former would
131need to include a variety of homophily parameters (mixing on age, for example).  The latter may or may not.  (Conditional on being in a relationship,
132does your difference in age affect your probability of breaking up?  Perhaps, but probably not as fundametally or as strongly as for formation).
133
134\section{An Introduction to STERGMs (a bit more technical)}
135
136We first review the ERGM framework for \emph{cross-sectional} or static networks, observed at a single point in time. Let $\dyadSpace\subseteq \{1,\dotsc,n\}^2$ be the set of potential relations (dyads) among $n$ nodes, ordered for directed networks and unordered for undirected.
137We can represent a network $\obsadj$ as a set of ties, with the set of possible sets of ties, $\grphSpace\subseteq 2^\dyadSpace$, being the sample space: $\obsadj \in \grphSpace$. Let $\obsadj_{ij}$ be 1 if $(i,j)\in\obsadj$ --- a relation of interest exists from $i$ to $j$ --- and 0 otherwise.
138
139The network also has an associated
140covariate array $\covmat$ containing attributes of the nodes, the dyads, or both.
141An exponential-family random graph model (ERGM) represents the pmf of $\adjmat$ as a function of a $p$-vector of network statistics $g(\adjmat,\covmat)$, with parameters $\paramVec \in \mathbb{R}^p$, as follows:
142\begin{equation}\label{eq:XsecModel}
143\mbox{Pr}_{\paramVec}\left(\adjmat = \obsadj \mid \covmat\right) =
144\frac{\exp\left\{\paramVec\cdot g(\obsadj, \covmat)\right\} }
145{c(\paramVec, \covmat, \grphSpace)},
146\end{equation}
147where the normalizing constant
148\[
149c(\paramVec, \covmat, \grphSpace)  = \sum\limits_{\obsadj' \in \grphSpace}\exp
150\left\{\paramVec\cdot g(\obsadj', \covmat)\right\}
151\]
152is a summation over the space of possible networks on $n$ nodes, $\grphSpace$.
153%The normalizing constant is a function of parameters ($\paramVec$), the space of possible networks ($\grphSpace$), and covariate values ($\covmat$). 
154Where $\grphSpace$ and $\covmat$ are held constant, as in a typical cross-sectional model, they may be suppressed in the notation.  Here, on the other hand, the dependence on $\grphSpace$ and $\covmat$ is made explicit.
155
156In modeling the transition from a network $\adjmat^{t}$ at time $t$ to a network $\adjmat^{t+1}$ at time $t+1$, the separable temporal ERGM assumes that the formation and dissolution of ties occur independently from each other within each time step, with each half of the process modeled as an ERGM. For two networks (sets of ties) $\obsadj,\obsadj'\in \grphSpace$, let $\obsadj \supseteq \obsadj'$ if any tie present in $\obsadj'$ is also present in $\obsadj$. Define $\grphSpace\form(\obsadj)=\{\obsadj'\in\grphSpace:\obsadj'\supseteq\obsadj\}$, the networks that can be constructed by forming ties in $\obsadj$; and $\grphSpace\diss(\obsadj)=\{\obsadj'\in\grphSpace:\obsadj'\subseteq\obsadj\}$, the networks that can be constructed dissolving ties in $\obsadj$.
157
158Given $\obsadj^{t}$, a \emph{formation network} $\adjmat\form$ is generated from an ERGM controlled by a $p$-vector of formation parameters $\paramVec\form$ and formation statistics $g\form(\obsadj\form, \covmat)$, conditional on only adding ties:
159\begin{equation}\label{formation}
160\Pr\left(\adjmat\form = \obsadj\form\mid \adjmat^{t};\paramVec\form\right) = \frac{\exp\left\{\paramVec\form\cdot g\form(\obsadj\form, \covmat)\right\} }
161%{c(\paramVec\form, \covmat, \grphSpace\form)},
162{c\left(\paramVec\form, \covmat, \grphSpace\form(\adjmat^{t})\right)}, \quad \obsadj\form \in \grphSpace\form(\obsadj^{t}).
163\end{equation}
164 A \emph{dissolution network} $\adjmat\diss$ is simultaneously generated from an ERGM controlled by a (possibly different) $q$-vector of dissolution parameters $\paramVec\diss$ and corresponding statistics $g\diss(\obsadj\diss, \covmat)$, conditional on only dissolving ties from $\obsadj^{t}$:
165\begin{equation}\label{dissolution}
166\Pr\left(\adjmat\diss = \obsadj\diss\mid \adjmat^{t};\paramVec\diss\right) = \frac{\exp\left\{\paramVec\diss\cdot g\diss(\obsadj\diss, \covmat)\right\} }
167%{c(\paramVec\diss, \covmat, \grphSpace\diss)},
168{c\left(\paramVec\diss, \covmat, \grphSpace\diss(\adjmat^{t})\right)}, \quad
169\obsadj\diss \in \grphSpace\diss(\obsadj^{t}).
170\end{equation}
171%is a normalizing constant. 
172The cross-sectional network at time $t+1$ is then constructed by applying the changes in $\adjmat^+$ and $\adjmat\diss$ to $\obsadj^{t}$:
173\[\adjmat^{t+1} = \adjmat^{t}\cup (\adjmat\form - \adjmat^{t}) \, - \, ( \adjmat^{t} - \adjmat\diss).\]
174which simplifies to either:
175\[\adjmat^{t+1} = \adjmat\form - (\adjmat^{t} - \adjmat\diss)\]
176\[\adjmat^{t+1} = \adjmat\diss\cup (\adjmat\form - \adjmat^{t})\]
177Visually, we can sum this up as:
178\begin{figure}[h]
179\begin{center}
180\includegraphics[height=1.5in]{Proc_diss_and_form}
181\label{Proc_diss_and_form}
182\end{center}
183\end{figure}
184
185\section{Notes on model specification and syntax}
186Within \emph{statnet}, an ERGM involves one network and one set of network statistics, so these are specified together using R's formula notation:
187
188\begin{center}
189\texttt{my.network $\sim$ my.vector.of.g.statistics}
190\end{center}
191
192For a call to \texttt{stergm}, there is still one network, but two formulas. These are now passed as three separate arguments: the network (argument \texttt{nw}), the formation formula (argument \texttt{formation}),
193and the dissolution formula (argument \texttt{dissolution}).  The latter two both take the form of a one-sided formula.  E.g.:
194
195\begin{verbatim}
196        stergm(my.network,
197            formation= ~edges+kstar(2),
198            dissolution= ~edges+triangle
199        )
200\end{verbatim}
201
202There are other features of a call to either \texttt{ergm} or \texttt{stergm} that will be important for us here.
203We list the features here; each will be illustrated in one or more examples during the workshop.
204
205\begin{enumerate}
206
207\item{To fix the coefficient for a particular network statistic, one uses offset notation. 
208For instance, to fix a dissolution model with only an edges term with parameter value 4.2, the dissolution formula woud be:
209
210\begin{center}
211\texttt{dissolution= $\sim$offset(edges)}
212\end{center}
213
214and the corresponding argument for passing the parameter value would be:
215
216\begin{center}
217\texttt{offset.coef.diss = 4.2}
218\end{center}
219
220}
221
222\item{In parallel with \texttt{ergm}, any information used to specify the nature of the fitting algorithm is passed by specifying a vector called \texttt{control.stergm}
223to the \texttt{control} argument. For example:
224       
225\begin{center}
226\texttt{control=control.stergm(MCMC.burnin=10000)}
227\end{center}
228
229For a list of options, type \texttt{?control.stergm}
230}
231 
232\item{Another argument that the user must supply is \texttt{estimate}, which controls the estimation method.
233Unlike with cross-sectional ERGMs, there is not necessarily an obvious default here, as different scenarios are best fit with different approaches.
234The most important for the new user to recognize are \texttt{EGMME} (equilibrium generalized method of moments estimation) and \texttt{CMLE} (conditional maximum likelihood estimation).
235A good rule of thumb is that when fitting to two networks, one should use \texttt{estimate="CMLE"} while when fitting to a single cross-section with some duration information,
236use \texttt{estimate="EGMME"}.
237}
238
239\item{For cross-sectional ERGMs, the model is by default fit using the sufficient statistics present in the cross-sectional network. 
240For STERGMs, the presence of multiple models makes the default less clear.  Thus, the user is required to supply this information via the \texttt{targets} argument. 
241This can take a one-sided formula listing the terms to be fit; or, if the formula is identical to either the formation or dissolution model,
242the user can simply pass the string \texttt{"formation"} or \texttt{"dissolution"}, respectively. If one is specifying
243targets="formation", dissolution should be an offset, and vice versa.
244If the values to be targeted for those terms are anything other than the sufficient statistics present in the cross-sectional network,
245then those values can be passed with the argument \texttt{target.stats}.
246
247}
248
249\end{enumerate}
250
251\section{STERGM estimation and simulation, Example 1}
252
253Let us imagine that we have observed two things: a cross-sectional network, and a mean relational duration.
254Let us say the cross-sectional network is flobusiness, and the mean relational duration we have witnessed is 10 time steps.
255Furthermore, we are willing to (for reasons of theory or convenience) assume a purely homogeneous dissolution process
256(that is, every existing relationship has the same probability of dissolving as all others, and at all times). 
257For a cross-sectional ERGM, a purely homogeneous model is one with just a single term in it for an edge count.
258The same is true for either of the two formulas in a STERGM. 
259
260The steps we will go through are:
261
262\begin{enumerate}
263
264\item Specify formation and dissolution models (\texttt{formation} and \texttt{dissolution}).
265
266We will begin by assuming a formation model identical to the model we fit in the cross-sectional case:
267\begin{center}
268\texttt{formation = $\sim$edges+gwesp(0,fixed=T)}
269\end{center}
270
271Analogously to cross-sectional ERGMs, our assumption of completely homogeneous dissolution corresponds to a model with only an edgecount term in it.  In STERGM notation this is:
272
273\begin{center}
274\texttt{dissolution = $\sim$edges}
275\end{center}
276
277which correspond to the probability statement:
278\begin{equation}\label{disslogit}
279\ln{\frac{P(Y_{ij,t+1}=1 \mid Y_{ij,t}=1)}{P(Y_{ij,t+1}=0\mid Y_{ij,t}=1)}} = \theta*\delta(y)
280\end{equation}
281where the one term in the $\delta(y)$ vector is the edge count of the network.
282
283\item Calculate \texttt{theta.diss}.
284
285Our dissolution model is applied to the set of ties that exist at any given time point, as reflected in the conditional present in both the
286numerator and denominator of Equation (\ref{disslogit}). The numerator thus represents the case where a tie persists to the next step,
287and the denomiator represents the case where it dissolves. 
288Furthermore, $\delta(y_{ij})=1$ for all $i,j$ for the case of the edge count statistic. 
289We define the probability of persistance from one time step to the next for actor pair $i,j$ as $p_{ij}$,
290and the probability of dissolution as $q_{ij}=1-p_{ij}$.
291Our dissolution model is Bernoulli; that is, all edges have the same probability of dissolution, and thus of persistence,
292so we further define $p_{ij}=p \forall i,j$ and $q_{ij}=q \forall i,j$. Then:
293
294\[
295\ln{(\frac{p_{ij}}{1-p_{ij}})}=\theta * \delta({y_{ij})}
296\]
297\[
298\ln{(\frac{p}{1-p})}=\theta
299\]
300\[
301\ln{(\frac{1-q}{q})}=\theta
302\]
303\[
304\ln{(\frac{1}{q}-1)}=\theta     
305\]
306
307And because this is a discrete memoryless process, durations are geometric; symbolizing mean relational duration as $d$, we have $d = \frac{1}{q}$, and thus:
308\begin{equation}
309\theta = \ln{(d-1)}
310\end{equation}
311So, for our dissolution model, theta.diss = $\ln{(10-1)}$ = $\ln{9}$ = $2.197$:
312
313<<>>=
314theta.diss <- log(9)
315@
316
317In short, because our dissolution model is dyadic independent, we can calculate it using a (rather simple) closed form solution.
318
319\item Estimate the formation model, conditional on the dissolution model.
320We put it all together for our first call to \texttt{stergm}, adding in one additional control argument that helps immensely with monitoring model convergence (and is just plain cool):
321plotting the progress of the coefficient estimates and the simulated sufficient statistics in real time.
322
323<<results=hide>>=
324
325stergm.fit.1 <- stergm(flobusiness,
326        formation= ~edges+gwesp(0,fixed=T),
327        dissolution = ~offset(edges),
328        targets="formation",
329        offset.coef.diss = theta.diss,
330        estimate = "EGMME",
331        control=control.stergm(SA.plot.progress=TRUE)
332        )
333@
334\texttt{Iteration 1 of at most 20:}\\
335$\Rightarrow$ Lots of output snipped. $\Leftarrow$ \\
336\texttt{== Phase 3: Simulate from the fit and estimate standard errors.==}
337
338First, we should double-check to make sure the fitting went well:
339<<fit1diag, include=FALSE, fig=TRUE, results=hide>>=
340mcmc.diagnostics(stergm.fit.1)
341@
342
343\begin{verbatim}
344==========================
345EGMME diagnostics
346==========================
347\end{verbatim}
348
349$\Rightarrow$ Lots of output snipped. $\Leftarrow$\\
350
351\begin{figure}[h]
352\begin{center}
353\includegraphics[height=3in]{STERGM-fit1diag}
354\end{center}
355\end{figure}
356
357Since those look good, we can next query the object in a variety of ways to see what we have:
358<<>>=
359stergm.fit.1
360names(stergm.fit.1)
361stergm.fit.1$formation
362stergm.fit.1$formation.fit
363summary(stergm.fit.1)
364@
365\end{enumerate}
366
367We have now obtained estimates for the coefficients of a formation model that,
368conditional on the stated dissolution model, yields simulated targets that matched those observed.
369Something very useful we have also gained in the process is the ability to simulate networks with the
370desired cross-sectional structure and mean relational duration.  This ability serves us well for any application
371areas that requires us to simulate phenomena on dynamic networks, whether they entail the diffusion of information or disease, or some other process. 
372
373<<>>=
374stergm.sim.1 <- simulate.stergm(stergm.fit.1, nsim=1,
375    time.slices = 1000)
376@
377
378Understanding this object requires us to learn about an additional piece of \emph{statnet} functionality:
379the \emph{networkDynamic} package.
380
381\section{networkDynamic}
382
383In \emph{statnet}, cross-sectional networks are stored using objects of class \emph{network}. 
384Tools to create, edit, and query network objects are in the package \emph{network}. Dynamic networks
385are now stored as objects with two classes (\emph{network} and \emph{networkDynamic}).
386They can thus be edited or queried using standard functions from the \emph{network} package, or using additional
387functions tailored specifically to the case of dynamic networks in the package \emph{networkDynamic}.
388
389To illustrate, let us begin with the network that we just created:
390
391<<results=hide>>=
392stergm.sim.1
393@
394
395\texttt{networkDynamic with 985 distinct change times:}\\
396$\Rightarrow$ Lots of output snipped. $\Leftarrow$
397\begin{verbatim}
398 Network attributes:
399  vertices = 16
400  directed = FALSE
401  hyper = FALSE
402  loops = FALSE
403  multiple = FALSE
404  bipartite = FALSE
405  total edges= 120
406    missing edges= 0
407    non-missing edges= 120
408
409 Vertex attribute names:
410    priorates totalties vertex.names wealth
411\end{verbatim}
412
413We can deduce from the number of edges that this likely represents the cumulative network---that is, the union of all edges that exist at any point in time over the course of the simulation. 
414What does the network look like at different time points? The function \texttt{network.extract} allows us to pull out the network at an instantanoues time point (with the argument \texttt{at}),
415or over any given spell (with the arguments \texttt{onset} and \texttt{terminus}).
416
417<<>>=
418network.extract(stergm.sim.1,at=429)
419@
420
421For any one of these time points, we can look at the network structure:
422
423<<simex, include=FALSE, fig=TRUE>>=
424plot(network.extract(stergm.sim.1,at=882))
425@
426
427\begin{figure}[ht]
428\begin{center}
429\includegraphics[height=3in]{STERGM-simex}
430\end{center}
431\end{figure}
432
433
434How well do the cross-sectional networks within our simulated dynamic network fit the probability distribution implied by our model? 
435We can check by considering the summary statistics for our observed network, and those for our cross-sectional networks.
436
437<<>>=
438summary(flobusiness~edges+gwesp(0,fixed=T))
439colMeans(attributes(stergm.sim.1)$stats)
440@
441
442And we can also easily look at a time series and histogram for each statistic:
443
444<<statsform1, include=FALSE, fig=TRUE>>=
445plot(attributes(stergm.sim.1)$stats)
446@
447
448\begin{figure}[ht]
449\begin{center}
450\includegraphics[height=3in]{STERGM-statsform1}
451\end{center}
452\end{figure}
453
454We should also check to make sure that our mean duration is what we expect (10 time steps). This requires knowing an additional function: \texttt{as.data.frame},
455which, when applied to an object of class \texttt{networkDynamic}, generates a timed edgelist.
456Although right-censoring is present for some edges in our simulation, with a mean duration of 10 time steps and a simulation 1000 time steps long,
457its effect on our observed mean duration should be trivial.
458
459<<>>=
460stergm.sim.1.dm <- as.data.frame(stergm.sim.1)
461names(stergm.sim.1.dm)
462mean(stergm.sim.1.dm$duration)
463@
464
465The information on when an edge is active and when it is inactive is stored within our \texttt{network} object as the edge attribute \texttt{active}. 
466Vertices, too, are capable of becoming active and inactive within \texttt{networkDynamic}, and this information is stored as a vertex attribute.
467Most of the time, users should access this information indirectly, through functions like \texttt{network.extract} or \texttt{as.data.frame}. 
468Additional functions to query or set activity include \texttt{is.active}, \texttt{activate.vertex}, \texttt{deactivate.vertex}, \texttt{activate.edge},
469and \texttt{deactivate.edge}, all documented in \texttt{help(package="networkDynamic")}.
470
471For those who want to look under the hood, they can see the activity spells directly.  For a single edge, say, edge number 25, use:
472
473<<>>=
474get.edge.value(stergm.sim.1, "active", unlist=FALSE)[[25]]
475@
476
477Note that \texttt{networkDynamic} stores spells in the form [onset,terminus), meaning that the spell is inclusive of the onset and exclusive of the terminus. 
478So a spell of 3,7 means the edge begins at time point 3 and ends just before time point 7.  networkDynamic can handle continuous-time spell information. 
479However, since STERGMs are discrete-time with integer steps, what this means for STERGM is that the edge is inactive up through time step 2;
480 active at time steps 3, 4, 5, and 6; and inactive again at time step 7 and on. Its duration is thus 4 time steps.
481
482\section{Independence within and across time steps}
483
484STERGMs assume that the formation and dissolution processes are indepedent of each other \emph{within the the same time step}. 
485
486This does not necessarily mean that they will be independent across time. In fact, for any dyadic dependent model, they will not. To see this point, think of a romantic relationship example with:
487
488\begin{verbatim}
489     formation = ~edges+degree(2:10)
490     dissolution = ~edges
491\end{verbatim}
492
493\noindent with increasingly negative parameters on the degree terms. 
494What this means is that there is some underlying tendency for relational formation to occur,
495which is considerably reduced with each pre-existing tie that the two actors involved are already in.
496In other words, there is a strong prohibition against being in multiple simultaneous romantic relationships.
497However, dissolution is fully independent---all existing relationships have the same underlying dissolution probability at every time step.
498(The latter assumption is probably unrealistic; in practice, if someone just acquired a second partner, their first is likely to be at increased risk of dissoving their relation.
499We ignore this now for simplicity).
500
501Imagine that Chris and Pat are in a relationship at time \texttt{t}. During the time period between \texttt{t} and \texttt{t+1},
502whether they break up does not depend on when either of them acquires a new partner, and vice versa. Let us assume that they do *not* break up during this time.
503Now, during the time period between \texttt{t+1} and \texttt{t+2}, whether or not they break up is dependent on the state of the network at time \texttt{t+1},
504and that depends on whether either of them they acquired new partners between \texttt{t} and \texttt{t+1}.
505
506The simple implication of this is that in this framework, formation and dissolution can be dependent, but that dependence occurs in subsequent time steps, not simultaneously.
507
508Note that a time step here is abritrary, and left to the user to define.  One reason to select a smaller time interval is that it makes this assumption more justifiable. With a time step of 1 month,
509then if I start a new relationship today, the earliest I can break up with my first partner as a direct result of that new partnership is in one month. If my time step is a day, then it is in 1 day;
510the latter is likely much more reasonable. The tradeoff is that a shorter time interval means longer computation time for both model estimation and simulation, as will be seen below. You will see throughout this talk that there are multiple
511positives and negatives to having a short time step and having a long time step.  We will discuss them as they go, and review them collectively at the end.
512
513At the limit, this can in practice approximate a continuous-time model---the only issue is computational limitations.
514
515\section{Example 2: Long durations}
516
517For the type of model we saw in Example 1 (with a known dissolution model that contains a subset of terms from the formation model),
518it can be shown that a good set of starting values for the estimation of the formation model are as follows:
519(1) fit the terms in the formation model as a static ERGM on the cross-sectional network; and (2) subtract the values
520of the dissolution parameters from the corresponding values in the cross-sectional model. 
521The result is a vector of parameter values that form a reasonable place to start the MCMC chain for the estimation of the formation model. 
522This is in fact exactly what the \texttt{stergm} estimation code does by default for this type of model.
523
524When mean relational duration is very long, this approximation is so good that it may not be necessary to run a STERGM estimation at all.
525Especially if your purpose is mainly for simulation, the approximation may be all you need.  This is a very useful finding, since models with long mean duration are precisely the ones that
526are the slowest and most difficult to fit using EGMME.  That's because, with long durations, very few ties will change between one time step and another, giving the fitting algorithm
527very little information on which to perform the estimation. 
528
529Of course, in order to be able to take advantage of this method, it is necessary for the terms in your dissolution model to be a subset of the terms in your formation model.
530
531To illustrate, let us reconsider Example 1, with a mean relational duration of 100 time steps.
532
533<<>>=
534theta.diss.100 <- log(99)
535@
536
537First, we treat the formation process as if it were a stand-alone cross-sectional model, and estimate it using a standard cross-sectional ERGM.
538We did, in fact, fit this cross-sectional model earlier:
539
540<<>>=
541summary(fit1)
542theta.form <- fit1$coef
543theta.form
544@
545
546Then, we subtract the values of the dissolution $\theta$ from each of the corresponding values in the formation model.
547In this example, the dissolution model contains only an edges term, so this coefficient should be subtracted from the starting value for the edges term in the formation model.
548
549<<>>=
550theta.form[1] <- theta.form[1] - theta.diss.100
551@
552
553How well does this approximation do in capturing our desired dynamic network properties? First, we can simulate from it:
554
555<<>>=
556stergm.sim.2 <- simulate(flobusiness,
557        formation=~edges+gwesp(0,fixed=T),
558        dissolution=~edges,
559        monitor="all",
560        coef.form=theta.form,
561        coef.diss=theta.diss.100,
562        time.slices=10000)
563@
564
565Then check the results in terms of cross-sectional network structure and mean relational duration?
566
567<<simform, include=FALSE, fig=TRUE>>=
568summary(flobusiness~edges+gwesp(0,fixed=T))
569colMeans(attributes(stergm.sim.2)$stats)
570stergm.sim.dm.2 <- as.data.frame(stergm.sim.2)
571mean(stergm.sim.dm.2$duration)
572plot(attributes(stergm.sim.2)$stats)
573@
574
575\begin{figure}[ht]
576\begin{center}
577\includegraphics[height=3in]{STERGM-simform}
578\end{center}
579\end{figure}
580
581
582
583
584\section{Example 3: Two network cross-sections}
585
586Another common data form for modeling dynamic network processes consists of observations of network structure at two or more points in time on the same node set. 
587Many classic network studies were of this type, and data of this form continue to be collected and analyzed.
588
589Let us consider the first two time points in the famous Sampson monastery data:
590
591<<>>=
592data(samplk)
593ls(pattern="samp*")
594@
595
596To pass them into \texttt{stergm}, we need to combine them into a list:
597<<>>=
598samp <- list()
599samp[[1]] <- samplk1
600samp[[2]] <- samplk2
601@
602
603Now we must decide on a model to fit to them.  Plotting one network:
604
605\begin{center}
606<<fig=TRUE>>=
607plot(samplk1)
608@
609\end{center}
610
611\noindent we might get the idea to consider mutuality as a predictor of a directed edge.  Also, since this is a directed network,
612and there appear to be a considerable number of triadic relations, it might be worth investigating the role of cyclic vs. transitive triads in the network.
613Of course, since we have two network snapshots, and we have separate formation and dissolution models,
614we can estimate the degree to which closing a mutual dyad or closing a triad of each type predicts the creation of a tie,
615and also estimate the degree to which maintaining a mutual dyad or maintaining a triad of each type predicts the persistence of an existing tie.
616We might see different phenomena at work in each case; or the same phenomena, but with different coefficients.
617
618Because of the different structure of our model, we need to change our arguments slightly.  Our estimation method should now be conditional maximum likelihood estimation (CMLE).
619Moreover, we no longer need the target argument (and it is in fact not allowed for CMLE, since the algorithm automatically targets the sufficient statistics present in each of the two networks).
620In this case, we have no offsets, since there are no coefficients set in either the formation or dissolution model.
621
622<<results=hide>>=
623stergm.fit.3 <- stergm(samp,
624        formation= ~edges+mutual+ctriad+ttriad,
625        dissolution = ~edges+mutual+ctriad+ttriad,
626        estimate = "CMLE"
627        )
628@
629
630\texttt{Fitting formation:}\\
631\texttt{Iteration 1 of at most 20:}\\
632$\Rightarrow$ Lots of output snipped. $\Leftarrow$\\
633\texttt{Time points not specified for a list. Modeling transition from the first to the second network. This behavior may change in the future.}\\
634
635And the results:
636
637<<>>=
638summary(stergm.fit.3)
639@
640
641So, a relationship is more likely than chance to form if it will close a mutual pair.
642And it is also more likely than chance to persist if it will retain a mutual pair, although the coefficient is smaller.
643A relationship is more likely than chance to form if it will close a transitive triad, and more likely to persist if it
644sustains a transitive triad, although these effects appear to be less clearly significant.
645
646\section{Example 4: Simulation driven by egocentric data}
647
648In many cases, people's primary interest in using dynamic networks is to simulate some diffusion process on one or more networks with similar features.
649Increasingly, our knowledge about those features come in the form of egocentrically sampled data, not from the traditional network census in a bounded population.
650Both \emph{ergm} and \emph{stergm} have methods for handling these situations.
651
652For example, imagine that you want to model HIV transmission among a population of gay men in steady partnerships. 50\% of the men are White and 50\% are Black.
653You collect egocentric partnership data from a random (ha! ha!) sample of these men. Your data say:
654
655\begin{enumerate}
656\item{There are no significant differences in the distribution of momentary degree (the number of ongoing partnerships at one point in time)
657reported by White vs. Black men. The mean is 0.90, and the overall distribution is:}
658\begin{enumerate}
659\item{36\% degree 0}
660\item{46\% degree 1}
661\item{18\% degree 2+}
662\end{enumerate}
663\item{83.3\% of relationships are racially homogeneous}
664\end{enumerate}
665
666We also have data (from these same men, or elsewhere) that tell us that the mean duration for a racially homogenous relationship is 10 months,
667while for a racially mixed one it is 20 months. 
668(Perhaps this is because the social pressure against cross-race ties makes it such that those who are willing to enter them are a select group,
669more committed to their relationships).
670
671Before we model the disease transmission, we need a dynamic network that possesses each of thse features to simulate it on.
672
673Our first step is to create a 500-node undirected network, and assign the first 250 nodes to race 0 and the second to race 1. The choice of 500 nodes is arbitary.
674
675<<>>=
676msm.net <- network.initialize(500, directed=F) 
677msm.net %v% 'race' <- c(rep(0,250),rep(1,250))
678msm.net
679@
680
681ERGM and STERGM have functionality that allow us to simply state what the target statistics are that we want to match; we do not actually need to generate
682a network that has them.  The formation formula and target statistics that we need are:
683
684<<>>=
685msm.form.formula <- ~edges+nodematch('race')+degree(0)+
686    concurrent
687msm.target.stats <- c(225,187,180,90)
688@
689
690Why don't we specify \texttt{degree(1)} as well?
691How did we get those values?
692
693Now let us turn to dissolution. We are back to the case where we can solve these explicitly,
694although this is complicated slightly by the fact that our disslution probabilities differ by the race composition of the members. 
695One dissolution formula for representing this is:
696
697<<>>=
698msm.diss.formula <- ~offset(edges)+offset(nodematch("race"))
699@
700
701These two model statistics means that there will be two model coefficients.  Let us call them $\theta_1$ and $\theta_2$ for the edges and nodematch terms, respectively.
702Let us also refer to the change statistics for actor pair $i,j$ for each of these as $\delta_1(y_{ij})$ and $\delta_2(y_{ij})$, respectively.
703
704Thus the log-odds expression for dissolution that we saw earlier would here be expressed as:
705\begin{equation}\label{disslogit}
706\ln{\frac{P(Y_{ij,t+1}=1 \mid Y_{ij,t}=1)}{P(Y_{ij,t+1}=0\mid Y_{ij,t}=1)}} = \theta_1\delta_1(y_{ij})+\theta_2\delta_2(y_{ij})
707\end{equation}
708Note that $\delta_1(y_{ij})$ would equal 1 for all actor pairs, while $\delta_2(y_{ij})$ would equal 1 for race homophilous pairs and 0 for others. 
709That means that the log-odds of tie persistence will equal $\theta_1$ for mixed-race couples and $\theta_1 + \theta_2$ for race-homophilous couples. 
710This suggests that we should be able to calculate $\theta_1$ directly, and subsequently calculate $\theta_2$.
711
712Following the logic we saw in the Example 1, we can see that:
713\begin{equation}
714\theta_1 = \ln{d_{mixed}-1}
715\end{equation}
716and therefore $\theta_1 = \ln{(20-1)} = \ln{19} = 2.944$.
717
718Furthermore,
719\begin{equation}
720\theta_1 + \theta_2 = \ln{d_{homoph}-1}
721\end{equation}
722and therefore $\theta_2 = \ln{(d_{homoph}-1)}-\theta_1 = \ln{(10-1)} - 2.944 = -0.747$.
723
724So, we have:
725
726<<>>=
727msm.theta.diss <- c(2.944, -0.747)
728@
729
730We add in one additional control parameter---\texttt{SA.init.gain}---giving it a small value (the default is 0.1). 
731As the help page for \texttt{control.stergm} sagely advises, ``If the process initially goes crazy beyond recovery, lower this value.''
732This slows down estimation, but also makes it more stable.
733From trial and error, we know that this model, fit to this relatively large network, does better with this smaller value.
734
735Putting it all together gives us:
736
737<<results=hide>>=
738set.seed(0)
739msm.fit <- stergm(msm.net,
740        formation= msm.form.formula,
741        dissolution= msm.diss.formula,
742        targets="formation",
743        target.stats= msm.target.stats,
744        offset.coef.diss = msm.theta.diss,
745        estimate = "EGMME",
746        control=control.stergm(SA.plot.progress=TRUE,
747            SA.init.gain=0.005)
748)
749@
750
751\texttt{Iteration 1 of at most 20:}\\
752$\Rightarrow$ Lots of output snipped. $\Leftarrow$ \\
753\texttt{========  Phase 3: Simulate from the fit and estimate standard errors. ========}\\
754
755Let's first check to make sure it fit well:
756
757<<msmdiag, include=FALSE, fig=TRUE, results=hide>>=
758mcmc.diagnostics(msm.fit)
759@
760$\Rightarrow$ Lots of output snipped. $\Leftarrow$\\
761\begin{figure}[h]
762\begin{center}
763\includegraphics[height=3in]{STERGM-msmdiag}
764\end{center}
765\end{figure}
766
767and see what the results tell us:
768
769<<>>=
770summary(msm.fit)
771@
772
773Now, we simulate a dynamic network:
774
775<<>>=
776msm.sim <- simulate(msm.fit,time.slices=1000)
777@
778
779and compare the outputs to what we expect, in terms of cross-sectional structure:
780
781<<>>=
782colMeans(attributes(msm.sim)$stats)
783msm.target.stats
784@
785
786Here's another interesting way to look at one aspect of the network structure:
787
788<<msmht, include=FALSE, fig=TRUE>>=
789msm.sim.dm <- as.data.frame(msm.sim)
790plot(msm.sim.dm$head,msm.sim.dm$tail)
791@
792
793\begin{figure}[ht]
794\begin{center}
795\includegraphics[height=3in]{STERGM-msmht}
796\end{center}
797\end{figure}
798 
799And relationship length:
800
801<<>>=
802names(msm.sim.dm)
803msm.sim.dm$race1 <- msm.sim.dm$head>250
804msm.sim.dm$race2 <- msm.sim.dm$tail>250
805msm.sim.dm$homoph <- msm.sim.dm$race1 == msm.sim.dm$race2
806mean(msm.sim.dm$duration[msm.sim.dm$homoph==T &
807  msm.sim.dm$left.censored==F & msm.sim.dm$right.censored==F ])
808mean(msm.sim.dm$duration[msm.sim.dm$homoph==F &
809  msm.sim.dm$left.censored==F & msm.sim.dm$right.censored==F ])
810@
811
812\section{Additional functionality}
813
814Both the \texttt{stergm} functions and the \texttt{networkDynamic} package have additional functionality,
815which you can learn about and explore through the use of R's many help features.
816Remember that both of these have only been released publicly for the first time in recent weeks. 
817If you begin to use them in depth in the near future, you will likely have further questions.  If so, we encourage you to join the statnet users' group
818(\url{http://csde.washington.edu/statnet/statnet_users_group.shtml}), where you can then post your questions (and possibly answer others).  You may also encounter bugs; please use the same
819place to report them. Happy stergming!
820\\
821\\
822\\
823References:\\
824\\
825Pavel N. Krivitsky and Mark S. Handcock. A Separable Model for Dynamic
826Networks. 2010. \url{http://arxiv.org/abs/1011.1937}
827
828\end{document}