Egocentric Network Data Analysis with ERGMs

Last update: 2024-06-23 · This tutorial is a joint product of the Statnet Development Team:

install.packages('ergm.ego')
install.packages(
  "ergm.ego", 
  repos = c("https://statnet.r-universe.dev", "https://cloud.r-project.org")
)
library('ergm.ego')
Loading required package: ergm
Loading required package: network

'network' 1.18.2 (2023-12-04), part of the Statnet Project
* 'news(package="network")' for changes since last version
* 'citation("network")' for citation information
* 'https://statnet.org' for help, support, and other information

'ergm' 4.7-7368 (2024-06-11), part of the Statnet Project
* 'news(package="ergm")' for changes since last version
* 'citation("ergm")' for citation information
* 'https://statnet.org' for help, support, and other information
'ergm' 4 is a major update that introduces some backwards-incompatible
changes. Please type 'news(package="ergm")' for a list of major
changes.
Loading required package: egor
Loading required package: dplyr

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Loading required package: tibble

'ergm.ego' 1.1-704 (2023-05-30), part of the Statnet Project
* 'news(package="ergm.ego")' for changes since last version
* 'citation("ergm.ego")' for citation information
* 'https://statnet.org' for help, support, and other information

Attaching package: 'ergm.ego'
The following objects are masked from 'package:ergm':

    COLLAPSE_SMALLEST, snctrl
The following object is masked from 'package:base':

    sample
packageVersion('ergm.ego')
[1] '1.1.704'
example(sample.egor)
library(help='ergm.ego')
?as.egor
help('ergm.ego-terms')
sessionInfo()
set.seed(1)
data(faux.mesa.high)
mesa <- faux.mesa.high
plot(mesa, vertex.col="Grade")
legend('bottomleft',fill=7:12,legend=paste('Grade',7:12),cex=0.75)
mesa.ego <- as.egor(mesa) 
names(mesa.ego) # what are the components of this object?
[1] "ego"   "alter" "aatie"
mesa.ego # shows the dimensions of each component
# EGO data (active): 205 × 4
  .egoID Grade Race  Sex  
*  <int> <dbl> <chr> <chr>
1      1     7 Hisp  F    
2      2     7 Hisp  F    
3      3    11 NatAm M    
4      4     8 Hisp  M    
5      5    10 White F    
# ℹ 200 more rows
# ALTER data: 406 × 5
  .altID .egoID Grade Race  Sex  
*  <int>  <int> <dbl> <chr> <chr>
1    174      1     7 Hisp  F    
2    161      1     7 Hisp  F    
3    151      1     7 Hisp  F    
# ℹ 403 more rows
# AATIE data: 372 × 3
  .egoID .srcID .tgtID
*  <int>  <int>  <int>
1      1    151    127
2      1    127     52
3      1    127     87
# ℹ 369 more rows
#View(mesa.ego) # opens the component in the Rstudio source window
class(mesa.ego) # what type of "object" is this?
[1] "egor" "list"
class(mesa.ego$ego) # and what type of objects are the components?
[1] "tbl_df"     "tbl"        "data.frame"
class(mesa.ego$alter)
[1] "tbl_df"     "tbl"        "data.frame"
class(mesa.ego$aatie)
[1] "tbl_df"     "tbl"        "data.frame"
mesa.ego$ego # first few rows of the ego table
# A tibble: 205 × 4
   .egoID Grade Race  Sex  
    <int> <dbl> <chr> <chr>
 1      1     7 Hisp  F    
 2      2     7 Hisp  F    
 3      3    11 NatAm M    
 4      4     8 Hisp  M    
 5      5    10 White F    
 6      6    10 Hisp  F    
 7      7     8 NatAm M    
 8      8    11 NatAm M    
 9      9     9 White M    
10     10     9 NatAm F    
# ℹ 195 more rows
mesa.ego$alter # first few rows of the alter table
# A tibble: 406 × 5
   .altID .egoID Grade Race  Sex  
    <int>  <int> <dbl> <chr> <chr>
 1    174      1     7 Hisp  F    
 2    161      1     7 Hisp  F    
 3    151      1     7 Hisp  F    
 4    127      1     7 Hisp  F    
 5    110      1     7 Hisp  F    
 6    100      1     7 Hisp  F    
 7     96      1     7 NatAm F    
 8     92      1     7 NatAm F    
 9     87      1     7 White F    
10     70      1     7 NatAm F    
# ℹ 396 more rows
# ties show up twice, but alter info is linked to .altID
mesa.ego$alter %>% filter((.altID==1 & .egoID==25) | (.egoID==1 & .altID==25))
# A tibble: 2 × 5
  .altID .egoID Grade Race  Sex  
   <int>  <int> <dbl> <chr> <chr>
1     25      1     7 White F    
2      1     25     7 Hisp  F    
mesa.ego$aatie # first few rows of the alter table
# A tibble: 372 × 3
   .egoID .srcID .tgtID
    <int>  <int>  <int>
 1      1    151    127
 2      1    127     52
 3      1    127     87
 4      1    127    151
 5      1    110     87
 6      1    110     92
 7      1    110     96
 8      1    100     96
 9      1     96     87
10      1     96    110
# ℹ 362 more rows
# egos
write.csv(mesa.ego$ego, file="mesa.ego.table.csv", row.names = F)

# alters
write.csv(mesa.ego$alter[,-1], file="mesa.alter.table.csv", row.names = F)
mesa.egos <- read.csv("mesa.ego.table.csv")
head(mesa.egos)
  .egoID Grade  Race Sex
1      1     7  Hisp   F
2      2     7  Hisp   F
3      3    11 NatAm   M
4      4     8  Hisp   M
5      5    10 White   F
6      6    10  Hisp   F
mesa.alts <- read.csv("mesa.alter.table.csv")
head(mesa.alts)
  .egoID Grade Race Sex
1      1     7 Hisp   F
2      1     7 Hisp   F
3      1     7 Hisp   F
4      1     7 Hisp   F
5      1     7 Hisp   F
6      1     7 Hisp   F
my.egodata <- egor(egos = mesa.egos, 
                   alters = mesa.alts, 
                   ID.vars = list(ego = ".egoID"))
my.egodata
# EGO data (active): 205 × 4
  .egoID Grade Race  Sex  
* <chr>  <int> <chr> <chr>
1 1          7 Hisp  F    
2 2          7 Hisp  F    
3 3         11 NatAm M    
4 4          8 Hisp  M    
5 5         10 White F    
# ℹ 200 more rows
# ALTER data: 406 × 4
  .egoID Grade Race  Sex  
* <chr>  <int> <chr> <chr>
1 1          7 Hisp  F    
2 1          7 Hisp  F    
3 1          7 Hisp  F    
# ℹ 403 more rows
# AATIE data: 0 × 3
# ℹ 3 variables: .egoID <chr>, .srcID <chr>, .tgtID <chr>
my.egodata$alter
# A tibble: 406 × 4
   .egoID Grade Race  Sex  
   <chr>  <int> <chr> <chr>
 1 1          7 Hisp  F    
 2 1          7 Hisp  F    
 3 1          7 Hisp  F    
 4 1          7 Hisp  F    
 5 1          7 Hisp  F    
 6 1          7 Hisp  F    
 7 1          7 NatAm F    
 8 1          7 NatAm F    
 9 1          7 White F    
10 1          7 NatAm F    
# ℹ 396 more rows
example("egor")
# to reduce typing, we'll pull the ego and alter data frames
egos <- mesa.ego$ego
alters <- mesa.ego$alter

table(egos$Sex) # Distribution of `Sex`

  F   M 
 99 106 
table(egos$Race) # Distribution of `Race`

Black  Hisp NatAm Other White 
    6   109    68     4    18 
barplot(table(egos$Grade), 
        main = "Ego grade distribution",
        ylab="frequency")
layout(matrix(1:2, 1, 2))
barplot(table(egos$Race)/nrow(egos),
        main="Ego Race Distn", ylab="percent",
        ylim = c(0,0.5), las = 3)
barplot(table(alters$Race)/nrow(alters),
        main="Alter Race Distn", ylab="percent",
        ylim = c(0,0.5), las = 3)
layout(1)
# to get the crosstabulated counts of ties:
mixingmatrix(mesa.ego,"Grade")
     7   8   9  10  11  12
7  150   0   0   1   1   1
8    0  66   2   4   2   1
9    0   2  46   7   6   4
10   1   4   7  18   1   5
11   1   2   6   1  34   5
12   1   1   4   5   5  12
Note:  Marginal totals can be misleading for undirected mixing matrices.
# contrast with the original network crosstab:
mixingmatrix(mesa, "Grade")
    7  8  9 10 11 12
7  75  0  0  1  1  1
8   0 33  2  4  2  1
9   0  2 23  7  6  4
10  1  4  7  9  1  5
11  1  2  6  1 17  5
12  1  1  4  5  5  6
Note:  Marginal totals can be misleading for undirected mixing matrices.
# to get the row conditional probabilities:

round(mixingmatrix(mesa.ego, "Grade", rowprob=T), 2)
      7    8    9   10   11   12
7  0.98 0.00 0.00 0.01 0.01 0.01
8  0.00 0.88 0.03 0.05 0.03 0.01
9  0.00 0.03 0.71 0.11 0.09 0.06
10 0.03 0.11 0.19 0.50 0.03 0.14
11 0.02 0.04 0.12 0.02 0.69 0.10
12 0.04 0.04 0.14 0.18 0.18 0.43
Note:  Marginal totals can be misleading for undirected mixing matrices.
round(mixingmatrix(mesa.ego, "Race", rowprob=T), 2)
      Black Hisp NatAm Other White
Black  0.00 0.31  0.50  0.00  0.19
Hisp   0.04 0.60  0.23  0.01  0.12
NatAm  0.08 0.26  0.59  0.00  0.06
Other  0.00 1.00  0.00  0.00  0.00
White  0.11 0.49  0.22  0.00  0.18
Note:  Marginal totals can be misleading for undirected mixing matrices.
# first, using the original network
network.edgecount(faux.mesa.high)
[1] 203
# compare to `egor`
# note that the ties are double counted, so we need to divide by 2.
nrow(mesa.ego$alter)/2
[1] 203
# mean degree -- here we want to count each "stub", so we don't divide by 2
nrow(mesa.ego$alter)/nrow(mesa.ego$ego)
[1] 1.980488
# overall degree distribution
summary(mesa.ego ~ degree(0:20))
         scaled mean     SE
degree0           57 6.4306
degree1           51 6.2048
degree2           30 5.0730
degree3           28 4.9289
degree4           18 4.0620
degree5           10 3.0917
degree6            2 1.4107
degree7            4 1.9852
degree8            1 1.0000
degree9            2 1.4107
degree10           1 1.0000
degree11           0 0.0000
degree12           0 0.0000
degree13           1 1.0000
degree14           0 0.0000
degree15           0 0.0000
degree16           0 0.0000
degree17           0 0.0000
degree18           0 0.0000
degree19           0 0.0000
degree20           0 0.0000
# and stratified by sex
summary(mesa.ego ~ degree(0:13, by="Sex"))
           scaled mean     SE
deg0.SexF           23 4.5299
deg1.SexF           23 4.5299
deg2.SexF           10 3.0917
deg3.SexF           17 3.9581
deg4.SexF           12 3.3694
deg5.SexF            7 2.6066
deg6.SexF            1 1.0000
deg7.SexF            3 1.7235
deg8.SexF            1 1.0000
deg9.SexF            0 0.0000
deg10.SexF           1 1.0000
deg11.SexF           0 0.0000
deg12.SexF           0 0.0000
deg13.SexF           1 1.0000
deg0.SexM           34 5.3385
deg1.SexM           28 4.9289
deg2.SexM           20 4.2588
deg3.SexM           11 3.2343
deg4.SexM            6 2.4193
deg5.SexM            3 1.7235
deg6.SexM            1 1.0000
deg7.SexM            1 1.0000
deg8.SexM            0 0.0000
deg9.SexM            2 1.4107
deg10.SexM           0 0.0000
deg11.SexM           0 0.0000
deg12.SexM           0 0.0000
deg13.SexM           0 0.0000
summary(mesa.ego ~ degree(0:10), scaleto=100000)
         scaled mean      SE
degree0     27804.88 3136.89
degree1     24878.05 3026.75
degree2     14634.15 2474.63
degree3     13658.54 2404.34
degree4      8780.49 1981.47
degree5      4878.05 1508.16
degree6       975.61  688.17
degree7      1951.22  968.41
degree8       487.80  487.80
degree9       975.61  688.17
degree10      487.80  487.80
summary(mesa.ego ~ degree(0:10), scaleto=nrow(mesa.ego$ego)*100)
         scaled mean     SE
degree0         5700 643.06
degree1         5100 620.48
degree2         3000 507.30
degree3         2800 492.89
degree4         1800 406.20
degree5         1000 309.17
degree6          200 141.07
degree7          400 198.52
degree8          100 100.00
degree9          200 141.07
degree10         100 100.00
# degreedist(mesa.ego, plot=TRUE, prob=FALSE) # bug statnet/ergm.ego#82.
degreedist(mesa.ego, by="Sex", plot=TRUE, prob=FALSE)
degreedist(mesa.ego, by="Sex", plot=TRUE, prob=TRUE)
set.seed(1)
degreedist(mesa.ego, brg=TRUE)
degreedist(mesa.ego, by="Sex", prob=TRUE, brg=TRUE)
?control.ergm.ego
fit.edges <- ergm.ego(mesa.ego ~ edges)
summary(fit.edges)
Call:
ergm.ego(formula = mesa.ego ~ edges)

Monte Carlo Maximum Likelihood Results:

                    Estimate Std. Error MCMC % z value Pr(>|z|)    
offset(netsize.adj) -5.32301    0.00000      0    -Inf   <1e-04 ***
edges                0.69590    0.07717      0   9.018   <1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 The following terms are fixed by offset and are not estimated:
  offset(netsize.adj) 
names(fit.edges)
 [1] "coefficients"     "sample"           "iterations"       "MCMCtheta"       
 [5] "loglikelihood"    "gradient"         "hessian"          "covar"           
 [9] "failure"          "newnetwork"       "coef.init"        "est.cov"         
[13] "coef.hist"        "stats.hist"       "steplen.hist"     "control"         
[17] "etamap"           "MCMCflag"         "nw.stats"         "call"            
[21] "network"          "ergm_version"     "info"             "MPLE_is_MLE"     
[25] "drop"             "offset"           "estimable"        "formula"         
[29] "target.stats"     "target.esteq"     "reference"        "constraints"     
[33] "obs.constraints"  "estimate"         "estimate.desc"    "v"               
[37] "m"                "ergm.formula"     "popnw"            "ergm.offset.coef"
[41] "egor"             "ppopsize"         "popsize"          "netsize.adj"     
[45] "ergm.covar"       "DtDe"             "ergm.call"       
fit.edges$ppopsize
[1] 205
fit.edges$popsize
[1] 1
 The following terms are fixed by offset and are not estimated:
  netsize.adj
summary(ergm.ego(mesa.ego ~ edges, 
                 control = control.ergm.ego(ppopsize=1000)))
Constructing pseudopopulation network.
Note: Constructed network has size 1025, different from requested 1000. Estimation should not be meaningfully affected.
Starting maximum pseudolikelihood estimation (MPLE):
Obtaining the responsible dyads.
Evaluating the predictor and response matrix.
Maximizing the pseudolikelihood.
Finished MPLE.
Starting Monte Carlo maximum likelihood estimation (MCMLE):
Iteration 1 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0019.
Convergence test p-value: 0.0140. Not converged with 99% confidence; increasing sample size.
Iteration 2 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0129.
Convergence test p-value: 0.0001. Converged with 99% confidence.
Finished MCMLE.
This model was fit using MCMC.  To examine model diagnostics and check
for degeneracy, use the mcmc.diagnostics() function.
Call:
ergm.ego(formula = mesa.ego ~ edges, control = control.ergm.ego(ppopsize = 1000))

Monte Carlo Maximum Likelihood Results:

                    Estimate Std. Error MCMC % z value Pr(>|z|)    
offset(netsize.adj) -6.93245    0.00000      0    -Inf   <1e-04 ***
edges                0.69348    0.08023      0   8.644   <1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 The following terms are fixed by offset and are not estimated:
  offset(netsize.adj) 
mcmc.diagnostics(fit.edges, which ="plots")

Note: MCMC diagnostics shown here are from the last round of
  simulation, prior to computation of final parameter estimates.
  Because the final estimates are refinements of those used for this
  simulation run, these diagnostics may understate model performance.
  To directly assess the performance of the final model on in-model
  statistics, please use the GOF command: gof(ergmFitObject,
  GOF=~model).
plot(gof(fit.edges, GOF="model"))
plot(gof(fit.edges, GOF="degree"))
set.seed(1)
fit.deg0 <- ergm.ego(mesa.ego ~ edges + degree(0), 
                     control = snctrl(ppopsize=1000))
summary(fit.deg0)
Call:
ergm.ego(formula = mesa.ego ~ edges + degree(0), control = snctrl(ppopsize = 1000))

Monte Carlo Maximum Likelihood Results:

                    Estimate Std. Error MCMC % z value Pr(>|z|)    
offset(netsize.adj)  -6.9324     0.0000      0    -Inf   <1e-04 ***
edges                 1.1704     0.1042      0  11.234   <1e-04 ***
degree0               1.4815     0.2592      0   5.716   <1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 The following terms are fixed by offset and are not estimated:
  offset(netsize.adj) 
mcmc.diagnostics(fit.deg0, which = "plots")

Note: MCMC diagnostics shown here are from the last round of
  simulation, prior to computation of final parameter estimates.
  Because the final estimates are refinements of those used for this
  simulation run, these diagnostics may understate model performance.
  To directly assess the performance of the final model on in-model
  statistics, please use the GOF command: gof(ergmFitObject,
  GOF=~model).
plot(gof(fit.deg0, GOF="model"))
plot(gof(fit.deg0, GOF="degree"))
fit.full <- ergm.ego(mesa.ego ~ edges + degree(0:1) 
                     + nodefactor("Sex")
                     + nodefactor("Race", levels = -LARGEST)
                     + nodefactor("Grade")
                     + nodematch("Sex") 
                     + nodematch("Race") 
                     + nodematch("Grade"))
summary(fit.full)
Call:
ergm.ego(formula = mesa.ego ~ edges + degree(0:1) + nodefactor("Sex") + 
    nodefactor("Race", levels = -LARGEST) + nodefactor("Grade") + 
    nodematch("Sex") + nodematch("Race") + nodematch("Grade"))

Monte Carlo Maximum Likelihood Results:

                      Estimate Std. Error MCMC % z value Pr(>|z|)    
offset(netsize.adj)   -5.32301    0.00000      0    -Inf  < 1e-04 ***
edges                 -1.38926    0.19665      0  -7.065  < 1e-04 ***
degree0                2.09717    0.36081      0   5.812  < 1e-04 ***
degree1                1.00401    0.28150      0   3.567 0.000362 ***
nodefactor.Sex.M      -0.17310    0.06319      0  -2.739 0.006155 ** 
nodefactor.Race.Black  1.20790    0.21176      0   5.704  < 1e-04 ***
nodefactor.Race.NatAm  0.30280    0.05821      0   5.202  < 1e-04 ***
nodefactor.Race.Other -0.90243    0.61221      0  -1.474 0.140466    
nodefactor.Race.White  0.57599    0.13107      0   4.394  < 1e-04 ***
nodefactor.Grade.8     0.14240    0.05373      0   2.650 0.008044 ** 
nodefactor.Grade.9     0.14073    0.04792      0   2.937 0.003319 ** 
nodefactor.Grade.10    0.31597    0.07197      0   4.391  < 1e-04 ***
nodefactor.Grade.11    0.40663    0.05753      0   7.068  < 1e-04 ***
nodefactor.Grade.12    0.77803    0.07399      0  10.515  < 1e-04 ***
nodematch.Sex          0.64352    0.12148      0   5.297  < 1e-04 ***
nodematch.Race         0.83975    0.12813      0   6.554  < 1e-04 ***
nodematch.Grade        3.05340    0.15340      0  19.904  < 1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 The following terms are fixed by offset and are not estimated:
  offset(netsize.adj) 
mcmc.diagnostics(fit.full, which = "plots")

Note: To save space, only one in every 2 iterations of the MCMC sample
  used for estimation was stored for diagnostics. Sample size per chain
  was originally around 5266 with thinning interval 2048.

Note: MCMC diagnostics shown here are from the last round of
  simulation, prior to computation of final parameter estimates.
  Because the final estimates are refinements of those used for this
  simulation run, these diagnostics may understate model performance.
  To directly assess the performance of the final model on in-model
  statistics, please use the GOF command: gof(ergmFitObject,
  GOF=~model).
plot(gof(fit.full, GOF="model"))
plot(gof(fit.full, GOF="degree"))
sim.full <- simulate(fit.full)
summary(mesa.ego ~ edges + degree(0:1)
                      + nodefactor("Sex")
                      + nodefactor("Race", levels = -LARGEST)
                      + nodefactor("Grade")
                      + nodematch("Sex") + nodematch("Race") + nodematch("Grade"))
                      scaled mean      SE
edges                         203 15.2022
degree0                        57  6.4306
degree1                        51  6.2048
nodefactor.Sex.M              171 17.1990
nodefactor.Race.Black          26  6.5507
nodefactor.Race.NatAm         156 19.7787
nodefactor.Race.Other           1  0.7054
nodefactor.Race.White          45  9.1943
nodefactor.Grade.8             75 17.3212
nodefactor.Grade.9             65 11.2475
nodefactor.Grade.10            36  8.0931
nodefactor.Grade.11            49 11.4861
nodefactor.Grade.12            28  7.2756
nodematch.Sex                 132 12.1128
nodematch.Race                103 10.0369
nodematch.Grade               163 13.6309
summary(sim.full ~ edges + degree(0:1)
                      + nodefactor("Sex")
                      + nodefactor("Race", levels = -LARGEST)
                      + nodefactor("Grade")
                      + nodematch("Sex") + nodematch("Race") + nodematch("Grade"))
                edges               degree0               degree1 
                  167                    54                    66 
     nodefactor.Sex.M nodefactor.Race.Black nodefactor.Race.NatAm 
                  130                    12                   150 
nodefactor.Race.Other nodefactor.Race.White    nodefactor.Grade.8 
                    3                    32                    63 
   nodefactor.Grade.9   nodefactor.Grade.10   nodefactor.Grade.11 
                   60                    29                    36 
  nodefactor.Grade.12         nodematch.Sex        nodematch.Race 
                   22                   113                    84 
      nodematch.Grade 
                  138 
plot(sim.full, vertex.col="Grade")
legend('bottomleft',fill=7:12,legend=paste('Grade',7:12),cex=0.75)
sim.full2 <- simulate(fit.full, popsize=network.size(mesa)*2)
summary(mesa~edges + degree(0:1)
                      + nodefactor("Sex")
                      + nodefactor("Race", levels = -LARGEST)
                      + nodefactor("Grade")
                      + nodematch("Sex") + nodematch("Race") + nodematch("Grade"))*2
                edges               degree0               degree1 
                  406                   114                   102 
     nodefactor.Sex.M nodefactor.Race.Black nodefactor.Race.NatAm 
                  342                    52                   312 
nodefactor.Race.Other nodefactor.Race.White    nodefactor.Grade.8 
                    2                    90                   150 
   nodefactor.Grade.9   nodefactor.Grade.10   nodefactor.Grade.11 
                  130                    72                    98 
  nodefactor.Grade.12         nodematch.Sex        nodematch.Race 
                   56                   264                   206 
      nodematch.Grade 
                  326 
summary(sim.full2~edges + degree(0:1)
                      + nodefactor("Sex")
                      + nodefactor("Race", levels = -LARGEST)
                      + nodefactor("Grade")
                      + nodematch("Sex") + nodematch("Race") + nodematch("Grade"))
                edges               degree0               degree1 
                  405                   120                    90 
     nodefactor.Sex.M nodefactor.Race.Black nodefactor.Race.NatAm 
                  312                    67                   307 
nodefactor.Race.Other nodefactor.Race.White    nodefactor.Grade.8 
                    2                    86                   146 
   nodefactor.Grade.9   nodefactor.Grade.10   nodefactor.Grade.11 
                  125                    74                    95 
  nodefactor.Grade.12         nodematch.Sex        nodematch.Race 
                   57                   279                   206 
      nodematch.Grade 
                  334 
data(faux.magnolia.high)
faux.magnolia.high -> fmh
N <- network.size(fmh)
fit.ergm <- ergm(fmh ~ degree(0:3) 
                 + nodefactor("Race", levels=TRUE) + nodematch("Race")
                 + nodefactor("Sex") + nodematch("Sex") 
                 + absdiff("Grade"))
round(coef(fit.ergm), 3)
              degree0               degree1               degree2 
                0.954                 0.274                 0.034 
              degree3 nodefactor.Race.Asian nodefactor.Race.Black 
               -0.240                -2.476                -3.045 
 nodefactor.Race.Hisp nodefactor.Race.NatAm nodefactor.Race.Other 
               -2.693                -2.263                -2.634 
nodefactor.Race.White        nodematch.Race      nodefactor.Sex.M 
               -3.385                 1.679                -0.087 
        nodematch.Sex         absdiff.Grade 
                0.860                -2.116 
fmh.ego <- as.egor(fmh)
head(fmh.ego)
# EGO data (active): 3 × 5
  .egoID Grade Race  Sex   vertex.names
*  <int> <dbl> <chr> <chr> <chr>       
1      1     9 Black F     1           
2      2    10 Black M     2           
3      3    12 Black F     3           
# ALTER data: 6 × 6
  .altID .egoID Grade Race  Sex   vertex.names
*  <int>  <int> <dbl> <chr> <chr> <chr>       
1    669      1     9 Black F     669         
2    963      2    10 White F     963         
3    912      2    10 White M     912         
# ℹ 3 more rows
# AATIE data: 0 × 3
# ℹ 3 variables: .egoID <int>, .srcID <int>, .tgtID <int>
egofit <- ergm.ego(fmh.ego ~ degree(0:3) 
                   + nodefactor("Race", levels=TRUE) + nodematch("Race")
                   + nodefactor("Sex") + nodematch("Sex") 
                   + absdiff("Grade"), popsize=N,
                  control = snctrl(ppopsize=N))

# A convenience function.
model.se <- function(fit) sqrt(diag(vcov(fit)))

# Parameters recovered:
coef.compare <- data.frame(
  "NW est" = coef(fit.ergm), 
  "Ego Cen est" = coef(egofit)[-1],
  "diff Z" = (coef(fit.ergm)-coef(egofit)[-1])/model.se(egofit)[-1])

round(coef.compare, 3)
                      NW.est Ego.Cen.est diff.Z
degree0                0.954       0.939  0.035
degree1                0.274       0.262  0.034
degree2                0.034       0.032  0.008
degree3               -0.240      -0.243  0.015
nodefactor.Race.Asian -2.476      -2.485  0.065
nodefactor.Race.Black -3.045      -3.048  0.028
nodefactor.Race.Hisp  -2.693      -2.708  0.125
nodefactor.Race.NatAm -2.263      -2.282  0.136
nodefactor.Race.Other -2.634      -2.648  0.049
nodefactor.Race.White -3.385      -3.387  0.025
nodematch.Race         1.679       1.677  0.028
nodefactor.Sex.M      -0.087      -0.087  0.033
nodematch.Sex          0.860       0.860  0.006
absdiff.Grade         -2.116      -2.111 -0.080
# MCMC diagnostics. 
mcmc.diagnostics(egofit, which="plots")
plot(gof(egofit, GOF="model"))
plot(gof(egofit, GOF="degree"))
set.seed(1)
fmh.egosampN <- sample(fmh.ego, N, replace=TRUE)

egofitN <- ergm.ego(fmh.egosampN ~ degree(0:3) 
                    + nodefactor("Race", levels=TRUE) + nodematch("Race") 
                    + nodefactor("Sex") + nodematch("Sex")
                    + absdiff("Grade"),
                    popsize=N)
Constructing pseudopopulation network.
Unable to match target stats. Using MCMLE estimation.
Starting maximum pseudolikelihood estimation (MPLE):
Obtaining the responsible dyads.
Evaluating the predictor and response matrix.
Maximizing the pseudolikelihood.
Finished MPLE.
Starting Monte Carlo maximum likelihood estimation (MCMLE):
Iteration 1 of at most 60:
1 Optimizing with step length 0.2525.
The log-likelihood improved by 1.9777.
Estimating equations are not within tolerance region.
Iteration 2 of at most 60:
1 Optimizing with step length 0.4057.
The log-likelihood improved by 2.1715.
Estimating equations are not within tolerance region.
Iteration 3 of at most 60:
1 Optimizing with step length 0.6318.
The log-likelihood improved by 2.0157.
Estimating equations are not within tolerance region.
Iteration 4 of at most 60:
1 Optimizing with step length 0.9426.
The log-likelihood improved by 1.7975.
Estimating equations are not within tolerance region.
Iteration 5 of at most 60:
1 Optimizing with step length 0.8976.
The log-likelihood improved by 1.8591.
Estimating equations are not within tolerance region.
Iteration 6 of at most 60:
1 Optimizing with step length 0.6572.
The log-likelihood improved by 2.1477.
Estimating equations are not within tolerance region.
Iteration 7 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.9768.
Estimating equations are not within tolerance region.
Iteration 8 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.7612.
Estimating equations are not within tolerance region.
Iteration 9 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.6097.
Estimating equations are not within tolerance region.
Estimating equations did not move closer to tolerance region more than 1 time(s) in 4 steps; increasing sample size.
Iteration 10 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.1171.
Estimating equations are not within tolerance region.
Iteration 11 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.1323.
Estimating equations are not within tolerance region.
Iteration 12 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.1681.
Estimating equations are not within tolerance region.
Iteration 13 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.1115.
Estimating equations are not within tolerance region.
Iteration 14 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.2142.
Estimating equations are not within tolerance region.
Estimating equations did not move closer to tolerance region more than 1 time(s) in 4 steps; increasing sample size.
Iteration 15 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.1448.
Estimating equations are not within tolerance region.
Iteration 16 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.2262.
Estimating equations are not within tolerance region.
Iteration 17 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.1817.
Estimating equations are not within tolerance region.
Iteration 18 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0152.
Convergence test p-value: 0.8234. Not converged with 99% confidence; increasing sample size.
Iteration 19 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0651.
Convergence test p-value: 0.7643. Not converged with 99% confidence; increasing sample size.
Iteration 20 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0410.
Convergence test p-value: 0.0149. Not converged with 99% confidence; increasing sample size.
Iteration 21 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0150.
Convergence test p-value: 0.0001. Converged with 99% confidence.
Finished MCMLE.
This model was fit using MCMC.  To examine model diagnostics and check
for degeneracy, use the mcmc.diagnostics() function.
# compare the coef
coef.compare <- data.frame(
  "NW est" = coef(fit.ergm), 
  "Ego SampN est" = coef(egofitN)[-1],
  "diff Z" = (coef(fit.ergm)-coef(egofitN)[-1])/model.se(egofitN)[-1])

round(coef.compare, 3)
                      NW.est Ego.SampN.est diff.Z
degree0                0.954         1.388 -0.933
degree1                0.274         0.516 -0.661
degree2                0.034         0.363 -1.184
degree3               -0.240        -0.021 -1.068
nodefactor.Race.Asian -2.476        -2.397 -0.516
nodefactor.Race.Black -3.045        -2.911 -1.206
nodefactor.Race.Hisp  -2.693        -2.532 -1.294
nodefactor.Race.NatAm -2.263        -2.112 -1.136
nodefactor.Race.Other -2.634        -2.623 -0.034
nodefactor.Race.White -3.385        -3.275 -1.037
nodematch.Race         1.679         1.613  0.812
nodefactor.Sex.M      -0.087        -0.142  2.012
nodematch.Sex          0.860         0.883 -0.407
absdiff.Grade         -2.116        -2.023 -1.353
# compare the s.e.'s
se.compare <- data.frame(
  "NW SE" = model.se(fit.ergm), 
  "Ego census SE" =model.se(egofit)[-1], 
  "Ego SampN SE" = model.se(egofitN)[-1])

round(se.compare, 3)
                      NW.SE Ego.census.SE Ego.SampN.SE
degree0               0.462         0.430        0.464
degree1               0.365         0.345        0.367
degree2               0.277         0.261        0.277
degree3               0.198         0.193        0.204
nodefactor.Race.Asian 0.150         0.144        0.152
nodefactor.Race.Black 0.115         0.102        0.112
nodefactor.Race.Hisp  0.144         0.121        0.124
nodefactor.Race.NatAm 0.165         0.141        0.133
nodefactor.Race.Other 0.402         0.292        0.335
nodefactor.Race.White 0.111         0.102        0.105
nodematch.Race        0.103         0.080        0.081
nodefactor.Sex.M      0.032         0.029        0.028
nodematch.Sex         0.070         0.054        0.056
absdiff.Grade         0.072         0.068        0.069
set.seed(0) # Some samples have different sets of alter levels from ego levels.

fmh.egosampN4 <- sample(fmh.ego, round(N/4), replace=TRUE)

egofitN4 <- ergm.ego(fmh.egosampN4 ~ degree(0:3) 
                    + nodefactor("Race", levels=TRUE) + nodematch("Race") 
                    + nodefactor("Sex") + nodematch("Sex")
                    + absdiff("Grade"),
                    popsize=N)
Constructing pseudopopulation network.
Note: Constructed network has size 1460, different from requested 1461. Estimation should not be meaningfully affected.
Starting maximum pseudolikelihood estimation (MPLE):
Obtaining the responsible dyads.
Evaluating the predictor and response matrix.
Maximizing the pseudolikelihood.
Finished MPLE.
Starting Monte Carlo maximum likelihood estimation (MCMLE):
Iteration 1 of at most 60:
1 Optimizing with step length 0.2258.
The log-likelihood improved by 2.0248.
Estimating equations are not within tolerance region.
Iteration 2 of at most 60:
1 Optimizing with step length 0.3843.
The log-likelihood improved by 2.0104.
Estimating equations are not within tolerance region.
Iteration 3 of at most 60:
1 Optimizing with step length 0.8968.
The log-likelihood improved by 2.6821.
Estimating equations are not within tolerance region.
Iteration 4 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.5568.
Estimating equations are not within tolerance region.
Iteration 5 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.2369.
Estimating equations are not within tolerance region.
Iteration 6 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.2663.
Estimating equations are not within tolerance region.
Iteration 7 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.1536.
Estimating equations are not within tolerance region.
Iteration 8 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.8350.
Estimating equations are not within tolerance region.
Iteration 9 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.9888.
Estimating equations are not within tolerance region.
Estimating equations did not move closer to tolerance region more than 1 time(s) in 4 steps; increasing sample size.
Iteration 10 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.9633.
Estimating equations are not within tolerance region.
Iteration 11 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0807.
Convergence test p-value: 0.8179. Not converged with 99% confidence; increasing sample size.
Iteration 12 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.1154.
Estimating equations are not within tolerance region.
Estimating equations did not move closer to tolerance region more than 1 time(s) in 4 steps; increasing sample size.
Iteration 13 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0990.
Convergence test p-value: 0.8092. Not converged with 99% confidence; increasing sample size.
Iteration 14 of at most 60:
1 Optimizing with step length 1.0000.
The log-likelihood improved by 0.0206.
Convergence test p-value: < 0.0001. Converged with 99% confidence.
Finished MCMLE.
This model was fit using MCMC.  To examine model diagnostics and check
for degeneracy, use the mcmc.diagnostics() function.
# compare the coef
coef.compare <- data.frame(
  "NW est" = coef(fit.ergm), 
  "Ego SampN4 est" = coef(egofitN4)[-1],
  "diff Z" = (coef(fit.ergm)-coef(egofitN4)[-1])/model.se(egofitN4)[-1])

round(coef.compare, 3)
                      NW.est Ego.SampN4.est diff.Z
degree0                0.954          0.529  0.458
degree1                0.274         -0.239  0.697
degree2                0.034         -0.041  0.141
degree3               -0.240         -0.363  0.314
nodefactor.Race.Asian -2.476         -2.190 -1.215
nodefactor.Race.Black -3.045         -2.991 -0.237
nodefactor.Race.Hisp  -2.693         -2.808  0.489
nodefactor.Race.NatAm -2.263         -2.498  0.980
nodefactor.Race.Other -2.634         -2.412 -0.455
nodefactor.Race.White -3.385         -3.431  0.219
nodematch.Race         1.679          1.579  0.750
nodefactor.Sex.M      -0.087         -0.192  1.755
nodematch.Sex          0.860          0.889 -0.262
absdiff.Grade         -2.116         -2.078 -0.267
# compare the s.e.'s
se.compare <- data.frame(
  "NW SE" = model.se(fit.ergm), 
  "Ego census SE" =model.se(egofit)[-1], 
  "Ego SampN SE" = model.se(egofitN)[-1],
  "Ego Samp4 SE" = model.se(egofitN4)[-1])

round(se.compare, 3)
                      NW.SE Ego.census.SE Ego.SampN.SE Ego.Samp4.SE
degree0               0.462         0.430        0.464        0.929
degree1               0.365         0.345        0.367        0.736
degree2               0.277         0.261        0.277        0.539
degree3               0.198         0.193        0.204        0.394
nodefactor.Race.Asian 0.150         0.144        0.152        0.236
nodefactor.Race.Black 0.115         0.102        0.112        0.230
nodefactor.Race.Hisp  0.144         0.121        0.124        0.235
nodefactor.Race.NatAm 0.165         0.141        0.133        0.240
nodefactor.Race.Other 0.402         0.292        0.335        0.488
nodefactor.Race.White 0.111         0.102        0.105        0.213
nodematch.Race        0.103         0.080        0.081        0.134
nodefactor.Sex.M      0.032         0.029        0.028        0.060
nodematch.Sex         0.070         0.054        0.056        0.109
absdiff.Grade         0.072         0.068        0.069        0.143
─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.1 (2024-06-14)
 os       Ubuntu 22.04.4 LTS
 system   x86_64, linux-gnu
 ui       X11
 language en
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/London
 date     2024-06-23
 pandoc   3.1.2 @ /usr/bin/ (via rmarkdown)

─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package        * version       date (UTC) lib source
 bookdown         0.39          2024-04-15 [1] CRAN (R 4.4.0)
 bslib            0.7.0         2024-03-29 [1] CRAN (R 4.4.0)
 cachem           1.1.0         2024-05-16 [1] CRAN (R 4.4.0)
 cli              3.6.3         2024-06-21 [1] CRAN (R 4.4.1)
 coda             0.19-4.1      2024-01-31 [1] CRAN (R 4.4.0)
 DBI              1.2.3         2024-06-02 [1] CRAN (R 4.4.0)
 deldir           2.0-4         2024-02-28 [1] CRAN (R 4.4.1)
 DEoptimR         1.1-3         2023-10-07 [1] CRAN (R 4.4.0)
 digest           0.6.35        2024-03-11 [1] CRAN (R 4.4.0)
 dplyr          * 1.1.4         2023-11-17 [1] CRAN (R 4.4.0)
 egor           * 1.24.2        2024-06-20 [1] Github (tilltnet/egor@44d87a0)
 ergm           * 4.7-7368      2024-06-20 [1] Github (statnet/ergm@93ecb25)
 ergm.ego       * 1.1-704       2024-06-20 [1] local
 evaluate         0.24.0        2024-06-10 [1] CRAN (R 4.4.0)
 fansi            1.0.6         2023-12-08 [1] CRAN (R 4.4.0)
 fastmap          1.2.0         2024-05-15 [1] CRAN (R 4.4.0)
 generics         0.1.3         2022-07-05 [1] CRAN (R 4.4.0)
 glue             1.7.0         2024-01-09 [1] CRAN (R 4.4.0)
 highr            0.11          2024-05-26 [1] CRAN (R 4.4.0)
 htmltools        0.5.8.1       2024-04-04 [1] CRAN (R 4.4.0)
 igraph           2.0.3         2024-03-13 [1] CRAN (R 4.4.0)
 interp           1.1-6         2024-01-26 [1] CRAN (R 4.4.1)
 jpeg             0.1-10        2022-11-29 [1] CRAN (R 4.4.1)
 jquerylib        0.1.4         2021-04-26 [1] CRAN (R 4.4.0)
 jsonlite         1.8.8         2023-12-04 [1] CRAN (R 4.4.0)
 knitr          * 1.47          2024-05-29 [1] CRAN (R 4.4.0)
 lattice          0.22-6        2024-03-20 [4] CRAN (R 4.4.1)
 latticeExtra     0.6-30        2022-07-04 [1] CRAN (R 4.4.1)
 lifecycle        1.0.4         2023-11-07 [1] CRAN (R 4.4.0)
 lpSolveAPI       5.5.2.0-17.11 2023-11-28 [1] CRAN (R 4.4.0)
 magrittr         2.0.3         2022-03-30 [1] CRAN (R 4.4.0)
 Matrix           1.7-0         2024-04-26 [4] CRAN (R 4.4.0)
 memoise          2.0.1         2021-11-26 [1] CRAN (R 4.4.0)
 mitools          2.4           2019-04-26 [1] CRAN (R 4.4.0)
 network        * 1.18.2        2024-06-20 [1] Github (statnet/network@c1b2084)
 pillar           1.9.0         2023-03-22 [1] CRAN (R 4.4.0)
 pkgconfig        2.0.3         2019-09-22 [1] CRAN (R 4.4.0)
 png              0.1-8         2022-11-29 [1] CRAN (R 4.4.0)
 purrr            1.0.2         2023-08-10 [1] CRAN (R 4.4.0)
 R6               2.5.1         2021-08-19 [1] CRAN (R 4.4.0)
 rbibutils        2.2.16        2023-10-25 [1] CRAN (R 4.4.0)
 RColorBrewer     1.1-3         2022-04-03 [1] CRAN (R 4.4.0)
 Rcpp             1.0.12        2024-01-09 [1] CRAN (R 4.4.0)
 Rdpack           2.6           2023-11-08 [1] CRAN (R 4.4.0)
 Rglpk            0.6-5.1       2024-01-13 [1] CRAN (R 4.4.0)
 rlang            1.1.4         2024-06-04 [1] CRAN (R 4.4.0)
 rle              0.9.2-234     2024-06-20 [1] Github (statnet/rle@d08b185)
 rmarkdown        2.27          2024-05-17 [1] CRAN (R 4.4.0)
 robustbase       0.99-2        2024-01-27 [1] CRAN (R 4.4.0)
 sass             0.4.9         2024-03-15 [1] CRAN (R 4.4.0)
 sessioninfo      1.2.2         2021-12-06 [1] CRAN (R 4.4.0)
 slam             0.1-50        2022-01-08 [1] CRAN (R 4.4.0)
 srvyr            1.2.0         2023-02-21 [1] CRAN (R 4.4.0)
 statnet.common   4.10.0-442    2024-06-20 [1] Github (statnet/statnet.common@4e8cb54)
 survey           4.4-2         2024-03-20 [1] CRAN (R 4.4.0)
 survival         3.7-0         2024-06-05 [4] CRAN (R 4.4.0)
 tibble         * 3.2.1         2023-03-20 [1] CRAN (R 4.4.0)
 tidygraph        1.3.1         2024-01-30 [1] CRAN (R 4.4.0)
 tidyr            1.3.1         2024-01-24 [1] CRAN (R 4.4.0)
 tidyselect       1.2.1         2024-03-11 [1] CRAN (R 4.4.0)
 trust            0.1-8         2020-01-10 [1] CRAN (R 4.4.0)
 utf8             1.2.4         2023-10-22 [1] CRAN (R 4.4.0)
 vctrs            0.6.5         2023-12-01 [1] CRAN (R 4.4.0)
 withr            3.0.0         2024-01-16 [1] CRAN (R 4.4.0)
 xfun             0.45          2024-06-16 [1] CRAN (R 4.4.1)
 yaml             2.3.8         2023-12-11 [1] CRAN (R 4.4.0)

 [1] /home/mbojan/R/library/4.4
 [2] /usr/local/lib/R/site-library
 [3] /usr/lib/R/site-library
 [4] /usr/lib/R/library

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Function	Purpose
`summary`	takes an ergm formula with an `egor` object on the LHS and ergm terms on the RHS, and returns the observed values of the statistics on the RHS. When used on an ergm.ego object, this returns a summary of the model fit.
`ergm.ego`	the main function used to fit an ERGM to the `egor` object. In addition to the arguments that are specific to fitting from egocentrically sampled data, you can pass the usual arguments to ergm for controlling the fitting algorithm.
`control.ergm.ego`	a set of control parameters specifically needed for fitting ERGMs to egocentrically sampled data, as well a method for passing other control parameters to ergm.
`vcov`	returns the variance–covariance matrix of the estimate for \(\theta\).

Function	Purpose
`simulate`	Simulates complete networks (of any size) from the fitted model. You can either pass the ergm.ego object (which contains the results of the model fit) to simulate, or pass a formula as the object, along with a vector of coefficients.
`gof`	Uses the simulation function above to compare observed statistics to statistics from networks simulated from the model. Constrained by the data to the model terms, degree distribution, and the ESP distribution (if the alter–alter ties are observed).
`control.simulate.ergm.ego`	Allows you to control how to resolve fractional numbers of vertices when this is produced by the sample weights and the population size selected. Also provides a way to pass controls to SAN and ergm simulate.

Parameter	Meaning
\(N\)	the population being studied: a very large, but finite, set of actors whose relations are of interest
\(x _ i\)	attribute (e.g., age, sex, race) vector of actor \(i \in N\)
\(x_N\)	(or just \(x\), when there is no ambiguity) the attributes of actors in \(N\)
\(\mathbb{Y}(N)\)	the set of dyads (potential ties) in an undirected network of actors in \(N\)
\(y\subseteq \mathbb{Y}(N)\)	the population network: a fixed but unknown network (a set of relationships) of relationships of interest. In particular,
\(y_{ij}\)	an indicator function of whether a tie between \(i\) and \(j\) is present in \(y\)
\(y _ i=\{j\in N: y _ {ij}=1\}\)	the set of \(i\)’s network neighbors.

Parameter	Meaning
\(e_{N}\)	the egocentric census, the information retained by the minimal egocentric sampling design when all nodes are sampled
\(S\subseteq N\)	the set of egos in a sample
\(e_{S}\)	the data contained in an egocentric sample
\(e_i\)	the “egocentric” view of network \(y\) from the point of view of actor \(i\) (“ego”), with the following parts:
\(e^e_i \equiv x_i\)	\(i\)’s own attributes
\(e^a_i \equiv (x_{j})_{j\in y_i}\)	an unordered list of attribute vectors of \(i\)’s immediate neighbors (“alters”), but not their identities (indices in \(N\))
\(e^e_{i,k}\equiv x_{i,k}\)	The \(k\)th attribute/covariate observed on ego \(i\)
\(e^a_{i,k}\equiv( x_{j,k})_{j\in y_i}\)	and its alters.

Statistic	\(g_{k}( y,x)\)	\(h _ {k}(e_i)\)
General sum over ties	\(\sum _ {(i,j)\in y} f _ k(x _ i,x _ j)\)	\(\frac{1}{2}\sum _ {j'\in e^\text{a} _ i} f _ k\big(e^\text{e}_i,e^\text{a}_{i,j'}\big)\)
Number of ties in the network	\(\lvert y \rvert\equiv \sum _ {(i,j) \in y} 1\)	\(\frac{1}{2}\lvert e^\text{a}_{i}\rvert\)
weighted by actor covariate \(x _ {i,k}\)	\(\sum _ {(i,j) \in y} (x _ {i,k}+x _ {j,k})\)	\(\frac{1}{2} \big(e^\text{e}_{i,k} \lvert e^\text{a}_{i}\rvert + \sum _ {j'\in e^\text{a} _ i} e^\text{a}_{i,j',k} \big)\)
weighted by difference in \(x _ {i,k}\)	\(\sum _ {(i,j) \in y} \lvert x _ {i,k}-x _ {j,k}\rvert\)	\(\frac{1}{2}\sum _ {j'\in e^\text{a} _ i} \lvert e^\text{e}_{i,k}-e^\text{a}_{i,j',k}\rvert\)
within groups identified by \(x _ {i,k}\)	\(\sum _ {(i,j) \in y} 1_{x _ {i,k}=x _ {j,k}}\)	\(\frac{1}{2}\sum _ {j'\in e^\text{a} _ i} 1_{ e^\text{e}_{i,k}= e^\text{a}_{i,j',k}}\)
General sum over actors	\(\sum _ {i\in N} f _ k\big\{x _ {i},(x _ j) _ {j\in y_{i}}\big\}\)	\(f _ k\big(e^\text{e}_i,e^\text{a}_{i}\big)\)
Number of actors with \(d\) neighbors	\(\sum _ {i\in N} 1_{\lvert y_{i}\rvert=d}\)	\(1_{\lvert e^\text{a}_{i}\rvert=d}\)
weighted by actor covariate \(x _ {i,k}\)	\(\sum _ {i\in N} x _ {i,k} 1_{\lvert y_{i}\rvert=d}\)	\(e^\text{e}_{i,k}1_{\lvert e^\text{a}_{i}\rvert=d}\)

Egocentric Network Data Analysis with ERGMs

Statnet Development Team

Last update: 2024-06-23

The Statnet Project

1 Introduction

1.1 Prerequisites

1.2 Software Installation

2 Overview of ergm.ego

3 Key concepts

3.1 ERGMs

3.2 Network Sampling

Network census

Link-Trace Designs

Egocentric Designs

3.3 Methods for sampled network data

Model-based

Design-based

4 Theoretical Framework

4.1 Estimation

4.1.1 Practical issues

Network Size

Observable discrepancies

Egocentric target statistics

4.2 Statistical Inference

4.3 Survey design effects

5 The package ergm.ego

Data structure and input

Model terms

Model-related functions

Simulation related functions

6 Example Analysis

6.1 Data construction

6.1.1 From a network object

6.1.2 From external data

6.2 Exploratory analysis

6.3 Model Fitting

6.3.1 Preliminaries

6.3.2 Fit a simple model

6.3.3 Convergence assessment

6.3.4 GOF assessment

6.3.5 Improve the fit

6.4 Parameter recovery and sampling

Egocentric census

Egocentric Sample: Same size

Egocentric Sample: Smaller sample

7 Package Development

Appendices

A Real world example

B TERGMs with egocentrically sampled data

C Formal definitions of egocentric statistics

C.1 Notation

C.1.1 Population network

C.1.2 Egocentric sample

C.2 Egocentric Statistics

D Session Info

References

6.1.1 From a `network` object