Statnet performance and improvement
ERGM
These tests assess the core functionalities of the ergm package, so we can compare the output between older and new versions by inspection. It replicates the results from a simple dyadic dependent model, using both default control parameters and “reasonable” parameters. Each section of the test is a self contained R file that can be run on its own. The overall test should be run on a machine with multiple CPU cores, to take advantage of parallel functionality.
Related Ticket
Related Links
ERGM timing spreadsheet for edges model  note code to reproduce is on the third sheet.
Parallel functionality
Parallel functionality examples
Open tickets:
#924 ERGM parallel on a single machine is slower than nonparallel
#889 ERGM's Parallel Future: aka Rmpi dependency causing trouble on OSX and Windows builds
#884 migrate ergm from package 'snow' to 'parallel'
Version comparison
Comparing how long it takes to fit a dyadic dependent model, between CRAN (3.1.2) and trunk (3.21309613098.12014.06.2018.18.46): Trunk version takes fewer iterations to converge, but takes 20 times longer per iteration.
Code:
library("ergm", lib.loc="~/R/winlibrary/3.1") data(faux.magnolia.high) nw < faux.magnolia.high replicate(n=20, { t0 < proc.time() fauxmodel.01 < ergm(nw ~ edges + isolates + gwesp(0.2, fixed=T), control=control.ergm(MCMLE.maxit=100)) c(proc.time()  t0[3], nrow(fauxmodel.01$stats.hist), fauxmodel.01$coef) }) > t.cran save(t.cran, file='tcran.Rdata') x=t.cran[6,] y=t.cran[3,] plot(y~x, main='ergm fit time, CRAN', xlab='iterations', ylab='seconds') abline(lm(y~x)) legend('topleft', paste('slope', round(lm(y~x)$coef[2], 2)))
Diagnostic
Kirk:
I also found the similar results as above. I am pretty sure it is because some recent change of convergence/burnin criterion. I am looking deep into it.
The diagnostic above shows the trunk ergm take 13x proposals (26,397,968) in MCMC compared with the cran version (2,060,403). I will look into the reason.
OK, the difference is in the trunk version, in ergm.MCMLE.R, "Insufficient effective sample size for MCMLE optimization. Rerunning with the longer interval.", and requires repeated MCMC chains (to reach sufficient effective sample size). For each repeat, the control$MCMC.interval is increased to 20x than the previous (initial default value is 100).
Therefore in practise, effective sample size will be sufficient after the second or third MCMC chains, which makes lots more proposals than the CRAN version, due to longer MCMC interval used.
In addition:
control$MCMC.burnin.retries=0 in CRAN version, however MCMC.burnin.retries: 100 in trunk version.
The CRAN version, the warning "In ergm.getMCMCsample(nw, model, MHproposal, mcmc.eta0, ... : Burnin failed to converge after retries" is suppressed.
We need to discuss how to solve these both in theory and in practise.
The related commit is 12310
TERGM
Estimation times for two different models, varying network size and duration. Sam's model (with Inf offset) was more difficult to fit than Steve's model (with degree constraint). stergm timing test.xlsx
Related Ticket
EpiModel
NetworkDynamic?
Attachments (8)
 ergm_trunk_time.png (5.9 KB)  added by lxwang 5 years ago.
 ergm_cran_time.png (6.0 KB)  added by lxwang 5 years ago.

04.png
(573.0 KB) 
added by kirkli 5 years ago.
comparison of ergm cran and trunk version
 stergm timing test.xlsx (11.1 KB)  added by lxwang 4 years ago.
 ergm_performance_3.5.2.html (952.7 KB)  added by lxwang 4 years ago.
 ergm_performance_3.5.html (952.7 KB)  added by lxwang 4 years ago.
 ergm_performance_3.5.1.html (1004.5 KB)  added by lxwang 4 years ago.
 ergm_performance_3.4.html (1005.0 KB)  added by lxwang 4 years ago.