Testing Effective Sample Size algorithm

Link to the flowchart for the ESS algorithm. ergm ESS flowchart.pdf:wiki:ESS

The ESS algorithm tested is the one from Pavel's modification in [13407/statnet_commons]. 100 runs are done for each version below, using random seeds 1:100.

Non-ESS option is specified by using the ergm control MCMLE.effectivesize = NULL. Both the non-ESS option and CRAN version use a fixed-size MCMC sample; the non-ESS version runs much faster (fewer iterations) because of the Hummel algorithm.

The CRAN version is 3.1.3. Both CRAN and non-ESS runs are done with the control parameters MCMC.interval = 2^13, MCMC.burnin = 50000. (213 is 8192, which is higher than the final intervals of the ESS run.

The model used is the one from ergm vignette:

formula = faux.magnolia.high~edges+gwesp(0.25,fixed=T)+nodematch('Grade')+nodematch('Race')+nodematch('Sex')


Timing tests show that the issue from ticket #1020 is mostly resolved. Of the 100 runs, only 2 took longer than 10 minutes, with a max of 15 minutes. Parallel tests are pending.

Variation in estimates

There is a very large amount of variation in the coefficient estimates from the ESS runs, compared to CRAN and to non-ESS. The variation comes from the differences in the MCMC.interval of the last MCMC sample (the one used to estimate the coefficients in the final iteration).

In the ESS3 version, I fixed the final sample to have an interval of 213 (after last.adequate==T). The result is comparable to the non-ESS version. This version was one of the alternatives suggested by Hummel:

Ideally, we would like to be able to set γ_t = 1, and in fact when this is possible while simultaneously satisfying the convex hull criterion described here for two consecutive iterations, the “stepping” portion of the algorithm is judged to have converged. At this point, we may take η_t+1 to be the final MLE or, alternatively, simulate a much larger sample using this η_t+1 in one final iteration.

However, there's no good way to know what that final sample size and interval should be. For research purposes, the final sample should probably as large as computationally feasible.

MCMC SE calculation for ESS algorithm

The MCMC SE calculation under-estimates the variation in the estimates from ESS. The MCMC SE should match the amount of variation in coef estimates from repeated runs of the same ergm model. The red boxes below show the 95% intervals specified by the MCMC SE. For CRAN, the intervals match the variations in the coefs, but for ESS, the intervals are too small.

I think there is extra variation associated with testing whether or not the chain has sufficient effective size. ESS is estimated from that realization of the chain, which is then used to decide whether to stop or grow the chain. I'm not sure how to calculate this extra variation; have to look at some MCMC texts.

This also affects the stopping criterion for MCMLE, which compares the MCMC SE to the model SE.

Commit [13415/statnet_commons] used ESS for MCMC, but with the Hotelling T2 for convergence testing. The performance is shown below. The ESS version here is from [13409/statnet_commons]

A more comprehensive test of ESS and the stepping algorithm is attached in the spreadsheet here:

ergm ESS tests.xlsx

MCMC diagnostics

The sample used for MCMC diagnostic (also part of the ergm output) is simulated from the penultimate iteration's coefficients. Since [13407/statnet_commons], the last 2 iterations should have Hummel step length = 1, but that did not seem to affect the sample.

The CRAN version has samples that are perfectly centered on target, but that is because the stopping criterion for CRAN estimation is based on a T2 test on the sample.

We still need to resolve #1022, probably by re-writing the MCMC diagnostic function and documentation.

Simulations from fit

The simulations are done using a very large burnin and interval, so we can check whether the estimates are on target. MCMC.interval=20000, MCMC.burnin=1000000. 500 simulations are taken from each of the 100 fits, and the mean stats are plotted.

The ESS sims show more variation because of the variation in the estimates. Overall, the sims do not show bias from target.

Whatever we set as defaults for simulate will probably be inadequate for certain models. We need to update vignette and documentation to tell users how to get good samples.


These are run on Mosix using parallel apply function.




Last modified 5 years ago Last modified on 10/30/14 17:16:18

Attachments (11)

Download all attachments as: .zip