This tutorial is a joint product of the Statnet Development Team:

Pavel N. Krivitsky (University of New South Wales)
Martina Morris (University of Washington)
Mark S. Handcock (University of California, Los Angeles)
Carter T. Butts (University of California, Irvine)
David R. Hunter (Penn State University)
Steven M. Goodreau (University of Washington)
Chad Klumb (University of Washington)
Skye Bender de-Moll (Oakland, CA)
Michał Bojanowski (Kozminski University, Poland)

The Statnet Project

All Statnet packages are open-source, written for the R computing environment, and published on CRAN. The source repositories are hosted on GitHub. Our website is statnet.org

  • Need help? For general questions and comments, please email the Statnet users group at statnet_help@uw.edu. You’ll need to join the listserv if you’re not already a member. You can do that here: Statnet_help listserve.

  • Found a bug in our software? Please let us know by filing an issue in the appropriate package GitHub repository, with a reproducible example.

  • Want to request new functionality? We welcome suggestions – you can make a request by filing an issue on the appropriate package GitHub repository. The chances that this functionality will be developed are substantially improved if the requests are accompanied by some proposed code (we are happy to review pull requests).

  • For all other issues, please email us at contact@statnet.org.

1 Introduction to this workshop/tutorial

This workshop and tutorial provide an overview of R packages for network analysis. This online tutorial is also designed for self-study, with example code and self-contained data.

  • Statnet suite (Krivitsky et al. 2003-2020) including:
    • network (Butts 2008, 2021) – storage and manipulation of network data
    • sna (Butts 2020) – descriptive statistics and graphics for exploratory network analysis`
  • igraph (Csardi and Nepusz 2006)
  • tidygraph (Pedersen 2020) and ggraph (Pedersen 2021)
  • graph (Gentleman et al. 2020) and Rgraphviz (Hansen et al. 2021)

and other more specialized packages that provide tools for e.g. particular SNA techniques or visualization, but rely on one of the above for network data storage and manipulation.

R code in this tutorial is also available an R script.

1.1 Prerequisites

This workshop assumes basic familiarity with R, experience with network concepts, terminology and data, and familiarity with the general framework for statistical modeling and inference. While previous experience with ERGMs is not required, some of the topics covered here may be difficult to understand without a strong background in linear and generalized linear models in statistics.

1.2 Software installation

Minimally, you will need to install the latest version of R (available here) and the packages listed below. The workshops are conducted using the free version of RStudio (available here).

The packages required for the workshop can be installed with the following expression:

install.packages(c("network", "sna", "igraph", "tidygraph", "ggraph", 
                   "intergraph", "remotes"))

Package remotes (Hester et al. 2021) is needed to install the remaining two packages.

remotes::install_bioc("graph")

For more information about installing other packages from the Statnet suite can be found on Statnet website. In particular, you can install (but do not have to for this tutorial) the whole Statnet suite with:

install.packages('statnet')

1.3 Necessary data files

  • Classroom data is a network within a school class of 26 9-year-olds coming from a larger study of Dolata (2014). The name generator question was “With whom do you like to play with?”. The data is available in the following files
    • classroom-adjacency.csv with adjacency matrix
    • classroom-edges.csv with an edgelist with edge attribute: liking – numeric, on the scale 1-5 the extent to which ego likes the alter. This attribute has been randomly generated for illustrative purposes.
    • classroom-nodes.csv with node attributes: female – logical, gender (TRUE for girls); isei08_m, isei08_f – numeric, social status score of, respectively, mother and father
  • Several other datasets contained in the file introToSNAinR.Rdata.

Download all the files as a ZIP file intro-sna-data.zip.

The code from this tutorial is available as a script too.

1.4 Working Directory

Before we go further, make sure R’s Working Directory (WD) is set to the folder where you extracted the data files from the ZIP archive for the workshop. If you’ve not set the working directory, you must do so now by one of:

  1. (Recommended) Create an RStudio Project dedictated to the workshop and unpack the data files there.

  2. Use RStudio “Files” tab to navigate to the directory with the workshop files, then click “More” and “Set As Working Directory”:

  3. You can use setwd() to change the working directory as well, like so:

    setwd("path/to/folder/with/workshop/files")

Verify if the WD is set correctly by

  1. Looking at the top of the Console window in RStudio, or

  2. Use getwd():

    getwd() # Check what directory you're in
    [1] "/home/mbojan/Teaching/workshop-intro-sna-tools"
    list.files() # Check what's in the working directory
     [1] "bibliography.bib"               "captab.html"                   
     [3] "captab.Rmd"                     "classroom-adjacency.csv"       
     [5] "classroom-edges.csv"            "classroom-nodes.csv"           
     [7] "common"                         "edgeList.csv"                  
     [9] "index.html"                     "intro_tutorial.html"           
    [11] "intro_tutorial.R"               "intro_tutorial.Rmd"            
    [13] "intro-sna-data.zip"             "introToSNAinR.Rdata"           
    [15] "Makefile"                       "practicals-solved.html"        
    [17] "practicals.html"                "practicals.Rmd"                
    [19] "README.md"                      "relationalData.csv"            
    [21] "rstudio-wd.png"                 "vertexAttributes.csv"          
    [23] "workshop-intro-sna-tools.Rproj"

1.5 Mitigating function name conflicts

Some packages we are going to demonstrate provide functions with identical names as in other packages. Examples include a function get.vertex.attribute() which is defined in packages network and igraph. Hence, if we load both packages with library() it matters which package is loaded last as its version of the function will be used when we write get.vertex.attribute.

In particular, note the following function name clashes:

  • Between igraph and network:

     [1] "%c%"                    "%s%"                    "add.edges"             
     [4] "add.vertices"           "delete.edges"           "delete.vertices"       
     [7] "get.edge.attribute"     "get.edges"              "get.vertex.attribute"  
    [10] "is.bipartite"           "is.directed"            "list.edge.attributes"  
    [13] "list.vertex.attributes" "set.edge.attribute"     "set.vertex.attribute"  
  • Between igraph and sna:<