This tutorial is a joint product of the Statnet Development Team:
Pavel N. Krivitsky (University of New South Wales)
Martina Morris (University of Washington)
Mark S. Handcock (University of California, Los Angeles)
Carter T. Butts (University of California, Irvine)
David R. Hunter (Penn State University)
Steven M. Goodreau (University of Washington)
Chad Klumb (University of Washington)
Skye Bender de-Moll (Oakland, CA)
Michał Bojanowski (Kozminski University, Poland)
All Statnet packages are open-source, written for the R computing environment, and published on CRAN. The source repositories are hosted on GitHub. Our website is statnet.org
Need help? For general questions and comments, please email the Statnet users group at statnet_help@uw.edu. You’ll need to join the listserv if you’re not already a member. You can do that here: Statnet_help listserve.
Found a bug in our software? Please let us know by filing an issue in the appropriate package GitHub repository, with a reproducible example.
Want to request new functionality? We welcome suggestions – you can make a request by filing an issue on the appropriate package GitHub repository. The chances that this functionality will be developed are substantially improved if the requests are accompanied by some proposed code (we are happy to review pull requests).
For all other issues, please email us at contact@statnet.org.
This workshop and tutorial provide an overview of R packages for network analysis. This online tutorial is also designed for self-study, with example code and self-contained data.
and other more specialized packages that provide tools for e.g. particular SNA techniques or visualization, but rely on one of the above for network data storage and manipulation.
R code in this tutorial is also available an R script.
This workshop assumes basic familiarity with R, experience with network concepts, terminology and data, and familiarity with the general framework for statistical modeling and inference. While previous experience with ERGMs is not required, some of the topics covered here may be difficult to understand without a strong background in linear and generalized linear models in statistics.
Minimally, you will need to install the latest version of R (available here) and the packages listed below. The workshops are conducted using the free version of RStudio (available here).
The packages required for the workshop can be installed with the following expression:
Package remotes (Hester et al. 2021) is needed to install the remaining two packages.
For more information about installing other packages from the Statnet suite can be found on Statnet website. In particular, you can install (but do not have to for this tutorial) the whole Statnet suite with:
classroom-adjacency.csv
with adjacency matrixclassroom-edges.csv
with an edgelist with edge
attribute: liking
– numeric, on the scale 1-5 the extent to
which ego likes the alter. This attribute has been randomly generated
for illustrative purposes.classroom-nodes.csv
with node attributes:
female
– logical, gender (TRUE
for girls);
isei08_m
, isei08_f
– numeric, social status
score of, respectively, mother and fatherintroToSNAinR.Rdata
.Download all the files as a ZIP file
intro-sna-data.zip
.
The code from this tutorial is available as a script too.
Before we go further, make sure R’s Working Directory (WD) is set to the folder where you extracted the data files from the ZIP archive for the workshop. If you’ve not set the working directory, you must do so now by one of:
(Recommended) Create an RStudio Project dedictated to the workshop and unpack the data files there.
Use RStudio “Files” tab to navigate to the directory with the workshop files, then click “More” and “Set As Working Directory”:
You can use setwd()
to change the working directory
as well, like so:
Verify if the WD is set correctly by
Looking at the top of the Console window in RStudio, or
Use getwd()
:
[1] "/home/mbojan/Teaching/workshop-intro-sna-tools"
[1] "bibliography.bib" "captab.html"
[3] "captab.Rmd" "classroom-adjacency.csv"
[5] "classroom-edges.csv" "classroom-nodes.csv"
[7] "common" "edgeList.csv"
[9] "index.html" "intro_tutorial.html"
[11] "intro_tutorial.R" "intro_tutorial.Rmd"
[13] "intro-sna-data.zip" "introToSNAinR.Rdata"
[15] "Makefile" "practicals-solved.html"
[17] "practicals.html" "practicals.Rmd"
[19] "README.md" "relationalData.csv"
[21] "rstudio-wd.png" "vertexAttributes.csv"
[23] "workshop-intro-sna-tools.Rproj"
Some packages we are going to demonstrate provide functions with
identical names as in other packages. Examples include a function
get.vertex.attribute()
which is defined in packages
network and igraph. Hence, if we load
both packages with library()
it matters which package is
loaded last as its version of the function will be used when we write
get.vertex.attribute
.
In particular, note the following function name clashes:
Between igraph and network:
[1] "%c%" "%s%" "add.edges"
[4] "add.vertices" "delete.edges" "delete.vertices"
[7] "get.edge.attribute" "get.edges" "get.vertex.attribute"
[10] "is.bipartite" "is.directed" "list.edge.attributes"
[13] "list.vertex.attributes" "set.edge.attribute" "set.vertex.attribute"
Between igraph and sna:
[1] "betweenness" "bonpow" "closeness" "components" "degree"
[6] "dyad.census" "evcent" "hierarchy" "is.connected" "neighborhood"
[11] "triad.census"
There are the following strategies to make sure possible conflicts are as painless as possible:
::
::
for
disambiguation.In this tutorial we had to deal with these conflicts as well. We have opted for strategy (3) because:
::
namespace directives
and hence cleaner to read.The disadvantage is that
library()
and
detach()
at the beginning and end of the subsections to
make sure only one intended package is attached at a given time.Network data is usually stored as
'network' 1.18.2 (2023-12-04), part of the Statnet Project
* 'news(package="network")' for changes since last version
* 'citation("network")' for citation information
* 'https://statnet.org' for help, support, and other information
Loading required package: statnet.common
Attaching package: 'statnet.common'
The following objects are masked from 'package:base':
attr, order
sna: Tools for Social Network Analysis
Version 2.7-2 created on 2023-12-05.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
For citation information, type citation("sna").
Type help(package="sna") to get started.
Read an adjacency matrix (R stores it as a data frame by default). R also won’t permit numbers as column names, although this is fine for rownames.
relations <- read.csv("classroom-adjacency.csv",header=T,row.names=1,stringsAsFactors=FALSE)
relations[1:10,1:10] #look at a subgraph using bracket notation
X1003 X1006 X1009 X1012 X1015 X1018 X1021 X1024 X1027 X1030
1003 0 0 0 0 0 1 0 0 0 0
1006 0 0 1 0 0 0 0 0 0 0
1009 0 1 0 0 0 0 0 0 0 0
1012 0 0 0 0 1 0 1 0 0 0
1015 0 0 0 1 0 0 1 0 0 0
1018 0 0 0 0 0 0 0 0 0 0
1021 0 1 1 1 0 0 0 0 0 0
1024 0 0 0 0 0 0 0 0 0 0
1027 0 1 1 0 0 0 0 0 0 0
1030 0 0 1 0 0 0 0 0 0 0
We might want to store it as a matrix. Most routines will accept
either data format. However, depending on how a function was written, it
might require one or the other. The isSymmetric
function
from the sna
package is one example that requires a matrix
rather than a data frame.
[1] FALSE
To make the row and column names identical, we can overwrite the rownames:
Read in some vertex attribute data (okay to leave it as a data frame - in fact converting to a matrix would create problems as matrices can only have strings or numbers, but data frames can have vectors of both)
name female isei08_m isei08_f
1 1003 FALSE NA 25.71
2 1006 TRUE 14.64 33.76
3 1009 TRUE 28.48 37.22
4 1012 TRUE 26.64 25.23
5 1015 TRUE 21.24 NA
6 1018 FALSE 23.47 24.45
We could also convert it to a network object. This would be useful for (1) storing all data in the same file, (2) a more compact format for large, sparse matrices, or (3) using the data in later analyses where the routines require network objects (e.g. ERGM)
Network attributes:
vertices = 26
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges = 88
missing edges = 0
non-missing edges = 88
density = 0.1353846
Vertex attributes:
vertex.names:
character valued attribute
26 valid vertex names
No edge attributes
Network edgelist matrix:
[,1] [,2]
[1,] 20 1
[2,] 24 1
[3,] 3 2
[4,] 7 2
[5,] 9 2
[6,] 12 2
[7,] 14 2
[8,] 21 2
[9,] 26 2
[10,] 2 3
[11,] 7 3
[12,] 9 3
[13,] 10 3
[14,] 19 3
[15,] 25 3
[16,] 26 3
[17,] 5 4
[18,] 7 4
[19,] 4 5
[20,] 1 6
[21,] 13 6
[22,] 14 6
[23,] 15 6
[24,] 16 6
[25,] 17 6
[26,] 18 6
[27,] 20 6
[28,] 24 6
[29,] 4 7
[30,] 5 7
[31,] 22 9
[32,] 25 9
[33,] 12 11
[34,] 14 13
[35,] 18 13
[36,] 2 14
[37,] 6 14
[38,] 12 14
[39,] 13 14
[40,] 15 14
[41,] 16 14
[42,] 17 14
[43,] 20 14
[44,] 21 14
[45,] 16 15
[46,] 17 15
[47,] 6 16
[48,] 9 16
[49,] 17 16
[50,] 18 16
[51,] 1 17
[52,] 6 17
[53,] 9 17
[54,] 11 17
[55,] 14 17
[56,] 16 17
[57,] 18 17
[58,] 20 17
[59,] 23 17
[60,] 24 17
[61,] 16 18
[62,] 17 18
[63,] 2 19
[64,] 3 19
[65,] 7 19
[66,] 9 19
[67,] 25 19
[68,] 26 19
[69,] 6 20
[70,] 22 21
[71,] 23 21
[72,] 21 22
[73,] 23 22
[74,] 21 23
[75,] 22 23
[76,] 1 24
[77,] 6 24
[78,] 18 24
[79,] 4 25
[80,] 22 25
[81,] 2 26
[82,] 3 26
[83,] 5 26
[84,] 7 26
[85,] 9 26
[86,] 10 26
[87,] 19 26
[88,] 25 26
Here the row and column names have been carried through becasue they were attached to the matrix. We can look at them by using the network variable methods and the shorthand %v%:
[1] "na" "vertex.names"
[1] "1003" "1006" "1009" "1012" "1015" "1018" "1021" "1024" "1027" "1030"
[11] "1033" "1036" "1039" "1042" "1045" "1048" "1051" "1054" "1057" "1060"
[21] "1063" "1066" "1069" "1072" "1075" "1078"
If we wanted to set the names back to the original numbers, we could use these methods as well:
[1] 1003 1006 1009 1012 1015 1018 1021 1024 1027 1030 1033 1036 1039 1042 1045
[16] 1048 1051 1054 1057 1060 1063 1066 1069 1072 1075 1078
Reading in an edgelist and converting it to a network object is also straightforward. Edgelists are useful because they are a smaller, more concise data structure for larger, sparser networks that we typically deal with in social network analysis.
In the newest release of statnet it will automatically read the
weight data and store it as “Weight.” If you’re using an older version
of statnet, you might need to add two more commands to the
network
command: ignore.eval=FALSE
and
names.eval="Weight"
.
from to liking
1 1003 1018 3
2 1003 1051 3
3 1003 1072 4
4 1006 1009 5
5 1006 1042 1
6 1006 1057 4
Network attributes:
vertices = 25
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 88
missing edges= 0
non-missing edges= 88
Vertex attribute names:
vertex.names
Edge attribute names:
liking
Converting back to an adjacency matrix is simple:
1003 1006 1009 1012 1015 1018 1021 1027 1030 1033 1036 1039 1042 1045 1048
1003 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
1006 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0
1009 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1012 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0
1015 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
1018 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1
1021 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0
1027 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1
1030 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1033 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1036 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0
1039 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0
1042 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0
1045 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0
1048 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0
1051 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1
1054 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1
1057 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1060 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0
1063 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0
1066 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
1069 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1072 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0
1075 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0
1078 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
1051 1054 1057 1060 1063 1066 1069 1072 1075 1078
1003 1 0 0 0 0 0 0 1 0 0
1006 0 0 1 0 0 0 0 0 0 1
1009 0 0 1 0 0 0 0 0 0 1
1012 0 0 0 0 0 0 0 0 1 0
1015 0 0 0 0 0 0 0 0 0 1
1018 1 0 0 1 0 0 0 1 0 0
1021 0 0 1 0 0 0 0 0 0 1
1027 1 0 1 0 0 0 0 0 0 1
1030 0 0 0 0 0 0 0 0 0 1
1033 1 0 0 0 0 0 0 0 0 0
1036 0 0 0 0 0 0 0 0 0 0
1039 0 0 0 0 0 0 0 0 0 0
1042 1 0 0 0 0 0 0 0 0 0
1045 0 0 0 0 0 0 0 0 0 0
1048 1 1 0 0 0 0 0 0 0 0
1051 0 1 0 0 0 0 0 0 0 0
1054 1 0 0 0 0 0 0 1 0 0
1057 0 0 0 0 0 0 0 0 0 1
1060 1 0 0 0 0 0 0 0 0 0
1063 0 0 0 0 0 1 1 0 0 0
1066 0 0 0 0 1 0 1 0 1 0
1069 1 0 0 0 1 1 0 0 0 0
1072 1 0 0 0 0 0 0 0 0 0
1075 0 0 1 0 0 0 0 0 0 1
1078 0 0 1 0 0 0 0 0 0 0
In network
edges and edge weights are considered
separate. This is confusing, but done for a number of reasons. (1) you
might want multiple types of weights associated with a given edge, or
(2) you might want a weight associated where there isn’t an edge at
all.
To see a particular weight, use the edge attribute shorthand %e% and
to get the full network with weights, the command
as.sociomatrix.sna
. Note that the network
command just called the weights by the column name from the csv
file.
[1] "liking" "na"
[1] 3 3 4 5 1 4 1 2 3 5 5 4 3 2 1 2 1 4 3 5 2 2 4 3 4 4 4 3 2 5 3 4 4 3 1 4 2 2
[39] 4 3 2 3 2 1 2 1 1 4 5 2 2 4 3 4 1 1 3 3 2 1 4 4 4 2 5 4 5 3 3 5 4 3 1 5 4 3
[77] 2 3 3 3 2 5 3 1 5 1 1 4
1003 1006 1009 1012 1015 1018 1021 1027 1030 1033 1036 1039 1042 1045 1048
1003 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0
1006 0 0 5 0 0 0 0 0 0 0 0 0 1 0 0
1009 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0
1012 0 0 0 0 5 0 4 0 0 0 0 0 0 0 0
1015 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0
1018 0 0 0 0 0 0 0 0 0 0 0 0 1 0 4
1021 0 2 4 3 0 0 0 0 0 0 0 0 0 0 0
1027 0 4 3 0 0 0 0 0 0 0 0 0 0 0 2
1030 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0
1033 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1036 0 4 0 0 0 0 0 0 0 2 0 0 2 0 0
1039 0 0 0 0 0 4 0 0 0 0 0 0 3 0 0
1042 0 2 0 0 0 3 0 0 0 0 0 2 0 0 0
1045 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0
1048 0 0 0 0 0 1 0 0 0 0 0 0 4 5 0
1051 0 0 0 0 0 4 0 0 0 0 0 0 3 4 1
1054 0 0 0 0 0 3 0 0 0 0 0 3 0 0 2
1057 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0
1060 2 0 0 0 0 5 0 0 0 0 0 0 4 0 0
1063 0 3 0 0 0 0 0 0 0 0 0 0 3 0 0
1066 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0
1069 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1072 3 0 0 0 0 3 0 0 0 0 0 0 0 0 0
1075 0 0 5 0 0 0 0 3 0 0 0 0 0 0 0
1078 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
1051 1054 1057 1060 1063 1066 1069 1072 1075 1078
1003 3 0 0 0 0 0 0 4 0 0
1006 0 0 4 0 0 0 0 0 0 1
1009 0 0 3 0 0 0 0 0 0 5
1012 0 0 0 0 0 0 0 0 3 0
1015 0 0 0 0 0 0 0 0 0 2
1018 3 0 0 5 0 0 0 2 0 0
1021 0 0 4 0 0 0 0 0 0 4
1027 5 0 3 0 0 0 0 0 0 4
1030 0 0 0 0 0 0 0 0 0 3
1033 1 0 0 0 0 0 0 0 0 0
1036 0 0 0 0 0 0 0 0 0 0
1039 0 0 0 0 0 0 0 0 0 0
1042 1 0 0 0 0 0 0 0 0 0
1045 0 0 0 0 0 0 0 0 0 0
1048 2 2 0 0 0 0 0 0 0 0
1051 0 1 0 0 0 0 0 0 0 0
1054 1 0 0 0 0 0 0 4 0 0
1057 0 0 0 0 0 0 0 0 0 4
1060 5 0 0 0 0 0 0 0 0 0
1063 0 0 0 0 0 5 4 0 0 0
1066 0 0 0 0 1 0 5 0 4 0
1069 3 0 0 0 2 3 0 0 0 0
1072 2 0 0 0 0 0 0 0 0 0
1075 0 0 1 0 0 0 0 0 0 5
1078 0 0 4 0 0 0 0 0 0 0
Attaching package: 'igraph'
The following objects are masked from 'package:stats':
decompose, spectrum
The following object is masked from 'package:base':
union
Small Igraph objects can be created using make_graph()
.
You can create network from data using one of the functions from the
table below. The table point to functions for:
Object | Object -> Igraph | Igraph -> Object |
---|---|---|
Adjacency matrix | graph_from_adjacency_matrix |
as_adjacency_matrix |
Edge list | graph_from_edgelist |
as_edgelist |
Data frames | graph_from_data_frame |
as_data_frame |
make_graph()
Function make_graph()
can quickly create small networks.
Relational information can be supplied in two ways:
As a vector of even number of node IDs. Pairs of adjacent IDs are interpreted as edges:
IGRAPH eedf6ab U--- 4 3 --
+ edges from eedf6ab:
[1] 1--2 2--3 3--4
Using symbolic formula in which
--
undirected tie--+
directed tie (+
is arrow’s head):
refer to node sets (e.g. A -- B:C
creates ties A -- B
and A -- C
)IGRAPH 38c6532 DN-- 5 5 --
+ attr: name (v/c)
+ edges from 38c6532 (vertex names):
[1] A->B A->D A->E B->A C->B
The print-out of g2
exemplifies how
igraph summarizes igraph objects:
IGRAPH
)?igraph::graph_id
)U
or D
if the network is
U
ndirected or D
irectedN
if the nodes have namesW
if the network is weightedB
if the network is bipartite+ attr:
list of present attributes, each
of the form nameoftheattribute (x/y)
where
x
informs about the type of an attribute:
v
ertex, e
dge or g
raph
attributey
informs about the mode of an attribute:
n
umeric, c
haracter, l
ogical, or
ex
tended (e.g. lists)+ edges:
a list of (some of) the edges of
the networkIgraph objects can be created from adjacency matrices with
graph_from_adjacency_matrix()
:
IGRAPH 24fddef DN-- 26 88 --
+ attr: name (v/c)
+ edges from 24fddef (vertex names):
[1] 1003->1018 1003->1051 1003->1072 1006->1009 1006->1042 1006->1057
[7] 1006->1078 1009->1006 1009->1057 1009->1078 1012->1015 1012->1021
[13] 1012->1075 1015->1012 1015->1021 1015->1078 1018->1042 1018->1048
[19] 1018->1051 1018->1060 1018->1072 1021->1006 1021->1009 1021->1012
[25] 1021->1057 1021->1078 1027->1006 1027->1009 1027->1048 1027->1051
[31] 1027->1057 1027->1078 1030->1009 1030->1078 1033->1051 1036->1006
[37] 1036->1033 1036->1042 1039->1018 1039->1042 1042->1006 1042->1018
[43] 1042->1039 1042->1051 1045->1018 1045->1042 1048->1018 1048->1042
+ ... omitted several edges
Important arguments:
mode
– how to interpret the matrix
"directed"
, "undirected"
:
directed/undirected network"max"
, "min"
, "sum
“:
determine the number of \(i\)-\(j\) relations that will be created, e.g.,
max( m[i,j], m[j,i] )
."lower"
, "upper"
: whether to read only
lower/upper triangle of the matrixweighted
– if TRUE
non-zero values of the
matrix are stored in edge attribute weight
Function graph_from_edgelist()
expects a two-column
matrix
from to
[1,] 1003 1018
[2,] 1003 1051
[3,] 1003 1072
[4,] 1006 1009
[5,] 1006 1042
[6,] 1006 1057
Now create the object:
IGRAPH d3baf43 D--- 1078 88 --
+ edges from d3baf43:
[1] 1003->1018 1003->1051 1003->1072 1006->1009 1006->1042 1006->1057
[7] 1006->1078 1009->1006 1009->1057 1009->1078 1012->1015 1012->1021
[13] 1012->1075 1015->1012 1015->1021 1015->1078 1018->1042 1018->1048
[19] 1018->1051 1018->1060 1018->1072 1021->1006 1021->1009 1021->1012
[25] 1021->1057 1021->1078 1027->1006 1027->1009 1027->1048 1027->1051
[31] 1027->1057 1027->1078 1030->1009 1030->1078 1033->1051 1036->1006
[37] 1036->1033 1036->1042 1039->1018 1039->1042 1042->1006 1042->1018
[43] 1042->1039 1042->1051 1045->1018 1045->1042 1048->1018 1048->1042
[49] 1048->1045 1048->1051 1048->1054 1051->1018 1051->1042 1051->1045
+ ... omitted several edges
Note the number of edges! If edgelist matrix contains integers the
function assumes that node IDs start from 1 and thus the result will
contain a lot of isolates. In this case we have to convert the matrix to
character mode before passing it to
graph_from_edgelist()
:
edgelist_matrix_ch <- as.character(edgelist_matrix)
dim(edgelist_matrix_ch) <- dim(edgelist_matrix)
graph_from_edgelist(edgelist_matrix_ch, directed=TRUE)
IGRAPH 457b9b4 DN-- 25 88 --
+ attr: name (v/c)
+ edges from 457b9b4 (vertex names):
[1] 1003->1018 1003->1051 1003->1072 1006->1009 1006->1042 1006->1057
[7] 1006->1078 1009->1006 1009->1057 1009->1078 1012->1015 1012->1021
[13] 1012->1075 1015->1012 1015->1021 1015->1078 1018->1042 1018->1048
[19] 1018->1051 1018->1060 1018->1072 1021->1006 1021->1009 1021->1012
[25] 1021->1057 1021->1078 1027->1006 1027->1009 1027->1048 1027->1051
[31] 1027->1057 1027->1078 1030->1009 1030->1078 1033->1051 1036->1006
[37] 1036->1033 1036->1042 1039->1018 1039->1042 1042->1006 1042->1018
[43] 1042->1039 1042->1051 1045->1018 1045->1042 1048->1018 1048->1042
+ ... omitted several edges
This also shows the disadvantage of solely relying on edgelist representation as we are missing one boy who is an isolate.
Igraph objects can be created from data frames with data on edges
and, optionally, on vertices with graph_from_data_frame
classroom_kids <- read.csv("classroom-nodes.csv", header=TRUE, colClasses=c(name = "character"))
head(classroom_kids)
name female isei08_m isei08_f
1 1003 FALSE NA 25.71
2 1006 TRUE 14.64 33.76
3 1009 TRUE 28.48 37.22
4 1012 TRUE 26.64 25.23
5 1015 TRUE 21.24 NA
6 1018 FALSE 23.47 24.45
classroom_play <- read.csv("classroom-edges.csv", header=TRUE, colClasses = c(from="character", to="character"))
head(classroom_play)
from to liking
1 1003 1018 3
2 1003 1051 3
3 1003 1072 4
4 1006 1009 5
5 1006 1042 1
6 1006 1057 4
classroom <- graph_from_data_frame(classroom_play, vertices=classroom_kids,
directed=TRUE)
classroom
IGRAPH d44beb4 DN-- 26 88 --
+ attr: name (v/c), female (v/l), isei08_m (v/n), isei08_f (v/n),
| liking (e/n)
+ edges from d44beb4 (vertex names):
[1] 1003->1018 1003->1051 1003->1072 1006->1009 1006->1042 1006->1057
[7] 1006->1078 1009->1006 1009->1057 1009->1078 1012->1015 1012->1021
[13] 1012->1075 1015->1012 1015->1021 1015->1078 1018->1042 1018->1048
[19] 1018->1051 1018->1060 1018->1072 1021->1006 1021->1009 1021->1012
[25] 1021->1057 1021->1078 1027->1006 1027->1009 1027->1048 1027->1051
[31] 1027->1057 1027->1078 1030->1009 1030->1078 1033->1051 1036->1006
[37] 1036->1033 1036->1042 1039->1018 1039->1042 1042->1006 1042->1018
+ ... omitted several edges
classroom_play
are vertex IDs,
additional columns are interpreted as edge attributes.classroom_play
is vertex ID,
additional columns are interpreted as vertex attributes.classroom_play
) must be present in the node data frame
(classroom_kids
)Package tidygraph uses igraph
internally to store network data but provides a “tidy” interface for
data manipulation – network data are interfaced as to interconnected
data frames (1) nodes and (2) edges. This is very similar to the data
structure accepted by igraph::graph_from_data_frame()
demonstrated above.
Attaching package: 'tidygraph'
The following object is masked from 'package:stats':
filter
Objects can be created with:
tbl_graph()
from two data frames, similarly to
igraph::graph_from_data_frame()
tg_classroom <- tbl_graph(nodes = classroom_kids, edges = classroom_play,
directed = TRUE)
tg_classroom
# A tbl_graph: 26 nodes and 88 edges
#
# A directed simple graph with 2 components
#
# Node Data: 26 × 4 (active)
name female isei08_m isei08_f
<chr> <lgl> <dbl> <dbl>
1 1003 FALSE NA 25.7
2 1006 TRUE 14.6 33.8
3 1009 TRUE 28.5 37.2
4 1012 TRUE 26.6 25.2
5 1015 TRUE 21.2 NA
6 1018 FALSE 23.5 24.4
7 1021 TRUE 21.2 25.2
8 1024 FALSE NA 25.7
9 1027 TRUE NA 26.0
10 1030 TRUE 26.6 25.7
# ℹ 16 more rows
#
# Edge Data: 88 × 3
from to liking
<int> <int> <int>
1 1 6 3
2 1 17 3
3 1 24 4
# ℹ 85 more rows
as_tbl_graph()
which accepts variety of objects:
adjacency matrices, igraph, network, ggraph and some more (c.f. the
documentation)
# A tbl_graph: 26 nodes and 88 edges
#
# A directed simple graph with 2 components
#
# Node Data: 26 × 4 (active)
name female isei08_m isei08_f
<chr> <lgl> <dbl> <dbl>
1 1003 FALSE NA 25.7
2 1006 TRUE 14.6 33.8
3 1009 TRUE 28.5 37.2
4 1012 TRUE 26.6 25.2
5 1015 TRUE 21.2 NA
6 1018 FALSE 23.5 24.4
7 1021 TRUE 21.2 25.2
8 1024 FALSE NA 25.7
9 1027 TRUE NA 26.0
10 1030 TRUE 26.6 25.7
# ℹ 16 more rows
#
# Edge Data: 88 × 3
from to liking
<int> <int> <int>
1 1 6 3
2 1 17 3
3 1 24 4
# ℹ 85 more rows
# A tbl_graph: 25 nodes and 88 edges
#
# A directed simple graph with 1 component
#
# Node Data: 25 × 1 (active)
na
<lgl>
1 FALSE
2 FALSE
3 FALSE
4 FALSE
5 FALSE
6 FALSE
7 FALSE
8 FALSE
9 FALSE
10 FALSE
# ℹ 15 more rows
#
# Edge Data: 88 × 4
from to liking na
<int> <int> <int> <lgl>
1 1 6 3 FALSE
2 1 16 3 FALSE
3 1 23 4 FALSE
# ℹ 85 more rows
In tidygraph you can use dplyr
(Wickham et al. 2021) verbs such as
mutate()
, select()
etc. once you
activate()
either the nodes
or
edges
data frame. Here are some examples.
Calculate social status of kid’s family as a minimal value of social statuses of mother and father:
# A tbl_graph: 26 nodes and 88 edges
#
# A directed simple graph with 2 components
#
# Node Data: 26 × 5 (active)
name female isei08_m isei08_f status
<chr> <lgl> <dbl> <dbl> <dbl>
1 1003 FALSE NA 25.7 25.7
2 1006 TRUE 14.6 33.8 14.6
3 1009 TRUE 28.5 37.2 28.5
4 1012 TRUE 26.6 25.2 25.2
5 1015 TRUE 21.2 NA 21.2
6 1018 FALSE 23.5 24.4 23.5
7 1021 TRUE 21.2 25.2 21.2
8 1024 FALSE NA 25.7 25.7
9 1027 TRUE NA 26.0 26.0
10 1030 TRUE 26.6 25.7 25.7
# ℹ 16 more rows
#
# Edge Data: 88 × 3
from to liking
<int> <int> <int>
1 1 6 3
2 1 17 3
3 1 24 4
# ℹ 85 more rows
Similarly to dplyr you can use the pipe operator
%>%
to chain multiple data transformations. Here add a
node attribute first, then edge attribute like5
second:
tg_classroom %>%
activate(nodes) %>%
mutate(
status = pmin(isei08_m, isei08_f, na.rm=TRUE)
) %>%
activate(edges) %>%
mutate(
like5 = liking == 5 # TRUE if liking is 5
)
# A tbl_graph: 26 nodes and 88 edges
#
# A directed simple graph with 2 components
#
# Edge Data: 88 × 4 (active)
from to liking like5
<int> <int> <int> <lgl>
1 1 6 3 FALSE
2 1 17 3 FALSE
3 1 24 4 FALSE
4 2 3 5 TRUE
5 2 14 1 FALSE
6 2 19 4 FALSE
7 2 26 1 FALSE
8 3 2 2 FALSE
9 3 19 3 FALSE
10 3 26 5 TRUE
# ℹ 78 more rows
#
# Node Data: 26 × 5
name female isei08_m isei08_f status
<chr> <lgl> <dbl> <dbl> <dbl>
1 1003 FALSE NA 25.7 25.7
2 1006 TRUE 14.6 33.8 14.6
3 1009 TRUE 28.5 37.2 28.5
# ℹ 23 more rows
You can refer to node attributes with .N()
when
computing on edges data frame and refer to edge attributes with
.E()
when computing on the nodes. For example, to add an
edge attribute which is TRUE
if gender of ego and alter
match and FALSE
otherwise we can use .N()
in
the following manner. Function .N()
returns a the node data
frame.
tg_classroom %>%
activate(edges) %>%
mutate(
# Add edge attribute which is TRUE if gender of ego and alter match
sex_match = .N()$female[from] == .N()$female[to]
)
# A tbl_graph: 26 nodes and 88 edges
#
# A directed simple graph with 2 components
#
# Edge Data: 88 × 4 (active)
from to liking sex_match
<int> <int> <int> <lgl>
1 1 6 3 TRUE
2 1 17 3 TRUE
3 1 24 4 TRUE
4 2 3 5 TRUE
5 2 14 1 FALSE
6 2 19 4 TRUE
7 2 26 1 TRUE
8 3 2 2 TRUE
9 3 19 3 TRUE
10 3 26 5 TRUE
# ℹ 78 more rows
#
# Node Data: 26 × 4
name female isei08_m isei08_f
<chr> <lgl> <dbl> <dbl>
1 1003 FALSE NA 25.7
2 1006 TRUE 14.6 33.8
3 1009 TRUE 28.5 37.2
# ℹ 23 more rows
Use filter()
to select subgraphs:
Select the subgraph of girls and relations between them:
# A tbl_graph: 13 nodes and 41 edges
#
# A directed simple graph with 1 component
#
# Node Data: 13 × 4 (active)
name female isei08_m isei08_f
<chr> <lgl> <dbl> <dbl>
1 1006 TRUE 14.6 33.8
2 1009 TRUE 28.5 37.2
3 1012 TRUE 26.6 25.2
4 1015 TRUE 21.2 NA
5 1021 TRUE 21.2 25.2
6 1027 TRUE NA 26.0
7 1030 TRUE 26.6 25.7
8 1057 TRUE 70.4 51.5
9 1063 TRUE 25.0 26.8
10 1066 TRUE 24.9 26.0
11 1069 TRUE 24.9 26.0
12 1075 TRUE 14.2 11.7
13 1078 TRUE 28.5 28.5
#
# Edge Data: 41 × 3
from to liking
<int> <int> <int>
1 1 2 5
2 1 8 4
3 1 13 1
# ℹ 38 more rows
Select a subgraph of relations for which liking
is
at least 3
# A tbl_graph: 26 nodes and 56 edges
#
# A directed simple graph with 3 components
#
# Edge Data: 56 × 3 (active)
from to liking
<int> <int> <int>
1 1 6 3
2 1 17 3
3 1 24 4
4 2 3 5
5 2 19 4
6 3 19 3
7 3 26 5
8 4 5 5
9 4 7 4
10 4 25 3
# ℹ 46 more rows
#
# Node Data: 26 × 4
name female isei08_m isei08_f
<chr> <lgl> <dbl> <dbl>
1 1003 FALSE NA 25.7
2 1006 TRUE 14.6 33.8
3 1009 TRUE 28.5 37.2
# ℹ 23 more rows
Loading required package: BiocGenerics
Attaching package: 'BiocGenerics'
The following object is masked from 'package:statnet.common':
order
The following objects are masked from 'package:stats':
IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':
anyDuplicated, aperm, append, as.data.frame, basename, cbind,
colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
Position, rank, rbind, Reduce, rownames, sapply, setdiff, table,
tapply, union, unique, unsplit, which.max, which.min
Package graph is implemented using S4 class system (see e.g. the part “Object oriented programming” of Wickham (2019), especially chapter 15 on the S4 system). The two main classes of objects are:
graphAM
– networks internally stored as adjacency
matricesgraphNEL
– networks internally stored as adjacency
lists (imprecisely called “edge lists” in the documentation). Adjacency
list is a list (class of R object) with an element for every node being
a vector of adjacent nodes.Objects can be created with functions with the above names. From adjacency matrices:
A graphAM graph with directed edges
Number of Nodes = 26
Number of Edges = 88
To demonstrate graphNEL
we have to create an adjacency
list first. We can create it from adjacency matrix
relations
like so:
adjlist <- apply(relations, 1, function(r) rownames(relations)[which(r == 1)])
head(adjlist) # initial elements of the adj. list
$`1003`
[1] "1018" "1051" "1072"
$`1006`
[1] "1009" "1042" "1057" "1078"
$`1009`
[1] "1006" "1057" "1078"
$`1012`
[1] "1015" "1021" "1075"
$`1015`
[1] "1012" "1021" "1078"
$`1018`
[1] "1042" "1048" "1051" "1060" "1072"
… and now the object:
gr2 <- graphNEL(
nodes = classroom_kids$name, # names of the nodes
edgeL = adjlist, # adjacency list of node names
edgemode = "directed"
)
gr2
A graphNEL graph with directed edges
Number of Nodes = 26
Number of Edges = 88
Both types of objects graphAM
and graphNEL
can store edge and node attributes. There are separate functions
edgeData()
and nodeData()
for setting and
accessing edge/node attributes. For example to add female
attribute we need to:
# Set the default value, say FALSE
nodeDataDefaults(gr2, attr="female") <- FALSE
# Assign the values
nodeData(gr2, n = classroom_kids$name, attr="female") <- classroom_kids$female
Working with edge attributes look similar, but uses function
edgeData()
like so:
Use intergraph (Bojanowski 2015) to convert data objects between igraph and network representations.
# igraph -> network
classroom_network <- intergraph::asNetwork(classroom)
# network -> igraph
classroom_igraph <- intergraph::asIgraph(classroom_network)
classroom_network
Network attributes:
vertices = 26
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 88
missing edges= 0
non-missing edges= 88
Vertex attribute names:
female isei08_f isei08_m vertex.names
Edge attribute names:
liking
IGRAPH f63c10e D--- 26 88 --
+ attr: female (v/l), isei08_f (v/n), isei08_m (v/n), na (v/l),
| vertex.names (v/c), liking (e/n), na (e/l)
+ edges from f63c10e:
[1] 1-> 6 1->17 1->24 2-> 3 2->14 2->19 2->26 3-> 2 3->19 3->26
[11] 4-> 5 4-> 7 4->25 5-> 4 5-> 7 5->26 6->14 6->16 6->17 6->20
[21] 6->24 7-> 2 7-> 3 7-> 4 7->19 7->26 9-> 2 9-> 3 9->16 9->17
[31] 9->19 9->26 10-> 3 10->26 11->17 12-> 2 12->11 12->14 13-> 6 13->14
[41] 14-> 2 14-> 6 14->13 14->17 15-> 6 15->14 16-> 6 16->14 16->15 16->17
[51] 16->18 17-> 6 17->14 17->15 17->16 17->18 18-> 6 18->13 18->16 18->17
[61] 18->24 19-> 3 19->26 20-> 1 20-> 6 20->14 20->17 21-> 2 21->14 21->22
+ ... omitted several edges
All the attributes are copied properly.
Use igraph::as_graphnel()
and
igraph::graph_from_graphnel()
for igraph
<-> graph conversions.
Package / Class | ||||
---|---|---|---|---|
network | igraph | graphNEL1 | graphAM1 | |
Bipartite | v | v | x | x |
Hypergraphs2 | v | x | x | x |
Vertex attributes | v | v | v | v |
Edge attributes | v | v | v | v |
Graph attributes | v | v | v | v |
Attributes can be lists | v | v | x | x |
Multigraphs3 | x | v | x | x |
1
Package 'graph'.
2
Networks with edges connecting sets of vertices.
3
Networks with multiple edges in the same dyad.
|
'network' 1.18.2 (2023-12-04), part of the Statnet Project
* 'news(package="network")' for changes since last version
* 'citation("network")' for citation information
* 'https://statnet.org' for help, support, and other information
sna: Tools for Social Network Analysis
Version 2.7-2 created on 2023-12-05.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
For citation information, type citation("sna").
Type help(package="sna") to get started.
We can plot matrices using the gplot
routine from the
sna
package:
Or network objects using the plot
command. This
automatically incorporates the network level attribute data that the
network is undirected. gplot
came first and years later the
network package with the more specialized data structures was written,
but we preserve gplot for the ability to work directly with
matrices.
More layout options are included in the sna
package for
gplot
. Here’s one that selects a node and tries to arrange
the other nodes around it as a bullseye.
Let’s color the nodes in gender-stereotypic colors, and increase the size of the nodes
nodeColors<-ifelse(nodeInfo$female,"hotpink","dodgerblue")
plot(nrelations,displaylabels=T,vertex.col=nodeColors,vertex.cex=3)
Same with edgelists, and simple to also display the edge weights
We can now look at slightly more complicated data in the supplied dataset. Plot the contiguity among nations in 1993 (from the Correlates of War (CoW)1 project)
Here’s an example of directed data|militarized interstate disputes (MIDs) for 1993, with added labels
All those isolates can get in the way. We can suppress them using
displayisolates
When a layout is generated, the results can be saved for later reuse. Here we use a spring-embedded algorithm on the global contiguity plot to place the nodes and then plot the edges from the militarized interstate dispute network. It’s very approximate, but generally shorter edges are attacks between contiguous or neigbouring countries and longer edges are between farther away countries.
coords <- gplot(contig_1993,gmode="graph",label=colnames(contig_1993[,]),label.cex=0.5,label.col="blue") # Capture the magic of the moment
x y
[1,] 23.54235 11.50500
[2,] 25.65262 20.06014
[3,] 32.49855 11.43293
[4,] 34.05418 6.65080
[5,] 38.10269 11.25545
[6,] 35.63348 16.57894
Saved (or a priori) layouts can be used via the coord argument
gplot(mids_1993,gmode="graph",label=colnames(contig_1993[,]),label.cex=0.5,label.col="blue",coord=coords)
When the default settings are insuficient, interactive mode allows for tweaking. This is a bit clunky and not run here, but can be very useful for getting a specific image exactly correct. We haven’t run it here, but you can play around with it later.
Attaching package: 'igraph'
The following objects are masked from 'package:BiocGenerics':
normalize, path, union
The following objects are masked from 'package:stats':
decompose, spectrum
The following object is masked from 'package:base':
union
Network visualization is performed using plot
function.
With default settings it looks like
Example plots using the classroom
data:
plot(
classroom,
layout=layout_with_fr,
vertex.color="white",
vertex.size=15,
edge.arrow.size=0.5,
vertex.label.color="black",
vertex.label.family="sans",
vertex.label=ifelse(V(classroom)$female, "F", "M")
)
Notable layouts in sna:
Notable layouts in igraph:
Package graphlayouts:
'network' 1.18.2 (2023-12-04), part of the Statnet Project
* 'news(package="network")' for changes since last version
* 'citation("network")' for citation information
* 'https://statnet.org' for help, support, and other information
sna: Tools for Social Network Analysis
Version 2.7-2 created on 2023-12-05.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
For citation information, type citation("sna").
Type help(package="sna") to get started.
The network package has many routines to describe the data. Dyads give the number of possible edges, so \(n*(n-1)\) for a directed graph, and \(\frac{n*(n-1)}{2}\) for undirected graphs. The edgecount will give the number of actual edges, and the size the number of nodes.
[1] 650
[1] 88
[1] 26
Going back to the Correlates of War data, we can look at our
centrality measures. Freeman degree is also called total degree and is
the sum of the indegrees and outdegrees. One degree centrality function
is used for all three with a default of Freeman degree. The input for
cmode
determines which kind of degree is calculated.
[1] 5 1 0 0 6 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0
[26] 0 0 0 0 0 0 0 0 1 0 3 0 2 2 0 4 0 0 0 1 0 0 1 1 0
[51] 0 0 0 2 0 0 2 1 3 14 2 1 1 2 0 1 1 9 0 0 0 0 0 3 1
[76] 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 1 0 0 0 0 1 1 0 1
[101] 0 0 1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[126] 0 0 0 0 0 0 0 1 3 5 8 2 1 1 0 2 1 0 2 0 0 0 0 6 1
[151] 1 1 1 1 4 0 1 4 1 1 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0
[176] 0 0 0 1 0 0 1 0 0 0 0
ideg <- degree(mids_1993, cmode="indegree") # Indegree for MIDs
odeg <- degree(mids_1993, cmode="outdegree") # Outdegree for MIDs
all(degree(mids_1993) == ideg+odeg) # In + out = total?
[1] TRUE
Once centrality scores are computed, we can handle them using standard R methods. Here, the dashed line indicates where on the plot outgoing attacks would equal incoming attacks (y=x); countries above this line are net agressors, and countries below are net defenders.
plot(ideg,
odeg,
type="n",
xlab="Incoming MIDs",
ylab="Outgoing MIDs") # Plot ideg by odeg
abline(0, 1, lty=3)
text(jitter(ideg),
jitter(odeg),
network.vertex.names(contig_1993),
cex=0.75,
col=2)
Plot simple histograms of the degree distributions. These can be quite useful to get a sense of how skewed the network is.
Centrality scores can also be used with other sna routines, e.g., gplot(). Here we’ve used the color functionality in rgb to shade each node by how much of an agressor (red) and defender (blue) compared to the other countries each node is.
gplot(mids_1993,
vertex.cex=(ideg+odeg)^0.5,
vertex.sides=50,
label.cex=0.4,
vertex.col=rgb(odeg/max(odeg),0,ideg/max(ideg)),
displaylabels=TRUE,
displayisolates=FALSE)
Betweenness and closeness are also popular measures
[1] 4997.6666667 0.0000000 1780.9333333 2063.4119048 445.9444444
[6] 2237.2984127 550.4444444 29.2618687 1.4553030 0.2916667
[11] 2.2191919 2.2191919 2.3441919 0.2916667 387.7916667
[16] 640.0412698 1.0079365 262.8190476 197.6650794 0.4444444
[21] 102.2246032 36.8031746 0.0000000 1359.0634921 1206.5426768
[26] 44.5261544 2.2166667 0.0000000 242.5166667 760.7691198
[31] 5.3095238 0.0000000 0.5833333 5.8928571 0.0000000
[36] 65.6519270 0.0000000 8.7941824 1.0000000 0.0000000
[41] 1558.2179920 0.0000000 0.0000000 123.6114706 244.8535495
[46] 0.0000000 0.0000000 1216.0722311 135.5883869 287.1149376
[51] 149.1209944 3.8980482 19.8155403 1046.8181481 0.0000000
[56] 11.4371615 11.9421356 1.3251984 109.5508382 157.8191065
[61] 0.0000000 12.4300337 526.2041909 9.9764069 132.2971043
[66] 0.0000000 137.1758265 7944.2773019 3.3530897 10.4082027
[71] 9.9320123 639.1920120 3.9538070 4.1910347 24.4580713
[76] 215.9500257 1.0678571 4.4209469 86.0305490 683.1869157
[81] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
[86] 0.0000000 1071.2878616 464.8800926 61.9749147 231.9178436
[91] 1337.5873379 132.3104055 315.4354711 133.8140313 4.4364799
[96] 0.0000000 54.8897054 2.0193708 376.7564917 496.6051739
[101] 165.3015620 141.0963159 561.2867358 73.2393103 1264.8267262
[106] 266.5677184 240.3355133 1481.3784317 0.0000000 0.3333333
[111] 916.4855143 7.9373749 44.9283484 9.5413404 254.5378746
[116] 495.7198422 620.0837129 22.4303647 0.0000000 192.0392450
[121] 103.3992854 0.0000000 21.3470314 0.0000000 0.0000000
[126] 49.1977850 0.0000000 228.1977850 29.7663917 1586.8067567
[131] 113.6367319 3396.9524336 2561.6616178 1385.6792878 4295.1559857
[136] 152.0162757 931.6201568 20.0901876 0.0000000 2.0239130
[141] 62.2773441 782.4652751 108.3209576 0.0000000 3.2329359
[146] 3.2329359 0.0000000 1171.7832050 121.9368695 242.2907025
[151] 6.4333836 6.5762408 2.2023810 360.4879556 2805.8710749
[156] 0.0000000 0.0000000 0.0000000 0.0000000 344.9912103
[161] 1217.3329177 0.0000000 1098.0060354 0.0000000 238.4518666
[166] 0.0000000 0.0000000 0.0000000 18.6739827 0.7500000
[171] 101.0817479 293.4544540 125.5886906 0.0000000 0.0000000
[176] 527.4145766 972.9699512 0.0000000 356.5000000 0.0000000
[181] 0.0000000 179.0000000 0.0000000 0.0000000 0.0000000
[186] 0.0000000
Closeness can be a bit tricky for disconnected graphs, but there are alternative definitions that fix some problems - see the help file for a discussion.
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[186] 0
[1] 0.2933419 0.2186251 0.2515980 0.2615079 0.2263751 0.2389878 0.2335823
[8] 0.1879666 0.1906693 0.1843630 0.1906693 0.1906693 0.1933720 0.1843630
[15] 0.2079823 0.2450215 0.2098887 0.2169157 0.2227715 0.1890634 0.2102346
[22] 0.2030274 0.1798585 0.2300544 0.2318562 0.1956243 0.1650359 0.1753540
[29] 0.1879666 0.2059846 0.1677386 0.1614323 0.1560269 0.1668377 0.1587296
[36] 0.2696933 0.2283376 0.2669906 0.2660897 0.2579816 0.3151094 0.2524024
[43] 0.2184277 0.2819562 0.2747447 0.2287881 0.2089489 0.3321321 0.3088889
[50] 0.2905148 0.2698734 0.2568104 0.2662698 0.3195903 0.2252660 0.2704912
[57] 0.2635736 0.2438438 0.2689790 0.2696096 0.2074174 0.2551051 0.3184234
[64] 0.2847297 0.2816860 0.2368876 0.2830373 0.3847555 0.2901502 0.2955556
[71] 0.2928529 0.3213320 0.2775247 0.2775676 0.3015122 0.3166281 0.2795066
[78] 0.2955556 0.2901502 0.3186186 0.0000000 0.1664139 0.2078757 0.1722698
[85] 0.2078757 0.1664139 0.2574732 0.2130108 0.2329043 0.2401308 0.2904397
[92] 0.2136736 0.2125604 0.2360317 0.1766585 0.1688464 0.2116208 0.2062154
[99] 0.2553754 0.2564565 0.2218396 0.2577499 0.2858451 0.2302800 0.2702724
[106] 0.2540562 0.2578829 0.2575623 0.2168888 0.2195915 0.2580051 0.2421943
[113] 0.2638288 0.2544595 0.2290188 0.2222053 0.2396301 0.1955836 0.2041872
[120] 0.1892055 0.1984279 0.1491523 0.1924305 0.1738151 0.1747160 0.2030611
[127] 0.1616358 0.2057638 0.2347254 0.3060768 0.2867074 0.3400837 0.3156306
[134] 0.3292664 0.3650965 0.2974775 0.3229665 0.2809459 0.2728378 0.2579344
[141] 0.2868018 0.3105405 0.2715315 0.2577864 0.2664736 0.2664736 0.2601673
[148] 0.2866924 0.2848005 0.2976963 0.2533719 0.2542728 0.2411712 0.3125997
[155] 0.3420592 0.2798906 0.2476190 0.2852960 0.2852960 0.2956564 0.2888739
[162] 0.2476190 0.2957400 0.2138846 0.2636551 0.2102810 0.2102810 0.2476190
[169] 0.2353260 0.2115637 0.2548263 0.2618533 0.2398305 0.1887194 0.2061583
[176] 0.2618533 0.2465873 0.1882690 0.1945753 0.0000000 0.1339852 0.1593136
[183] 0.0000000 0.0000000 0.1882690 0.0000000
From centrality to centralization. Here we nest commands - the
cmode
input is sent to the degree
function.
[1] 0.05773557
[1] 0.3376634
Elementary graph-level indices are pretty useful. Density is the number of edges divided by the number of possible edges, or \(\frac{E}{n(n-1)}\) for a directed network, and \(\frac{E}{2n(n-1)}\) for undirected graphs.
[1] 0.002034292
The MAN distribution is quite useful; it lists the number of Mutal, Assymetric, and Null ties in a given graph:
Mut Asym Null
[1,] 3 64 17138
Mut Asym Null
[1,] 534 0 16671
Reciprocity is calculated from the numbers in the dyad census. The
defaul routine defines reciprocity as \(\frac{M+N}{M+A+N}\). This is often not what
we first think of as reciprocity, since null ties are included in the
definition making the MIDS network seem quite reciprocal.
Edgewise
reciprocity, defined as \(\frac{M}{M+A}\) is interpreted as the
probability that a tie sent is also recieved. Under this definiton the
MIDS network has a very low reciprocity.
Mut
0.9962802
Mut
0.08571429
Transitivity is the proportion of paths i–>j–>k where the i–>k edge is also present.
[1] 0.02409639
Attaching package: 'igraph'
The following objects are masked from 'package:BiocGenerics':
normalize, path, union
The following objects are masked from 'package:stats':
decompose, spectrum
The following object is masked from 'package:base':
union
IGRAPH 6b016e0 UN-- 5 4 --
+ attr: name (v/c)
[1] 4
[1] 5
Warning: `is.directed()` was deprecated in igraph 2.0.0.
ℹ Please use `is_directed()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
[1] FALSE
Graph density and reciprocity
Density = proportion of exisintg edges
[1] 0.1353846
Reciprocity = proportion of mutual connections
[1] 0.6666667
[1] 0.5
Vertex degrees
Calculating in-/out-/total degrees
1003 1006 1009 1012 1015 1018 1021 1024 1027 1030 1033 1036 1039 1042 1045 1048
5 11 10 5 4 14 7 0 8 2 2 3 4 13 4 9
1051 1054 1057 1060 1063 1066 1069 1072 1075 1078
15 7 8 5 6 6 5 6 6 11
1003 1006 1009 1012 1015 1018 1021 1024 1027 1030 1033 1036 1039 1042 1045 1048
2 7 7 2 1 9 2 0 2 0 1 0 2 9 2 4
1051 1054 1057 1060 1063 1066 1069 1072 1075 1078
10 2 6 1 2 2 2 3 2 8
1003 1006 1009 1012 1015 1018 1021 1024 1027 1030 1033 1036 1039 1042 1045 1048
3 4 3 3 3 5 5 0 6 2 1 3 2 4 2 5
1051 1054 1057 1060 1063 1066 1069 1072 1075 1078
5 5 2 4 4 4 3 3 4 3
Degree distribution
Fraction of nodes with given degree
[1] 0.03846154 0.00000000 0.07692308 0.03846154 0.11538462 0.15384615
[7] 0.15384615 0.07692308 0.07692308 0.03846154 0.03846154 0.07692308
[13] 0.00000000 0.03846154 0.03846154 0.03846154
Centrality
1003 1006 1009 1012 1015 1018
0.5000000 102.0119048 12.1761905 7.6666667 0.3333333 74.4820346
1021 1024 1027 1030 1033 1036
10.7833333 0.0000000 27.3978355 0.0000000 1.8333333 0.0000000
1039 1042 1045 1048 1051 1054
1.2500000 128.2556277 0.0000000 15.6358225 76.8609307 9.5370130
1057 1060 1063 1066 1069 1072
0.0000000 9.2386364 6.1580087 6.2500000 3.9751082 12.7613636
1075 1078
16.3833333 17.5095238
1003 1006 1009 1012 1015 1018 1021
0.03225806 0.03448276 0.02564103 0.01851852 0.01694915 0.04166667 0.02083333
1024 1027 1030 1033 1036 1039 1042
NaN 0.03703704 0.01923077 0.02564103 0.02941176 0.03333333 0.04347826
1045 1048 1051 1054 1057 1060 1063
0.03448276 0.04000000 0.04000000 0.03571429 0.02000000 0.04000000 0.02564103
1066 1069 1072 1075 1078
0.02325581 0.02439024 0.03225806 0.02564103 0.02564103
See also package netrankr (Schoch 2017) and this blogpost for more centrality indices.
Package graph is rather thin with respect to analysis. For the most part it relies on a separate package RBGL (Carey, Long, and Gentleman 2021) available from Bioconductor repository.
Challenge your self with the set of practicals we have prepared.
─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.4.1 (2024-06-14)
os Ubuntu 22.04.4 LTS
system x86_64, linux-gnu
ui X11
language en
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/London
date 2024-06-23
pandoc 3.1.2 @ /usr/bin/ (via rmarkdown)
─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
BiocGenerics * 0.48.1 2023-11-01 [1] Bioconductor
bslib 0.7.0 2024-03-29 [1] CRAN (R 4.4.0)
cachem 1.1.0 2024-05-16 [1] CRAN (R 4.4.0)
cli 3.6.3 2024-06-21 [1] CRAN (R 4.4.1)
coda 0.19-4.1 2024-01-31 [1] CRAN (R 4.4.0)
colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.4.0)
crayon 1.5.3 2024-06-20 [1] CRAN (R 4.4.1)
digest 0.6.35 2024-03-11 [1] CRAN (R 4.4.0)
dplyr 1.1.4 2023-11-17 [1] CRAN (R 4.4.0)
evaluate 0.24.0 2024-06-10 [1] CRAN (R 4.4.0)
fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0)
farver 2.1.2 2024-05-13 [1] CRAN (R 4.4.0)
fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0)
generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0)
ggforce 0.4.2 2024-02-19 [1] CRAN (R 4.4.0)
ggplot2 3.5.1 2024-04-23 [1] CRAN (R 4.4.0)
ggraph 2.2.1 2024-03-07 [1] CRAN (R 4.4.0)
ggrepel 0.9.5 2024-01-10 [1] CRAN (R 4.4.0)
glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0)
graph 1.82.0 2024-06-23 [1] bioc_xgit (@e3ea15c)
graphlayouts 1.1.1 2024-03-09 [1] CRAN (R 4.4.0)
gridExtra 2.3 2017-09-09 [1] CRAN (R 4.4.0)
gtable 0.3.5 2024-04-22 [1] CRAN (R 4.4.0)
highr 0.11 2024-05-26 [1] CRAN (R 4.4.0)
htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
igraph 2.0.3 2024-03-13 [1] CRAN (R 4.4.0)
intergraph 2.0-4 2024-02-01 [1] CRAN (R 4.4.0)
isnar 1.0-0 2024-04-28 [1] Github (mbojan/isnar@5617770)
jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.4.0)
jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.4.0)
knitr * 1.47 2024-05-29 [1] CRAN (R 4.4.0)
lattice 0.22-6 2024-03-20 [4] CRAN (R 4.4.1)
lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0)
magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0)
MASS 7.3-61 2024-06-13 [4] CRAN (R 4.4.0)
memoise 2.0.1 2021-11-26 [1] CRAN (R 4.4.0)
munsell 0.5.1 2024-04-01 [1] CRAN (R 4.4.0)
network 1.18.2 2024-06-20 [1] Github (statnet/network@c1b2084)
pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0)
png 0.1-8 2022-11-29 [1] CRAN (R 4.4.0)
polyclip 1.10-6 2023-09-27 [1] CRAN (R 4.4.0)
purrr 1.0.2 2023-08-10 [1] CRAN (R 4.4.0)
R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0)
Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.4.0)
rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.0)
rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.0)
sass 0.4.9 2024-03-15 [1] CRAN (R 4.4.0)
scales 1.3.0 2023-11-28 [1] CRAN (R 4.4.0)
sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0)
sna 2.7-2 2023-12-06 [1] CRAN (R 4.4.0)
statnet.common * 4.10.0-442 2024-06-20 [1] Github (statnet/statnet.common@4e8cb54)
tibble 3.2.1 2023-03-20 [1] CRAN (R 4.4.0)
tidygraph 1.3.1 2024-01-30 [1] CRAN (R 4.4.0)
tidyr 1.3.1 2024-01-24 [1] CRAN (R 4.4.0)
tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0)
tweenr 2.0.3 2024-02-26 [1] CRAN (R 4.4.0)
utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0)
vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0)
viridis 0.6.5 2024-01-29 [1] CRAN (R 4.4.0)
viridisLite 0.4.2 2023-05-02 [1] CRAN (R 4.4.0)
withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0)
xfun 0.45 2024-06-16 [1] CRAN (R 4.4.1)
yaml 2.3.8 2023-12-11 [1] CRAN (R 4.4.0)
[1] /home/mbojan/R/library/4.4
[2] /usr/local/lib/R/site-library
[3] /usr/lib/R/site-library
[4] /usr/lib/R/library
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────