wiki:NetworkDynamicConverterFunctions

Version 9 (modified by skyebend, 7 years ago) (diff)

--

Draft Specification for networkDynamic converter functions: Take 3

Skye is collecting info from several previous discussions on the developer list in hopes of finally generating a spec for the converter methods.

A list of source / target schemas for dynamics

toggles: start network + edge toggles

[ onset, terminus, tail, head, onset.censored, terminus.censored, duration, edge.id what about vertex dynamics?

changes: edge toggles with times and direction

likes toggles, but also includes a column indicating if edge was formed or dissolved

spell matrix: arrays of spells

possibly two arrays, one for edges and one for vertices

list of networks

each network is one wave in discrete time possible subtypes, list of matrices, or sna style graph stack

file formats with dynamics

.net, .son, some xml formats

Current or existing partial implementations

list of nets to nD : as.networkDynamic.network.list()

edge spells to nD : as.networkDynamic.network(spells=x)

vertex spells to nD : ?

edge (and vertex) spells to nD : as.networkDynamic.data.frame(x,nodeInfo=)

edge toggles to nD : as.networkDynamic.network(toggles=x)

nD to edge spells : as.data.frame.networkDynamic() uses duration.matrix internally

nD to edge spells : duration.matrix()

nD to vertex spells : get.vertex.activity() - sortof, returns list not matrix

Problem

"as.*" syntax doesn't seem appropriate for these converter functions, since often need multiple data objects as input. And for output usually need to specify which facet of network information should be returned (vertex vs. edges). Also, the S3 method dispatching (as.nD.data.frame) is tricky since the data frame may be either toggles or spells. Not much advantage to having a single "convert" function if you have to have to set lots of arguments, hard to have defaults sensible for each type.

Proposed methods and behaviors

Importing to nD

Single function networkDynamic() that has arguments for specifying inputs of the various schemas. Arguments are not speced by object types o toggles and spells are both represented with data frames (or matrices.

function

networkDynamic(base.net=NULL,edge.toggles=NULL,vertex.toggles=NULL,
   edge.spells=NULL,vertex.spells=NULL,edge.changes=NULL,vertex.changes=NULL,
   network.list=NULL,onsets=NULL,termini=NULL,vertex.pid=NULL,start,end...)

fill in edges and/or vertex spells in starting network using toggles. Toggles arguments can be anything convertible into a matrix.

Arguments

base.net
a network object providing an initial network state for the toggles and changes versions. For spell versions it is copied to provide the basic network attributes (n,directed, etc)
network.list
a list of network objects assumed to describe sequential panels of network observations. Network sizes may vary. see onsets, termini, vertex.pid.
edge.toggles
assume first columns are [tail,head,time]. What times assumed for ties in initial network?
edge.spells
an object coercible to a matrix with first four columns assumed to be [tail,head,onset,terminus]. Each row defines an activity spell for an edge. tail and head are vertex.ids for the edge, there may be multiple rows for each edge. Any remaining columns added as TEA attributes?
vertex.spells
an object coercible to a matrix with first three columns assumed to be [vertex.id,onset,terminus]. Each row defines an activity spell for the appropriate vertex, there may be multiple rows per vertex. Any remaining columns added as TEA attributes?
onsets
an optional array of onset times to be assigned to the network panels of network.list. defaults to seq(from=0,length=length(network.list))
termini
an optional array of terminus times to be assigned to the network panels of network.list defaults to seq(from=1,length=length(network.list)
vertex.pid
an optional name of a vertex attribute to be used as a unique vertex identifier when constructing nD from a network list with different sized networks. (Also allow for spell lists?)
start
The beginning of the observation period describing the network (used to define censoring)
end
The end of the observation period describing the network (used to define censoring)
construct.mode
c("strict","warn","checkless","expand") Need Better names. Specifies how carefully the object should be checked for consistency during construction. "strict"=fail if edges and vertex activity not aligned. "warn" warn if there are problems, but still create object if possible. "checkless" create object as quickly as possible, do not perform checks (if you know you are feeding it good data) "expand" define new activity spells for vertices when edges attach if needed, add new vertices if edge ids out of range, etc.
additional arguments needed in the future, possible network attributes?

behavior sketch

Determine if building from network list, as that specifies both edge and vertex dynamics at the same time

If base.net is specified, construct a new network with matching parameters. If vertex or edges or attributes are present, add them to new network as non-TEA attributes?

Determine if all the networks in the list are the same size. If not, check for vertex.pid to specify which vertex attribute should be used as a unique identifier. In onsets is not set, it defaults to seq(from=0,length=length(network.list)). If terminus not set defaults to seq(from=1,length=length(network.list))

Step through the list of networks, adding the set of edges and vertices to the new network with onset and terminus times specified by onsets and termini vectors. If inconsistencies are encountered, perform as indicated by construct.mode. Add any network, vertex, or edge attributes found in the list networks with appropriate spells.

If not building from network list

Determine vertex mode

Determine edge mode

If nothing specified, return error, otherwise return constructed network dynamic.

network.list
create a nD object using a list of networks (or objects like matrices convertible to networks). Timing of each panel of vertex and edges . spells taken from onsets and termini arguments (defaults to [0,1].[1,2] etc..). If vertex.pid is set to the name of a network attribute, use that attribute as the vertex. id when constructing a network (for when inputs are not the same sizes. Censoring of starting / ending networks?

Edge cases to consider

edge activity not consistent with vertex activity network list: base net size different than list net sizes base net has multiplex edges?

Notes

If we want to support multiplex edges, need to expand toggles etc to allow specifying an persistent eid Not sure how we would support multiple vertex head- and tail-sets on edges

Exporting from nD

(maybe only keep an "as.data.frame" function aliased to most common use case? ).

get.vertex.activity(nD, v=1:network.size(x),as.spellList=FALSE)
currently this returns a list of spell matrices, one for each v specified. Add argument to return same info as a single spell matrix [onset.terminus,vertex.id]. Sort order by onset,terminus,vertex.id? Include attributes?
get.edge.activity(networkDynamic,as.spellList=FALSE)
currently this returns a list of spell matrices. Add "start" and "end" arguments for specifying censoring. Add argument to return same info as a single spell matrix [onset,terminus,tail,head,left.censored, right.censored, duration, edge.id]. Sort order by onset,terminus,tail,head,eid?.
get.slice.networks(networkDynamic, start=min(get.change.times(networkDynamic),end=max(get.change.times(networkDynamic),time.step=1,duration=1, rule="any")
return a list with a series of networks produced by network.extract. Allow specifying a series of onsets, termini instead of / in addition to start,step,duration,end ?

don't include export toggle methods unless we have a use case.

Questions:

Should we use "fake" S3 method names to be consistent? (i.e. networkDynamic.spell.list): NO

Is it better to pass in a network object to be populated YES, or specify full list of network options (directed,bipartite,etc)? For matrix input formats, better to assume column order, or allow specifying column indicies/names?

Preferred format for returned data objects? (array, matrix, data.frame)?

Censoring behavior (Inf should not automatically be censored, right) change to start.censored, end.censored.

Implement for multiple head/tail set, or just return error?

Support multiplex edges.

Include missing edges / vertices in output? include.missing=FALSE If true, include extra column for missingness?

Argument to include additional columns for attributes? Or user should lookup using eids? (simple attributes would be fine, but lists of objects might not work in data.frame output.) include.attrs=c("attrName1","attrName2")

Support for non-numeric,non integer input ids. pid=attrName Sort into alphabetic order, assign to ids 1:n

Add "include.censoring=T" attribute