wiki:NetworkDynamicConverterFunctions

Version 30 (modified by skyebend, 7 years ago) (diff)

comment on multiplex edges

Draft Specification for networkDynamic converter functions: Take 3

Skye is collecting info from several previous discussions on the developer list in hopes of finally generating a spec for the converter methods.

Problem

"as.*" syntax doesn't seem appropriate for these converter functions, since often need multiple data objects as input. And for output usually need to specify which facet of network information should be returned (vertex vs. edges). Also, the S3 method dispatching (as.nD.data.frame) is tricky since the data frame may be either toggles or spells. Not much advantage to having a single "convert" function if you have to have to set lots of arguments, hard to have defaults sensible for each type.

Proposed methods and behaviors

Importing to nD

Single function networkDynamic() that has arguments for specifying inputs of the various schemas. Arguments are not speced by object types o toggles and spells are both represented with data frames (or matrices.

function

networkDynamic(base.net=NULL,edge.toggles=NULL,vertex.toggles=NULL,
   edge.spells=NULL,vertex.spells=NULL,edge.changes=NULL,vertex.changes=NULL,
   network.list=NULL,onsets=NULL,termini=NULL,vertex.pid=NULL,start=NULL,end=NULL,net.obs.period=NULL,...)

Arguments

base.net
a network object providing an initial network state for the toggles and changes versions. For spell versions it is copied to provide the basic network attributes (n,directed, etc)
network.list
a list of network objects assumed to describe sequential panels of network observations. Network sizes may vary. see onsets, termini, vertex.pid.
edge.toggles
assume first columns are [tail,head,time]. What times assumed for ties in initial network?
edge.spells
an object coercible to a matrix with first four columns assumed to be [tail,head,onset,terminus]. Each row defines an activity spell for an edge. tail and head are vertex.ids for the edge, there may be multiple rows for each edge. Any remaining columns added as TEA attributes?
vertex.toggles
assume first columns [vertex.id,time]
vertex.spells
an object coercible to a matrix with first three columns assumed to be [vertex.id,onset,terminus]. Each row defines an activity spell for the appropriate vertex, there may be multiple rows per vertex. Any remaining columns added as TEA attributes?
onsets
an optional array of onset times to be assigned to the network panels of network.list. defaults to seq(from=0,length=length(network.list)-1)
termini
an optional array of terminus times to be assigned to the network panels of network.list defaults to seq(from=1,length=length(network.list)
vertex.pid
an optional name of a vertex attribute to be used as a unique vertex identifier when constructing nD from a network list with different sized networks. (Also allow for spell lists?)
start
The beginning of the observation period describing the network (used to define censoring)
end
The end of the observation period describing the network (used to define censoring)
net.obs.period
alternate and more complete way to fully specify the observation period. see https://statnet.csde.washington.edu/trac/ticket/155
construct.mode
c("strict","warn","checkless","expand") Need Better names. Specifies how carefully the object should be checked for consistency during construction. "strict"=fail if edges and vertex activity not aligned. "warn" warn if there are problems, but still create object if possible. "checkless" create object as quickly as possible, do not perform checks (if you know you are feeding it good data) "expand" define new activity spells for vertices when edges attach if needed, add new vertices if edge ids out of range, etc.
additional arguments needed in the future, possible network attributes?

behavior sketch and implementation notes

Validate inputs

start <= end

start and end OR net.obs.period specified, but not both

base.net is NULL or a network

if not NULL coerce toggles, spells, changes to matrix, test for minimum number of columns

network.list is NULL or all elements of list are networks.

if network sizes of network.list items vary, vertex.pid must be non-NULL

if vertex.pid != NULL, it must be present in network vertex attributes

onsets and termini must be NULL or numeric, the same length and the same length as network.list

construct.mode is valid

only one of vertex.toggles, vertex.spells and vertex.changes can be non-NULL

only one of edge.toggles, edge.spells and edge.changes can be non-NULL

if network.list is non-NULL vertex.* and edge.* must be NULL and the reverse

Bookeeping

if start and end values present, store in observation.period attribute

base.net

base.net activity information is ignored for now. base.net is only assumed to be a network object, not networkDynamic (behavior subject to change). base.net not required; if it is not given, create a new base.net with the max number of vertices required, and use the edge info (without timing) to create edges. This assumes that there are no isolate vertices greater than the max number of vertices given. If base.net parameter is supplied without any other edge or vertex info, returns the base.net casted into a networkDynamic object. An easy way to create a basic nD object is

networkDynamic(network.initialize(n))

Both vertex info and edge info are optional. If vertex info is missing, and edge info is present, try to infer the vertices required for the edges. If both are missing, and there is no base.net, return an empty list (since 0 node networks are not supported).

Determines if building from network list, as that specifies both edge and vertex dynamics at the same time

If net.obs.period not specified, default to list(observations=list(c(0,length(network.list)),mode="discrete", time.increment=1,time.unit="step") replacing 0, length(network.list) with start and end if they are not null

If base.net is specified, constructs a new network with matching parameters. If vertex or edges or attributes are present, adds them to new network as non-TEA attributes?

Determines if all the networks in the list are the same size. If not, checks for vertex.pid parameter to specify which vertex attribute should be used as a unique identifier. In onsets is not set, it defaults to seq(from=0,length=length(network.list)). If terminus not set defaults to seq(from=1,length=length(network.list))

Steps through the list of networks, adding the set of edges and vertices to the new network with onset and terminus times specified by onsets and termini vectors. If inconsistencies are encountered, perform as indicated by construct.mode. Add any network, vertex, or edge attributes found in the list networks to nD as TEA with appropriate spells.

If not building from network list:

If base.net exists determine if the set of vertex.ids is consistent with base net Create new network with properties given by base net if present

Determine vertex data schema.

If toggles

  • activate all vertices present in base net, assuming onset time of -Inf
  • loop through toggles, activating and deactivating vertices at appropriate times. vertices where last toggle is to active given Inf as terminus.

If changes

  • If the first change is "activate", it should assume that it was previously inactive (from -Inf). This is default behavior in the function activate.vertices().
  • can convert directly to activation, deactivation lists and apply to vertices in a single pass. vertices where last change is to active given Inf as terminus

If spells

  • apply activation spells to vertices

Determine edge data schema.

If base.net exists, determine if the set of vertex.ids present in the edges data is consistent with it and take action appropriate to construct.mode. Edges present in the base.net should be set to active with -Inf as onset.

If toggles

  • Add all edges present in base.net
  • Set initial edge state according to base net, assuming -Inf as onset. If edge is not present in base.net but implied by the edge.data, then assume it is inactive from (-Inf, Inf) initially.
  • Add all additional edges implied by toggles
  • loop through toggles, activating and deactivating edges at appropriate times. All edges where last state is active given Inf as terminus. (Look at pavel's code to see how he did it.)

If changes

  • Add all edges present in base.net
  • Set initial edge state according to base net, assuming -Inf as onset.
  • Add all additional edges implied by toggles
  • loop through changes, activating and deactivating edges at appropriate times. All edges where last state is active given Inf as terminus

If spells

set start and end censoring on nD observation attribute net.obs.period. If net.obs.period not specified, default to list(observations=list(c(start,end)),mode="discrete", time.increment=1,time.unit="step") if start and end not null. If start and end not specified, default to smallest and largest observed time values. For toggles and changes, set mode="discrete" for spells, set mode="continuous"

return constructed networkDynamic.

network.list
create a nD object using a list of networks (or objects like matrices convertible to networks). Timing of each panel of vertex and edges . spells taken from onsets and termini arguments (defaults to [0,1].[1,2] etc..). If vertex.pid is set to the name of a network attribute, use that attribute as the vertex. id when constructing a network (for when inputs are not the same sizes. Censoring of starting / ending networks?

Edge cases to consider

toggles, changes arguments present, but have zero rows: base.net state applied to entire network, respecting start and end if present. if no base.net, no spells added activity status will be determined by active.default argument during query

spells arguments present, but have zero rows: no spells added, activity status will be determined by active.default argument during query

network list: base.net size different than list net sizes

base net has multiplex edges? Not sure how to interpret for toggles, changes and spells. Activate all matching edges?

network.list has networks with multiplex edges or edges with multiple vertex head- tail-sets. (I think this will work if we write code correctly)

if all arguments empty, should return empty network, but still no way to specify that.

edge activity not consistent with vertex activity (edge active when incident vertex is not) : behavior determined by construct.mode

watch out for NULL edge rows caused by deleted edges.

Notes / Questions

Should additional columns present on toggles, spells, changes be added as TEA attributes?

There is existing code for many of these steps, but it is fragmented across many hidden methods, needs to be organized and recycled where possible.

If we want to support multiplex edges, need to expand toggles etc to allow specifying a persistent eid

Not sure how we would support multiple vertex head- and tail-sets on edges, vertex pid?

What if base.net is a nD object, should we try to append to it? If so, need a way to determine the last state of network for toggles and edges.

What if vertices/edges need to be inactive at the start of observation in toggles case? Include First toggle at 0

If vertex activity is specified, should vertices implied by network size that are never activated be removed?

vertices and edges in the initial network are currently all deleted. Should we keep them?

Proposed vertex.pid behavior

how to handle alpha ids and convert them in a stable way to vertex.ids . See pid spec: PersistentIdProposal

Proposed "expand" construction.mode ===

If vertex.ids appear in edge records which are outside the range implied by the known network size, add them to the network, including implied vertices. i.e if network size if 5, and an id of 8 is found, add add vertices 6,7 and 8. Activate vertices whenever incident edges are active.

Exporting from nD

(maybe only keep an "as.data.frame" function aliased to most common use case? ). All of these methods should handle censoring. Check the soon-to-be-added observation.period attribute #155 (if present) and set all onset.censored spells (-Inf) to value of start, and terminus-censored spells (Inf) to value of end. (discussion on #149)

get.vertex.activity(nD, v=1:network.size(x),as.spellList=FALSE)
currently this returns a list of spell matrices, one for each v specified. Add argument to return same info as a single spell matrix [onset.terminus,vertex.id]. Sort order by onset,terminus,vertex.id? Include attributes?
get.edge.activity(networkDynamic,as.spellList=FALSE)
currently this returns a list of spell matrices. Add "start" and "end" arguments for specifying censoring. Add argument to return same info as a single spell matrix [onset,terminus,tail,head,left.censored, right.censored, duration, edge.id]. Sort order by onset,terminus,tail,head,eid?.
get.slice.networks(networkDynamic, start=min(get.change.times(networkDynamic),end=max(get.change.times(networkDynamic),time.step=1,duration=1, rule="any")
return a list with a series of networks produced by network.extract. Allow specifying a series of onsets, termini instead of / in addition to start,step,duration,end ?

don't include export toggle methods unless we have a use case.

Questions:

Should we use "fake" S3 method names to be consistent? (i.e. networkDynamic.spell.list): NO

Is it better to pass in a network object to be populated YES, or specify full list of network options (directed,bipartite,etc)? For matrix input formats, better to assume column order, or allow specifying column indicies/names?

Preferred format for returned data objects? (array, matrix, data.frame)?

Censoring behavior (Inf should not automatically be censored, right) change to start.censored, end.censored.

Implement for multiple head/tail set, or just return error?

Support multiplex edges? When 1-mode networks are induced from two mode networks it often the case that a dyad is linked by multiple edges at the same time. In the dynamic version, this means we need to use multiplex edges (because overlapping spells are not allowed) so we need a way to load in and distinguish multiplex edges from multiple spells.

Include missing edges / vertices in output? include.missing=FALSE If true, include extra column for missingness?

Argument to include additional columns for attributes? Or user should lookup using eids? (simple attributes would be fine, but lists of objects might not work in data.frame output.) include.attrs=c("attrName1","attrName2")

Support for non-numeric,non integer input ids. pid=attrName Sort into alphabetic order, assign to ids 1:n

Add "include.censoring=T" attribute

get.slice.networks: how do we handle censoring here?

get.slice.networks: the output list of networks should have time labels (perhaps as a network attribute?)

since "changes" is so similar to "toggles" should we just make those the same argument? The converter can detect whether there is an extra column or not (what should the column be called?)

should the argument be called as.spellmatrix instead of as.spellList, since its matrix not a list?

If we are guessing at network size based on ids present in vertices and edges, should we give warning?

Should we allow setting of network properties by passing standard network prams (bipartite, etc) via ... ?

A list of source / target schemas for dynamics

toggles: start network + edge toggles

[ onset, terminus, tail, head, onset.censored, terminus.censored, duration, edge.id what about vertex dynamics?

changes: edge toggles with times and direction

likes toggles, but also includes a column indicating if edge was formed or dissolved. Activate = 1, deactivate = 0.

spell matrix: arrays of spells

possibly two arrays, one for edges and one for vertices

list of networks

each network is one wave in discrete time possible subtypes, list of matrices, or sna style graph stack

file formats with dynamics

.net, .son, some xml formats