Created: 27 Apr 2015 | Modified: 30 Jun 2016 | BibTeX Entry | RIS Citation |
Prior to the SAA conference in SFO this April, I performed 28 different network model experiments (seriationct-1 through seriationct-28). Each had a different starting temporal network model, generated by one of two network generator programs. The early experiments (through 12 or 15 or so) used an initial cut at network modeling, which was too constrained in various ways. I was not able to recovery what I thought was the structure of the network.
Subsequently, I simplified the network generator into two executables. It now produces M clusters of N communities. Each community is fully connected to other communities in the same cluster, and in addition, there is a small fraction of communities that are connected between clusters. In addition, each community in the model lasts for a single network slice, before being replaced by another slice of communities, each of which has a randomly sampled parent in the previous time interval.
In the first program, there is a single “lineage” with the cluster structure just described. The linkage between clusters represents the only mesoscopic structure. I did not get clear recovery of this structure, at least with the number of loci/classes, innovation rate, and migration rates tested.
In the second program, I produced two models: lineage splitting, and lineage coalescence. In the former, at early times, M clusters of weakly interconnected communities evolve to a splitting time, at which point the single lineage of M clusters splits into L lineages of \(M/L\) clusters of communities each, and continues evolving with weak interconnections within the lineage but NO edges between lineages, until the stop time. The lineage coalescence is a mirror image, starting with L lineages and coalescing them into a single lineage.
With the lineage split/merge model, the idea is to use seriation to recover the major lineage structure of the model.
The annotation convention used is:
In Fig. 1, the lineage structure is evident. Two lineages on the right begin early, and come together with some circles to form a single lineage, which continues to the left with thicker shapes. There is some confusion around the branch point, with some later assemblages mixed into the terminal end of the earlier branches. But in general this is darned good.
The difficulty is that only some samples from a given network model/CT simulation yield clear structure. Some don’t at all, even though they’re samples from the same underlying data set. There are several possibilities for why this would happen:
Of course, both can be occurring. Now it’s time to figure out what’s going on.
The next steps are to focus on tweaking the factors just listed and seeing what encourages robust lineage structure recovery. We need:
This will allow a large batch with replicates, to see where we get good, reliable recovery, and where we don’t.