|Created: 22 Mar 2016||Modified: 23 Jun 2017||BibTeX Entry||RIS Citation|
Initial experiments are promising, when using the sorted Laplacian spectrum as the features for building a classifier model. Even with small samples, it seems to show the following:
This comes from building a multi-class GB tree model from
sc-4-nn, and predicting the data generating model from a 10% holdout set.
The classifier results hold pretty steady in a qualitative sets regardless of the random train/test split.
What doesn’t hold steady is the prediction and class probabilities for the PFG continuity graph. I get different answers depending upon the train/test split, which is probably a function of:
Given that there isn’t much overlap in the overall classification itself, my guess is that if we could look at this in the 10 dimensional space of the eigenvalues used, we would see that:
Given this, a different train/test split could shift a decision boundary very slightly, without having a major impact on the overall confusion matrix among models, and thus change the predicted assignment for the PFG sample.
We might be able to visualize something like the above by using a dimensionality reduction technique and mapping the models against say the first 3 principal components, and then putting PFG on the map. Worth a try.
But removing this issue and getting stable predictions for PFG is going to be a function of:
While I develop more network models, I will probably start doing the second and third for the existing four models, but with PNN models collapsed down to a single model. I don’t have the formal infrastructure yet for doing multiple realizations of a single network model, so that’s the first step.