Clustering is an unsupervised machine learning technique that serves to extract patterns in unlabeled datasets by grouping their elements based on a similarity measure. A priori knowledge of the number of clusters is needed in most of the clustering techniques, which is both difficult and necessary for an effective and accurate pattern recognition and latent (not directly observable) feature analysis. Recently, graph based Symmetric Non-negative Matrix factorization (SymmNMF) has been demonstrated to perform better than k-means and spectral clustering. Here, we present a consensus clustering based on robust resampling technique which in conjunction with SymmNMF and Proportion of Ambiguous Clustering (PAC) criterion performs a robust graphical clustering and accurate identification of the number of clusters in several non-convex benchmark datasets.
We present a weak matching algorithm for interval graphs, to detect recurrent patterns in multimodal temporal data, with feature time series extracted by nonnegative tensor factorization (NTF). NTF enables latent feature extraction as well as uniform representation of multimodal observables. This work builds on our previous work introducing an interval graph representation framework for multi-sensor data. Salient data regions and their relationships are represented by temporal interval graphs, where observables are captured as time intervals (nodes), and temporally proximate nodes are related by edges. Comparing events is then posed as a subgraph matching problem. However, subgraph matching is notoriously difficult (NP-complete) with polynomial algorithms for only very restricted families of graphs. Even in these cases, perturbations to graph structure from missing or extra nodes and edges can lead to brittle matching results. Indeed, realworld sensing involves noisy environments where extraneous or missing observables interfere with event interval graph structures. To cope with these challenges, we propose a proxy representation of interval graphs via their shortest and longest paths and compare graphs by matching their path sets. We describe an attributed path matching scheme that is robust to inclusions and exclusions of nodes by adapting the longest common subsequence algorithm using dynamic programming for attributed path matching. We demonstrate the efficacy of interval graph analysis of tensor features on real-world multimodal sensor data where we investigate the detectability, similarity, and distinguishability of three sets of known events based on ground truth. We illustrate our results with match matrices and ROC curves.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.