Главные
многообразия для визуализации и анализа данных
Principal Manifolds for Data Visualisation
and Dimension Reduction
По главам (PDF files of Chapters):
Contents
Frontmatter
(Preface-Contents-List of Authors)
1
Developments and Applications of Nonlinear
Principal Component Analysis – a Review
Uwe Kruger, Junping
Zhang, Lei Xie
1.1 Introduction
1.2 PCA Preliminaries
1.3 Nonlinearity Test for PCA Models
1.3.1 Assumptions
1.3.2 Disjunct Regions
1.3.3 Confidence Limits for Correlation
Matrix
1.3.4 Accuracy Bounds
1.3.5 Summary of the Nonlinearity Test
1.3.6 Example Studies
1.4 Nonlinear PCA Extensions
1.4.1 Principal Curves and Manifolds
1.4.2 Neural Network Approaches
1.4.3 Kernel PCA
1.5 Analysis of Existing Work
1.5.1 Computational Issues
1.5.2 Generalization of Linear PCA?
1.5. Roadmap for Future Developments
(Basics and Beyond)
1.6 Concluding Summary
References
2
Nonlinear Principal Component Analysis:
Neural Network Models and Applications
Matthias Scholz, Martin Fraunholz, Joachim
Selbig
2.1 Introduction
2.2 Standard Nonlinear PCA
2.3 Hierarchical Nonlinear PCA
2.3.1 The Hierarchical Error Function
2.4 Circular PCA
2.5 Inverse Model of Nonlinear PCA
2.5.1 The Inverse Network Model
2.5.2 NLPCA Models Applied to Circular
Data
2.5.3 Inverse NLPCA for Missing Data
2.5.4 Missing Data Estimation
2.6 Applications
2.6.1 Application of Hierarchical
NLPCA
2.6.2 Metabolite Data Analysis
2.6.3 Gene Expression Analysis
2.7 Summary
References
3
Learning Nonlinear Principal Manifolds
Hujun Yin
3.1 Introduction
3.2 Biological Background
3.2.1 Lateral Inhibition and Hebbian
Learning
3.2.2 From Von Marsburg and Willshaw’s Mode
to Kohonen’s SOM
3.2.3 The SOM Algorithm
3.3 Theories
3.3.1 Convergence and Cost Functions
3.3.2 Topological Ordering Measures
3.4 SOMs, Multidimensional Scaling
and Principal Manifolds
3.4.1 Multidimensional Scaling
3.4.2 Principal Manifolds
3.4.3 Visualisation Induced SOM
(ViSOM)
3.5 Examples
3.5.1 Data
Visualisation
3.5.2 Document
Organisation and Content Management
References
4
Elastic Maps and Nets for Approximating
Principal Manifolds and Their Application
to Microarray Data Visualization
Alexander N Gorban, Andrei Y Zinovyev
4.1 Introduction and Overview
4.1.1 Frґechet
Mean and Principal Objects
K-Means, PCA, what else?
4.1.2 Principal Manifolds
4.1.3 Elastic Functional and Elastic
Nets
4.2 Optimization of Elastic Nets for Data
Approximation
4.2.1 Basic Optimization Algorithm
4.2.2 Missing Data Values
4.2.3 Adaptive Strategies
4.3 Elastic Maps
4.3.1 Piecewise Linear Manifolds and Data
Projectors
4.3.2 Iterative Data Approximation
4.4 Principal Manifold as Elastic
Membrane
4.5 Method Implementation
4.6 Examples
4.6.1 Test Examples
4.6.2 Modeling Molecular Surfaces
4.6.3 Visualization of Microarray Data
4.7 Discussion
References
5
Topology-Preserving Mappings for Data Visualisation
Marian PeЇna, Wesam Barbakh, Colin Fyfe
5.1 Introduction
5.2 Clustering Techniques
5.2.1 K-Means
5.2.2 K-Harmonic Means
5.2.3 Neural Gas
5.2.4 Weighted K-Means
5.2.5 The Inverse Weighted K-Means
5.3 Topology Preserving Mappings
5.3.1 Generative Topographic Map
5.3.2 Topographic Product of Experts
ToPoE
5.3.3 The Harmonic Topograpic Map
5..3.4 Topographic Neural Gas
5.3.5 Inverse-Weighted K-Means
Topology-Preserving Map
5.4 Experiments
5.4.1 Projections in Latent Space
5.4.2 Responsibilities
5.4.3 U-matrix, Hit Histograms and Distance
Matrix
5.4.4 The Quality of The Map
5.5 Conclusions
References
6
The Iterative Extraction Approach to Clustering
Boris Mirkin
6.1 Introduction
6.2 Clustering Entity-to-feature Data
6.2.1 Principal Component Analysis
6.2.2 Additive Clustering Model and
ITEX
6.2.3 Overlapping and Fuzzy Clustering
Case
6.2.4 K-Means and iK-Means
Clustering
6.3 ITEX Structuring and Clustering for
Similarity Data
6.3.1 Similarity Clustering: a Review
6.3.2 The Additive Structuring Model and
ITEX
6.3.3 Additive Clustering Model
6.3.4 Approximate Partitioning
6.3.5 One Cluster Clustering
6.3.6 Some Applications
References
7
Representing Complex Data Using Localized Principal
Components with Application to Astronomical Data
Jochen Einbeck, Ludger Evers, Coryn
Bailer-Jones
7.1 Introduction
7.2 Localized Principal Component
Analysis
7.2.1 Cluster-wise PCA
7.2.2 Principal Curves
7.2.3 Further Approaches
7.3 Combining Principal Curves and
Regression
7.3.1 Principal Component Regression and
its Shortcomings
7.3.2 The Generalization to Principal
Curves
7.3.3 Using Directions Other than the Local
Principal Components
7.3.4 A Simple Example
7.4 Application to the Gaia Survey
7.4.1 The Astrophysical Data
7.4.2 Principal Manifold Based
Approach
7.5 Conclusion
References
8
Auto-Associative Models, Nonlinear Principal Component
Analysis, Manifolds and Projection Pursuit
Stґephane Girard, Serge Iovleff
8.1 Introduction
8.2 Auto-Associative Models
8.2.1 Approximation by Manifolds
8.2.2 A Projection Pursuit Algorithm
8.2.3 Theoretical Results
8.3 Examples
8.3.1 Linear Auto-Associative Models and
PCA
8.3.2 Additive Auto-Associative Models and
Neural Networks
8.4 Implementation Aspects
8.4.1 Estimation of the Regression
Functions
8.4.2 Computation of Principal
Directions
8.5 Illustration on Real and Simulated
Data
References
9
Beyond The Concept of Manifolds: Principal Trees,
Metro Maps, and Elastic Cubic Complexes
Alexander N Gorban, Neil R Sumner, Andrei Y
Zinovyev
9.1 Introduction and Overview
9.1.1 Elastic Principal Graphs
9.2 Optimization of Elastic Graphs
for Data Approximation
9.2.1 Elastic Functional Optimization
9.2.2 Optimal Application of Graph
Grammars
9.2.3 Factorization and Transformation of
Factors
9.3 Principal Trees (Branching Principal
Curves)
9.3.1 Simple Graph Grammar (“Add a Node”,
“Bisect an Edge”)
9.3.2 Visualization of Data Using “Metro
Map” Two-Dimensional Tree Layout
9.3.3 Example of Principal Cubic Complex: Product
of Principal Trees
9.4 Analysis of the Universal 7-Cluster
Structure
of Bacterial Genomes
9.4.1 Brief Introduction
9.4.2 Visualization of the 7-Cluster
Structure
9.5 Visualization of Microarray Data
9.5.1 Dataset Used
9.5.2 Principal Tree of Human Tissues
9.6 Discussion
References
10
Diffusion Maps - a Probabilistic Interpretation
for Spectral Embedding and Clustering Algorithms
Boaz Nadler, Stephane Lafon, Ronald
Coifman, Ioannis G Kevrekidis
10.1 Introduction
10.2 Diffusion Distances and
Diffusion Maps
10.2.1 Asymptotics of the Diffusion
Map
10.3 Spectral Embedding of Low Dimensional
Manifolds
10.4 Spectral Clustering of a Mixture of
Gaussians
10.5 Summary and Discussion
References
11
On Bounds for Diffusion, Discrepancy
Steven B Damelin
11.1 Introduction
11.2 Energy, Discrepancy, Distance
and Integration on Measurable Sets in
Euclidean Space
11.3 Set Learning via Normalized Laplacian
Dimension Reduction and Diffusion
Distance
11.4 Main Result: Bounds for Discrepancy,
Diffusion and Fill Distance Metrics
References
12
Geometric Optimization Methods for the Analysis
Michel Journґee, Andrew E Teschendorff, Pierre-Antoine Absil,
Simon Tavarґe, Rodolphe Sepulchre
12.1 Introduction
12.2
12.3 Contrast Functions
12.3.1 Mutual Information [8, 10]
12.3.2 F-Correlation [14]
12.3.3 Non-Gaussianity [17]
12.3.4 Joint Diagonalization of Cumulant
Matrices [19]
12.4 Matrix Manifolds for
12.5 Optimization Algorithms
12.5.1 Line-Search Algorithms
12.5.2 FastICA
12.5.3 Jacobi Rotations
12.6 Analysis of Gene Expression Data by
12.6.1 Some Issues About the Application of
12.6.2 Evaluation of the Biological
Relevance
of the Expression Modes
12.6.3 Results Obtained on the Breast
Cancer
Microarray Data Set
12.7 Conclusion
References
13
Dimensionality Reduction and Microarray data
David A Elizondo, Benjamin N Passow, Ralph
Birkenhead,
Andreas Huemer
13.1 Introduction
13.2 Background
13.2.1 Microarray Data
13.2.2 Methods for Dimension Reduction
13.2.3 Linear Separability
13.3 Comparison Procedure
13.3.1 Data Sets
13.3.2 Dimensionality Reduction
13.3.3 Perceptron Models
13.4 Results
13.5 Conclusions
References
14
PCA and K-Means Decipher Genome
Alexander N Gorban, Andrei Y Zinovyev
14.1 Introduction
14.2 Required Materials
14.3 Genomic Sequence
14.3.1 Background
14.3.2 Sequences for the Analysis
14.4 Converting Text to a Numerical
Table
14.5 Data Visualization
14.5.1 Visualization
14.5.2 Understanding Plots
14.6 Clustering and Visualizing
Results
14.7 Task List and Further Information
14.8 Conclusion
References