Data Fusion

Data fusion is about a systematic, joint analysis of data from different sources (measurement approaches) aiming to harvest their synergy. Both a major strength as well as a potential pitfall of data fusion consists of the fact that there is no magic bullet, no universal solution. – The careful analysis of the task at hand and suitable adaptation of the analysis strategy becomes as important as the analysis methodology itself.
Introductions
- Book by Age Smilde, Tormod Næs and Kristian Hovde Liland
General Theory
- Survey on PLS methods by J.A. Wegelin
- Analysis of multiblock PCA and PLS by J.A. Westerhuis, T. Kourti and J.F. MacGregor
- Deflation in multiblock PLS by J.A. Westerhuis and A. Smilde
- PLS regression methods "classic" article by A. Höskuldsson
- A multiblock PLS algorithm by L.E. Wangen and B.R. Kowalski
- Advances in PLS by R. Rosipal and N. Krämer
Methods
- Canonical Correlation Analysis
- Consensus PCA and variants by A.K. Smilde, J.A. Westerhuis and S. de Jong
- Multi-block PLS by S. Wold et al.
- O2-PLS by J. Trygg and S. Wold
- Statis by I. Stanimirova, B. Walczak, D.L. Massart et al.
- L-PLS according to H. Martens et al.
- L-PLS according to V. Esposito Vinzi et al.
- L-PLS according to K. Muteki and J.F. MacGregor
- L-PLS according to L. Eriksson et al.
- PLS path modeling by M. Tenenhaus, V. Esposito Vinzi et al.
- PLS-path modelling comparison by M. Tenenhaus and V. Esposito Vinzi
Applications
- PAT by T. Kourti, P. Nomikos and J.F. MacGregor
- PAT by J.A. Westerhuis and P.M.J. Coenegracht
- Unifying multi-block analysis for PAT by S.J. Qin, S. Valle and M.J. Piovoso
- Statis for batch monitoring by S. Gourvénec et al.
- MSPC by J. Kohonen et al.
- O2-PLS in bioinformatics by M. Bylesjö et al.
- Consensus-PCA and Multiblock-PLS for Metabolomics by A.K. Smilde et al.
- Co-inertia analysis for microarrays by A.C. Culhane et al.
- Generalized SVD for microarray analysis by O. Alter, P.O. Brown and D. Botstein
Pattern Recognition
- On Combining Classifiers The "classical" article by Kittler, Duin et al.
- To train or not to train? Article by R.P.W. Duin, also refer to his numerous other articles
- Combining Pattern Classifiers textbook by L.I. Kuncheva, also refer to her other publications
- Combining One-Class Classifiers by D.M.J. Tax and R.P.W. Duin
Kernel Approaches
- Kernel Fusion by G.R.G. Lanckriet et al.
- Multiple Kernel Learning by S.Sonnenburg, G.Rätsch, C.Schäfer and B.Schölkopf
- Kernel-CCA by T. De Bie, N. Cristianini and R. Rosipal
Sensor and Information Fusion
- International Society of Information Fusion
- Information Fusion an international journal
Software
- multiblock R package to accompany the above book by Smilde, Næs and Liland
- Matlab Statistics Toolbox contains e.g. canonical correlation analysis, Procrustes analysis, PCA and PLS
- Multi-block Toolbox for Matlab by Frans van den Berg
- CuBatch Comprehensive Matlab toolbox for batch modelling and multi-way analysis, see the article by S. Gourvénec et al.
- Simca-P+ incorporates the patented O2-PLS algorithm
- PRTools Matlab toolbox for pattern recognition including classifier combination
- Shogun Machine learning toolbox including multiple kernel learning
- CCA Canonical Correlation Analysis
- ade4 Co-inertia analysis by S. Dray, A.B. Dufour and D. Chessel
- MADE4 Co-inertia analysis for microarray analysis by A.C. Culhane et al.
Data Sets
- Spectroscopic data on wine from T. Skov, D. Balabio and R. Bro
- Spotted, Affymetrix®U95 A-E chipset, U133 A+B chipset, proteomic and drug activity data for the NCI60 cancer cell lines. Refer to CellMinerTM or the MADE4 supplement for some pre-compiled data.
- Data on Protein classification accompanying the publication on genomic data fusion by G.R.G. Lanckriet et al.
Conferences
Contact

Dr. Juergen von Frese
Data Analysis Solutions S.L.
 Telephone: +34 871 811 605
 E-Mail: jvf@da-sol.com
