This tutorial provides an introduction to Multi-Omics Factor Analysis (MOFA), a novel unsupervised framework for the integration of multi-omic data sets (Argelaguet et al, Molecular Systems Biology. 2018). Intuitively, MOFA can be viewed as a versatile and statistically rigorous generalization of principal component analysis to multi-omics data. Given multiple ‘omics data types on overlapping sets of samples, MOFA infers a low-dimensional data representation in terms of (hidden) factors. These learnt factors represent the driving sources of variation across data modalities, thus facilitating the identification of molecular phenotypes and disease subgroups.
In the first part of the tutorial I will give a 30-minute presentation to explain the model, its applications and limitations. The second part will consist on a hands-on activity where we will use two real-case data sets to show how MOFA can be used for integrative analysis. The first data set will be a large study of blood cancer patients (Dietrich, J Clin Invest. 2018), and the second will be a single-cell multi-omics data set (Angermueller, Nature Methods. 2016). The attendants are also encouraged to bring their own multi-omics data sets.
A working knowledge of R is expected.
The tutorial requires the installation of the following software:
• R>=3.4 + Rstudio
• MOFA R package (+ dependencies)
• MOFAdata R package (+ dependencies)
• mofapy python package (+ dependencies)