PMID- 36456571 OWN - NLM STAT- MEDLINE VI - 13 IP - 1 TI - A unified computational framework for single-cell data integration with optimal transport. PG - 7419 CI - © 2022. The Author(s). LA - eng PT - Journal Article PT - Research Support, Non-U.S. Gov't PL - England TA - Nat Commun JT - Nature communications JID - 101528555 IS - 2041-1723 (Electronic) LID - 10.1038/s41467-022-35094-8 [doi] FAU - Cao, Kai AU - Cao K AUID- ORCID: 0000-0002-9524-4942 AD - LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China. AD - School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China. FAU - Gong, Qiyu AU - Gong Q AD - Shanghai Institute of Immunology, Faculty of Basic Medicine, Shanghai Jiao Tong University School of Medicine, Shanghai, China. FAU - Hong, Yiguang AU - Hong Y AUID- ORCID: 0000-0001-9505-8739 AD - Department of Control Science and Engineering, Tongji University, Shanghai, China. yghong@iss.ac.cn. FAU - Wan, Lin AU - Wan L AUID- ORCID: 0000-0002-3511-0512 AD - LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China. lwan@amss.ac.cn. AD - School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China. lwan@amss.ac.cn. IS - 2041-1723 (Linking) RN - 0 (Chromatin) SB - IM MH - *Chromatin/genetics MH - Ion Transport MH - *Transcriptome/genetics PMC - PMC9715710 DCOM- 20221205 LR - 20221229 DP - 20221201 DEP - 20221201 AB - Single-cell data integration can provide a comprehensive molecular view of cells. However, how to integrate heterogeneous single-cell multi-omics as well as spatially resolved transcriptomic data remains a major challenge. Here we introduce uniPort, a unified single-cell data integration framework that combines a coupled variational autoencoder (coupled-VAE) and minibatch unbalanced optimal transport (Minibatch-UOT). It leverages both highly variable common and dataset-specific genes for integration to handle the heterogeneity across datasets, and it is scalable to large-scale datasets. uniPort jointly embeds heterogeneous single-cell multi-omics datasets into a shared latent space. It can further construct a reference atlas for gene imputation across datasets. Meanwhile, uniPort provides a flexible label transfer framework to deconvolute heterogeneous spatial transcriptomic data using an optimal transport plan, instead of embedding latent space. We demonstrate the capability of uniPort by applying it to integrate a variety of datasets, including single-cell transcriptomics, chromatin accessibility, and spatially resolved transcriptomic data.