DoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors.

IF: 11.091

Cited by: 1,168

Abstract

Single-cell RNA sequencing (scRNA-seq) data are commonly affected by technical artifacts known as "doublets," which limit cell throughput and lead to spurious biological conclusions. Here, we present a computational doublet detection tool-DoubletFinder-that identifies doublets using only gene expression data. DoubletFinder predicts doublets according to each real cell's proximity in gene expression space to artificial doublets created by averaging the transcriptional profile of randomly chosen cell pairs. We first use scRNA-seq datasets where the identity of doublets is known to show that DoubletFinder identifies doublets formed from transcriptionally distinct cells. When these doublets are removed, the identification of differentially expressed genes is enhanced. Second, we provide a method for estimating DoubletFinder input parameters, allowing its application across scRNA-seq datasets with diverse distributions of cell types. Lastly, we present "best practices" for DoubletFinder applications and illustrate that DoubletFinder is insensitive to an experimentally validated kidney cell type with "hybrid" expression features.

Keywords

Spatial reconstruction

Gene Expression

Seurat

doublet detection

machine learning

quality-control

single-cell RNA sequencing

MeSH terms

Humans

RNA-Seq

Single-Cell Analysis

Software

Authors

McGinnis, Christopher S

Murrow, Lyndsay M

Gartner, Zev J

Recommend literature

1. Comprehensive Integration of Single-Cell Data.

2. DoubletDecon: Deconvoluting Doublets from Single-Cell RNA-Sequencing Data.

3. Spatial reconstruction of single-cell gene expression data.

4. Fast, sensitive and accurate integration of single-cell data with Harmony.

5. Single Cell Explorer, collaboration-driven tools to leverage large-scale single cell RNA-seq data.

Similar data

1. Multiplexing droplet-based single cell RNA-sequencing using genetic barcodes

2. Lineage dynamics of pancreatic development at single cell resolution

3. Comprehensive single cell RNAseq analysis of the kidney reveals novel cell types and unexpected cell plasticity

4. Cell hashing enable sample multiplexing, multiplet identification and super-loading on droplet-based single cell RNA-sequencing platforms

5. MULTI-seq: Universal sample multiplexing for single-cell RNA sequencing using lipid-tagged indices