Cell-omics Data Coordinate Platform (CDCP)
CDCP aims to provide a platform for sharing single-cell datasets and allows the query of the expression of different genes in specific cell types or clusters. By now, CDCP contains expression profiles of 6467 samples and 8,000,000 cells from public datasets in species, including humans, monkeys, and other animals.
Datasets
The datasets were curated from the NCBI GEO,CDCP archiving system, and other sources.
1. Data pre-processing and analysis
In brief, we used Scanpy (version 1.8.1) to analyze curated datasets with default parameters. In the beginning, we normalized and logarithmized the downloaded gene expression data, and then we conducted principal component analysis (PCA) with the top 2000 highly variable genes to reduce the dimensionality of data. Next, we calculated the neighborhood map with PCA results. Uniform Manifold Approximation and Projection (UMAP) analysis and cluster spots were performed with the Leiden algorithm. For each dataset, we annotated cluster-specific marker genes with Wilcoxon rank-sum test by Scanpy.
2. Data visualization
2.1 Gene Expression
Here you can view the results of datatset analysis with a UMAP and you also can visualize the expression pattern of the gene on the UMAP.
2.2 Dot Plots
User can allows uexplore gene distributions across cell categories using dot plots.
2.3 Violin
User can allows uexplore gene distributions across cell categories using violin plots.
2.4 Statistics
1.A violin plot of some of computed quality measures in dataset.
2.The number of genes expressed in the count matrix.
3.The total counts per cell.
2.5 Maker gene
In the section, users can view the differential gene expression and its expression difference in each cluster through heatmap and table.
Sample
Detailed sample information in each curated datasets or project, such as species information, tissue details, corresponding diseases and so on.
Gene
Users can globally view and search for the expression patterns of genes in each dataset within a given cluster.
Bioproject
An overall description of archiving single cell projects.
Marker gene
A list of marker genes that have been commonly published for different species and cell types.
Collections
CDCP offers a customized database service for single-cell transcriptomics. Researchers can collaborate with cdcp to create single-cell databases or deploy their own specialized databases on CDCP.
Tools
In the Tools page, a description of Codeplot, the analyzing tool, and the workflow of data analysis are presented (Figure S1C). To facilitate the analysis and sharing of single-cell transcriptome sequencing datasets, single-cell workspace was implemented to manage relevant datasets and perform comprehensive bioinformatics analysis as well as visualization based on the expression matrix datasets archived in CDCP. When clicking “single-cell workspace”, users can be directed to the Codeplot page, which provides a computing-platform with trusted execution environment for bio-informatics analysis.
Single cell data submission
There are five major entities supported by single cell data submission: project, sample, experiment / run and analysis data. The submission of single-cell data is mainly divided into two processes: the submission of original sequencing data and the submission of analysis results. The project and sample information shall be submitted before the submission of data documents.
1. Submit project
1.Enter the submission process
Click "Project" on the submission portal page to enter the submission process.
2.Submit project information
Select Data management form -> Fill in the basic information -> Fill in the details -> Overview -> Submit
2. Submit sample
1.Enter the submission process
Click "Sample" on the submission portal page to enter the submission process.
2.Submit sample information
(1) Single submission: Select "Submit a single sample" -> Select sample type -> Fill in sample attributes -> Fields pass check -> Overview -> Submit
(2) Batch submission: Select "Submit batch samples" -> Select sample type -> Download template -> Upload completed template -> Template pass check -> Submit
3. Submit experiment / run
1.Enter the submission process
Click "Experiment / Run" on the submission portal page to enter the submission process.
2.Submit experiment / run information
(1) Single submission: Select submission type (Submit a single experiment/run) -> Fill in basic information -> Fill in metadata -> Metadata pass check -> Submit data files -> Data files pass check -> Overview -> Submit
(2) Batch submission: Select submission type (Submit batch experiments/runs) -> Upload data files -> Download metadata template -> Upload completed metadata template -> Metadata pass check -> Data files pass check -> Submit
4. Submit single cell analysis data
1.Enter the submission process
Click "Single cell" on the submission portal page to enter the submission process.
2.Submit single cell analysis data
Upload data files -> Download metadata template -> Upload completed metadata template -> Metadata pass check -> Data files pass check -> Submit