STOmicsDB Visualization Creation Guide
1. Introduction to Visualization Creation
STOmicsDB provides spatial omics data and metadata, including research areas, sample tissues, species, spatial resolutions, and publication types. The data includes gene expression matrices, spatial positions, tissue images, and analysis results like clustering and cell annotation.
The Dataset Creation system ensures data accuracy through quality control and performs analyses such as cell type annotation, spatial region identification, and cell-cell interaction. Researchers can explore results and visualize data to gain insights into spatial transcriptomics.
2. Database Access
- Home page: https://db.cngb.org/stomics/
- Visualization creation: https://db.cngb.org/stomics/submission/data_visualization
- Browse datasets: https://db.cngb.org/stomics/datasets/
3. Steps for Visualization Creation
3.1 Accessing the Visualization Creation Interface
- Visit https://db.cngb.org/stomics/.
- Click Submit | Create under the Submission module to access the creation interface.
- Click Create to start the process.
- Log in using WeChat, CARSI, ORCID, GitHub, BGI, etc.
3.2 Starting a New Visualization
- Click New Creation to begin.
- A unique creation number (e.g.,
stc0000xxx
) will be generated, and you will enter the creation details interface.
3.3 Completing Creation Details
Step 1: Upload Data
- Download and fill in the visualization creation template (see FAQ for details).
- Upload the template. Fix errors based on messages or email the support team for unresolved issues.
- Choose a data upload method (e.g., FTP or Aspera). Wait ~10 minutes for system detection.
- Click Check Files and Next to verify data integrity.
Step 2: Provide Additional Information
- Fill in details like submitter name, project name, etc.
3.4 Managing Visualizations
- Creation List Interface: https://db.cngb.org/stportal/creations/
- Unfinished Creations: Click Modify to continue or Delete to remove.
- Completed Creations:
- Click Apply Modification to request edit permissions. After approval, click Modify to edit.
- Click Apply Deletion to request deletion. After approval, the creation will be deleted.
4. How to Cite Data
4.1 Example Dataset References
- "The stereo-seq data used to generate Fig. 1 comes from the STOmicsDB database [1], query number STDS0000058 [2]."
- "The spatial mouse kidney data have been deposited into STOmicsDB [1] (https://db.cngb.org/stomics/datasets/STDS0000058 [2])."
4.2 Citation Formats
- Citing STOmicsDB:
- Xu, Zhicheng et al. “STOmicsDB: a comprehensive database for spatial transcriptomics data sharing, analysis and visualization." Nucleic Acids Research, vol. 52, D1 (2024): D1053-D1061. doi:10.1093/nar/gkad933
- Citing a Visualization Dataset (e.g., STDS0000058):
- Longqi Liu. MOSTA: Mouse Organogenesis Spatiotemporal Transcriptomic Atlas [DS/OL]. STOmicsDB, 2021 [2021-10-22]. https://db.cngb.org/stomics/datasets/STDS0000058/. doi: 10.26036/STDS0000058
- Citing Original Data Articles:
- Chen, Ao et al. "Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays." Cell, vol. 185, 10 (2022): 1777-1792.e21. doi:10.1016/j.cell.2022.04.003
5. Visualization Creation - Dataset
5.1 Searching for Datasets
- Use the general or advanced search modules to find datasets.
5.2 Metadata Display
- Research Information: Includes species, tissues, diseases, developmental stages, publications, and technologies.
- Metadata Organization: Includes file types, sizes, and structures. Standard and advanced analyses extract key information.
5.3 Data Standardization
STOmicsDB standardizes data from public databases and user submissions. The process includes:
- Data Import: Use Scanpy (v1.8.1) to read data (e.g., expression matrices, spatial locations, tissue images) into AnnData objects.
- Standardization: Remove duplicate genes, normalize counts, perform PCA, and cluster using the Leiden algorithm. Identify marker genes and spatially variable genes using Scanpy and spatialDE.
- Output: Save results in h5ad format for user access.
Advanced Analysis Visualization
- Cell Annotation: Use SCINA (v1.2.0) to annotate clusters with cell types based on marker genes.
- Cell Interaction: Use stLearn (v0.4.12) to analyze and visualize significant cell-cell interactions.
- Differential Analysis: Identify upregulated genes between cell types and perform enrichment analysis using ClusterProfiler.
5.4 Expression Profile Visualization
Displays gene expression maps, cell annotations, and marker gene distributions for intuitive spatial visualization.

5.5 Data Files and Downloads
STOmicsDB provides controlled and public sharing options:
- Raw Data: Unprocessed experimental data or filtered matrices.
- Processed Data: Results after standardization, including normalized matrices and cluster analyses.
- Custom Data: User-submitted intermediate results or annotations.
5.6 Sample Modules
- Sample List: View all samples and their main information.
- Sample Details: Access detailed sample information and download data.
6. Frequently Asked Questions
6.1 Template Filling Requirements
To create a visualization, fill in dataset metadata, sample information, sample file information, and analysis file information. Optionally, add a homepage image for personalized dataset display.
Dataset Metadata
FIELDS | DESCRIPTION | EXAMPLE |
---|---|---|
TITLE | Dataset title | MOSTA: Mouse Organogenesis Spatiotemporal Transcriptomic Atlas |
SPECIES | Species name (Tax ID), separate multiple with ` | ` |
TISSUES | Biological tissue names, separate multiple with ` | ` |
ORGAN PARTS | Biological suborgan names, separate multiple with ` | ` |
DEVELOPMENT STAGE | Sample developmental period, separate multiple with ` | ` |
SEX | Sample gender, Male or Female, separate multiple with ` | ` |
TECHNOLOGY | Sequencing technology, separate multiple with ` | ` |
SAMPLE NUMBER | Sample size | 16 |
SECTION NUMBER | Total number of slices | 61 |
DISEASE | Diseases studied, separate multiple with ` | ` |
SUMMARY | Dataset overview (max 4000 characters) | Overview of the dataset and its significance. |
OVERALL DESIGN | Experimental design (max 2000 characters) | Description of the experimental setup. |
SUBMISSION DATE | Submission date in yyyy-mm-dd format |
2020-01-24 |
UPDATE DATE | Update date in yyyy-mm-dd format |
2020-11-18 |
CONTRIBUTORS | Contributors, separate multiple with ` | ` |
CONTACT | Contact information, separate multiple with ` | ` |
CITATION | Citations of related articles, separate multiple with ` | ` |
ACCESSIONS | Data source or storage location | CNSA project: CNP0001543 |
RELATIONS | Data associations with other databases | Stomics submission: STT0000013 |
PLATFORM | Sequencing platform, separate multiple with ` | ` |
Sample Information
FIELDS | DESCRIPTION | EXAMPLE |
---|---|---|
SAMPLE_NAME | Sample name | E9.5_EXAMPLE |
SECTION_NAME | Slice namespace | Omics technology: When the same sample includes multiple slices, multiple lines need to be filled in Other Omics technology: When no slice is included, the slice name must be the same as the sample name. |
TECHNOLOGY | Sequencing technologies, such as Stereo-Seq, 10X Visium, scRNA | Stereo-Seq |
SPECIES | Tax ID of the species to which the sample belongs | 10090 |
TISSUE | Sample tissue type | Embryo |
ORGAN PARTS | Name of biological subgroup | Embryonic brain |
DEVELOPMENT STAGE | Sample development period | E9.5 |
SAMPLE ID | The sample number submitted to CNSA or spatiotemporal group database; leave it blank if none exists | CNS0001619 |
SOURCE | For those collected by other databases/institutions, fill in the corresponding information; for those submitted to the gene bank, fill in CNGB | CNGB |
SEX | Sample gender, such as: Male | Female | Male |
PLATFORM | Sequencing platform, such as DIPSEQ-T1 | DNBSEQ-T1 |
DISEASE | Disease of the sample, if none, normal | squamous cell carcinoma |
Expression File Information
FIELDS | DESCRIPTION | EXAMPLE |
---|---|---|
SECTION_NAME | The sample name to which the user analysis file belongs | E9.5_E2S1_EXAMPLE |
DATA_TYPE | Sequencing technology | Stereo-Seq |
FILE_NAME | File name | cluster_makers.svg |
TITLE | Analysis Type | Cluster markers |
DESCRIPTION | Analysis Description | The marker genes of each cluster were calculated by scanpy.tl.rank_genes_groups with the "wilcoxon" method. If the original annotation information of dataset is available, we use the original one, if not, we get the annotation information through scanpy.tl.leiden. |
MD5 | File md5 value | 7abcfa8f3abd503e286badf040ba4fa3 |
Analyze File Information
FIELD NAME | CONTENT | FORMAT DESCRIPTION |
---|---|---|
SECTION_NAME | The sample name to which the user analysis file belongs | E9.5_E2S1_example |
DATA_TYPE | Sequencing technology | Stereo-Seq |
FILE_NAME | File name | cluster_makers.svg |
TITLE | Analysis Type | Cluster markers |
DESCRIPTION | Analysis Description | The marker genes of each cluster were calculated by scanpy.tl.rank_genes_groups with the "wilcoxon" method. If the original annotation information of dataset is available, we use the original one, if not, we get the annotation information through scanpy.tl.leiden. |
MD5 | File md5 value | 7abcfa8f3abd503e286badf040ba4fa3 |
Optional Information for the Dataset
- Display thumbnails: You can add pictures to Excel spreadsheets with unlimited resolution
6.2 Verification Information Explanation
To be continued. If you encounter any related problems, please contact us.
Contact
For assistance, contact P_STOmicsDB@genomics.cn.
We provide detailed guidance and support to ensure a smooth process for data submission and visualization creation.
Support Hours: Monday to Friday, 9:00 AM - 6:00 PM (GMT+8).