Spatial TranscriptOmics DataBase (STOmicsDB) is a comprehensive portal that integrates spatiotemporal omics literature, tools, and data. STOmicsDB consists of the following sections.
Data pre-processing and analysis
In brief, we used Scanpy (version 1.8.1) to analyze curated datasets with default parameters. In the beginning, we normalized and logarithmized the downloaded gene expression data, and then we conducted principal component analysis (PCA) with the top 2000 highly variable genes to reduce the dimensionality of data. Next, we calculated the neighborhood map with PCA results. Uniform Manifold Approximation and Projection (UMAP) analysis and cluster spots were performed with the Leiden algorithm. For each dataset, we annotated cluster-specific marker genes with Wilcoxon rank-sum test by Scanpy. If the data contains spatial coordinate information, we then identified spatially variable genes with spatialDE (version 1.1.3). All data and the corresponding analysis result can be downloaded in the Data tab and Analysis results tab on the top panel.
Data visualization
STOmicsDB uses Cirrocumulus (https://cirrocumulus.readthedocs.io/en/latest/) for dataset visualization. Cirrocumulus is an interactive visualization tool for large-scale single-cell and spatial transcriptomic data. The data visualization page consists of five parts: a section selector, a top toolbar, a sidebar, a main canvas, and a gallery.
For more details, please visit: https://cirrocumulus.readthedocs.io/en/latest/documentation.html
Dataset analysis results
In the “Visualization” tab, different sections in the same dataset can be chosen through ‘Sections’ selector on the top, which shows the general statistics figure, heatmap and tab of cluster marker, volcano plot of upregulated differential markers, histogram of GO and KEGG pathways enrichment, cell-cell interactions, and spatially-specific modules and genes.
The resource center provides following functions:
you can find the samples of dataset by searching any keywords and filtering conditions such as sample release dates, species information, tissue details, spatial transcriptomic technologies which are provided in the left sidebar. You can also sort the data set by update time and relevance.
you can find the publication by searching any keywords and filtering conditions such as research areas, experimental targets, species information, tissue details, computational directions which are provided in the left sidebar. You can also sort the data set by update time and relevance.
Projects of the page are submitted from users to STOmicsDB data archiving system.
STOmicsDB has a customized database service. We welcome researchers to construct spatial transcriptomics databases with us, or deploy their specialized databases on STOmicsDB.
Now, we have constructed three such databases with other researchers: ATRISTA (axolotl brain regeneration, https://db.cngb.org/stomics/artista/), MOSTA (mouse organogenesis, https://db.cngb.org/stomics/mosta/), ZESTA(Zebrafish Embryogenesis,https://db.cngb.org/stomics/zesta/) and MLRSTA (mouse liver regeneration, https://db.cngb.org/stomics/mlrsta/)
Compare with other techniques, spatial transcriptomic techniques have some critical features, such as their spatial information. Additionally, different spatial transcriptomic techniques also have their own features. To facilitate the spatial transcriptomic data reuse and re-analysis, we aim to develop a spatial transcriptomic data submission standard for each technique.
In the current version, STOmicsDB supports the data submission of two techniques: Stereo-seq and 10x Visium. The template for submission can be downloaded here (https://ftp.cngb.org/pub/stomics)
The data model is as follows:
To facilitate spatial transcriptomic data usage, we set up an online tool based on SingleR to provide an interaction analysis between spatial transcriptomic data and single-cell RNA sequencing data. This tool allows users to annotate cell types of a specific spatial transcriptomic dataset on STOmicsDB by uploading their single-cell RNA sequencing gene expression matrix and the corresponding cell types. Except for the default outputs of SingleR, this tool also generates a spatial feature plot to show the spatial localization of each annotated cell type.
Select the species and gene to show the spatial map of corresponding gene in different dataset sections. Click the section name to enter the corresponding dataset visualization for more details.
You can compare spatial gene expression of two datasets from STOmicsDB.
The datasets with Stereo-seq technology also set up the customized visualization system, Stereomap. This ultra-high resolution visualization system is specific designed for Stereo-seq technology, which is able to display more than one million cells and multiple bins (“bins” is the Stereo-seq technology features, referring the combination level of spots).
If you need any help, please contact CNGBdb@cngb.org.
The stereo-seq data used to generate Fig. 1 comes from the StomicsDB database[1], and the query number is STDS0000058[2].
The spatial mouse kidney data have been deposited into STOmicsDB[1](https://db.cngb.org/stomics/datasets/STDS0000058[2])
Xu, Zhicheng et al. “STOmicsDB: a comprehensive database for spatial transcriptomics data sharing, analysis and visualization.” Nucleic acids research vol. 52,D1 (2024): D1053-D1061. doi:10.1093/nar/gkad933'
Longqi Liu. MOSTA: Mouse Organogenesis Spatiotemporal Transcriptomic Atlas[DS/OL]. STOmicsDB, 2021[2021-10-22]. https://db.cngb.org/stomics/datasets/STDS0000058/. doi: 10.26036/STDS0000058
#Format: {contributors}. {title}[DS/OL]. STOmicsDB, {the year of submission data}[{submission data}]. {dataset link}. doi: {doi ID}
Chen, Ao et al. “Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays.” Cell vol. 185,10 (2022): 1777-1792.e21. doi:10.1016/j.cell.2022.04.003
China National GeneBank DataBase (CNGBdb) is a system that provides biological big data sharing and application services for scientific research communities. Before registering and using the CNGBdb platform, please carefully read the "Terms and Conditions" and "Privacy and Security Policy" of this platform.
[NOTE] Your continued access to the CNGBdb or use any services provided by the CNGBdb constitutes you have carefully read and fully understood the "Terms and Conditions" and "Privacy and Security Policy", and your acceptance of be subject to the "Terms and Conditions" and " Privacy and Security Policy ". If you do not accept or cannot fully understand the "Terms and Conditions" and / or "Privacy and Security Policy", please stop registration and stop using any services provided by CNGBdb.
Disclaimer:
The users should follow the laws, regulations, policies, biosafety and bioethics principles of China and other relevant countries or regions, and submit authentic and accurate information of organizations and contacts as well. If the user has any violations (including but not limited to the violations of laws and regulations related to human genetic resource management, bioethics or biosafety management and related policies), the user and its affiliates shall bear all legal liability and any loss of a third parties. CNGBdb does not assume any legal responsibility for that. Meanwhile, CNGBdb has the right to take corresponding measures according to the situation, including but not limited to immediate suspension or termination of the service and deletion of corresponding information.
Data sharing in STOmicsDB follows CC BY 4.0.