LettuceDB serves as a portal to multi-omics data for cultivated lettuce and its wild relatives,aims to provide help for lettuce research and breeding.The data on this website were generated as a collaborative projects between BGI-Research, China National Gene Bank (CNGB),CGN, Huazhong Agricultural University and other research institutes. Data and information released on this website are provided on an "as is" basis.

Non-Human Primate Cell Atlas(NHPCA) is a single cell transcriptomics data resource that provides visualization and preliminary analysis of transcriptomic and forthcoming epigenetic single cell data sampled from NHP organs or tissues.

The Database of Deep-Sea Life (Deepseadb) involves both sequencing resource and metadata of ecological communities, isolates and animals collected from the deep-sea (>1000m depth). This database aims to be a dictionary for the exploration and utilization of genetic resources in the deep-sea, which provides uniformed metadata, standard analyzed data and batched analysis tool.

QTP is one for the qinghai-tibet plateau animal microbiome project has produced by more than 30 terabytes of gut microbes macro genome database, to provide users with more than 30 terabytes of gut microbes metagenomic data display, data search, and other functions, through the project, species, such as multiple dimensions fully display the characteristics of the qinghai-tibet plateau animals and microorganisms,The collection of scientific research achievements and related literature since the history of the National Qinghai-Tibet Plateau Scientific Research Project enables researchers to have a systematic analysis and cognition of the genome of animals on the Qinghai-Tibet Plateau.

Biological Resource Center of Plants, Animals and Microorganisms (BRC-PAM) is to build a featured biobank that integrates high-quality genomic data with the collected biological materials. Biological resources including animal cell cultures, plant seeds, microbial strains and associated genomic information stored in the center are available for sharing and distribution to facilitate scientific research.

CODEPLOT aims to provide a reliable and highly efficient computing platform for users to conduct bioinformatics analysis automatically even without programming background.

Version 1.0 release features include:

  1. My workspace. My workspace is composed of description, dataset, workflow and job processing monitor. Platform users can create personalized datasets or workflows in the workspace and share it with anyone them trust.
  2. Datasets. Increasing valuable datasets produced from international genomic projects are provided in CODEPLOT, which are archived in CNGBdb and curated by relevant experts. The accessible public datasets in the current version include:
  3. 1)Assembly and gene annotation of the 1000 plant transcriptomes

    The 1000 plants Project (1KP) is an international multidisciplinary alliance project that has conducted large-scale transcriptome sequencing of more than 1,000 plants.

    2)COVID-19 Database

    The CoVID-19 Novel Coronavirus Sequence Database collected data from CNGB, GenBank, GSAID and other sources, providing the fundamental information to study the evolutionary relationship of COVID-19 collected from different regions worldwide and infer the potential spread routine.

    3)Single-cell Database

    Single-cell database integrates complex single cell sequencing datasets, and provides various relevant analysis tools including result visualization services, which will facilitate researchers to access and explore published single-cell datasets easily.

  4. Tools, Codeplot builds the tools of different research directions based on existing data set resources:
  5. 1)Blast, sequence database screen based on homologous sequence alignment

    2) single_cell_scanpy, single cell sequencing data analysis

    3)HMMER, gene family member mining

    4)edgeR, differential expression gene analysis

  6. Blockchain

Codeplot employs blockchain to produce fingerprint for all confidential files and calculation process to ensure that all relevant calculation processes and histories can be traced back and the records cannot be tampered with. Users can browse and retrieve the fingerprint information on the whole block chain for your ana.

CDCP (Cell-omics Data Coordinate Platform) is a shared and integrated set of complex single cell data, providing users with integrated services such as single cell data search, analysis tools and visualization.

Single Cell Analysis of Macaca Fascicularis project 210,000 + fascicularis fascicularis;

Single Cell Analysis of Human Project 100,000 + cellular omics data and visualization;

In order to comply with the new national laws and regulations on life science data archiving, the "Privacy and Security Policy" has been updated this time and will be updated online on September 23, 2020.

The updated documents are as follows:

"Privacy and Security Policy" https://db.cngb.org/policy/

In order to comply with the new national laws and regulations on life science data archiving,  part of the "Terms and Conditions" has been updated this time and will be updated online on September 2, 2020.

In order to comply with the new national laws and regulations on life science data archiving, management, and open sharing, CNGBdb has amended processes of data submission and access applications and established related documents for its data services. The new processes were officially updated and implemented on July 11, 2020. To ensure a smooth transition, if you have already submitted your request, you can complete the process with the old version documents.

The updated documents are as follows:

Please read carefully the above documents to fully understand our new processes before continuing to visit CNGBdb and use its data services.

If you have any questions, please contact bdcro@cngb.org.


The new version of CNGBdb comes with a much better user interface, which has been re-revised, fully upgraded, becomes more easier to use and user-friendly.

Data retrieval service of China National GeneBank

Advanced search

CNGBdb's advanced search allows users to run queries over all fields in the database according to the data retrieval needs. For example, users can do a title search with keyword ‘rice’ in the Title field of the literature database, or search ‘Gigascience’ in the Journal field. BOOLEAN operations also can be used to improve search accuracy and search efficiency.

Text auto-completion

Users are offered ‘auto complete’ search service by CNGBdb. If users type partial keywords into the CNGBdb search bar, auto-complete will finish typing it for users.

Text auto-correction

Users are offered ‘auto correction’ search service by CNGBdb. With this search service, users will receive suggested spelling search results if they type their search queries incorrectly, which can help users find what they are looking for in an easy and fast way. For example, users type ‘lang cancer’ into the search bar, based on the algorithms analysis, CNGBdb predicts ‘lung cancer’ may be the term users think about actually, and provides search results of ‘lung cancer’.

Results filtering

Search and filter functions of CNGBdb in all databases can help users narrow down searches that brought back too many results and find results accurately and rapidly. For example, users can filter out full-text articles by selecting ‘Free full text’ in the search result page of literature database, and also can search for the newest papers by the year data of publication.

Excellent datasets recommendation

In search results, CNGBdb will give priority to recommend excellent datasets that match the user's search terms. For example, if users search for ‘the Ruili Botanical Garden’, the search results will give priority to recommend the Ruili Botanical Garden dataset, project ID of which is CNPhis0000538, containing 42TB sequencing data and 738 plant samples.

Data update

CNGBdb version 1.0 has increased data resources compared to the beta version. Up to February 2019, data resources of all databases are as follows:

  • Literature Library ( 29,198,501 )
  • Gene bank ( 33,171,984 )
  • mutation library ( 763,230,128 )
  • Protein Library ( 134,065,913 )
  • Sequence Library ( 2,136,651,182 )
  • Project Library ( 3,162 )
  • sample library ( 323,116 )
  • Experimental Library ( 430,586 )
  • Assembly library ( 2,346 )

Data resources will be updated periodically to ensure timeliness, such as daily updates of the literature database.

China National GeneBank Nucleotide Sequence Archive (CNSA)

Data submission interaction logic has been optimized, adding prompts.
Online batch submission efficiency optimization

The average submission time is reduced by 70% after optimization. Testing case: submission time-consuming(file size: 99kb; article number: 768) is 27s and 5s before and after optimization.

Upload ftp, MD5 verification speed has been accelerated to eight times.
New submission types (bulk/single) have been added in My Submission page, the number(e.g. sample ID) list also can be downloaded from this page.
Reviewer link generated automatically according to users’ requirements for article publication can be provided to the journal editors to review, which can help the article get approved and published more quickly.
Internal Users are offered auto data submission service via Cluster Upload.

Data Calculation and Analysis Service of China National GeneBank (BLAST)

BLAST has been upgraded to NCBI's latest V2.8.1.
The reference database has been upgraded to NCBI's latest V5 version, and the genome assembly data published by BGI has been added in CNGBdb Version1.0.

Based on big data and cloud computing technologies, China National GeneBank DataBase (CNGBdb) provides integrated data services such as data archiving, computational analysis, knowledge search, management authorization and visualization.

Data retrieval service of China National GeneBank

CNGBdb constructs multiple databases, including Literature, Gene, Variation, Protein, Sequence, Project, Sample, Experiment, Assembly, and allows cross-reference among those data sources to form data interconnection. CNGBdb offers a number of important performance advantages:3 billion data items; full-text search; second response time; retrieval keywords both in Chinese and English.

Data Calculation and Analysis Service of China National GeneBank (BLAST)

BLAST has been upgraded to NCBI's latest V2.8.1.
BLAST has been integrated with NCBI's nc and nr databases.
BLAST has been integrated with CNGB's multiple databases, such as ONEKP(BLAST for 1,000 Plants), B10K(The Bird 10,000 Genomes), PIRD(Pan immune repertoire database), containing 564,057,891 items of immune data.
  1. Batch submit and review data of projects, samples, experiments and assemblies online.
  2. CNSA provides archiving services of variation data for users.
  3. CNSA assigns DOI(digital object identifier) numbers for projects in order to help users cite, trace, retrieval and reuse data conveniently.
  1. Users can submit data of projects, samples, experiments and assemblies online.
  2. Reviewer can review data of projects, samples, experiments and assemblies online.
  3. English-Chinese bilingual interface, localized service.
  4. Open data retrieval service to users.