China National GeneBank DataBase (CNGBdb) is a uniﬁed platform built for biological big data sharing and application services to the research community. Based on the big data and cloud computing technologies, it provides data services such as archive, analysis, knowledge search, management authorization, and visualization. At present, CNGBdb has integrated large amounts of internal and external molecular data and other information from CNGB, NCBI, EBI, DDBJ, etc., indexed by search, covering 10 data structures. Moreover, CNGBdb correlates living sources, biological samples and bioinformatic data to realize the traceability of comprehensive data.
Based on the internal data source from “Three Banks and Two Platforms” of CNGB and external data source from NCBI, EBI, DDBJ, etc., and following the data standards of international standard alliance such as INSDC, DataCite, GA4GH, GGBN, ACMG, CNGBdb builds data structures covering literature, gene, variation, protein, etc., and provides data sharing and application services such as data archive, query and retrieval, and analysis.
In January 2011, relying on BGI-Shenzhen, the CNGB was established with the approval of the National Development and Reform Commission (NDRC). The CNGB has integrated five arms divided into three "banks" and two "platforms." The banks include a Biorepository, Bio-informatics Data Center and Living Biobank, while the platforms consist of a Digitalization Platform, as well as a Synthesis and Editing Platform. Based on the ability of storing, reading and writing massive biological resources, the CNGB has built a public welfare, open, supportive and leading service platform for genetic resources mining. Based on the living samples, biological resource samples, and bioinformatics data, CNGBdb provides various biological big data sharing and application services.
One of the main advantages of the CNGB is the construction of a Biorepository, Bio-informatics Data Center and Living Biobank covering the life cycle. CNGBdb interconnects the information of the three banks and provides external data sharing services to enable biological data to be traced throughout the life cycle.
To cite CNGBdb, please refer to the following words :
魏晓锋 陈凤珍 游丽金 杨帆 王丽娜 郭学芹 高飞 华聪 谈聪 方林 单日强 曾文君 王博 王韧 徐讯. CNGBdb：国家基因库生命大数据平台[J]. 遗传, doi: 10.16288/j.yczz.20-080.
External data in CNGBdb archiving, search, analysis, scientific database and other services are mainly from open databases such as NCBI，EBI，DDBJ，HPO，CHPO，ICGC，TCGA，cBioPortal，Uniprot，MSK-IMPACT，5ExAC，1000Genomes，NIFTY database，WoRMS，NHLBI ESP，NIEHS EGP，HGDP，Phytozome，dbNSFP，EVS，GWAS. We sincerely thank these databases for providing us with a rich source of biological data. We also would like to thank them for their contributions to the bio-data archiving and sharing businesses. CNGBdb has no special use restriction on these public data resources. However, these data may be subject to the terms and conditions of these external databases themselves. Please review the specific terms and conditions of each database before use to ensure compliance with all applicable regulations.
All public data and data services provided by CNGBdb are freely available to all users worldwide.