China National GeneBank DataBase (CNGBdb) is a unified platform for data sharing and application services (Science as a Service) for biological big data. Based on the underlying big data and cloud computing technologies, it provides data services such as transmission and archive, computation and analysis, knowledge search, management and authorization, and associated visualization. At present, CNGBdb has integrated multi-omics data covering genome, transcriptome, epigenome and proteome and correlates these data with samples and even living organisms, so the data can be traced through the whole process from living organisms to biological sample to biological information data to realize the full penetration of comprehensive data. At present, the data submission and archive services of CNGBdb include CNGB Nucleotide Sequence Archive (CNSA) and CNGB Variation Archive (CVAR), Pan immune repertoire database (PIRD) and GigaDB, which are committed to the submission, storage and sharing of data for biological sequencing research projects, samples, experiments, assembly data, mutations, etc. They’re designed to provide researchers around the world with the comprehensive data and information resources today, enabling researchers to use data with maximum authority. The CNSA accepts sequencing data (including raw data and other supporting data) from global research, and its service can be used as a supplement to the literature publishing process [1-2].
When you submit data to CNGBdb through these archive services, you will by default agree to the CNGBdb's agreement.
CNGBdb is a public welfare and non-profit data platform. It includes both human and non-human data, and the data rights are in the hands of the submitters. For human data, relevant data covered by the Human Genetic Resources Management Regulations is not accepted currently. CNGBdb respects and protects the data submitter's interests, and the submitters must determine the data control scope (fully control, partially control, and fully open, etc.) when submitting data to the designated databases of CNGBdb. CNGBdb provides an accession ID for each submitted data (If you choose to synchronize data to ENA/NCBI, you will also get the ENA/NCBI accession ID) or a referenceable digital object identifier (DOI). To respect the research results of all researchers, users must be approved to access the controlled data in the databases for secondary research. Anyone can publicly access data in the database for unrestricted access.
1) Sample donor rights and privacy protection
2) Data submitter's equity
3) Terms and conditions for controlled data
Access to controlled data is achieved by setting up unrestricted and controlled data access mechanisms. Users are required to submit applications to the CNGB Data Access before using the controlled data, and the data reviewer will present the approval results to users based on the determined usage limits of the data submitter. CNGBdb will accept access requests for research purposes beginning one month prior to the data release date. The access period for all controlled access data defaults to one year, and data submitters can close access in advance; at the end of each approval period, data users can request additional one-year access or close access.
Users with controlled data usage rights can download controlled access data and must abide by the CNGBdb controlled data usage terms and conditions. These include:
CNGBdb wants researchers approved to use controlled data to follow a data security policy that outlines expected data security measures (such as physical security measures and user training) to ensure data security and does not leak the data to any data user who is not allowed to access the data.
If the user violates the terms and conditions of controlled data usage, CNGBdb will take appropriate measures.
4) Conditions for users to use unrestricted access to data
CNGBdb develops safety management methods and information security systems in accordance with the national “Management Measures of Information Security Level Protection” system requirements and the “Basic Requirements of Information System Security Level Protection” technical standards to manage the data security from data security, data organization specifications, data backup and archiving, physical security, personnel security, security operations, security development, and establishment of information security linkage mechanisms.
(1) Physical security
Ensure the physical security of the data center from the aspects of power supply, fire prevention equipment, monitoring and regulation of temperature and humidity. Detailed and professional management of authorized activities are implemented for separation of the equipment room area, personnel management, the access, storage, use and destruction of important equipment, CCTV (video surveillance), and records of relevant activities within a reasonable period are kept.
(2) Security organization
Set up a team responsible for information security to design, develop and operate information security construction and auditing of the platform, defend all types of security attacks and intrusions into information services, systems and networks of CNGBdb, and develop and monitor the implementation of defense techniques and management measures.
(3) Personnel safety
Conduct background checks on management, operation and maintenance personnel of CNGBdb before entering the company to ensure compliance with the company's code of conduct, sign confidentiality agreements and conduct regular training after entering the job.
(4) Access control
Identity and authorization controls are used to identify and restrict access to data, and a sufficiently detailed log must be kept for auditors to audit and monitor as needed.
(5) Data Security
Develop technical and management measures for data transmission, storage, usage and destruction to ensure data security.
(6) Data organization specification
Clear data and information organization specifications, according to different data and information classification, achieve unified management of data resources and shared services. Clear the information security requirements of the database management system and the data storage management platform, and perform security verification on the corresponding systems and platforms to ensure the implementation of security control.
(7) Data backup and archive
Establish management strategies for data lifecycle, develop data backup and archive management processes, establish data backup and recovery procedures, record backup processes, and keep all files and records properly.
(8) Security operations
Use technical measures to protect, monitor and warn the platform, deal with all types of attacks and abnormal situations, achieve the prevention of security incidents in advance, deal with the matter and prevent the recurrence of the event.
(9) Security development
Information security requirements are introduced in the new project establishment and demand assessment phases, and these requirements are verified before going online to ensure the implementation of security control.
(10) Information security linkage mechanism
CNGBdb and related functional units establish long-term, fast and accurate linkages for work docking, information notification and accident handling through designated contacts, including regular reporting on progress of construction work, timely communication of existing problems and important information, and information security. Events (accidents) include attack intrusion events to CNGBdb information services, system and network security, information leakage incidents, etc. are managed hierarchically, and effective notification mechanisms are established for different levels of accidents. For major attacks, timely notification and rapid response are required. If necessary, CNGBdb will cooperate with relevant departments to investigate or properly handle the incident to protect China's biological information security and national interests.
Fully considering the exponential growth of data, as well as its sharing and development, the hardware and software architecture of CNGBdb is built to ensure the scalability to the support the system.
The security management of the data life cycle is as follows:
1）Ensure that the submitted data source is secure and credible;
2）Adopt quality control process to eliminate low quality and falsified data.
2．Storage security: Use multi-center, multi-copy, multi-node, distributed form to store metadata and data files, reduce data storage pressure and leakage risks.
3．Transmission security: CNGBdb uses digital certificates and asymmetric encryption to ensure the transmission security.
4．Usage and share security:
1）CNGBdb adopts the unified user registration management system to hierarchically manage user access and usage rights to achieve data usage and sharing security;
2）User-sensitive data is processed using anonymization methods to protect user's privacy.
5．Safe Destruction: Periodically check the disk, and the data to be destroyed shall be destroyed in four levels: all 0/all 1/random code/national security standard.
If you have any questions about using the data in the database, please contact us at firstname.lastname@example.org. We welcome collaborative interactions to provide a comprehensive, up-to-date data set for researchers worldwide.
CNGBdb has no restrictions on the use or sharing of fully open data contained in the databases, subjecting to the data ethics, permissions and rights asserted by the global research users.
The user (the enterprise or the original country to which the data belongs) may declare patents, copyrights or other intellectual property rights in all or part of the submitted data. Other users are required to fully comply with the limitations of such intellectual property rights and to use the data without violating such intellectual property rights.
Users are required to follow the “Interim Measures for the Management of Human Genetic Resources” and ethical norms in their countries, submit real organization information and contacts, and take responsibility for the legality and compliance of the uploaded data. If the user conceals or misrepresents the information and causes legal disputes or administrative penalties or criminal accountability, the user and his unit shall bear the responsibility, and CNGBdb shall not bear any unfavorable legal responsibility.
For archiving services of CNGBdb, including CNSA, users can choose to synchronize data to EBI, NCBI or DDBJ. CNSA assists users in synchronizing with their chosen upload database, but is not responsible for the security of synchronized data and other benefits. Other archive databases do not currently provide the service of data synchronization to EBI, NCBI or DDBJ database services.
CNGBdb only receives data with non-identifiable personal information. If the data information uploaded by the user has the risk of leaking personal information, the user and his or her organization shall bear the relevant legal responsibilities, and CNGBdb shall not bear any unfavorable legal responsibilities.
The data administrator email (email@example.com) of CNGBdb is responsible for handling data review and answering user’s inquiries. CNGBdb reserves the right to refuse to upload data for users if the data review fails. Specifically, the data administrator email reply shall prevail.
All materials in CNGBdb are provided "what you see is what you get" without any express or implied warranties, including but not limited to the fitness for a particular purpose and/or non-infringement, and does not assume any legal responsibility or obligation for the purpose of its use .
 Open science and the role of publishers in reproducible research
 EBI Serving as a complement to the literature publication process and supporting early data sharing
 GIGADB GigaScience and the BGI provide this data in good faith, but make no warranty, express or implied, nor assume any legal liability or responsibility for any purpose for which they are used.