Preservation Plan


  1. Purpose
    The China National GeneBank Sequence Archive (CNSA) endeavors to ensure the long-term preservation and accessibility of biological sequence data, thus facilitating and promoting scientific research in biology. This preservation plan outlines CNSA's protocols for data archiving, curation, preservation, and sharing, aiming to safeguard data from loss or corruption, maintain accessibility for researchers, and ensure proper curation to preserve data quality and utility.
  2. Ethics and Legal Compliance
    CNSA complies with the laws and regulations of the People's Republic of China, international guidelines, and industry best practices for the responsible and ethical preservation of biological data. When submitting data to CNSA, depositors must comply with local regulations, ethics, and laws related to Human Genetic Resources. Data anonymization or de-identification is undertaken to uphold privacy standards.
  3. Data Standards
    CNSA adopts data and metadata submission standards from prestigious consortiums and organizations like INSDC, DataCite, GA4GH, GGBN, and others. Depositors are required to provide comprehensive metadata to ensure long-term preservation and understandability. CNSA supports commonly used data formats (FASTQ, FASTA, VCF, etc.) and offers guidance to depositors for data and metadata submission. CNSA is committed to updating its standards in alignment with any changes from these international organizations.
  4. Data Archive and Curation
    CNSA supports the archival of diverse data types, such as raw sequencing data, assembly, variation information, metabolism data, single-cell data, and other sequence data. Automatic and manual reviews are conducted on submitted data to ensure its quality. Data integrity during each submission and transfer activity is ensured by applying MD5 checksums.
  5. Accessibility
    To promote data sharing and reuse while protecting data privacy and security, CNSA provides different levels of access privileges. Public data are openly accessible, while controlled data access is granted upon request to users. The access privilege is chosen by the data depositor.
  6. Security
    Archived data is backed up at a geographically separate data center, with two copies periodically validated using MD5 checksums. CNSA regularly checks storage media conditions and replaces any defective units. Data recovery from redundant copies is carried out in case of media failure. Security measures, potentially including encryption of sensitive data, are taken to safeguard data, and these activities are conducted by staff with appropriate security skills, which are updated through continuous training. CNSA has a detailed plan outlining the recovery procedures for data loss events such as technical failures, or cyber-attacks.
  7. Long-term Preservation and Migration Strategy
    CNSA commits to a strategy ensuring long-term data preservation, which includes the potential migration of data to new systems as technology evolves. This ensures that the data remains accessible and usable in the future.
  8. Review and Update
    CNSA's preservation plan is a living document that undergoes regular reviews and updates with a designated team to ensure its continued effectiveness and relevance.