Data schema
The CGA applies GGBN sample standards to sample collection and sample data sharing. The GA4GH standard is applied to datasets from individual humans. Data of project, sample, experiment/run, assembly, annotation in the CGA are all referenced to the INSDC standard, so that data can be shared with international data sharing centers such as EBI, NCBI DDBJ. Based on “The three banks” (Biorepository, Bio-informatics Data Center and Living Biobank, CGA can correlate the metadata, samples, even living organisms, so that any information in it can be traceable throughout the whole process.