The genetic repertoire of the deep sea: from sequence to structure and function

The deep sea as the largest and maybe most hostile environment on Earth is still underexplored especially regarding its genetic repertoire. Yet, previous work has revealed significant habitat-specific deep-sea biodiversity. Here, we present an integrated deep-sea genetic dataset comprising of 502 million nonredundant genes from 2,138 samples and 2.4 million predicted structures, revealing unprecedented microbial genetic diversity. Global sequence analysis combined with biophysical and biochemical measurements allowed us to link specific protein structures with genetic variants required for life in the deep sea and to advance biotechnology. Furthermore, estimating the rate of substitutions revealed that genes involved in replication, recombination and repair appear to be critical for microbial life in the deep sea. Among them was a structurally unique helicase which enabled ultra-rapid nanopore sequencing (390±11 bp/s). Thus, our work not only deciphers ecological drivers and evolutionary forces underpinning the deep-sea genetic diversity, but it also bridges genetic knowledge with biotechnology.

Download
File nameDescriptionSizeMD5Download
DSGC_Table_S1.xlsxMetadata of metagenome datasets used in this study286 KBf6f2bf73035db66e130c4f835ba1ccaf
DSGC.faa.gzSequences of the 502 million unigenes in Deep-Sea Gene Catalog (DSGC)53 GBa10bcc99c7dfb4a44550fc10d0d41126
DSGC_eggNOG_annotation.gzEggNOG annotations of DSGC27 GBdfff4ab7a8f23586d77dad832db5bb7e
DSGC_KEGG_annotation.gzKEGG annotations of DSGC0.68 GB9933f6d0cf0b5716775f552b0be2e08a
DSFC_HC.tar.gzHigh-Confidence structures in Deep-Sea Fold Catalog (DSFC)54 GB460b9d5d644978c2784127ea975bc658
DSFC_GC.tar.gzGood-Confidence structures in DSFC34 GB2a710e3eaa8858ab1d7ac89cd61853e5
DSFC_LC.tar.gzLow-Confidence structures in DSFC12 GBb1776d0d8f406c59a40e9a782d651525
DSFC_struc2unigene.list.gzThe correspondence between representative structures and the unigenes in DSGC1.7 GB90fbf8809a8d2162706abb627e616b1d
DS_MAG_HQ.tar.gzHigh quality MAGs used in this study2.3 GB8aa6ad17672c07df012ce4b4b8abd438
How to cite

xxx, The genetic repertoire of the deep sea: from sequence to structure and function. Under Review (2025) DOI: 10.1016/j.cell.2025.xx.xxx.