
The deep sea as the largest and maybe most hostile environment on Earth is still underexplored especially regarding its genetic repertoire. Yet, previous work has revealed significant habitat-specific deep-sea biodiversity. Here, we present an integrated deep-sea genetic dataset comprising of 502 million nonredundant genes from 2,138 samples and 2.4 million predicted structures, revealing unprecedented microbial genetic diversity. Global sequence analysis combined with biophysical and biochemical measurements allowed us to link specific protein structures with genetic variants required for life in the deep sea and to advance biotechnology. Furthermore, estimating the rate of substitutions revealed that genes involved in replication, recombination and repair appear to be critical for microbial life in the deep sea. Among them was a structurally unique helicase which enabled ultra-rapid nanopore sequencing (390±11 bp/s). Thus, our work not only deciphers ecological drivers and evolutionary forces underpinning the deep-sea genetic diversity, but it also bridges genetic knowledge with biotechnology.
File name | Description | Size | MD5 | Download |
---|---|---|---|---|
DSGC_Table_S1.xlsx | Metadata of metagenome datasets used in this study | 286 KB | f6f2bf73035db66e130c4f835ba1ccaf | |
DSGC.faa.gz | Sequences of the 502 million unigenes in Deep-Sea Gene Catalog (DSGC) | 53 GB | a10bcc99c7dfb4a44550fc10d0d41126 | |
DSGC_eggNOG_annotation.gz | EggNOG annotations of DSGC | 27 GB | dfff4ab7a8f23586d77dad832db5bb7e | |
DSGC_KEGG_annotation.gz | KEGG annotations of DSGC | 0.68 GB | 9933f6d0cf0b5716775f552b0be2e08a | |
DSFC_HC.tar.gz | High-Confidence structures in Deep-Sea Fold Catalog (DSFC) | 54 GB | 460b9d5d644978c2784127ea975bc658 | |
DSFC_GC.tar.gz | Good-Confidence structures in DSFC | 34 GB | 2a710e3eaa8858ab1d7ac89cd61853e5 | |
DSFC_LC.tar.gz | Low-Confidence structures in DSFC | 12 GB | b1776d0d8f406c59a40e9a782d651525 | |
DSFC_struc2unigene.list.gz | The correspondence between representative structures and the unigenes in DSGC | 1.7 GB | 90fbf8809a8d2162706abb627e616b1d | |
DS_MAG_HQ.tar.gz | High quality MAGs used in this study | 2.3 GB | 8aa6ad17672c07df012ce4b4b8abd438 |
xxx, The genetic repertoire of the deep sea: from sequence to structure and function. Under Review (2025) DOI: 10.1016/j.cell.2025.xx.xxx.