/
/
The Cycas genome and the early evolution of seed plants
The Cycas genome and the early evolution of seed plants

The cycad genome project is an integration of genomic data of cycads and other related seed plants, including the raw sequencing data, assembly and annotation.

数据量: 444
更新时间: 2022-04-19

1. Backgroud

Introduction to cycads.
Cycads are long-lived, woody and dioecious gymnosperms that develop cones and reproduced by seeds characterized by their frond like leaves. Today, they compose one of the largest lineages of gymnosperms comprising ca. 360 living species (http://www.cycadlist.org) that widely distributed across tropical and subtropical regions. As cycads are among the most ancient lineages of living seed plants, the cycad genome project provides great resources for a better understanding of the origin and early evolution of seed plants.

Cycad genome database
The cycad genome database is an integration of genomic data of cycads and other related seed plants, including the raw sequencing data, assembly and annotation. Assemblies are from cycad genomes, female and male specific regions of cycad genomes, and transcriptomes of cycads and other gymnosperm species. The annotations included repeat, gene, and functional annotation of the cycad genome, as well as open reading frame predictions of transcriptomes.

2. Data description

2.1 Genome

A Cycas panzhihuaensis genome was assembled and polished by modified softwares NextDenovo and NextPolish. After conjunction with Hi-C chromosome conformation, the C. panzhihuaensis genome comprises 10.5 Gb in 5,123 contigs (N50 = 12 Mb), with 95.3% of the assembled contigs anchored to the largest 11 pseudomolecules, corresponding to the 11 chromosomes (n = 11) of the C. panzhihuaensis karyotype.

2.2 Male specific regions of Y chromosome

SSK_finder pipeline was used to extracted male-specific 21 k-mers, which was then applied to select male-specific Nanopore reads with at least a coverage of 0.2% of each read. the selected long reads were used for genome assembly using NextDenovo, and then the assemblies were polished with male-specific reads by NextPolish. The contigs were assembled into scaffolds using Juicer and 3D-DNA with male-specific Hi-C reads extracted by the same pipeline as for short reads. A total of 45.5 Mb male specific regions was assembled into 43 scaffolds.

2.3 Transcriptome

Transcripts were assembled from cleaned reads of each sample using the Trinity transcript assembler with the k-mer length of 25 bp. and the longest transcripts were selected and translated with Transdecoder.

2.4 Project

Project: https://db.cngb.org/search/project/CNP0001756/ Raw data are available in the CNP0001756.

3. Gallery

Cycas panzhihuaensis
Alsophila spinulosa
Dioon spinulosum
Taxus wallichiana var chinensis
Bowenia spectabilis
Ceratozamia robusta
Encephalartos manikensis
Lepidozamia peroffskyana
Macrozamia communis
Microcycas calocoma
Stangeria eriopus
Zamia neurophyllidia
Ginkgo biloba
Gnetum montanum
Liriodendron chinense
Pinus koraiensis

4. Publication

Liu, Y., Wang, S., Li, L. et al. The Cycas genome and the early evolution of seed plants. Nat. Plants (2022). https://doi.org/10.1038/s41477-022-01129-7

5.How to analysis in CODEPLOT

The genome and transcriptome data, genome assemblies and annotations can be analyzed in CODEPLOT,you can clone the Cycad dataset to your workspace,then add OrthoFinder or other tool from Tools to your workspace.

6.News

  1. 真·铁树开花丨战胜恐龙存活至今,它的完整基因组图谱首次发布
  2. Nature Plants封面丨苏铁基因组-现存最原始种子植物基因组详解