Phytozome Database

The Phytozome Database collect JGI-sequenced plant genomes, as well as selected genomes and datasets that have been sequenced .

数据量: 209
更新时间: 2021-02-24

1.Background 背景描述

Phytozome is the Plant Comparative Genomics portal of the Department of Energy’s Joint Genome Institute. As of release v13, Phytozome provides access to more than 200 the sequences and functional annotations of a growing number of complete plant genomes, including all the land plants and selected algae sequenced at the Joint Genome Institute,these data are available to the broader plant science research community。


2.Data description 数据说明

2.1 data processing 数据来源

This Databse collect the latest version Phytozome v13 from official download website Phytozome Downloads, include Includes protein sequences, annotations, and transcripts of 209 plant species.

数据库从官方下载网站收集最新版本的Phytozome v13 Phytozome 下载,其中包括209个植物物种的蛋白质序列、注释和转录本。

Reference 参考文献

David M.Goodstein et al.(2012) ** Phytozome: a comparative platform for green plant genomics **. Nucleic Acids Research. DOI:10.1093/nar/gkr944

2.2 Meta data 元信息表

Field 字段说明
organism 生物名称
species 物种名称
common_name 普通名称
protein 蛋白序列文件
gff3 注释文件
transcript 转录本文件

3. Workflows 工作流程说明

Gene homology discovery using Hidden Markov Models

HMMER is widely used to search homologous protein or nucleotide sequences agianst relevant database using multiple sequence alignment profiles as queries through profile HMM methods. Its major utilizations include searching either a single protein sequence, multiple protein sequence alignment or profile HMM against a target sequence database.

Here, HMMER was implemented to discover all members of a given gene family in the gene coding product datasets generated from the 1000 Plant transcriptomes initiative. Later, we plan to provide more comprehensive datasets for characterizing the diversity of all functional gene families.


HMMER广泛用于在相关数据库中搜索同源蛋白质或核苷酸序列,它基于多个序列比对生成的比对矩阵文件,采用隐马尔可夫模型的算法进行同源基因的鉴定。它的主要用途包括搜索单个蛋白质序列、多个蛋白质序列比对或针对目标序列数据库的使用隐马尔可夫模型进行搜索。 在这里,HMMER的部署是为了搜索由千种植物转录组项目生成的基因编码产品数据集中给定基因家族的所有成员。稍后,我们计划提供更全面的数据集来研究所有功能基因家族的多样性特征。

reference 参考文献

Zhang, Z., Wood, WI. (2003). A profile hidden Markov model for signal peptides generated by HMMER. Bioinformatics. Doi: 10.1093/bioinformatics/19.2.307