Drosophila gene expression pattern annotation through multi-instance multi-label learning.

PMID:21519115

IF: 3.702

Cited by: 42

Abstract

In the studies of Drosophila embryogenesis, a large number of two-dimensional digital images of gene expression patterns have been produced to build an atlas of spatio-temporal gene expression dynamics across developmental time. Gene expressions captured in these images have been manually annotated with anatomical and developmental ontology terms using a controlled vocabulary (CV), which are useful in research aimed at understanding gene functions, interactions, and networks. With the rapid accumulation of images, the process of manual annotation has become increasingly cumbersome, and computational methods to automate this task are urgently needed. However, the automated annotation of embryo images is challenging. This is because the annotation terms spatially correspond to local expression patterns of images, yet they are assigned collectively to groups of images and it is unknown which term corresponds to which region of which image in the group. In this paper, we address this problem using a new machine learning framework, Multi-Instance Multi-Label (MIML) learning. We first show that the underlying nature of the annotation task is a typical MIML learning problem. Then, we propose two support vector machine algorithms under the MIML framework for the task. Experimental results on the FlyExpress database (a digital library of standardized Drosophila gene expression pattern images) reveal that the exploitation of MIML framework leads to significant performance improvement over state-of-the-art approaches.

Keywords

Temporal Spatial Gene Expression

Temporal Spatial Anatomic

MeSH terms

Animals

Computational Biology

Databases, Factual

Drosophila

Embryo, Nonmammalian

Gene Expression Regulation, Developmental

Molecular Sequence Annotation

Support Vector Machine

Authors

Li, Ying-Xin

Ji, Shuiwang

Kumar, Sudhir

Ye, Jieping

Zhou, Zhi-Hua

Recommend literature

1. Learning sparse representations for fruit-fly gene expression pattern image annotation and retrieval.

2. Image-level and group-level models for Drosophila gene expression pattern annotation.

3. Automated annotation of Drosophila gene expression patterns using a controlled vocabulary.

4. Automated annotation of developmental stages of Drosophila embryos in images containing spatial patterns of expression.

5. AnnoFly: annotating Drosophila embryonic images based on an attention-enhanced RNN model.