ISCID News Editor
Member # 1417
posted 10. October 2005 10:53
Protein Molecular Function Prediction by Bayesian Phylogenomics
Source: PLoS Computational Biology
Volume 1 | Issue 5 | OCTOBER 2005
Barbara E. Engelhardt, Michael I. Jordan, Kathryn E. Muratore, Steven E. Brenner
The post-genomic era has revealed the nucleic and amino acid sequences for large numbers of genes and proteins, but the rate of sequence acquisition far surpasses the rate of accurate protein function determination. Sequences that lack molecular function annotation are of limited use to researchers, so automated methods for molecular function annotation attempt to make up for this deficiency. But the large number of errors in protein function annotation propagated by automated methods reduces their reliability and utility.
Most of the well-known methods or resources for molecular function annotation ... rely on sequence similarity, such as a BLAST E-value, as an indicator of homology. A functional annotation is heuristically transferred to the query sequence based on reported functions of similar sequences.
SIFTER (Statistical Inference of Function Through Evolutionary Relationships) takes a different approach to function annotation. Phylogenetic information, if leveraged correctly, addresses many of the weaknesses of sequence-similarity-based annotation transfer, such as ignoring variable mutation rates. Orthostrapper and RIO provide examples of methods that exploit phylogenetic information, but these methods simplify the problem by extracting pairwise comparisons from the phylogeny, and by using heuristics to convert these comparisons into annotations. SIFTER is a more thoroughgoing approach to automating phylogenomics that makes use of a statistical model of molecular function evolution to propagate all observed molecular function annotations throughout the phylogeny. Thus, SIFTER is able to leverage high-quality, specific annotations and to combine them according to the overall pattern of phylogenetic relationships among homologous proteins.
Other approaches, referred to as context methods, predict protein function using evolutionary information and protein expression and interaction data [21–26]. These methods provide predictions for functional interactions and relationships. They complement detailed predictions from SIFTER and the sequence-based approaches mentioned above, which predict features that evolve in parallel with molecular phylogenetic relationships, such as molecular function.
[Emphases added by ISCID News Editor]
[Link-underlined terms indicate linked entry in ISCID Encyclopedia of Science and Philosophy as added by ISCID News Editor]
Editor: Jonathan Eisen, The Institute for Genomic Research, United States of America
Received: May 4, 2005; Accepted: August 29, 2005; Published: October 7, 2005
Copyright: © 2005 Engelhardt et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Abbreviations: AMP, adenosine-5′-monophosphate; DAG, directed acyclic graph; EC, Enzyme Commission; GO, Gene Ontology; GOA, Gene Ontology annotation; LDH, lactate dehydrogenase; MDH, malate dehydrogenase; ROC, receiver operating characteristic
* To whom correspondence should be addressed. E-mail: email@example.com
Citation: Engelhardt BE, Jordan MI, Muratore KE, Brenner SE (2005) Protein Molecular Function Prediction by Bayesian Phylogenomics. PLoS Comput Biol 1(5): e45