|
Author
|
Topic: ancient genome duplication in yeast
|
Matthew J. Brauer
Member
Member # 819
|
posted 17. March 2004 14:36
Nature advanced online publication 7 March 2004
quote:
Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae
MANOLIS KELLIS1,2, BRUCE W. BIRREN1 & ERIC S. LANDER1,3
1 The Broad Institute, Massachusetts Institute of Technology and Harvard University, Cambridge, Massachusetts 02138, USA 2 MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts 02139, USA 3 Whitehead Institute for Biomedical Research, Cambridge, Massachusetts 02139, USA
Correspondence and requests for materials should be addressed to M.K. (manoli@mit.edu) and E.S.L. (lander@broad.mit.edu). The GenBank accession number for K. waltii is AADM01000000.
Abstract Whole-genome duplication followed by massive gene loss and specialization has long been postulated as a powerful mechanism of evolutionary innovation. Recently, it has become possible to test this notion by searching complete genome sequence for signs of ancient duplication. Here, we show that the yeast Saccharomyces cerevisiae arose from ancient whole-genome duplication, by sequencing and analysing Kluyveromyces waltii, a related yeast species that diverged before the duplication. The two genomes are related by a 1:2 mapping, with each region of K. waltii corresponding to two regions of S. cerevisiae, as expected for whole-genome duplication. This resolves the long-standing controversy on the ancestry of the yeast genome, and makes it possible to study the fate of duplicated genes directly. Strikingly, 95% of cases of accelerated evolution involve only one member of a gene pair, providing strong support for a specific model of evolution, and allowing us to distinguish ancestral and derived functions.[/b]
I'll come back to this when I've finished reading the paper. For now I hope this abstract piques some interest. [ 17. March 2004, 20:44: Message edited by: Moderator ]
IP: Logged
|
|
Pim van Meurs
Member
Member # 541
|
posted 17. March 2004 23:42
MB: I'll come back to this when I've finished reading the paper. For now I hope this abstract piques some interest.
It surely does. Let us know. I will see if I can get my hands on a copy
Evolution with a purpose
Captures some of the observations by other scientists namely evolvability.
Having read the paper I understand your interest. It seems also highly relevant to ID namely the conclusion
quote:
WGD followed by massive gene loss and gene specialization offers an important path for large-scale evolutionary innovation. Compared to multiple independent duplications and divergence of individual genes or segments, WGD may be more efficient and may offer great opportunities for coordinated evolution. Although organisms clearly can undergo WGD resulting in complete polyploids (as evidenced by existing tetraploid species of plants and animals 34–37), it has been unclear whether WGD can then be followed by massive genome reshaping to yield diploids with expanded gene content. Our analysis of K. waltii definitively proves that S. cerevisiae is the descendant of an ancient WGD, as originally proposed by Wolfe 3 on the basis of subtle genomic patterns. Comparison with K. waltii allows rigorous study of the many evolutionary innovations that arose in this dramatic event. We note that a similar analysis of the Ashbya gossypii genome has been carried out (P. Philippsen, personal communication). The results here suggest that it may also be fruitful to search for similar genomic signatures of ancient WGD in other organisms 38–40. It will be interesting to see just how far such distant echoes of genomic upheaval may be traced.
[ 18. March 2004, 11:03: Message edited by: Pim van Meurs ]
IP: Logged
|
|
Pim van Meurs
Member
Member # 541
|
posted 20. March 2004 15:23
Gene duplication, robustness and evolvability are of quite some interest to me. The Santa Fe institute has a page describing one of their projects by Fontana
quote: How does robustness to mutation arise through evolution? The simplest naturally occurring object capable of evolution is a single polymer molecule, known as RNA. An important fact about RNA folding is that many sequences do not change their shape when certain positions are mutated. Fontana and colleagues discovered (computationally) that sequences with the same shape are organized into mutationally connected networks called ``neutral networks''. An evolving population can drift on such a neutral network until a more advantageous shape comes within mutational reach. This enables evolutionary change to occur that otherwise would have been impossible. In this way neutrality is seen to be both a buffer against change and an enabler of change.
Fascinating to me it is not just divergence but also convergence which seems to exhibit scale free features
quote:
By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix–loop–helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.
Convergent evolution of gene networks by single gene duplications in higher eukaryotes EMBO reports 5, 3, 274–279 (2004) Gregory D Amoutzias, David L Robertson, Stephen G Oliver & Erich Bornberg-Bauer
Data on evolution of protein and RNA networks seems to more and more reveal how relatively simple processes can lead to robustness AND evolvability through the existence of (scale free networks with) neutral pathways. Toussaint's thesis work which provides a theoretical foundation for exploring these concepts seems to be quite timely.
See Evolving protein interaction networks through gene duplication, Romualdo Pastor-Satorras, Eric Smith and Richard Sole in Journal of Theoretical Biology 222 (2003) 199–210
quote:
Abstract The topology of the proteome map revealed by recent large-scale hybridization methods has shown that the distribution of protein–protein interactions is highly heterogeneous, with many proteins having few edges while a few of them are heavily connected. This particular topology is shared by other cellular networks, such as metabolic pathways, and it has been suggested to be responsible for the high mutational homeostasis displayed by the genome of some organisms. In this paper we explore a recent model of proteome evolution that has been shown to reproduce many of the features displayed by its real counterparts. The model is based on gene duplication plus re-wiring of the newly created genes. The statistical features displayed by the proteome of well known organisms are reproduced and suggest that the overall topology of the protein maps naturally emerges from the two leading mechanisms considered by the model.
it's hard to keep up with these fascinating findings.
Oh yes, Toussaint's thesis and other relevant papers.
M. Toussaint (2003): The evolution of genetic representations and modular neural adaptation. Submitted (in April) version of my PhD thesis; a revised, official version published as a booklet will follow late this year.
How do these findings tie in with ID? Given the theoretical foundations for the evolution of genetic representations can we trace how CSI, complexity and information arises in the genome?
Toussaint argues
quote: Eventually, this allows an information theoretic interpretation of evolutionary dynamics: The information that is given by the selection or non-selection of solutions is implicitly accumulated by evolutionary dynamics and exploited for further search. This information is stored in the way phenotypes are represented. In that way evolution implicitly learns about the problem by adapting its genetic representations accordingly.
p. 15 Thesis
Mark's approach captures how the genome accumulates information quote:
In terms of information theory, sigma-evolution minimizes the Kullback-Leibler divergence between the exploration distribution and the exponential (Boltzmann) fitness distribution, and minimizes the entropy of exploration. It describes the accumulation of information given by the fitness distribution into genetic representations.
Mutations and crossovers are explorative in nature but because of this, they reduce mutual information, EDA's (estimation of distribution algorithms) however preserve or amplify structural information. According to Toussaint, the evolutionary process is a succession of increase and decrease in entropy in the population.
Toussaint concludes
quote:
As a benefit we can formalize the relation and conceptual similarity between natural evolution, evolutionary algorithms, and the generic heuristic search scheme that we introduced as a basic information theoretic paradigm of what it means to “learn the structure of a problem and exploit it for future exploration.” All three of them are based on a process of accumulating information—in a more or less explicit way. Concepts of self-adaptation or derandomized adaptation of probabilistic search strategies in evolutionary computation become comparable to the evolution of exploration strategies in natural evolution.
Toussaint also discusses the much referenced 'No Free Lunch theorems' and concludes
quote:
The theorem was formalized and generalized in many ways. Schuhmacher, Vose, & Whitley (2001) generalized it to be valid for any subsets of problems that are closed under permutation. Igel & Toussaint (2003c, 2003b) derived sucient and necessary conditions for No Free Lunch when the probability of permutations is not equally but arbitrarily distributed. In that paper we also argued that generally the necessary conditions for No Free Lunch are hardly ever fulfilled.
The latter one is quite relevant. Toussaint states 'Only when making this assumption, evolutionary search has a chance to learn about and exploit this structure of the distribution of good solutions and efficiently explore the space of solutions."
The question now becomes how evolution makes these assumptions?
Toussaint again
quote:
However, in natural evolution mutation operators are not designed by some intelligence. A central question arises: What does it mean to “learn” about the problem structure and exploit it? How in principle can evolution realize this? The answer we will give is that the implicit process of the evolution of genetic representations allows for the self-adaptation of the “search strategy” (i.e., the phenotypic variability induced by mutation and recombination). To some degree, this process has been overlooked in the context of evolutionary algorithms because complex, non-trivial (to be rigorously defined later) genetic representations (genotype phenotype mappings) have been neglected by theoreticians. This chapter tries to fill this gap and propose a theoretical framework for evolution in the case of complex genotype-phenotype mappings focusing at the evolution of phenotypic variability. The next section lays the first cornerstone by clarifying what it means to learn about a problem structure.
This seems quite relevant to Dembski's objections
quote:
What's more, the evolutionary algorithms employing these fitness functions were "no prior knowledge" algorithms. "No prior knowledge" simply means that the algorithm has no additional information for finding a solution other than what it gets from the fitness function. In general, arbitrary, unconstrained, maximal classes of fitness functions each seem to have a No Free Lunch theorem for which evolutionary algorithms cannot, on average, outperform blind search.
Evolution's Logic of Credulity: An Unfettered Response to Allen Orr
Dembski then continues
quote:
To be sure, fitness in biology varies with time. As organisms evolve and the environment changes, what the environment deems fit changes as well. But what exactly constrains the transition from one fitness landscape or function to the next? If there is no constraint, then we are in the position of Wolpert and Macready's Theorem 2, with evolutionary algorithms proceeding independently of their progress to solution and thus unable to outperform blind search (which means that even with 3.5 billion years of evolution, it's going to be vastly improbable that the evolutionary algorithm approaches a solution). Conveniently, Orr doesn't tell us what constrains the transitions. Presumably nature, unprogrammed and unguided, spontaneously gives rise to the right and needed transitions between successive fitness landscapes, thereby ensuring a form of complexity-increasing evolution. But that is precisely what needs to be explained. Yet for Orr there is no problem, only boundless optimism.
Toussaint may provide the answers to Dembski's quetions here. Boundless optimism may have been justified here.
This also brings me to the logical question about Dembski's displacement argument, IF evolutionary algorithms can increase CSI in the genome through displacement of said CSI from its environment how is this different from CSI displaced by intelligent design when using artificial selection? Using the Wesley Algorithm room, I would argue that it is impossible to differentiate between the two cases. [ 20. March 2004, 15:51: Message edited by: Pim van Meurs ]
IP: Logged
|
|
Pim van Meurs
Member
Member # 541
|
posted 22. March 2004 12:44
It was pointed out to me that I may have spoken too hasty. Toussaint is dealing in terms of information and complexity making most sense to evolution namely Shannon information. Such definitions of complexity and information need to be distinguished from Dembski's probability arguments. In other words, once a natural pathway exists, CSI will disappear, in other words CSI is only present when we have no regularity/chance pathways identified. In other words, this is not an issue of creating or displacing CSI merely the issue of destroying apparant CSI, which was found to be a placeholder for incomplete probability estimates for regularity/chance pathways. Slowly but steadily I am absorbing the rich depths of the contributions of Wein, Perakh and Elsberry. All because I was confused by the use of the terms complexity and information to describe probabilities. ![[Frown]](frown.gif) [ 22. March 2004, 12:45: Message edited by: Pim van Meurs ]
IP: Logged
|
|
Matthew J. Brauer
Member
Member # 819
|
posted 30. March 2004 23:57
Anyone besides Pim care to comment on this?
IP: Logged
|
|
|