|
Author
|
Topic: Non congruent phylogenies
|
Cornelius G. Hunter
Member
Member # 81
|
posted 23. December 2002 12:54
Folks:
An oft cited evidence for evolution is the congruence of phylogenies constructed from different characters. By congruence, evolutionists mean significant similarity in those topologies that best fit the character data.
The logic behind this evidential claim is, I suppose,
1: if evolution is true then different characters will produce congruent phylogenies
2: We observe congruent phylogenies
3: Hence we have evidence for evolution
I think we can safely lay aside any questions of what constitutes statistical significance and whether or not Step 2 above is true. Clearly, there are character/species sets which lead to significantly similar phylogenies.
However, there are also plenty of character/species sets that do not produce congruent phylogenies [some examples are pasted at end of this message]. I wanted to explore the question, which I have never seen addressed in the evolution literature, of precisely what the theoretical statement is in Step 1 above.
That is, if we take the statement literally, as I've interpreted the evolution argument, then any non congruent phylogeny (of which there are many) disproves evolution and we can safely discard it as a falsified theory and move on. Obviously, things are not so simple. Evolutionists are still believing in their theory despite these falsified predictions. So this brings up the question of just what precisely is being state in Step 1.
To be frank, I suspect most (all?) evolutionists have not thought this out, and there is no accepted precise theoretical formulation of Step 1, and that they are "winging it," for example when new non congruent phylogenies are discovered. I have seen several explanatory mechanisms used in such cases, they include:
a] the molecular data may have sufficiently diverged that the comparisons between them are not valid, b] there may be too much noise in the molecular data, c] lateral gene transfer has altered the molecular data, d] the molecular data may be biased by the particular set of species under study. e] Estimation errors can arise from data sets with many species f] molecular data ought not to be equally weighted, [eg, segments influencing protein function should be assumed to be more important than the other segments, or perhaps segments influencing the protein structure should be more heavily weighted, or perhaps hydrophobic segments should be deweighted, etc]
My point is not to say explanatory mechanisms are out of bounds or that complicating factors should not be expected, but merely to raise the question: At what point does the use of these explanatory mechanisms become ad hoc and do we consider the Step 1 in the syllogism falsified? This evidence is prominent in the evolution literature, yet the details of just what is being claimed do not seem to be available. An evolutionist once told me this evidence is "the real kicker" but did not elaborate when I asked for the details.
I suspect any rigorous theoretical justification for Step 1 is not given simply because, in fact, evolutionary theory does not provide that level of resolution. It is more of an over arching framework from which to construct sub hypotheses, but the framework's specificity is vague, allowing for considerable flexibility in those sub hypotheses. Hence, the evolutionist is free to make the statement in Step 1, but then is also free to modify it as needed to fit the actual data.
Any thoughts?
===============================
Examples of non congruent phylogenies:
1] 188 different genes from five different light-harvesting bacteria, each from a different phyla, showed dramatic inconsistencies. In fact, every conceivable phylogeny found support amongst the 188 genes. One could argue for completely different evolutionary histories depending on which genes one selected—the different design features did not converge to the same phylogeny.
[Jason Raymond, et. al., "Whole-Genome Analysis of Photosynthetic Prokaryotes," Science 298 (2002): 1616-1619.]
2] Molecular studies of bats have challenged the traditional phylogeny. Some species of bat have a complicated echolocation system that tracks objects as small as a mosquito by sensing the echoes of the bat’s own squeaks. The bat emits a high-pitch squeak, well beyond the range of human hearing, up to 2,000 times per second. It determines both range and direction to the tiny mosquito by sensing the echo while filtering out echoes from the squeaks of nearby bats. Beyond general speculation evolutionists cannot describe how such a system would have evolved, but it is usually assumed that such a unique and complex system evolved only once. The different species of bat that have echolocation are thought to all derive from the ancestor that first developed the system. But now the new molecular phylogenies call for a different arrangement. If they are correct then echolocation must have evolved more than once, independently, in different bat species.
[Emma C. Teeling, et. al., "Microbat Paraphyly and the Convergent Evolution of a Key Innovation in Old World Rhinolophoid Microbats," PNAS, 99 (2002) 1431-1436.]
3] Mitochondrial DNA. A recent study found that the mitochondrial DNA provided a statistically high-confidence phlyogeny that "was clearly the wrong answer." For example, frogs and chickens were clustered with fish.
[Michael Balter, “Morphologists Learn to Live With Molecular Upstarts,” Science, 276 (1997) 1032-1034.]
IP: Logged
|
|
Frances
Member
Member # 169
|
posted 23. December 2002 21:38
A few comments.
While I do not believe that this posting contributes to any positive insights into intelligent design, it is an interesting topic nevertheless. As Mr Hunter already suggests the lateral gene transfer will make some phylogenies quite tricky. Thus we are in the interesting position that while the general tendency of phylogenies to overlap is strong evidence for evolution, instances in which such congruency is lacking are not necessarily falsifications of the theory of evolution. In fact given the existence of lateral gene transfer, this makes for a powerful explanation of the data. But there may be other reasons for non congruent phylogenies including problems with the data and many other factors. Whether or not these explanations become ad hoc is determined by the actual instances in which these conclusions are reached. There are countless congruent phylogenies but there is surely strong evidence as well of lateral gene transfer. I think Mr Hunter has found a good example in which reality adapts to our knowledge. In this case we seem to have found evidence that leads one to propose lateral gene transfer. Of course that the 'evolutionist is free to modify the statement to fit the actual data' is nothing particular to evolution but applies to any scientific theory.
So I am not sure where Mr Hunter is going with this thread. Perhaps Mr Hunter could be encouraged to formulate his ideas in a more positive manner as to the topic of intelligent design? What would a hypothesis of intelligent design predict for such phylogenies would be an interesting topic. What about ID ala Mike Gene which seems to rely on front loading? Those would be interesting topics to further explore although the topic itself and the examples given are topic of hot discussion within biology. For those interested I believe that the conclusion that 'If they are correct then echolocation must have evolved more than once, independently, in different bat species' is but one possible explanation of the data.
quote:
If microbats are paraphyletic, then laryngeal echolocation either evolved more than once in different microbats or was lost in megabats after evolving in the ancestor of all living bats.
Source
Another interesting Paper on echo location evolution [ 23. December 2002, 21:59: Message edited by: Frances ]
IP: Logged
|
|
yersinia
Member
Member # 324
|
posted 24. December 2002 00:40
I think that several considerations have to be added to Hunter's post before serious discussion can be had.
1) "Congruence" and "noncongruence" are not either/or entities, they a matter of degree. Given N species being analyzed, there are something like (2n-3)!/(2n-2(n-2)!) hypothetically possible ways of arranging them into a tree (Theobald 2002), and the (dis)similarity between two trees can be rigourously quanitified.
This equation will differ slightly depending on whether the trees are rooted vs. unrooted, binary splits only, etc. Regardless, the number of possible trees gets very big very fast: 4 species = 15 possible trees, 8 species = 135,135 possible trees.
You can randomly generate tree diagrams at this cool page (Phylogeny and Reconstructing Phylogenetic Trees) and get the idea very quickly what the odds are of getting the same tree twice by random chance.
So the question is not whether two phylogenies from different data sources/research labs are congruent or incongruent, full stop, the question is how congruent or incongruent are they? Most of the examples touted as showing "incongruence" are actually quite minor phylogenetic disagreements. E.g., the interrelationships of different groups of bats is a pretty trivial issue in the context of vertebrates or animalia. If the microbats grouped most closely with anthropoid apes, and the macrobats with giraffes, then we'd have a significant disagreement. This kind of thing does not happen in multicellular organisms with protected germ line cells, rather different datasets keep returning highly congruent phylogenies.
So, just like any scientific measurement, there will be noise in input data. The analogy here is to radiometric dating: if two measurement dates of a moon rock return ages of 4.6 and 4.5 billion years, this is very minor disagreement relative to the result (100 million years sounds like alot but is only a 2% disagreement). If someone were to go around saying "geological measurements disagree by 100 million years and this is evidence against an old earth" they would be wrong. Similar minor disagreements, such as Teeling et al.'s 2002 bat study, should not be cited as evidence for Hunter's proposition "there are also plenty of character/species sets that do not produce congruent phylogenies". A real disagreement would occur if all of these different bat species did not group together and instead were randomly associated with the outgroup taxa, but as we can see this did not occur:

The odds of all these bat species grouping together by chance are astronomical.
2. Scale of the study and range of dataset
As the age-of-the-moon example points out, what is important in considering disagreement in results is not the absolute measurement, but the size of the disagreement relative to the scale of the study. 100 million years sounds like alot but is peanuts in terms of the age of the earth. Such a disagreement would be major, however, in a radiometric dating of dinosaur bones, and a data source with a smaller error would have to be used.
Radiometric datasets have ranges and scales over which they are useful, due essentially to their rate of decay. You use uranium-lead to date the age of the moon, because it has a half-life of hundreds of millions of years, but it would be ridiculous to use it for dating an archeological artifact because the answer you would get (assuming the artifact was, say, something that had been forged by remelting the ore) would be "0 +/- millions of years". Similarly, the half-life of C-14 is only ~5,000 years, so it is excellent for archeology but for anything older than 50,000 years it is useless (a result of "50,000 years old" for a carbon date essentially means "this sample is between 50,000 and infinite years old"). In the first case, the noise is much larger than the signal, and in the second case the signal is much smaller than the noise (these are slightly different, think about it for a sec.).
With molecular sequences the same factors must be taken into account. I don't currently have access to Hunter's cited Balter (1997), " Morphologists learn to live with molecular upstarts", but I would note that there is apparently a contrasting commentary (Mindell 1997) on that very article from the next month of Science, entitled ""Misleading" molecules?". Probably the basic point is that the particular mtDNA sequences being used evolve too quickly (certain mtDNA sequences are, after all, used for tracing migration patterns within the human species), such that sequence similarity is low and therefore "noise" in the form of mutational biases is larger than the signal. Certainly comparing chickens, amphibians, and fish is a long ways from what one normally sees mtDNA used for, e.g. species within a genus.
(Note in passing: not all mtDNA within a mitochondrion is the same. It's possible that the above study used a very slowly evolving mtDNA sequence and similarity between e.g. birds and fish was high, e.g. >75%. But I doubt it. Let's get the Balter and Mindell articles and see what they say, shall we?)
In summary, anytime one sees a cited "incongruence" they must consider the dataset is appropriate for the scale of the analysis. If sequence similarity is approaching randomness then mutational biases are increasingly important to consider.
3. Actual violation of lineal descent. This is commonly the case for single-celled prokaryotes without protected germline DNA. If you like, the tree hypothesis has been falsified, because it is known and has been observed in the lab that they can trade DNA laterally. But this leaves the evidence for the common descent of e.g. all animals unquestioned. Much more can be said here because LGT is itself a nonrandom process and certainly some things are harder to LGT than others, but this is another topic. If we saw the kinds of disagreements in animals that we have in prokaryotes, as we have no mechanism for significant LGT in animals (viral transfers is about it I think), this would be a significant problem for the common descent theory. But we don't. "Disagreements" that I have seen cited for multicellular critters basically fall into the above categories.
In summary, in answer to Hunter's question,
quote:
My point is not to say explanatory mechanisms are out of bounds or that complicating factors should not be expected, but merely to raise the question: At what point does the use of these explanatory mechanisms become ad hoc and do we consider the Step 1 in the syllogism falsified?
...basically, these explanatory mechanisms are allowed when they themselves are well-supported by available data. We can measure mtDNA rates of change and mutational biases. We can observe and explain why LGT occurs in prokaryotes but not in mammals. We can measure the degree of disagreement between trees and determine if the error is equivalent to 100 million years/4.6 billion years or not.
There is a massive literature on all of this, which is why I'm surprised that Hunter thinks that biologists haven't thought about it. The best introduction to it all is Theobald's FAQ at that talkorigins archive, referenced below. It references a lot of articles with titles like "Testing Common Descent" about the probabilities of hitting on congruent trees by chance.
Refs:
Theobald, Doug. 2002. 29 Evidences for Macroevolution
Teeling, Emma C. et al. 2002 Microbat paraphyly and the convergent evolution of a key innovation in Old World rhinolophoid microbats Proc. Natl. Acad. Sci. USA, Vol. 99, Issue 3, 1431-1436.
(bold added below)
quote:
Molecular phylogenies challenge the view that bats belong to the superordinal group Archonta, which also includes primates, tree shrews, and flying lemurs. Some molecular studies also challenge microbat monophyly and instead support an alliance between megabats and representative rhinolophoid microbats from the families Rhinolophidae (horseshoe bats, Old World leaf-nosed bats) and Megadermatidae (false vampire bats). Another molecular study ostensibly contradicts these results and supports traditional microbat monophyly, inclusive of representative rhinolophoids from the family Nycteridae (slit-faced bats). Resolution of the microbat paraphyly/monophyly issue is essential for reconstructing the temporal sequence and deployment of morphological character state changes associated with flight and echolocation in bats. If microbats are paraphyletic, then laryngeal echolocation either evolved more than once in different microbats or was lost in megabats after evolving in the ancestor of all living bats. To examine these issues, we used a 7.1-kb nuclear data set for nine outgroups and twenty bats, including representatives of all rhinolophoid families. Phylogenetic analyses and statistical tests rejected both Archonta and microbat monophyly. Instead, bats are in the superorder Laurasiatheria and microbats are paraphyletic. Further, the superfamily Rhinolophoidea is polyphyletic. The rhinolophoid families Rhinolophidae and Megadermatidae belong to the suborder Yinpterochiroptera along with rhinopomatids and megabats. The rhinolophoid family Nycteridae belongs to the suborder Yangochiroptera along with vespertilionoids, noctilionoids, and emballonuroids. These results resolve the apparent conflict between previous molecular studies that sampled different rhinolophoid families. An important implication of rhinolophoid polyphyly is independent evolution of key anatomical innovations associated with the nasal-emission of echolocation pulses.
[ 24. December 2002, 00:42: Message edited by: yersinia ]
IP: Logged
|
|
John Bracht
Member
Member # 5
|
posted 24. December 2002 01:18
Hi everyone,
As I was wandering through a display of deugerrotypes at the Nelson Atkins Art museum in Kansas City, KS, last year, I had something of an epiphany when I read the statement "form follows function" used to describe the technological evolution of the camera (the statement was actually painted on a wall and spotlighted for emphasis). This design principle seems highly relevant to a study of protein sequences. The "form" of a protein (i.e., its 3-dimensional shape) must certainly be tightly fitted to its function, since the form of the protein determines its function. Furthermore, this form of a protein is determined by its 1-dimensional amino acid sequence (or the 1-dimensional sequence of nucleotides in DNA, if you prefer). So the amino acid (or DNA) sequence of a given protein should be, on this design principle, highly fitted to its function.
What is the point of all this? Simply this: I suggest that perhaps phylogenies of proteins should be viewed as indicating not descent, but functions. They're not descent trees, they're function trees. In other words, bats cluster together on a tree because their proteins have more similar functions relative to each other than other specie's proteins. The bat clade is actually a "bat function clade". In other words, it seems like a design standpoint would suggest that clusters of species on a tree would suggest that these proteins share very similar (if not identical) form, and hence similar (or identical) function. The upshot is that the different "clades" of a phylogeny may reflect the similar environmental conditions to which a given biochemical system is adapted, rather than descent.
Lateral transfers? Human designers are known for their "laterial idea transfers" all the time (just consider adding computers to cars).
Of course, (and just to acknowledge the sure objection from the critics), these ideas are not applicable to non-functional sequences, since there presumably is no function for the form to "fit" (this is really off-topic here, but I have found that it is difficult to say with certainty that a biological sequence is non-functional, though there are a few examples that seem pretty solid).
If I were to be so bold as to propose a positive interpretation of molecular phylogenies (at least the ones from functional sequences), I'd propose that they are "function trees" rather than "descent trees."
Anyone want to take a stab at suggesting how a "function tree" might be distinguished empirically from a "descent tree"? My mind is too tired to take that one on tonight.
John [ 24. December 2002, 01:34: Message edited by: John Bracht ]
IP: Logged
|
|
Cornelius G. Hunter
Member
Member # 81
|
posted 24. December 2002 01:53
Francis has raised the question of what similarities and differences we should expect of designed objects. Or, in other words, what might a phylogeny look like if we were to apply this evolution perspective to such a set of designed objects.
I think there is much to consider here and I will not attempt to provide a complete response, but I do have a few thoughts. First, I think it is obvious that we would have to have some knowledge of the object functionality map in design space, and I think it is obvious that enormous portions of the design space will yield no function as the design is meaningless (eg, Tab A doesn't fit into Slot B).
Most importantly for our present purposes, we'd need to know how different design parameters, or decisions, are correlated. Again, I think it is obvious that there would be significant correlations. For example, in aircraft design, low-speed aerodynamics is usually correlated with piston engines while super-sonic aerodynamics is usually correlated with turbofan engines. Therefore, we would, to a certain extent, expect to find some character/species sets forming statistically significant phylogenies.
On the other hand, it should be clear that there can be component swapping in designed objects. Again using an example from transportation, aircraft can have wheels or pontoons for landing gear. You can think of many more examples and these characters would violate the phylogenetic topology. From an evolution perspective, they would be modeled as convergent evolution.
In addition to component swapping there is the question of function-to-design degeneracy and vice-versa. In other words, can different functions be accomplished with similar designs and can a common function be accomplished with different designs. I think there is degeneracy in both directions, but it is more restricted in the former. That is, perhaps multiple functions are supported by a single *type of* design but not likely by the precise same design. For example, an earth-mover and an economy car might both have four wheels, piston engine powered by gasoline, transmission, driver's seat and wheel, etc., but there are many differences and the designs are obviouly not highly similar.
Also, there is the possibility of double degeneracy. That is, a function that is supported by multiple designs, and one or more of those designs supporting other designs as well.
This existence of such degeneracies (especially the 1 function to many designs degeneracy) suggests the existence of arbitrary design decisions. I think it is fair to say that evolution has taken a strong position on this. That is, that the existence of arbitrary design is ubiquitous in biology. I would like to make two comments on this:
1) Arbitrary with respect to (wrt) what?
Evolutionists tend to focus exclusively on function as the sole design criterion. But ID need not be so restrictive. For example, the color of a car is probably neutral and therefore arbitrary wrt performance. But it certainly is not an arbitrary design decision for auto designers.
2) Is arbitrary design wrt function ubiquitous in biology?
It is not clear to me that design decisions wrt function is nearly as ubiquitous in biology as evolutionists imply that it is. A popular example is protein sequences, which are degenerate to structure and function. But when we say this we refer to the classic definition of function -- what we may call proximal function. There could be other, more subtle reasons for the protein sequence. Just off the top of my head, proteins could serve as amino acid storage units in addition to their classic function. We are so far from a complete understanding of the cell that I think it is premature to say that protein sequences are significantly arbitrary. I note that in the past century or so, many designs that were thought to be functionless were later found to have function.
So, Francis, there are some thoughts on how design similarities and differences might bear on ID. To summarize,
a) I don't think there is much doubt that ID predicts correlated design decisions, and therefore congruent phylogenies, to put it into evolutionary terms.
b) ID also predicts the feasibility of component swapping, or in evolutionary terms, congruent evolution.
c) Design-function degeneracy is a complicating factor.
Obviously, there is much more work that can be done in these areas, but I would not be surprised if it turns out that ID is far more specific in its predictions than evolution. It seems to me that evolution can allow for a substantial range of results, and this is why I brought this up in my first post.
So, in addition to this ID discussion, I would like also to direct attention back to my first post and again ask the question: what theoretical bounds exist, if any, regarding evolution's predictions about the congruence of phylogenies. How many violations, and of what type and what degree, constitute failure of evolution?
--Cornelius
IP: Logged
|
|
Mike Gene
Member
Member # 149
|
posted 24. December 2002 02:00
John: Anyone want to take a stab at suggesting how a "function tree" might be distinguished empirically from a "descent tree"? My mind is too tired to take that one on tonight.
Think of cytochrome c. The sequences that confer the same basic function are quite numerous. And cytochrome c from one species can function in another species (as seen by the lateral transfer of fungal cytochrome c into higher plant). Thus, function alone doesn't seem to predict any coherent tree for this protein. Yet a pattern exists that is explained by descent. [ 24. December 2002, 02:02: Message edited by: Mike Gene ]
IP: Logged
|
|
Cornelius G. Hunter
Member
Member # 81
|
posted 24. December 2002 02:49
I'm short on time, but wanted to quickly reply to some of Yersinia's points.
1) "Congruence" and "noncongruence" are not either/or entities
True, but they sometimes are. You write: "the interrelationships of different groups of bats is a pretty trivial issue in the context of vertebrates or animalia. If the microbats grouped most closely with anthropoid apes, and the macrobats with giraffes, then we'd have a significant disagreement." This tells me that evolution is quite flexible and can sustain a tremendous quantity of phylogenetic anomaly. As I suggested, this is what I suspected. While you say this bat example is pretty trivial, it has the consequence of causing echolocation to have evolved twice. So here we have an example which not only can we apply the usual statistical measure of significance and confidence to, but we can also see a significant evolutionary consequence.
Of course, with all these examples, we can derive the necessary anomalies (eg, #LGTs required). This would be another way of judging the theory.
You next make an analogy to radiometric dating and conclude, in reference to the bat study, "A real disagreement would occur if all of these different bat species did not group together and instead were randomly associated with the outgroup taxa, but as we can see this did not occur: The odds of all these bat species grouping together by chance are astronomica."
Your use of a random distribution as your null hypothesis tells me, again, that evolution is tremendously flexible. It allows you to make the variations appear vanishingly small, on par with the typical noise found in radiometric dating. This null hypothesis is so generous it would make a great many theories appear solid. I could give many humorous examples, but you get the point.
Also, consider the hypothetical world where on a few rare occasions rocks were observed to fall at half their normal acceleration. With the random null hypothesis a physicist could easily dispose of such measurements as being "in the noise" compared to all the other consistent measurements. Compared to the null hypothesis, his 1/r^2 model would look solid.
2) Scale of the study and range of dataset
I unfortunately cannot locate my copy of the Balter article at the moment on the mtDNA anomalies.
3) Actual violation of lineal descent
Yes, of course LGT is more plausible in bacteria, but the extent required to explain these result from these disparate organisms is far beyond what was imagined. We now need "massive LGT" to explain the data.
Finally, in your concluding comments you write: "There is a massive literature on all of this, which is why I'm surprised that Hunter thinks that biologists haven't thought about it."
I am familiar with a good bit of the literature, but by no means all of it. What I normally see addressed are questions such as: what is the statistical signficance of a given phylogeny? how do different phylogeny results compare? how do the different numerical methods compare? etc. The papers I'm familiar with tend to be operating from within the paradigm (ie, "normal science" to use Kuhn's language) rather than asking the question I asked. It is one thing to question a particular result, it is another to question the over arching theory. In fact, I've only seen one paper that asked this sort of question (Penny, 1982 I believe), and if I'm not mistaken we should now abandon evolution according to that paper.
Finally, you mentioned papers such as: "articles with titles like 'Testing Common Descent' about the probabilities of hitting on congruent trees by chance." Again, as I mentioned above, I don't see how this approach gets to the issue I'm raising. I see it essentially as normal science.
--Cornelius
IP: Logged
|
|
yersinia
Member
Member # 324
|
posted 24. December 2002 04:34
quote:
Your use of a random distribution as your null hypothesis tells me, again, that evolution is tremendously flexible. It allows you to make the variations appear vanishingly small, on par with the typical noise found in radiometric dating.
Dare I note that "this occurred by chance" is a standard null hypothesis practically anywhere anyone does statistics?
The variations in bat phylogeny are vanishingly small in comparison to what is possible. While in radioactive dating there are (being generous) only a few hundred billion years to choose from, the variability possible in comparing 4 (IIRC) gene sequences shared between 29 species is immense. Theobald's FAQ says that for only 26 species there are 5.8 x 10^31 possible trees, yet the number of trees in which all the bats fall on the same branch has got to be a tiny subset (and note that other common results are found, e.g. sisterhood of cows and whales). For 29 sequences it would be something like 10^34 possible trees.
Hunter alludes to the random chance hypothesis being "too generous" (even though this is standard in medicine, court DNA tests, etc.), but he suggests no alternative hypothesis.
One suggestion that is very commonly made (as it has been by John Bracht here) that perhaps the sequences map to function and therefore the similarity in sequences can be explained by design for common functions. Mike Gene pointed out the basic problem with this, to elaborate a bit:
1. The genes that were used (IIRC) serve the same basic biochemical functions in all the species. These kinds of genes are the typical genes used for phylogenies (probably because you find them in one species and they are then easy to find in all the others). They may well be interchangable (and actual experiments like taking vertebrate cytochrome C sequences to replace the native sequence of bacteria (and even yeast IIRC) have proven this logic in some cases.
2. Many DNA sequences yield the same aa sequence (degeneracy, as Hunter mentions)
3. Many aa sequences yield the same protein structure & function (more degeneracy, as Hunter mentions)
4. Different protein structures can, at least sometimes, serve the same function. A good brief page with examples is here:
Convergence
To sum up, there is no reason to think that "sequence maps to function" in any kind of 1-to-1 way. Rather it appears there is a kind of pyramid, with lots of DNA sequences at the bottom, which map to many aa sequences, which map to at least several 3-D structures, which map to a function (excluding all the other functions the same 3-D structure may be used for, another complication).
I should also clarify that common descent makes a stronger prediction than "non randomly similar trees", rather it says that independent datasets will produce (highly statistically significant) congruent trees. The odds of getting a "nonrandom" congruence by chance between trees are about 5% (the usual cutoff for statistical significance in many fields is p-value = 0.05). The p-value for getting all the bats on the same branch over multiple sequences can be calculated, and (not having done the calculation myself) I would predict the p-value of such a result happening by chance is p=1x10^-10 or probably much lower.
(friggin' edits; note that the "less than" bracket is treated as evil HTML by UBB software) [ 24. December 2002, 04:39: Message edited by: yersinia ]
IP: Logged
|
|
Cornelius G. Hunter
Member
Member # 81
|
posted 24. December 2002 13:07
Yersenia has responded with a condescending statement as a means of making his point: "Dare I note that 'this occurred by chance' is a standard null hypothesis practically anywhere anyone does statistics?"
For the record let me say I do not see the choice of null hypothesis as nearly so straightforward and uncontroversial as Yersenia. Nor do I think "one size fits all" type rules for null hypothesis selection are appropriate. In fact, it seems to me that in a great many more cases than people are aware of, defining the null hypothesis is tantamount to defining the truth. For he who defines the null hypothesis, or more generally the conditions of testing, defines the alternatives. This constitutes tremendous power and I have seen cases where, essentially by default, this power is given to the developer of the idea being tested.
In this present case I think random design is not appropriate for testing the theory of evolution. Again, this is a case where the null hypothesis has been selected by those promoting the idea, and with it they wield tremendous rhetorical power, for it allows them to frame the debate from the beginning.
As Yersenia writes: "The variations in bat phylogeny are vanishingly small in comparison to what is possible." Yes, this is precisely my point. Yersenia also objects that I have not presented an "alternative hypothesis" and that "there is no reason to think that sequence maps to function in any kind of 1-to-1 way."
He might want to refer to my previous post in response to Francis as I addressed both these points. First, I presented ID as an alternative hypothesis which clearly does not predict random design (I gave aircraft aerodynamic and propulsion component design examples). I also addressed the question of sequence to function mapping, and the nature of the degeneracy of that map.
On the one hand, I questioned the strong view which evolutionists take (ie, arbitrary design decisions are ubuitious in biology because there is massive design-to-function degeneracy). There is an unfortunate history in evolutionary thinking of rushing to judgement in the face of tremendous uncertainty, and I suspect this may be another example. I think it is clear that this is driven by the evolutionary paradigm. For example, when Robert Weidersheim found nearly 100 vestigial organs in the human body about a century ago, it was because he was an evolutionist and therefore was expecting to make such findings. Today we know of functions for essentially all of his claims. I discussed the very example Yersenia brings up, protein sequences. I would be surprised if future research does not, as with Weidersheim's examples, find functional reasons for sequence variations in homologous proteins.
On the other hand, I noted that evolutionists focus essentially exclusively on function as the sole design criteria. Again, this obviously relates to their paradigm. Unfortunately, they seem unsympathetic to alternative explanations which invoke, at least potentially, other design criteria. I used the color of an auto as an example.
So to summarize, evolutionists justify their use of the random-design null hypothesis on the assumption that arbitrary design decisions are ubuitious in biology. It is quite possible that this assumption is faulty for it (i) assumes knowledge about the nature of the design-function mapping beyond our present state, and (ii) assumes complete knowledge of the design cost function (ie, organism function is the only term) and as such presumes to define the alternative theory (ID) thus framing the debate.
I take the random-design null hypothesis not to be exclusive of evolution but rather to be a fundamental tenet of the theory. As such its use is legitimate. But it provides independent evidence for evolution only to the extent that it can be justified without presupposing evolution to be true.
For my purposes in this thread, the use of the random-design null hypothesis means that evolution is highly flexible and can sustain a wide range of phylogenetic results.
IP: Logged
|
|
Argon
Member
Member # 276
|
posted 24. December 2002 13:13
I am combining responses from several writers in this post. Snippets mostly and I do apologize in advance if the quotes are taken out of context. I've tried to keep to the theme of this thread's topic.
John Bracht writes: quote: [...] What is the point of all this? Simply this: I suggest that perhaps phylogenies of proteins should be viewed as indicating not descent, but functions. They're not descent trees, they're function trees. In other words, bats cluster together on a tree because their proteins have more similar functions relative to each other than other specie's proteins. The bat clade is actually a "bat function clade". In other words, it seems like a design standpoint would suggest that clusters of species on a tree would suggest that these proteins share very similar (if not identical) form, and hence similar (or identical) function. The upshot is that the different "clades" of a phylogeny may reflect the similar environmental conditions to which a given biochemical system is adapted, rather than descent. [...]
John, what you've suggested has been considered and generally discarded years ago. There are numerous examples of protein families with members that have differing functions. Similarly, there are examples of proteins with distinct structures that have similar (convergent) functions. Mike Gene mentions that variants of cytC can operate reasonably well in multiple species. The same is true for many other proteins that have been "crossed into" recipient organisms for which the native versions had been previously knocked out. While "function" can provide a first-pass guesstimate about relatedness, it is still the presumed time since divergence that is the best, overall correlating factor for molecular similarities.
Comparisons of protein folds have also been considered fairly reliable markers for 'deep' studies of molecular relatedness (There are some examples of convergence at this level as well, but the method probably works for the majority of comparisons).
John also writes: quote: [...] Lateral transfers? Human designers are known for their "laterial idea transfers" all the time (just consider adding computers to cars). [...]
The issue was not whether "designers can do that too" but whether horizontal transfer is a known, observed, *natural* mechanism. In fact, it is an observed phenomenon & it is particularly nasty in bacteria. In E. coli, it is estimated the species could have sampled roughly 1-2 megabases of horizontally-transferred DNA over the course of about 100 million years ( JG Lawrence & H Ochman PNAS paper ). Horizontal transfer has been known to occur, though to a far lesser extent, in multicellular eukaryotes as well. So we should not be too surprised to see examples of phylogenetic reconstructions using relatively few molecular characters that sometimes produce inconsistent results.
Cornelius G. Hunter writes quote: [...] I think there is much to consider here and I will not attempt to provide a complete response, but I do have a few thoughts. First, I think it is obvious that we would have to have some knowledge of the object functionality map in design space, and I think it is obvious that enormous portions of the design space will yield no function as the design is meaningless (e.g., Tab A doesn't fit into Slot B). [...]
I can agree with this. However, it is generally accepted that the sequence space of functional proteins far exceeds the space that may be sampled by common descent via modification of functional intermediates. That is to say that a designer with sufficiently advanced technology would not be constrained from implementing very different designs in morphologically similar (or identical) organisms. The DNA sequences encoding proteins are under even fewer constraints -- If we were to hypothesize that a designer created each 'class' of organisms in distinct events (i.e. not descent with modification), then there is essentially no reason to presume that the separately created organisms should map to fairly consistent trees derived from comparative morphologies, molecular comparisons (DNA, polypeptide sequences and protein structural classes), and tracking over time. Thus things tend to hang together in morphology space, multiple sequence spaces and temporal 'space'(?). Functional commonality is an insufficient explanation for the phylogenetic trees observed.
I'd like to add one general comment: Common descent with modification is a mechanism available to a designer. The evidence for common descent is in fact accepted by many of the ID biologists who have had the most impact on the current design movement, including: Mike Gene, Mike Behe, and Michael Denton. Perhaps this is why Bill Dembski has floated ideas such as the intelligent choosing of mutational probabilities to guide evolutionary pathways. Common descent does not necessarily imply naturalistic evolution, although the general acceptance of common descent does provide a useful, common, starting point for discussions between ID and non-ID theorists, IMO. That would mean that everyone would at least be reading from the same page, so to speak.
Ok, one more comment: The issues discussed by CG Hunter in his first post are not unfamiliar to most of the labs studying molecular evolution (or taxonomy in general). Issues of how to rank or weight various characters and how the characters may have been affected over the course of evolution have been vigorously discussed in the literature. The debates still continue and I encourage interested parties to investigate what is available in publications rather than guessing about the motives and methods employed by the researchers in the field.
IP: Logged
|
|
Cornelius G. Hunter
Member
Member # 81
|
posted 24. December 2002 14:11
Can anyone tell me how to do quotes in this editor?
Argon is also bringing up the issue of design-function degeneracy:
"If we were to hypothesize that a designer created each 'class' of organisms in distinct events (i.e. not descent with modification), then there is essentially no reason to presume that the separately created organisms should map to fairly consistent trees derived from comparative morphologies, molecular comparisons (DNA, polypeptide sequences and protein structural classes), and tracking over time. Thus things tend to hang together in morphology space, multiple sequence spaces and temporal 'space'(?). Functional commonality is an insufficient explanation for the phylogenetic trees observed."
I discussed in my earlier posts, this is the strong evolution view on degeneracy. I will not repeat my earlier comments on this. I would like to add that, though any sort of comprehensive mapping out of the function-to-design space seems like a staggering task which is beyond our current level of technology, nonetheless, the strong view can be tested using a genetic engineering experiment that could conceivably be done in the near term. Here is the experiment: For all the major protein families, transplant the protein sequence, for each protein, from a randomly selected species into a given test individual. For example, the genotype of a frog could be altered by swapping in the hemoglobin from a horse, lysozyme from a lamb, cytochrome c from a cow, and so forth. ID, in my opinion, would predict that the frog's performance would degrade as the number of transplants increased. Evolution would predict no appreciable change in performance.
Argon makes a general comment regarding common descent, and even though it strays from the original focus of this thread I think it is helpful and appropriate. He writes:
"Common descent with modification is a mechanism available to a designer. The evidence for common descent is in fact accepted by many of the ID biologists who have had the most impact on the current design movement ..."
No doubt, common descent could be a mechanism used by a designer. At issue, from my perspective, is the tremendous evidential problems with common descent. I my opinion all of the evidences that Darwinists have set forth over the years are weak from a scientific perspective, and when all the evidence which we have is considered together it argues against common descent.
--Cornelius
IP: Logged
|
|
John Bracht
Member
Member # 5
|
posted 24. December 2002 14:38
Hi Mike, everyone,
Thanks for the comments. I guess I didn't make my earlier post very clear about what I meant by "function". I am not referring to a basic function, but to something more subtle: the "tweaking" of a protein to most optimally perform its function in its given environment. Obviously, proteins can be transferred between organisms and often perform fairly well. But it will usually not perform optimally, as the native protein would. As Argon noted, "variants of cytC can operate reasonably well in multiple species"--but presumably not optimally.
Perhaps an example would help. Let's return to the cytochrome C example in Bats. I am sure bats have very specialized metabolic requirements given their active nocturnal lifestyle. It makes sense to me that their cytochrome C might be optmized to transfer electrons more quickly (though maybe less efficiently overall, if there's some sort of trade-off) in order to boost metabolic rate. Perhaps the engineering trade-offs are shifted in one direction versus the other given the needs of the organism and its lifestyle. Other organisms might have a greater need for efficiency of electron transfer and their cytochrome C molecules might be fine-tuned for that end, at the expense of speed (again, I'm just speculating here). I'm sure there are other engineering constrants and trade-offs that could be tweaked to fit an organism to its environment, though it might be something that requires some careful investigation to really tease out. As Mike Gene often points out, the evidence may not be black and white, but there may be subtle hints that suggest differently optimized proteins across phylogenetic lines. And it seems to me that natural selection would be an ideal mechanism to achieve this fine-tuning of organisms's DNA (hence, protein) sequences to optimally fit the environment.
With regards to the distinction between "descent" trees vs. "function" trees, I think it is not a black-and-white distinction. The two may in fact be the same--a tree of descent that will also reflect the specialized functions that various proteins are optimized to perform. However, I just want to make the point that what a phylogenetic tree shows, fundamentally, are functional relationships, not necessarily descent relationships (though they may show both). I'm bringing a slightly different perspective that emphasizes the fact that the phylogenetic patterns of life aren't a prediction exclusively of (non-teleological) Darwinian theory, but also fit well within a design framework (whether it incorporates common descent or not).
Hope that helps.
John
IP: Logged
|
|
Argon
Member
Member # 276
|
posted 24. December 2002 16:21
Hello. Cornelius G. Hunter writes: quote: [...] I discussed in my earlier posts, this is the strong evolution view on degeneracy. I will not repeat my earlier comments on this. I would like to add that, though any sort of comprehensive mapping out of the function-to-design space seems like a staggering task which is beyond our current level of technology, nonetheless, the strong view can be tested using a genetic engineering experiment that could conceivably be done in the near term. Here is the experiment: For all the major protein families, transplant the protein sequence, for each protein, from a randomly selected species into a given test individual. For example, the genotype of a frog could be altered by swapping in the hemoglobin from a horse, lysozyme from a lamb, cytochrome c from a cow, and so forth. ID, in my opinion, would predict that the frog's performance would degrade as the number of transplants increased.
Evolution would predict no appreciable change in performance.[...]
Respectfully, I don't agree with the predictions. Without reference to a particular mode of design implementation and planning I do not understand how one prediction would be favored over another. As for the notion of evolution 'predicting no appreciable change in performance', I also cannot see the justification for that conclusion.
For starters, both the 'fine-tuning' ID and naturalistic evolutionary schemes would 'predict' some degree of variations in performances between swapped parts. That is because both assume that interactions between components are optimized to various amounts. WRT evolutionary optimization, see John Bracht's reply above.
With regard to ID, I could make the completely opposite prediction that swapping large components would have little effect on function. Remember John Bracht mentioned that human designers are noted for lateral transfers. If a biotic designer employed similar practices, then it might be practical to make similar components interchangeable. After all, once a functional module is solved in one organism, it could be simple to redeploy the same module in later creations. In fact human design employs a large degree of component standardization and direct interchangeability.
Now about the test you've proposed: Unfortunately, as simple as such tests may appear, they do not actually address the question of how many active proteins with the same function exist in "sequence space". That is because it only searches the tiny subset of extant, available proteins. Protein engineering does extend the range a tiny bit but even in many of these cases, active distinct proteins have been found.
However, this has to some degree already been performed over the course of about 25-30 years of molecular genetic engineering. A classic strategy for cloning, expressing & characterizing genes from other organisms is to transfer the sequences to a vector and introduce it into organisms like E. coli and yeast -- ideally, strains for which the native version of the gene was knocked out. Many of these cloned, foreign genes complemented their bacterial or eukaryotic counterparts just fine. Was the complementation perfect in every case? Of course not, but many did quite well.
There was even a case published by Mike Behe about the modification of a highly conserved series of residues on a histone protein. This region of the protein was tightly conserved - essentially identical in many distantly related species. Because of the conservation, it was assumed that this particular sequence had an essential role in histone function. Mike altered this 'invariant' region and returned the clone into yeast. It worked fine and had no obvious growth defects. Other organisms carrying these variations were later found.
Even more experiments of this sort have been performed at the DNA sequence level. Note that phylogenetic patterns aren't just limited to amino acid sequences but can also be established with DNA sequences where the function mapping is far less tight. The trees also tend to overlap each other. There have been many experiments in which sequences have been changed to introduce new restriction enzyme sites or to alter the length of proteins. Sometimes this was to add markers like b-galactosidase, maltose binding protein, or his-tags to the recombinant protein. In many, if not most of these cases, fully functional enzymes were expressed that worked in the cells.
The basic conclusion is that although the exact numbers of functional sequences are unknown, such a count is not actually necessary. What is important are the findings 1) That the number of alternative sequences appear to exceed the actual number of sequences that are found in any particular species, and 2) That the alternate functional sequences are sufficiently different to significantly change the determined phylogenetic tree.
And later, Cornelius G. Hunter writes: quote: [...] No doubt, common descent could be a mechanism used by a designer. At issue, from my perspective, is the tremendous evidential problems with common descent. I my opinion all of the evidences that Darwinists have set forth over the years are weak from a scientific perspective, and when all the evidence which we have is considered together it argues against common descent.[...]
I encourage careful use of wording. "Darwinists" are not the only ones who support the notion of common descent. Many ID correspondents (especially, many of the biologists) on this board also support common descent; their issues are with "Darwinism' as a sufficient mechanism, not common descent with modification as a mode of design implementation. There are also people who support various degrees of 'direct introduction' combined with the subsequent modification of introduced forms. Obviously for people there is a difference of opinion about evidence that supports a continuum of ideas about the extent of common descent.
Personally, I think all the evidence, when considered together points to the common descent of eukaryotes and some form of relationship with archeabacteria and eubacteria that is somewhat muddled with what could be large amounts of horizontal transfer. There appear to by many clear phylogenies in bacteria although the actual root -- if there is one -- is quite obscure. But YMMV. [ 24. December 2002, 16:40: Message edited by: Argon ]
IP: Logged
|
|
Argon
Member
Member # 276
|
posted 24. December 2002 16:39
Oops... Sorry, I'm exceeding the reasonable limit of posts per day...
John Bracht writes: quote: [...] I'm bringing a slightly different perspective that emphasizes the fact that the phylogenetic patterns of life aren't a prediction exclusively of (non-teleological) Darwinian theory, but also fit well within a design framework (whether it incorporates common descent or not).[...]
There is no doubt that a design framework can be fit to patterns of nested hierarchies. But what do our observations inform us about the likely mode of design on this particular planet?
A design framework that doesn't incorporate common descent has a difficult time explaining the relationships. Remember, morphologies, protein structure, protein sequences, DNA sequences and *time since separation/'creation'* all tend to produce overlapping trees. The 'time since separation' aspect is particularly difficult to square without descent + modification as a chief mode of design. The overlapping trees are also difficult to harmonize without common descent.
RE: The optimality of cloned cytC in other organisms...
That's a hard case to make. It is possible that current verions of cytochome-C that bats carry are not actually optimal for all bats -- at least not in all conditions. In cultures of cells, one can sometimes tweak the conditions to favor one variation over another. The classical work in bacterial genetics from the 1940's to the 1960's exploited these methods quite ingeniously. I agree that natural selection can optimize sequences but in many cases, I think it would be difficult to make a case that a sufficiently optimized sequence would consistently map to the same portion of a phylogenetic tree. Selection could also optimize a phylogenetically distinct sequence. And then there is the question of how optimal is enough? Often, optimization is related to inflexibility.
Where the heck did the thread about chemostats go? A perfect means of testing optimality, if only under a highly defined, limited range of conditions. Hmm... They are easy to make. [ 24. December 2002, 16:40: Message edited by: Argon ]
IP: Logged
|
|
Cornelius G. Hunter
Member
Member # 81
|
posted 24. December 2002 17:43
Argon wrote: "Oops... Sorry, I'm exceeding the reasonable limit of posts per day..."
Hmm, I didn't know there was a limit. If so, I must be exceeding mine too. By the way, can you tell me how to do quotes? Thanks.
Argon wrote: "Respectfully, I don't agree with the predictions. ... naturalistic evolutionary schemes would 'predict' some degree of variations in performances between swapped parts. That is because [it assumes] that interactions between components are optimized to various amounts."
Argon, I'm having a little bit of difficulty tracking your reasoning. I thought you, earlier, were making the argument that the fitness landscape across the sequences we observe in a protein family was neutral, and that this is explained by evolution but not by ID. Now you seem to be saying the landscape is not neutral, and that selection would play a role in determining the extant sequences we observe. Can you clarify a bit (I don't mind if you write many posts per day so long as they are of good quality)?
Argon wrote: "With regard to ID, I could make the completely opposite prediction that swapping large components would have little effect on function. Remember John Bracht mentioned that human designers are noted for lateral transfers. If a biotic designer employed similar practices, then it might be practical to make similar components interchangeable."
Of course component swapping is feasible, I would use the marsupial-placental convergence as an example. But I would distinguish that from a case where gradual changes are observed. If you have two cars which are slightly different, and the wheel diameter is, likewise, slightly different, this is not an example of component swapping. If you have two similar cars, one with a diesel engine and the other with a gasoline engine, then this is such an example. My feeling is you are not using the component swapping mechanism appropriately when you argue hemoglobin proteins can be swapped. This is not to say they *cannot* be swapped, but it seems to me it is a case of very similar design, so similar that interchange is possible.
Argon wrote: "Unfortunately, as simple as such tests may appear, they do not actually address the question of how many active proteins with the same function exist in 'sequence space'."
I agree, that was not my intent. I presented the experiment as something that would be feasible in the near term, not a panacea to our quandary.
Argon wrote: "There was even a case published by Mike Behe about the modification of a highly conserved series of residues on a histone protein. This region of the protein was tightly conserved - essentially identical in many distantly related species. Because of the conservation, it was assumed that this particular sequence had an essential role in histone function. Mike altered this 'invariant' region and returned the clone into yeast. It worked fine and had no obvious growth defects. Other organisms carrying these variations were later found."
Unfortunately this does not address what my experiment is intending to address. In my experiment, you would transplant designs that are *not* found in nature (ie, a frog with a horse's hemoglobin), and you would repeat, adding a new transplant at each step and evaluating performance.
Argon wrote: "What is important are the findings 1) That the number of alternative sequences appear to exceed the actual number of sequences that are found in any particular species, and 2) That the alternate functional sequences are sufficiently different to significantly change the determined phylogenetic tree."
Much hard work and useful results, no doubt; however, I'm not sure I understand what importance, in particular, you are attaching to them.
Argon wrote: "I encourage careful use of wording. "Darwinists" are not the only ones who support the notion of common descent."
I appreciate your concern, however, I chose my words carefully. My point was that Darwinists have been at this for many years and have crafted the best arguments possible. I appreciate the fact that there are other interpretations of common descent.
--Cornelius
IP: Logged
|
|
|