|
Author
|
Topic: Selection Acting Directly on Genes
|
warren_bergerson
Member
Member # 262
|
posted 23. September 2002 16:22
Evan,
Sorry for the delay in getting back to you. You are asking some very specific and very relevant questions. I am developing a detailed answer which I should have ready tomorrow.
Per,
As near as I can tell, you have your facts on the wrong side of the argument. Does introductory genetics covers calculating selection rates per potential allele?
Francis,
You are, IMO, raising some interesting issues relevant to ‘the theory of knowledge’ but not really relevant to the subject here. If I get some time I might start a thread on the theory of knowledge.
IP: Logged
|
|
peroxisome
unregistered
|
posted 23. September 2002 20:39
Hi Warren quote: EVIDENCE FOR GSH The body of evidence supporting GSH appears very clear and unambiguous, and is readily available for verification. The step by step evidence is: {1-7}
I just made the point that step 5 of your chain of "evidence" for GSH is pure nonsense. Specifically that you cannot claim that "mutation rates" and "selection rates" allow you to suggest a clear pattern of mortality or infertility; even if you bothered to say what your mutation and selection rates are. Given that point 5 is blatantly wrong, it follows that points 6 and 7 are unfounded. Therefore, your conclusion- that there are other mechanisms operating- is wrong.
quote: As near as I can tell, you have your facts on the wrong side of the argument. Does introductory genetics covers calculating selection rates per potential allele?
On the contrary, I have specifically addressed a point which you identify as clear and unambiguous evidence for GSH, and shown it to be gibberish.
I think Evan has already made the point that you have been less than clear in what your rates are, and I would agree with him that some of your statements are either imprecise to the point of having no value, or frankly wrong. I look forward to your calculations of the "selection rate per potential allele" for your average hypervariable VNTR.
Finally, I cannot help but notice that there seems to be a recurring trend that some participants seem unwilling to make even the most cursory examination of relevant biological data when making biological hypotheses.
yours per
IP: Logged
|
|
Frances
Member
Member # 169
|
posted 24. September 2002 04:39
Warren, I am confused how you can claim that my comments are relevant for "the theory of knowledge". All I am asking for is supporting evidence for your claims. Nothing theoretical about that, just 'simple' supporting evidence. Let's keep it simple.
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 24. September 2002 12:37
Evan,
You asked for specific details on steps 1 to 3 of the analysis used to justify formulating GSH. I assume from your question you are trying to understand the logic underlying steps 1 to 3 and trying to determine if the logic used is valid. A valid and appropriate question to raise with respect to unfamiliar forms of analysis.
From my perspective, and from the perspectives of individuals with a working knowledge of the specific techniques used, steps 1 to 3 describe an obvious and essentially trivial analysis which happens to produce useful results. I interject these comments on perspective, because from my perspective, the issue you raise is not simply ‘explain steps 1 to 3’ but really three questions 1) Why doesn’t Evan understand the technique used? 2) How can the technique be explained so Evan understands it? And 3) Is Evan really interested in understanding step 1 to 3?
Given both the specificity of your comments and your attempt to analyze the procedures used, it is reasonable to conclude that you are actually interested in understanding the analysis. All too often the mantra of ‘provide proof, provide proof’ is simply a devise to disrupt and avoid discussion of technical issues.
The question of ‘why don’t they understand?’ or more specifically ‘Why, apparently, don’t geneticists understand the basic/elementary technique for quantifying selection rates used in steps 1 to 3?" . Since geneticists are interested in explaining the processes responsible for producing genetic change, and since they recognize that change is due to the interaction of ‘diversity creating processes’(mutation) and ‘diversity eliminating processes’(selection) why wouldn’t they be interested in techniques that allowed them to quantify selection rates?" The techniques underlying steps 1 to 3 involve ‘some deceptively complex mathematical techniques from the relatively obscure field of actuarial science’. If you have no specific training in what in my day was called ‘measurement of mortality’, then it not surprising that you personally are not familiar with the technique.
I apologize for bringing in what may seem like irrelevant detail, but you should appreciate that in order to ‘explain something’, you need to form some idea of what the explainees already know and don’t know. Part of the problem with explaining steps 1 to 3 is that, apparently, most people with a background in evolutionary biology are not familiar with actuarial multiple decrement methods. Lack of familiarity with actuarial techniques, is, however, only one of the issues that needs to be addressed in explaining steps 1 to 3.
The other part of the problem is that individuals with a background in evolutionary biology have been taught that it is impossible or impractical to create models which simulate and predict the overall genetic change occurring in a genome. Since it is known that genetic change is the result of the interaction of variance and selection, and since we have a reasonable level of knowledge of mutations and mutation rates, the impracticality of simulating genomic change must logically be due to the impossibility/impracticallity of measuring selection rates. It therefore seems very likely that you, and many individuals with a background in evolutionary biology, are going to have a hard time understanding and accepting a ‘simple technique for measuring selection rates’, because you have been taught that measuring these rates is impossible or impractical.
[People incapable of looking introspectively at how and what they learn will find this discussion irrelevant (and probably irritating). If, however, you are trying to explain something ‘everybody knows is impossible’, you first need to understand why everybody knows/believes it is impossible. ]
Given this background material, we can now turn to addressing the issue your raise:
EXPLAINING STEPS 1 TO 3 The explanation of steps 1 to 3 is offered here in three parts: 1)structuring the problem, 2)performing analysis and 3)interpreting results. Understanding the analysis is steps 1 to 3 involves much more than just ‘seeing the calculations’.
STRUCTURING THE ANALYSIS There is a tendency by some individuals to believe that all analysis, or all genetic analysis, must be structured using a standard format. In particular, there appears to be a belief that all analysis must be structured in a format compatible with ‘publishing a peer reviewed paper’. Those living in the real world will recognize that ‘publication format’ is usually not the only or even the best format for structuring analysis. At the very least, publication format is not the format used here.
Structuring analysis, IMO, involves ‘working the problem backwards’. In basic terms structuring analysis involves: 1. Identify the issue or hypothesis being addressed. 2. Identify the mathematical result needed to reliably confirm or refute the hypothesis 3. Identify the calculations needed to produce the needed mathematical results and 4. Identify the data needed to perform the calculations.
The purpose of structuring is to 1)determine if a form of analysis is possible and/or practical, and 2) to determine the most efficient, practical, reliable method of producing the analysis. As a general rule, complex forms of analysis are neither possible nor practical without first structuring the problem. [Structuring, it will be noted, is the basic technique used to ‘rig’ analysis and produce ‘fraudulent results’. With appropriate hands on experience, it is generally not difficult to differentiate between fraudulent structuring and beneficial structuring. ]
The structuring of the step 1 to 3 analysis is fairly straight forward. The intermediate hypothesis for the end of step 3 can be expressed as :
Hypothesis: The force of selection for most of the ‘alleles produced by likely mutations’ in the genome is very high.
In structuring this analysis, we can start by suggesting that the hypothesis is confirmed if 75% of ‘alleles produced by likely mutations’ have forces of selection greater than 95%. It is useful to note here, that in order to be useful in subsequent steps much less rigorous criteria (a lower % of likely alleles and a lower force of selection) could have been used. It is also useful to note that the analysis being performed could satisfy a much more rigorous criteria( higher % of likely alleles and higher forces of selection. This analysis involves very substantial margins for error.)
The calculation technique to be used to generate data is an actuarial measurement of mortality technique which will be discussed latter.
The calculation to be performed requires 2 pieces of information as follows: 1. Estimates of the raw mutation rates for the set of ‘likely mutations’. 2. Estimates of the current distributions of alleles by gene.
The calculation to be performed is based on comparing 1)the distribution of alleles that would be produced by a ‘mutation only’ or ‘selection neutral’ process and 2) the actual distribution of alleles in a population. When you know the technique being used, you know that even fairly small differences in these two distributions will confirm the hypothesis being tested.
Even with a very limited knowledge of genetics, it is apparent that differences between ‘selection neutral allele distributions’ and ‘observed allele distributions’ are very large. [Technical note: If you define ‘raw mutation rates’ in terms of ‘mutations that are observed in the live reproducing individuals’, rather mutations generated by cell reproduction, you will greatly reduce the differences between the two distributions. However, if you eliminate all ‘selected out’ or nearly always ‘selected out’ alleles, you create a genetic system where ‘random mutation’ means ‘generate one of a very limited number of possible mutations’. Although beyond the scope of this thread, systems with very limited sources of potential variation, have only a very limited ability to evolve. (If you have available only a very small number of options for solving a very complex survival problem, the odds are very high you will fail to find a solution.) Again, this is an interesting question, but a subject for another thread.]
Thus, based on structuring the problem, we can conclude that the analysis can be performed based on general, public knowledge of genomes. The analysis does not require references to any specific supporting studies. [Note: ‘Reliance on general, publicly available information’ is a theory of knowledge issue. Some individuals insist that anybody opposing established theories must provide references for every piece of information used. This is clearly an invalid rule. If I use ‘reliance of general, publicly available information’ in an argument, the reliance is not a valid basis for questioning the validity of the argument. The validity of the argument only comes into question if 1)the relied upon information is false, and 2)the reliance results in a faulty conclusion. In order to dispute an argument based on reliance it is necessary to demonstrate both 1 and 2. Many silly and unproductive arguments arise from misinterpreting the theory of knowledge rule relating to reliance on generally available information. ]
The key point of this ‘structuring analysis’ is the conclusion that the analysis can be performed using ‘generally available information’. There is no need to make reference to specific studies of mutation rates and studies of actual distributions of alleles. If you question this conclusion I will be glad to start a separate ‘theory of knowledge’ thread to discuss the issue. Note, if you claim that I have relied on incorrect information and that reliance has distorted the results, you can provide evidence supporting your claim.
PERFORMING THE ANALYSIS At last we can start addressing the specifics of the ‘how to calculate’ question.
1. DATA REQUIRED
As stated above, the analysis here requires a ‘rough estimate’ of a. the ‘selection neutral’ distribution of alleles and b. the actual distribution of alleles.
a. estimating the selection neutral distribution i. assume there are ‘roughly’ 3000 likely ‘mutations’ per gene- this is based on an estimate of 1000 pairs per gene and 3 ‘likely’ point mutations per pair. Note: Recognizing other likely mutations would make it ‘easier’ to demonstrate the hypothesis presented. ii. assume raw mutation rates for all 3000 likely mutations are in the 1 per million per generation to 1 per ten million per generation range. iii. result- the selection neutral distribution of alleles is a relatively flat distribution over all 3000 likely mutations( roughly same expected frequency for each likely allele). [It is interesting to note that this ‘selection neutral’ result is dramatically different than the result generally shown for genetic drift demonstrations. An other interesting topic for another thread. ]
b. estimate of actual distribution of alleles i. assume that for the average gene, there are only 100 different alleles actually observed in mature reproducing individuals. ii. result- for 2900 of the 3000 likely genes, the probability of occurrence is 0%.
2. CALCULATION OF SELECTION RATES PER POTENTIAL ALLELE
It is taken here as a given(definition), that populations of alleles change over time as the result of the interaction of forces of selection(roughly decrement) and the forces of mutation(roughly increment). If the forces of increment and decrement remain relatively constant overtime, then the populations of alleles achieve a stable or stationary distribution.
Changes in population from generation to generation using multiple decrement terminology can be denoted by (population at end of period(Pt+1)) = (population at beginning of period(Pt)) + (increments(I)) - (decrements(D)). Or Pt+1=Pt+D-I.
When the forces of increment and decrement come into equilibrium, the population becomes stable or stationary. At this point Pt+1=Pt and by algebraic elimination D=I.
In order to calculate rates of decrement or selection for alleles we start with the ‘assumption’ that the estimated distribution of alleles represents a stationary population. [You can go to the appropriate mathematical and actuarial literature if you want to learn more about the appropriate uses of the stationary population assumption.]
Given the stationary population assumption, we know that D=I. Since we know both the rates of increment (mutation) and the current population, we can calculate/estimate the rates of decrement. For the 2900 possible alleles with a distribution of 0%, we know there are increments due to mutation. In order to maintain the 0% population, the rate of decrement must be 100%.
3. CONCLUSION The rates of decrement (selection ) are 100% for 2900 out of the 3000 likely alleles. The calculation confirms the proposed hypothesis with a very wide margin of error.
I think it should be ‘obvious’, that if you know the stationary population ‘trick’ you can do these calculations ‘in your head’. My suggestions that some of these results are ‘obvious’ is not bogus.
The calculations performed above can be made more complicated by modifying some of the assumptions. The appropriate literature can provide you with an almost endless amount of complex math to evaluate complex assumptions. For the sake of the discussion here, there is no need for me to try to teach you the minutia of actuarial mathematics.
INTERPRETING THE RESULTS
The analysis performed confirmed the conclusion shown at the end of steps 1 to 3. As noted above, the conclusion was confirmed with a very wide margin of error. It is important to note that 1)the demonstration does not ‘prove’ the hypothesis nor 2)does the demonstration NEED to prove the hypothesis. The argument I presented asserted there is a reasonable basis for suggesting that rates of selection are very high (and thus eventually GSH is a reasonable hypothesis). The demonstration presented fully supports the reasonable basis criteria.
As may or may not be obvious, there are two features of the above demonstration which have far more scientific significance than the GSH. First, the demonstration above shows there is a relatively simple and reliable method of measuring rates of selection. Since we already have reasonable techniques for measuring/estimating rates of mutation, this means to possible/practical to model and simulate genetic changes for an entire genome. An obvious and significant departure from existing EB claims.
Probably less obvious, but the above demonstration shows it is possible to model/simulate/analyze genetic change processes independent of phenotype changes. It has long been suggested by opponents of neo-Darwinian theory that the lack of recognizable genotype-phenotype maps suggested a flaw in the theory. With the techniques described above, we can now directly test genetic theory without having to address the genotype- phenotype map issue.
SUMMARY If you can follow the calculations explained above, it is not unreasonable to claim that the calculations are ‘simple’ and the results are ‘obvious’. As with a lot of mathematics, it may take a great deal of effort to reach the point where the analysis is either simple or obvious.
I apologize for the length of the explanation of such a ‘simple’ procedure. I was attempting to address ‘why is it difficult to understand’ as well as ‘how is it done’. The question of ‘why isn’t this obvious and simple technique widely recognized in EB?’ is in itself a very interesting question, IMO.
Probably the most important ‘result’ of this demonstration, is what it suggests about our ability to analyze genetic change. Academia is full of claims like ‘evolution explains….’ Or ‘genetics explains….’ The techniques described above suggest we may actually be able to quantify and analyze some of these claims.
This is a long post. I am sure there are many parts of it which will lead to questions like ‘I don’t understand what you meant by…..’ I will be happy to address all such questions.
IP: Logged
|
|
Frances
Member
Member # 169
|
posted 24. September 2002 13:19
An interesting but imho totally fallacious calculation. Let me try to explain why I say this. You claim that every mutation in a gene has to be 100% removed if this allele is not present in the population but that is not necessarily correct. IF the mutation is lethal then it will indeed be removed from the population but if its function is neutral then it can remain, for a short time, in the population to be eliminated by random drift or other factors. By assuming that there are 2900 alleles with zero occurrence you have forced them to be all lethal. Not very useful a presumption. But furthermore this does not say much about the alleles which can be found in the population. What about their mutation and selection rates... All in all you have ´proven´ that lethal mutations will be selected against. Genetics has come to that conclusion quite some time ago.
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 24. September 2002 16:43
warren wrote quote: The evidence supporting GSH appears clear, verifiable and available for review and testing. Note the evidence is not designed to prove GSH, but simply to provide a reasonable basis for proposing GSH. If Francis or anyone else believes the evidence inadequate, then it seems appropriate that they identify what part of the evidence they consider to be inadequate.
A prerequisite to identifying inadequate evidence is that the evidence be offered. warren has asserted that the evidence "appears clear, verifiable, and available for review and testing." But he doesn't tell us what it is or where it is available. It's therefore impossible to tell whether it is adequate, inadequate, or indeed, whether it exists at all.
warren's description of the GSH model is of a simple Markov process, a basic state transition model. Markov processes are well known and have been used in population genetics and ecology for decades. A discrete Markov process, which is what warren describes, has the general form:
State 1 => Op1...Opi => State 2 => ... => State N,
Where States 1, 2, ... N are successive states of the population, described in a manner appropriate to the research question being asked, and Op1, Op2, ... Opi are transition operators, functions that map State N into State N+1, defined to be appropriate to the biological transition processes being modeled.
In evolutionary biology applications, the States might correspond to the distribution of alleles in successive generations and the operators to the principal evolutionary operators - mutation, recombination, drift, selection, and so on.
Markov models of biological phenomena, and mathematical modeling technologies in general, are sufficiently common in biology that courses are routinely offered in them in respectable universities. For example, at the University of Tennessee one might take a short course with this description quote: This Course will provide an overview of mathematical and computational approaches useful in analyzing complex biological systems including: continuous dynamical systems, discrete dynamical systems, matrix approaches including structured population models and Markov chains, and stochastic process models.
Penn State recommends that its biology students take this course: quote: STAT 416 Stochastic modeling: Review of distribution models, probability generating functions, transforms, convolutions, Markov chains, equilibrium distributions, Poisson process and birthdeath processes.
In Washington University's Computational Molecular Biology Ph.D. program, first-year students are required to take this course quote: Core CMB course (MBT 540/541, Genetics 540/541). A 2-quarter course on protein and DNA sequence analysis and molecular evolution. This will include a brief review of basic molecular biology (structure & evolution of genes, genomes and proteins), probabilistic models of sequences and of sequence evolution , computational gene identification, pairwise sequence comparison and alignment (algorithms and statistical issues), multiple sequence alignment and evolutionary tree construction, and protein sequence/structure relationships. These are the central computational methods required to determine the "periodic table of biology", i.e. the list of proteins and their evolutionary relationships, which can be regarded as the first stage in the conversion of molecular biology into a quantitative science. Moreover, the statistical and algorithmic methods used (which include maximum likelihood estimation, hidden Markov models, dynamic programming) have wide applicability in other areas of computational/mathematical biology. The core course is taught Winter/Spring, so as to leave the Autumn quarter free for meeting the above prerequisites and for becoming well-integrated into the home department.
I added the emphasis in each course description.
warren asks quote: The question of 'why don't they understand?' or more specifically 'Why, apparently, don't geneticists understand the basic/elementary technique for quantifying selection rates used in steps 1 to 3?' . Since geneticists are interested in explaining the processes responsible for producing genetic change, and since they recognize that change is due to the interaction of 'diversity creating processes'(mutation) and 'diversity eliminating processes'(selection) why wouldn't they be interested in techniques that allowed them to quantify selection rates?" (emphasis added)
The answer is that they are interested in quantifying the various evolutionary operators, not merely mutation and selection, and they in fact actually use a variety of sophisticated mathematical modeling technologies, including on occasion the rather elementary kind of step-wise Markov model warren describes as GSH. Genetic researchers and biological modelers are not as ignorant of math as warren makes them out to be, nor are they so stupid that they are unaware of the utility of the various well-developed techniques for modeling dynamical systems.
RBH [ 24. September 2002, 16:51: Message edited by: RBH ]
IP: Logged
|
|
Evan
Member
Member # 164
|
posted 24. September 2002 21:09
As far as I can tell, Warren’s method and argument boils down to this: many of the possible alleles of a gene must not confer any advantage, because they don’t spread throughout the population. Assuming that genes are mutating randomly, so that all genes periodically mutate in some individual or another, we can conclude that many mutations either are deleterious or neutral because they are not passed on to subsequent generations. One can conclude this based merely on the fact that variation in a population contains only a small percentage of the hypothetically possible genetic combinations for that organism.
So what? As Francis says, “All you have ´proven´ that lethal mutations will be selected against. Genetics has come to that conclusion quite some time ago.”
I have read Warren’s post several times. The excessive verbiage and disclaimers makes it hard to follow, and the innuendoes about mine, and others, ability and willingness to understand, make it hard to want to.
In general, I believe Warren’s general style represents a common problem with approaches which are excessively logical without being grounded in empirical reality - they find it all too easy to seem to establish conclusions which are in fact built into the assumptions with which they start.
I am both a mathematician and a computer programmer. Among other things, I make my living explaining the meaning of mathematical formulas and procedures to people. The math Warren presents here is not complicated. In a stable population, point mutations of genes will be balanced by selective pressures, either immediately at the genetic level or at any place further on in the development of the organism. This balance will be approximate, of course, because genetic variation within the population will exist. The assumption of a stable population leads to the conclusion that D = I, in his terms.
This is of course approximately what we find in stable populations. However, all sorts of other factors are also present. There are occasionally mutations which do lead to a selective advantage, the environment changes so that a stable population is no longer stable, factors other than point mutations bring new opportunities to the genome, etc. So nothing in Warren’s analysis leads to any broad conclusions other than, as Francis says, those mutations which are truly 100% selected against do not spread through the population. This is indeed obvious and trivial, and needs no actuarial mathematics nor fancy definitions and acronyms in order to be established.
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 25. September 2002 06:07
Evan,
I originally presented a very simple presentation of steps 1 to 3. You suggested that the validity of the arguments in steps 1 to 3 could not be substantiated so I provided an admittedly lengthy explanation of the logic underlying steps 1 to 3. You found the long answer irritating, but at least you learned that the mathematics underlying steps 1 to 3 is, as I originally claimed relatively simple. You originally questioned, and in fact you were extremely skeptical, my claim that actuaries could do these calculations in their head. You have now learned how to do at least the simple forms of the calculations in your head. Irritating, but at least you are learning.
Now to your current question, which I think can be paraphrased ‘So what? ’. Steps 1 to 3 were designed to support the ‘reasonableness’ of the assertion that in genetic change processes, the selection rates are very high. This conclusion, as I demonstrated, follows directly from the ‘observation’ that actual distribution of alleles in a population is different that the ‘mutation only’ or increment only distribution. I think we can agree that "it can be easily demonstrated that selection rates in genetic systems are very high" is a valid statement. This, IMO, should be recognized as a major step forward. We went in one irritating step from ‘that’s impossible’ to ‘that’s obvious’.
To return to you question, "What is the relevance or significance of the observation that selection rates are very high?". First, and least, the high selection rate observation is the basis for demonstrating that ‘the measured levels of forces of selection do not appear compatible with the claim that ‘all selection is due to natural selection’. This observation is the basis for proposing GSH. [It is probably important to repeat that GSH is proposed as an alternative to ‘all selection is due to natural selection’. I do not claim to know, and no one has offered to clarify, the current TOE position on natural selection. Is the current position- natural selection is the only form of selection, natural selection is one form of selection, or some other not explicitly formulated position. No one so far has been willing to take a position of whether GSH contradicts or is compatible with the current TOE.]
Second and far more important, the high selection rates demonstration means there exists a valid, established technique for quantifying rates of selection in genetic systems.
EB is filled with claims that "RM&NS can explain…" or "TOE can explain …" . Given the ability to quantify rates of selection, these claims can now be evaluated in the form "Can RM&NS explain ….. using realistic mutation and selection rates?" or "Can TOE explain …., using decrement(selection) and increment(mutation) rates derived from genetics?"
I don’t know how familiar you are with evolutionary algorithms, but you should be aware that there is a very big difference between "can explain using any possible set of mutation and selection rates" and "can explain using realistic mutation and selection rates". As should be obvious, but probably isn’t, the ability of basic evolutionary algorithms to ‘explain’ genetic changes or evolutionary changes is dramatically reduced when the ‘using realistic assumptions’ requirement is imposed.
In answer to your ‘so what’ question, the ability to measure selection rates is important, because it makes Darwinian and neo-Darwinian theories testable using the ‘realistic assumptions’ criteria. Since you ‘don’t believe it until you see it’, I won’t bother suggesting what the results of this more rigorous testing might be.
Next, let me address two ‘issues of style’ that you raised. First,
Quote: In general, I believe Warren’s general style represents a common problem with approaches which are excessively logical without being grounded in empirical reality - they find it all too easy to seem to establish conclusions which are in fact built into the assumptions with which they start.
Some of my presentation need to be ‘excessively logical’ because I am presenting the ideas to an audience that does not understand and/or accept the basic ‘obvious’ analytical techniques being used. Many of my arguments appear excessively logical and flawed because they produce results which disagree with "what everybody in EB knows to be true". It is very easy to believe that 1) if almost all EB professionals are in agreement on a claim then 2)the EB ‘hand-waving’ logic and arguments must provide sound logical support for the claim. It is much more difficult to put two contradictory logical arguments along side each other and attempt to determine if one or both for the arguments is flawed. You have at least been brave enough to try to make such arguments.
Second, you find my style, at least at time irritating. I think if you look in the mirror, you might find that you are sometimes guilty of ‘I don’t understand this so you obviously don’t know what your talking about’. Dealing with individuals who ideas different than our own is inherently irritating. Sometimes we deal with the irritation and learn something. Sometimes we throw up our hands and say its not worth dealing with the irritation. I appreciate your efforts to work through the irritation. If it makes you feel any better, the irritation is two sided.
In closing, let me address your concluding paragraph.
Quote: This is of course approximately what we find in stable populations. However, all sorts of other factors are also present. There are occasionally mutations which do lead to a selective advantage, the environment changes so that a stable population is no longer stable, factors other than point mutations bring new opportunities to the genome, etc. So nothing in Warren’s analysis leads to any broad conclusions other than, as Francis says, those mutations which are truly 100% selected against do not spread through the population. This is indeed obvious and trivial, and needs no actuarial mathematics nor fancy definitions and acronyms in order to be established.
First, the ‘obvious and trivial’ conclusion of step 1 to 3 is that ‘forces of selection in genetic systems are very high’. It was my claim that this was an obvious and trivial conclusion that led to the lengthy description of steps 1 to 3. Whether there are a lot of 100% selection factors, or even more 99% selection factors is not material. The issue is which selections rates are high and measurable.
Second, if change is to occur, then, obviously, something must occur which ‘makes the population unstable’. What causes/explains change is another interesting topic but one that is not being discussed here. The issue addressed here is not whether there are ‘all sorts of other factors’ but whether any of these ‘all sorts of other factors’ impact the very narrow and specific hypothesis (very high rates of selection ) being evaluated.
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 25. September 2002 08:15
Response to RBH,
Thanks for pointing out that multiple decrement techniques are known and recognized in EB. Some of the comments provided here and elsewhere disputed my claim that "It can be easily demonstrated that selection in genetic systems involves very high rates of selection". In questioning the my claim, observers seemed to imply that ‘no simple technique for performing such calculations was known in EB’. I am glad you can confirm that the simple actuarial techniques I am using exist and have been validated in EB.
Since, as you have confirmed, actuarial multiple decrement techniques exist in EB, two obvious questions arise:
1. How widely are the techniques known and used in EB? And 2. Why aren’t the techniques more widely used in testing TOE?
From the reaction to my suggestion that ‘high rates of selection can be easily demonstrated, we might conclude that actuarial techniques, or Markov techniques as you call them, are not widely known or recognized or used in EB. This is not at all surprising. While, as I demonstrated, there are some simple actuarial techniques, multiple decrement analysis can very quickly become very complex. It is therefore not surprising that in a field that is not primarily mathematical, you find relatively few individuals with a working knowledge of actuarial or Markov techniques.
It is far more difficult to understand why Markov or actuarial techniques are not routinely used to test the claims made with respect to RM&NS and Darwinian TOE. Markov or actuarial techniques make it possible to measure forces of increment and decrement, these techniques make it possible to ‘reverse engineer’ the sources of selection and mutation, and they make it possible to compare the speed of evolutionary change to the speed of change produced by simulation models. It is, IMO, somewhat baffling. If useful and powerful analytical techniques exist for testing TOE, why aren’t they routinely used? It would certainly be interesting to hear comments on this question from someone who actually works with Markov analysis.
Quote: warren's description of the GSH model is of a simple Markov process, a basic state transition model. Markov processes are well known and have been used in population genetics and ecology for decades. A discrete Markov process, which is what warren describes, has the general form:
GSH is a hypothesis stating that ‘genetic selection processes are predominating ‘non-natural selection processes’. It is somewhat difficult to understand how such an hypothesis could be mistaken for a Markov process. I used a multiple decrement technique to measure selection rates. While it is helpful to know that the actuarial technique used is equivalent to the technique used in Markov analysis, this doesn’t make GSH into a Markov process.
Quote: A prerequisite to identifying inadequate evidence is that the evidence be offered. warren has asserted that the evidence "appears clear, verifiable, and available for review and testing." But he doesn't tell us what it is or where it is available. It's therefore impossible to tell whether it is adequate, inadequate, or indeed, whether it exists at all.
A rather astonishing and unsupported statement. I stated explicitly, in irritating detail what evidence I used and the rational for using the evidence I did, and I showed how that evidence was used to calculate the values used to test the applicable hypothesis. You need to go back, reread the material posted and then state your objection to the evidence, if you have one in a manner consistent with the facts. Response to Frances,
It is somewhat difficult to respond because your comments don’t appear to reflect an understanding of either the mathematics or the argument the mathematics are used to support. Let me at least try to address some of the technical fallacies in your comments.
To begin, the analysis performed demonstrate that ‘selection rates for most likely alleles are at or near 100%’. The fact that ‘selection rates are at or near 100% for an allele’ is NOT the same as ‘the allele is lethal’. The point of GSH, is that selection rates are near 100% AND this does not require that the allele be lethal or that the allele prevent successful reproduction. The point of GSH is that there are selection mechanisms other than natural selection operating in genetic change.
Quote: You claim that every mutation in a gene has to be 100% removed if this allele is not present in the population but that is not necessarily correct.
What ‘I said’ is on a technical basis different from what you say I said. Technically what I am saying, is that ‘if a likely potential allele is not present in the current population, and if the current population has been relatively stable for an extended period of time then, using standard actuarial concepts and techniques, it is reasonable to calculate the force of selection as 100%’. There are rather complex statistical techniques for calculating the probability that the ‘true force of selection’ under defined conditions is other than 100%. The 100% value is, however, a best estimate calculation.
The claim here is not that ‘we are absolutely positive that the selection rate for all 2900 likely alleles is exactly 100%. The claim is only that such measurements/estimates are reasonable.
Your argument is essentially ‘Your demonstration is flawed because you haven’t proven it is exactly true under all possible conditions’. You, IMO, are the one guilty of seriously flawed reasoning.
IP: Logged
|
|
Frances
Member
Member # 169
|
posted 25. September 2002 09:22
Dear Warren.
If selection rates are 100%, that is a mutation is immediately removed from the population then the selection is due to the mortality of the mutation. It's that simple. Whether this selection is natural or not, no mutation survives to the next population. It´s that simple. But the conclusion was already built in your assumptions namely that a population is completely stabile and that any change is immediately removed. However real life does not work that way, a new allele arises and over time disappears, wanders around or becomes fixed. Only by ignoring these common principles can you enforce your conclusion. It seems that your ´structuring´has caused you to reach a result that was alredy enforced by your assumptions. Not very surprising... Perhaps you could do the same calculations for realistic examples of mutations and show what the selection rates really are
Despite your claim, my argument is simple that your conclusion were caused by your assumptions and do not reflect necessarily reality.
Perhaps your claims that your calculations show that RM_NS has been disproven seems to be based more on a faulty presumption than on actual calculations.
As far as Markov chains and evolution, if Markov is equivalent to GSH then they are the same and indeed Markov is a more common description for actuarial calculations. Perhaps this may not be obvious to those less familiar with such mathematics but a quick search can help or here
Markov processes are commonly used in genetics as well thus your claim that actuarian mathematics is somehow not applied in genetics seems to be incorrect mainly due to the use of a different term for the same mathematics.
Perhaps the following may be useful for Warren when he claims that
quote:
It is somewhat difficult to understand how such an hypothesis could be mistaken for a Markov process. I used a multiple decrement technique to measure selection rates.
the following may be useful
And as far as applications of RM&NS in real life is concerned, this seems to do quite well The works of Endler come to mind or this review paper
quote:
Over the last 15 years, the availability of standardized esti-mates of the strength of selection on quantitative traits in wild populations has increased tremendously.
[ 25. September 2002, 09:39: Message edited by: Frances ]
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 25. September 2002 10:42
I did a search similar to the one Frances describes and came to the same conclusion, so I'll not comment further on that.
warren wrote quote: GSH is a hypothesis stating that 'genetic selection processes are predominating 'non-natural selection processes'. It is somewhat difficult to understand how such an hypothesis could be mistaken for a Markov process. I used a multiple decrement technique to measure selection rates.
Warren is right: in supposing that his model was intended to test his hypothesis, I conflated the model (Markov process) and hypothesis (GSH: "genetic selection processes are predominating 'non-natural selection processes"). I apologize for the confusion. Warren is right: the Markov model he describes does not embody the hypothesis he proposes.
As stated by warren, GSH is an eliminative argument. It merely asserts that natural processes cannot account for observed phenomena, and that in turn is alleged to be a failure of current genetic theory, namely that current genetic theory, as it is incorporated into the larger theory of the evolution of biological life, fails to account for observed rates of change in population gene frequencies from generation to generation and we must therefore invoke "non-natural processes." The first question then is what is the basis for that assertion? What are the observed rates of change that cannot be accounted for? Is there actually a problem here? warren has not established that there is a problem.
My reservations about the model itself rest on a simple issue: does warren's model demonstrate that the principles of genetics cannot account for observed rates of change in gene frequencies? What I have been mostly struggling to understand is how the model that warren intends to answer that question maps the processes and variables that genetics identifies as relevant to generational changes in gene/allele frequencies. Genetics has given us pretty clear descriptions of the variables it sees as relevant to the production of genetic variability. In contrast, GSH does not identify any variables. It just asserts a negative: "non-natural selection processes." As I understand it, the model warren describes is intended to demonstrate that rates estimated from his model are too high to be accounted for by standard genetics.
The issue of veridical mapping from genetics into model is therefore critical, because if the model does not accurately map the variables genetics identifies as relevant then conclusions about genetic theory drawn from the model are invalid. Since I see no veridical mapping of genetic theory into warren's model, I conclude that the conclusions he draws from the behavior of the model are irrelevant to genetics. Not only does his model not represent his own GSH hypothesis, it doesn't represent current genetics. One wonders what it does represent.
warren wrote to Evan quote: I don't know how familiar you are with evolutionary algorithms, but you should be aware that there is a very big difference between "can explain using any possible set of mutation and selection rates" and "can explain using realistic mutation and selection rates". As should be obvious, but probably isn't, the ability of basic evolutionary algorithms to 'explain' genetic changes or evolutionary changes is dramatically reduced when the 'using realistic assumptions' requirement is imposed.
I'm fairly familiar with evolutionary algorithms, having designed and built them in an applied context for a dozen years or so, and it is not obvious to me that what warren asserts is the case. First, in the context of theory-testing and research in evolutionary biology, computer-based evolutionary algorithms do not "explain" anything. They are test beds in which one can assess hypothesized processes and variables, but they are not themselves "explanations;" they are a technology for testing explanations.
Second, when I run an evolutionary algorithm using 'realistic' mutation rates and 'realistic' representations of recombination processes and 'realistic' selective pressures (in fact, using actual environmental variables piped into the program from the outside world) operating on phenotypic manifestations of the gene strings, I see evolution. Changes in gene frequencies and distributions that are consistent with adaptation to the selective environment are clearly seen in 'genetic' analyses of the population through generations. In those analyses one can track the emergence and coalescence of multiple 'species' in the population, adapted to different environmental 'niches,' and one can track their success or failure as the selective environment changes through time.
I frankly do not know the source of warren's assertion about evolutionary algorithms - it is not "obvious" to me - and I would dearly like to know its basis. Are there research papers out there that warren can refer me to?
RBH [ 25. September 2002, 10:52: Message edited by: RBH ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 26. September 2002 10:57
RBH,
Quote: As stated by warren, GSH is an eliminative argument. It merely asserts that natural processes cannot account for observed phenomena, and that in turn is alleged to be a failure of current genetic theory, namely that current genetic theory, as it is incorporated into the larger theory of the evolution of biological life, fails to account for observed rates of change in population gene frequencies from generation to generation and we must therefore invoke "non-natural processes." The first question then is what is the basis for that assertion? What are the observed rates of change that cannot be accounted for? Is there actually a problem here? warren has not established that there is a problem.
This is good, if somewhat indirect, question. To begin GSH hypothesizes the existence of ‘selection processes’ other than ‘Darwinian natural selection’ or ‘other than selection operating on phenotypic effects’. These are not ‘non-natural processes’. An appropriate name for the process identified by GSH would be ‘gene direct selection’.
As you point out, it is not at all clear whether ‘gene direct selection’ does or does not contradict current evolutionary theory. Certainly, ‘selection other than Darwinian natural selection’ is a problem for Darwin’s original formulation of evolutionary theory. However, current evolutionary theory is so ill-defined it is impossible to determine what if anything it says about any feature of evolutionary processes.
The most obvious and best known examples of ‘gene direct selection’ are ‘DNA error correction processes’ which select out or eliminate likely mutations before any phenotype effect is ever exhibited. Since EB accepts error correction processes as compatible with current evolutionary theory, it is not unreasonable to conclude that GSH is compatible with ‘current evolutionary theory’. If you go back and read my original postings, you will note I specifically avoided any claim that GSH contradicts current Darwinian or neo-Darwinian theory.
Fortunately, you provided the answer to your own question. GSH is not necessarily a problem for TOE, but it is clearly a problem for evolutionary algorithms. More specifically, GSH is a serious problem for the claim that evolutionary algorithms can model, simulate, or explain either ‘evolutionary processes’ or ‘human design processes’.
Evolutionary algorithms are base on the ‘ASSUMPTION’ that all selection involves ‘selection based on phenotype effects’. GSH and the argument/demonstration supporting GSH clearly show that most of the selection associated with genetic change is gene direct selection. Evolutionary algorithms can not model, simulate, or explain evolutionary change because they fail to recognize the impact of selection processes which are not based on phenotype effect.
Quote: My reservations about the model itself rest on a simple issue: does warren's model demonstrate that the principles of genetics cannot account for observed rates of change in gene frequencies?
A very good question, but then you offer a somewhat questionable argument for rejecting my analysis. It is therefore useful to separate the above question from your ‘veridical mapping’ argument.
To begin, we start with the muliple decrement or Markov definition of genetic changes in frequency as the ‘interaction of forces of increment and decrement. In the my notation, Pt+1=Pt + D - I. Again note that this is a definition.
I now propose that 1)Pt+1 and Pt denote real world measurements of distributions, 2)I represents real world measures of changes due to mutation, and 3)D is net of all change processes other than mutation and I will call this net change natural selection. Using these definitions, I have a ‘model’ of genetic change processes. Note that using this definition one does not have a testable scientific theory, because since D has been defined as the balance item, the model will fit any possible sets of values of Pt+1, Pt, and I.
What my analysis does, it measure/estimate Pt+1, Pt, and I under ‘stable conditions’, and use these measures/estimates to D. I then divide the measured D into 1)natural selection(NS) or selection based on phenotype effect and 2)direct selection (DS) . For this analysis, DS is defined as all selection other than NS. By measuring/estimating NS, I demonstrate that DS is much larger than NS.
The answer to your question "does warren's model demonstrate that the principles of genetics cannot account for observed rates of change in gene frequencies?" depends on how the ‘principles of genetics’ defines selection. If the principles of genetics defines ‘natural selection’ as ‘all forces of increment and decrement other than mutation’, then the answer to your question is no. If the ‘principles of genetics’ defines ‘natural selection as selection based on phenotype effects, then the answer is yes, I have demonstrated the inadequacy of the principles of genetics. If, as I suspect, the principles of genetics, are undefined, then the question you raise is unanswerable.
Your ‘veridical mapping’ appears to me to say, ‘geneticists have decided which variables they will recognize as relevant to genetic change, and if your analysis does not comply with our list of variables then it must be rejected as invalid’. If that is your argument then it is clearly unsound. If you, as I must assume is the case, your offering some other argument, you need to explain it more clearly.
Quote: I frankly do not know the source of warren's assertion about evolutionary algorithms - it is not "obvious" to me - and I would dearly like to know its basis. Are there research papers out there that warren can refer me to?
First, let me say I found your comments on evolutionary algorithms interesting, useful and well expressed. The issue here is, IMO, partially one of perspective. Second, ‘ability of evolutionary algorithms to explain’ and ‘evolutionary algorithms used as a tool for testing or fitting real world data’ are, IMO, essentially the same.
Most tests/explanations using evolutionary algorithms are of the general form ‘If assumptions A1, A2,… are valid, then an evolutionary algorithm model Y can simulate/explain result X’. When you work with most types of simulation/projection models, it is fairly obvious that changing assumptions changes results, and the more limited your ability to adjust assumptions(i.e. the more realistic your assumptions must be) the more difficult it is to generate a particular result X.
Many people appear to be unaware of these principles and argue as you do that.. " when I run an evolutionary algorithm using 'realistic' mutation rates and 'realistic' representations of recombination processes and 'realistic' selective pressures (in fact, using actual environmental variables piped into the program from the outside world) operating on phenotypic manifestations of the gene strings, I see evolution.".
The ‘problem’ is that much of this analysis contains ‘implicit or hidden assumptions’ which are unrealistic. As simple examples (which may or may not be based on evolutionary algorithms;
1. Genetic drift simulations tend to ‘assume’ that expected increments from thousands of likely mutations can be ignored. 2. Simulations of evolutionary change ‘assume’ an unlimited amount of time and lives available to produce result X. 3. Simulations ‘assume’ that the rate of change produced by ‘selection based on phenotype effect’ in a simulation model could be duplicated by a ‘selection based on phenotype effect’ in nature. 4. Simulations assume the adaptive landscape is stable.
For reasons that are somewhat difficult to explain, EB analysis seems to ignore the bias that is being introduced by some of these hidden or implicit assumptions. The GSH suggests that ‘ignoring gene direct selection is one of the ‘flawed assumptions’ implicit in evolutionary algorithm analysis. [ 26. September 2002, 10:58: Message edited by: warren_bergerson ]
IP: Logged
|
|
Frances
Member
Member # 169
|
posted 26. September 2002 11:30
Warren still has not explained to us why his very simplistic model of dP/dt=D-I is a realistic model. It would be a poor but not totally trivial model if Warren had not assumed as well that dP/dt=0 thus D=I which means that any mutation is immediately removed from the population. This is equivalent to stating that the mortality rate of the mutation is 100%.
Now let's assume that dP/dt=D-I was not zero but that selection was zero thus dP/dt=D, this would represent a stepwise process if D were a delta function in time or a linear process if D is constant in time. Neither processes are very representative of reality though. A more interesting example would be if D and I all happen and D is either 0 or 1 and I is either 0 or one. We now have the following process
dP/dt= C where C is -1, 0 or 1
This would model would be similar to a random walk process, again related to Markov processes.
But by insisting that the model is time independent Warren has enforced 100% mortality. Not very useful for modeling the reality and certainly far from disproving RM&NS.
In reality however the mathematics of mutation and selection are slightly more complicated leading to mutations being allowed to spread through the population, mutations which will eventually go extinct but none of the real world modeling expects a steady state model as proposed by Warren to be a reliable model of reality.
I am curious why Warren believes that such a trivial result based on a trivial and unrealistic model should be indicative of a failure of RM&NS. [ 26. September 2002, 12:16: Message edited by: Frances ]
IP: Logged
|
|
charlie d.
Member
Member # 159
|
posted 26. September 2002 12:41
quote: 1. Genetic drift simulations tend to ‘assume’ that expected increments from thousands of likely mutations can be ignored.
Not clear what you mean. All drift models (indeed, all evolutionary models) of course account for mutation frequencies. quote: 2. Simulations of evolutionary change ‘assume’ an unlimited amount of time and lives available to produce result X.
Entirely untrue. All (all) evolutionary models absolutely do consider population sizes, and allow for accurate calculations of time frames (in terms of number of generations). Changing the first affects the second in well-characterized ways; that's the whole point of evolutionary models. quote: 3. Simulations ‘assume’ that the rate of change produced by ‘selection based on phenotype effect’ in a simulation model could be duplicated by a ‘selection based on phenotype effect’ in nature.
Uh? What else are models for? All models "assume" they replicate a natural phenomenon (even yours, whatever that is). Whether or not they actually do, depends on how good the models are. Evolutionary models are highly compatible with observed evolutionary rates in terms of nucleotide and allele substitutions. quote: 4. Simulations assume the adaptive landscape is stable.
Also untrue. Most do, because that's a reasonable assumption for the studies in question. However, variable landscapes have been incorporated successully in evolutionary computations, when needed.
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 26. September 2002 13:16
Hi warren,
You wrote quote: Evolutionary algorithms are base on the 'ASSUMPTION' that all selection involves 'selection based on phenotype effects'. GSH and the argument/demonstration supporting GSH clearly show that most of the selection associated with genetic change is gene direct selection. Evolutionary algorithms can not model, simulate, or explain evolutionary change because they fail to recognize the impact of selection processes which are not based on phenotype effect.
Well, actually that's not the case. In the GAs my firm builds we have to go to a good deal of work and complication (think developmental processes) to provide phenotype-level selection. We do that because it is phenotypes that have to go out into the real world to control real processes. It is much easier in GAs to implement gene-level selection.
You also wrote quote: First, let me say I found your comments on evolutionary algorithms interesting, useful and well expressed. The issue here is, IMO, partially one of perspective. Second, 'ability of evolutionary algorithms to explain' and 'evolutionary algorithms used as a tool for testing or fitting real world data' are, IMO, essentially the same.
I really have to disagree here. A research methodology and an explanation are not synonyms. They're clearly related: An explanatory theory makes (implicit or explicit) statements about how one might test the theory and about the methodologies that might be appropriate for testing it, but they are not identical.
As for your 5 "hidden assumptions," charlie d's comments cover them very well. I'll remark only that in my description of the EAs my firm builds and deploys I mentioned that we pipe real selective environments from the outside world into the programs, so whatever properties those real environments have, including their stability or lack thereof, is available to our GAs.
RBH
IP: Logged
|
|
|