|
Author
|
Topic: Structure of Scientiifc Anlalysis
|
Rex Kerr
Member
Member # 632
|
posted 09. March 2003 23:02
I find this discussion rather amusing because I am usually the one who is pressing for greater precision in biological sciences. However, it is important to be realistic.
Let's look at an area where these standards have been applied--in modeling divergence of genes via point mutation.
Define variables: our variables will be the corresponding genes of ancestor and offspring species. (Or, alternatively, the differences between them.) Model of causal relationship: For each nucleotide in the gene of an ancestor, each offspring has a high probability of retaining the same nucleotide, and a low (but specified) probability of switching to a new nucleotide at random. Define structure: sequential nucleotides are unrelated. Genes of offspring are related via the causal relationship above and their shared ancestor. Define environmental conditions: anything that causes approximately constant mutation rates per generation. Explicit formulation: see "Molecular Evolution", Wen-Hsiung Li. Or any number of journals. Resolvability: sequence the genes of closely related organisms and compare the statistics of nucleotide changes with that predicted by the model.
So, anyway, gene change by point-mutation seems to be an area of study that meets your criteria. (This doesn't say anything about the conclusion of such studies--just that the process meets your standards.)
Now let's look at what your criteria reject as "invalid, non-rigorous models and theories". To be honest, I am not sure what is rejected, as the criteria are all quite subjective.
Here's an example of the type of science common in genetics; maybe you can assess the rigor of this according to your standards. Suppose we find a fly that can't smell vinegar. We find that this is a heritable recessive condition; via mapping and sequencing, we eventually find a defect in a tyrosine receptor kinase. We create a transgenic fly that has a good copy of that gene, and the fly can smell vinegar again; we conclude that this TRK is involved in sensing vinegar. We use antibodies to locate the gene to the olfactory organs of the fly and conclude that this TRK plays a role in the olfactory signal transduction pathway. TRKs often signal via the GTP-binding protein Ras and similar proteins (e.g. Rak). We make mutants in Ras and Rak in the olfactory organ and note that the Ras mutant but not the Rak mutant cannot smell vinegar. We thus conclude that Ras, but not Rak, plays a specific role in the signal transduction of vinegar olfaction.
Rigorous or not? If so, why, if not, what does this violate? (I'll provide more details if needed.) You have previously complained that genetics is not rigorous.
Edited for typos and/or clarity. [ 10. March 2003, 05:10: Message edited by: Rex Kerr ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 10. March 2003 09:02
Rex,
Quote: I find this discussion rather amusing because I am usually the one who is pressing for greater precision in biological sciences. However, it is important to be realistic
Two general comments. First, the primary issue here, IMO, is modeling, theory construction, and/or interpreting results. I certainly do not question that biologists and geneticists use a high degree of precision in making observations. Second, the major ‘problems’ in theory construction, again IMO, involve ‘scope’ and logical consistency. There are lots of ‘narrow scope models’ that explain, describe or speculate on how something ‘might happen under a very narrow set of unrealistic conditions’. The simplistic explanations, and simplistic Darwinian and neo-Darwinian models are a prime example, do not seem to work when we look at them under more realistic conditions, or when we attempt to evaluate the logical consistency of the various simplistic models.
Quote: Let's look at an area where these standards have been applied--in modeling divergence of genes via point mutation.
This provides an excellent example of ‘genetic models and theories’ falling apart when analyzed under realistic conditions. As you point out, genetic change is an example of situation where we can precisely define genes, alleles, and point mutation processes.
Given estimates of point mutation rates, we can readily estimate the rate of natural divergence or drift. (In a selection neutral environment, mutation would result in divergence from the original set of genes.) Given this base measurement/estimate of point mutation rates, we can then estimate the forces of selection operating on point mutations by looking at the set of alleles that are actually present in the population for each genes. Actuaries call this stationary population analysis. By tracking individual chains of cells, it should even be possible to determine when selection operates to eliminate specific types of point mutations.
Given precise definitions of genes, mutation rates, and selection rates, it would be possible to construct a mutation/selection model of genetic change. Given such a model it would not be difficult to work out the genotype phenotype map. Given a mapping, it would or should be possible to develop predictive genetic models of evolutionary change.
Again, recognizing that genes and alleles can be precisely defined and modeled, and given the assumption that genetic change is produced by ‘random mutation’ and ‘natural selection’, then it should be quite easy given modern technology to develop a predictive genetic model of evolutionary change. To repeat in terms of formal notation:
1. We can precisely define the set of genes G as g1, g2,… gn 2. For each gene gk we can define the set of potential alleles or point mutations GKA = gk1, gk2, …,gkm 3. For each potential allele gkj we can use actuarial multiple decrement techniques to measure/estimate ‘for current conditions’ 1)the mutation rate MRgkj and the selection rate SRgkj. 4. For each SRgkj, we can further estimate SRTgkjt which is the rate or force of selection at time t during the lifetime of the organism. This is useful because it tells us what manifestation of traits or phenotype at what point in time is responsible for eliminating or selecting out a potential allele.
All the information needed to develop a precisely defined predictive mutation/selection model/theory is readily available. Why hasn’t someone developed such a model or theory? The answer is that the model won’t work. Once you define the functioning of the model under stable conditions, you have defined a model incapable of producing change.
As I have discussed before, I believe it is possible to develop ‘genetic change models’ but such models need to be at a far finer level of detail. The key to genetic change, IMO, is not the ‘change in alleles between generations’ but ‘change in gene status during the lifetime of a cell’. To understand genetic change and ultimately evolution, you must be able to explain the processes by which allele gkj changes between inactive and active status(obviously regulatory states may be more complex than a simple binary on-off).
Again, your example points out a central problem in genetic, evolutionary, and biological modeling. The ‘problem’ that exists is not the inability to develop precisely defined models and theories. The problem is that 1)such models are not developed or are rejected when they don’t produce the ‘correct’ result, and 2)individuals continue to make claims for models that it should be known can’t work.
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 10. March 2003 17:04
quote: All the information needed to develop a precisely defined predictive mutation/selection model/theory is readily available. Why hasn't someone developed such a model or theory?
Maybe because the information isn't readily available at all.
To reuse an example I mentioned earlier, the gene pod-1 is essential for viability in the nematode worm C. elegans. Here is the gene (from Genbank, accession number NM_067396):
code:
1 acaaatcaca cacagccatg gcgtggagat ttgcagcgtc gaaattcaag aacacgacgc 61 caaaggttcc gaagaaggag gagacaatct tcgatgttcc cgtcggcaat ctctcctgca 121 cgaatgacgg aatccacgcc agcgccgact tcctcgcttt ccacattgag ggagaaggtg 181 gcaaactcgg agttctgccc atcactgcga agggacgacg cacccgcaac gatatcggaa 241 ttatcgcggc tcacggagag caagtagcgg atttcggatt cttgacgttc gccgatgagc 301 tgctcgccac gtgcagccga gatgaacccg taaaaatctg gaagctctcc cgggatcact 361 ctccaaaact ggccacagaa atcgacgttg gaggtggcaa cgtgattgcg gaatgtcttc 421 gagctcattc cacggccgat aacattttgg cagtcggctc ccacggttcg acgtacatca 481 cggacatctc cacgggaaag acggctgtcg agctctccgg agtgacggat aaagttcaat 541 cgatggactg gagtgaggat ggtaaacttc tggcggtcag tggcgacaag ggacgtcaga 601 ttgttgtgta cgacccgcgt gctagcatgg agccaataca aacgctcgag ggacatggtg 661 gaatgggcag agaggcccgt gtgctctttg ctggaaaccg actcatcagc actggtttca 721 ctacgaaacg aatccaagaa gtgcgcgcgt acgatactgg aaaatgggga gcacccgtgc 781 atacacagga gttcgtctcc accaccggtg tactcatccc gcattacgac gccgacactc 841 gtctcgtctt cttgtctggc aagggaacca ataagttatt tatgctggag atgcaggatc 901 gtcaacccta tctttcgcat gtcttcgagc ttacactgcc agagcagaca ctcggtgcga 961 cgattggcgc caagcggcga gtacatgtta tggatggaga ggttgatacc tactaccagc 1021 ttacgaaaag ttcgattgtg ccaactccat gcatcgtgcc acgaagatcc tatcgtgatt 1081 tccacagcga tctgttccca gagacacgtg gtgccgagcc aggatgcacc gccggcgagt 1141 ggttgaatgg gacaaatgca gttccgcaga aagttagcat ggctccgtcg caaagctcct 1201 catcgccacc gcctccagag ccagttccaa ctccgaaggt tgctcaaaca ccagctccag 1261 ttccagtacc aacaccagca gccgcacctc gtcccatgtc caacaataat tcatcgtcga 1321 acaacgtgcc gagcgtccag gaacaacatt cggttccaaa gaaagaagag gttcgagaac 1381 tcgattacag gccttacgaa aaggagaatg gagttcacac cccaaatgcc gagacaaata 1441 gcactcaggg aaactcgtca ccaatctcca ccatctctcc ggagccagtc acgattgtga 1501 agcccgcaag cacgcctgca accgactcag tgtcaactcc aagcgtcgtt ggaccggcat 1561 ttggtaaaaa ggttccggag cagccaccag tgaacttccg taagccgatc ggagcctcga 1621 atcgtgtgcc actctcgcaa agagttcgtc cgaagtcgtg tgttgtcggt cagatcacgt 1681 cgaagttccg tcacgtggat ggtcagcaag gaacgaaatc tggcgccgtg ttctcgaatc 1741 ttcgcaatgt gaacacgcgt ctgccgccag agtccaacgg tgtctgctgc tcgaacaaat 1801 ttgcggcggt tcctctcgcc ggtctaggag tcattgggat ctatgatgtg aatgagcctg 1861 gcaagttgcc cgatggagtt atggacggaa tcttcaacaa gacgcttgtc accgatttgc 1921 actggaatcc gttcgacgat gaacagctcg ccgtaggaac cgactgtgga cagatcaatc 1981 tgtggcgtct aaccacgaac gatggtccac ggaatgagat ggaacccgag aagattatca 2041 agattggagg tgagaagatc acttcgttgc gttggcatcc acttgcgtcg gatctcttgg 2101 ccgtggcgct ttcgaatagt acaatcgagc tgtgggatgt ggcaaatgcg aagctttaca 2161 gccggttcgt caaccatacc ggagggatct tgggaatcgc atggtcggct gatggtcggc 2221 ggatcgcttc agtcggaaag gacgcgacgc tctttgtgca tgagccggcg agccgcgagc 2281 aacgggtcta cgaacggaaa acagttgtcg agtcgactcg tgccgcccgt gtgctcttcg 2341 cctgtgacga tcggattgtg attgtggtcg ggatgacgaa gagctcgcag cgacaggttc 2401 agatgtatga tgcgcagaca gtggatcttc gacacattta cactcaagtg atcgattcgg 2461 cgacacagcc cctggtgcct cactatgatt acgattcgaa tgtgcttttc cttagcggaa 2521 aaggtgatcg atttgtgaac atgttcgagg tgatctatga ttcgccgtat ctgcttccgt 2581 tggcaccatt tatgtcgcct gttggaagtc aaggaatcgc gttccatcag aaactgaaat 2641 gtaacgtgat ggctgtcgaa tttcaagttt gctggcgcct ctcggacaaa aatctggaga 2701 agattacgtt ccgtgttcca cgtatcaaga aggacgtctt ccaagatgac ttgttcccgg 2761 attcacttgt cacatgggag cccgtcacga ctggaacaaa atggatgctc ggagagcaag 2821 ctgcacccgt gttcagatca ctcaagccgg atggcgtgtt ctcgtcgatt cctcgcgcga 2881 tcactgcatc tgttcgtcac tcggaaatgc catcatcgtc gtccacgaca aattctgccg 2941 cacagacacc atctacttca gtcacacatt cgactacgga gaagcatcat catcaccagc 3001 accaccagca tcaggagcca acatcagtgc cgacaccatc gtcgcgaaac atgcaaagct 3061 gtggagtcga aagcactcaa cagccggacc gtaaacaggt ggccgctgcg tggtcgacaa 3121 aaatcgacgt ggacacgcgc ctcgaacaag atcaaatgga gggtgtcgat gaggcggaat 3181 gggacaaata ggcgctactg gaccatttca tattattttc agtcaagtag tgtacaatga 3241 acacaatttt ctcacggttc tgtaaaaatg ttttttctat tgaaatgttt gatttttcgc 3301 ccccatcacc aatccatcac cacctctccc tctctcgctt tttatttgtc tcatgcttta 3361 ttcatcattt tttatgatta ttattatgag tattattact attgtatagt ctccaatttc 3421 gtgatttttg gttttctaga aaattgcgcc cgctcgcccg cccccacgac ttaccacctc 3481 cccctgaatt tttttgtgct cccatcgcct agtcgaattt attcttttgt atttttgtgt 3541 gtccactttc tctctcggtc gatgtgtttt aacatccata ttttctgccc cgcctcgtcc 3601 cccctctcaa tcgcccgctc cccgccccgc ctttacactg tgtttcgatg aaataaacag 3661 tagagaattg taaaactatg t
Let's focus on one aspect of your model. We can simplify and suppose selection is constant with time, and we have only this one gene of interest, and that the mutation rate is a constant at every base pair (probably about 10^-8), and that only point mutations can happen. We will assume that organisms are haploid, so we don't have to keep track of dominant and recessive alleles. This is all a vast oversimplification, but it's a starting point.
How do we estimate SRg1j, j=0..(3681*3) for this gene? (Many of these changes will be silent mutations and thus can be predicted to be "no effect", but there are still possible changes in every amino acid in the protein that is encoded.)
If we manage to estimate these values, there will be N viable mutants (probably at least 3681, given that we have to keep track of silent mutations) may have offspring with up to two mutations from the original. How do we estimate SRg1j', j'=0..((3681*3-1)*N)~=0..3*(3681)^2?
And then in the third generation there will be about 3*(3681)^3 possibilities. [ 10. March 2003, 17:06: Message edited by: Rex Kerr ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 10. March 2003 20:27
Rex,
Quote: Maybe because the information isn't readily available at all.
It is always difficult to discuss what type of model is or is not possible, particularly when 1)the conventional view is that certain processes are too complex to model and 2)you don’t see articles discussing the specific technical problems which prevent the development of such models. With this qualifier in mind, let us look at the example you mention.
As you point out there are 3*3661 possible point mutations for this gene. If we use your assumption as to a raw point mutation rate, and if we assume all possible alleles are ‘selection neutral’ then we can calculate or estimate a probability distribution that would result from true genetic drift or a true selection neutral environment. After a relatively few generations, this gene would have a very large number of alleles each with a very low probability. True drift would produce rapid divergence. Agree?
But we know from looking at the actual distribution of allele’s in the population that drift or divergence is not occurring. We know from looking at the current distribution of alleles, and stationary population concepts, that the gene being studied is not selection neutral. Given the limited number of alleles actually observed in the population, and stationary population concepts, we know that the selection rate applicable to the vast majority of potential alleles is 100%. You undoubtedly know better than I do the shape of the probability distribution for alleles of this gene. Given this distribution, it is not difficult to estimate the rate of decrement or selection applicable. Agree?
We know from stationary population techniques the selection or decrement rate for most potential alleles is 100%. These genes do not appear in the gene pool therefore the mutations when they occur do not survive to reproduce. The next question to be addressed is ‘when are these potential alleles eliminated’. There are two broad types of decrement process recognized by neo-Darwin- 1)error correction which is equivalent to ‘the mutation never occurred’ and 2)natural selection-death or failure to successfully reproduce. I would guess that the vast majority of decrements are of the error correction type. In effect, the actual number of possible ‘random mutations’ of a gene is very small.
If a potential allele is eliminated by natural selection then we should be able to see/measure when the ‘flawed gene’ actually kills or makes reproduction impossible. If the decrement is fatal, then we should be able to observe and study a fairly clear pattern of mortality for each ‘fatal’ type of allele (or group of fatal alleles). Cancers, some of which apparently are the result of mutations exhibit such patterns. Again I am only guessing based on the lack of information, but I assume the ‘decrement’ producing an identifiable pattern of mortality is relatively rare. Therefore, most decrements are of the error correction type and the number of potential ‘random’ mutations is in effect very small. Agree?
Since the potential alleles with 100% decrement rates don’t survive, they do not play a significant role in evolution. The remaining model contains a known number of genes with each gene having only a small number of possible ‘random variations’. Using a process of elimination we should be able to back track on such a relatively simple variation-selection model and determine what genes are associated with what types of changes.
Given modern analytical and modeling techniques, it should not be terribly difficult to reverse engineer a genetic change system involving 30,000 genes if 1)the only source of variation is ‘random mutation’ and 2)the only source of decrement is natural selection. Using rather basic actuarial techniques, we can readily quantify with a reasonable degree of accuracy the forces of incremental variation and forces of decrement/selection. Based on such information, we know that if the variation and selection assumptions are valid, then the change process is relatively simple. These genetic change processes can be tested/validated against known selective breeding change processes.
This is only ‘back of the envelope’ analysis, but it strongly suggests that there are techniques available to quantify the key variables assumed to be involved in RM&NS type genetic change. If, as suggested by some individuals, the assumptions are valid, but the technical problems in generating predictive models are too complex to solve, then they should be able to demonstrate specifically which part of the calculation is beyond our current ability to model. Certainly the number of ‘possible mutations’ is not a limiting factor, since the vast majority of these possible mutations are immediately eliminated and can play no role in an RM&NS type process.
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 11. March 2003 14:23
Rex,
Two issues worth considering:
TOO COMPLEX TO MODEL Since the subject here is ‘standards of rigorous scientific analysis’ it may be useful to take a look at the ‘too complex to model’ or ‘we do not yet have enough knowledge to model’ claims. The claim is often raised by supporters of Darwinian theory. The claim could be expanded to ‘We know our model or theory is correct, we know our model would produce testable predictions in situation X, but situation X is too complex to apply our model or theory". The issue, IMO, is when is this a legitimate sound argument and when is it a dubious argument to cover up for a flawed theory or model.
The legitimacy of the ‘too complex to model’ claim would seem to depend entirely on the nature of the unavailable information. The claim of ‘too complex’ is legitimate if what is missing is ‘accurate values for identified variables’ or the ability to calculate the interactions of the variables. It is, for example, legitimate to claim that the physical interaction of a large number of objects is too complex to model because we can not accurately measure the initial, speeds, locations, and direction at a point in time and we do not have the computing capacity to calculate all the future movements.
The too complex to model argument is not legitimate if what is missing is ‘one or more of the processes or mechanisms’ responsible for change. If it is not possible to create a predictive model because the ‘processes involved are not understood’ then the theory is incomplete and/or inadequate. If the theory is incomplete then the ‘too complex to model’ argument becomes the nonsensical "The model or theory is correct, but our model or theory is too inadequate or incomplete to produce a predictive model or theory’.
QUANTIFICATION It is probably worth noting that ‘precisely defining variables’ includes precisely defining rules for quantifying variables. You will never have a true scientific model or theory of either evolution or intelligent design unless you have precise rules for quantifying the complexity of ‘evolutionary results’ or ‘biological designs’. Nor will you ever have a true scientific theory until you have verifiable methods of quantifying the ‘volume of evolutionary processing’ needed to produce specific evolutionary changes or ‘the volume of intelligence’ needed to generate specific designs or design changes.
Genetics, evolutionary biology, ID, and AI all share the failure or inability to quantify variables. Only after the quantification issue is solved, I suggest, when any of these disciplines develop into a true science capable of formal structured analysis.
IP: Logged
|
|
brauer
Member
Member # 398
|
posted 11. March 2003 15:20
Warren wrote:
quote:
Given precise definitions of genes, mutation rates, and selection rates, it would be possible to construct a mutation/selection model of genetic change. Given such a model it would not be difficult to work out the genotype phenotype map.
This is a fairly silly claim, and points out how a lack of connection to biological reality leads one into a morass of verbiage.
To work out a genotype<-->phenotype map would require not only "precise definitions of genes, mutation rates, and selection rates". Rather, one would need the distribution of selective effects across all possible alleles. Given that there's often strong epistasis among alleles, this is literally a combinatorial impossibility.
There seems to be a confusion between abstract models and predictive or concrete models. What is needed to make a predictive model is very different from what is needed for an abstract model. In an abstract model I can assume such things as a fixed distribution of mutational effects, a specific distribution of interactive effects, and so on. If I wanted to make predictions, I'd need to fix all of the parameters of my model to biologically realistic values.
Parameters such as "the" mutation rate, the distribution of selective effects, and the effective population size have been the subject of intense scrutiny and interest for nearly a century. When the claim is made that unspecified "actuarial methods" can estimate these parameters, I think the response should be "show me how." I'd like to see, for example, how these methods could estimate something as simple as the effective population size of a real population. (Let alone the far more difficult parameter of mutation rate.)
BTW, abstract mutation-selection models of genetic change have been worked out. Contrary to various non-biologist nay-sayers, they have had great utility for understanding the spread of, say, disease resistance alleles in a population.
Does this mean they can predict the precise point mutation that would lead to resistance? Nope. But it's phenomenally foolish to ask the models to do this, and it betrays a gap in the understanding of what a model does, and what it's supposed to do.
[typo edits] [ 11. March 2003, 15:26: Message edited by: brauer ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 11. March 2003 16:06
Matt,
Quote: To work out a genotype<-->phenotype map would require not only "precise definitions of genes, mutation rates, and selection rates". Rather, one would need the distribution of selective effects across all possible alleles. Given that there's often strong epistasis among alleles, this is literally a combinatorial impossibility.
You should have followed the issue being discussed. If you assume that the neo-Darwinian genetic model is valid, then by the techniques discussed for find that the set of ‘possible random variations’ for each gene is rather small. By observing for each gene individuals with one of the limited number of possible alleles, you could construct a map of the phenotype traits associated with each allele. A substantial undertaking but not one beyond the scope of current technology.
Granted that the map produced in this fashion would be essentially useless, but it would be the map suggested by the neo-Darwinian theory. Again if you followed the discussion, Rex claimed the information was not available to develop a predictive model based on neo-Darwin theory was not available. It is not difficult to demonstrate that in fact the information needed could be easily available. The model developed would be invalid because the theory is invalid.
If as you argue it would be a mathematical impossibility to determine the genotype to phenotype map, then you are asserting that your form of evolutionary theory is undefineable and untestable.
Quote: Parameters such as "the" mutation rate, the distribution of selective effects, and the effective population size have been the subject of intense scrutiny and interest for nearly a century. When the claim is made that unspecified "actuarial methods" can estimate these parameters, I think the response should be "show me how."
If you 1)assume that changes in a population involves one increment process (mutation) and one decrement process(natural selection) , 2)you know the increment rates (mutation rates) for each potential allele and 3)you know the distribution of stationary population, then you can calculate or estimate rates of decrement. Specifically, if the increment rate is positive, and the allele is absent in the population, then the decrement rate would be calculated as 100%.
The results produced by these calculations are flawed, not because the math is flawed, but because the two decrement -mutation and selection assumption or theory is flawed. Genetic change process are far more complex and involve a wide range of processes and mechanism beyond mutation and selection.
Quote: BTW, abstract mutation-selection models of genetic change have been worked out. Contrary to various non-biologist nay-sayers, they have had great utility for understanding the spread of, say, disease resistance alleles in a population.
There are models which purport to fit observed patterns of spread of disease. My observation has been that these models are forced to fit observed data by adjusting assumptions. You are welcome to demonstrate that such models predict results based on assumptions that can be validated.
IP: Logged
|
|
brauer
Member
Member # 398
|
posted 11. March 2003 17:50
Warren,
Sorry, but I was not able to decipher the meaning of your last post.
However, one point you made was this: quote:
By observing for each gene individuals with one of the limited number of possible alleles, you could construct a map of the phenotype traits associated with each allele. A substantial undertaking but not one beyond the scope of current technology.
What organism are you thinking of, that you can observe "for each gene individuals with one of the limited number of possible alleles"?
My organism-of-choice is on the simple end of things. It is the brewer's yeast, Saccharomyces cerevisiae, and as simple as it is, it has in excess of 6200 genes. The lab I'm in is at the forefront of technology for genomic analysis of this organism, so we have a fairly good idea for what's possible. (I invite you to look at the current database of information on Saccharomyces, at the Saccharomyces Genome Database.)
There are deletion panels, in which each of the genes has been knocked out. But screening the phenotypes of these single alleles for each of the identified genes is incredibly laborious. Furthermore, one can generally only identify gross phenotypic differences. Finally, the phenotype of interest here is "fitness". This is a complex multigenic trait whose value is highly context dependent. I could estimate fitness for a single deletion strain, in a single growth condition, and this would tell me nothing about the importance of the gene in other condotions, or in competition with other strains (fitness, remember, is really only relevant as a comparative measure). Add in frequency-dependence, and you ought to be able to see the problem.
Other problems: do you evaluate the effect of the gene in a homozygote or in a heterozygote? What genetic background do you use? Do you want to start to look at pairwise effects between genes? If so you have 38 million strains to look at. How about conditional lethals, or temperature sensitive alleles?
Keep in mind that I'm discussing a microbial organism -- few genes (relatively), 90 minute generation time, single celled. How might this be accomplished in, say, humans, or even flies?
[edited to rmove inappropriate rant] [ 11. March 2003, 17:53: Message edited by: brauer ]
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 11. March 2003 18:37
Lots of misconceptions here.....
quote: If we use your assumption as to a raw point mutation rate, and if we assume all possible alleles are ?selection neutral? then we can calculate or estimate a probability distribution that would result from true genetic drift or a true selection neutral environment. After a relatively few generations, this gene would have a very large number of alleles each with a very low probability. True drift would produce rapid divergence. Agree?
Depends what you mean by "rapid". Any given gene is likely to pick up a mutation at a frequency of on the order of 10^-6 per generation. So yes, after say 100,000,000 generations in a neutral environment, you'd have many alleles each with a very low probability. However, the environment is not neutral, and you have population effects (founder effects, bottlenecks, etc.) that tend to arise much more frequently than once in in tens of millions of generations.
quote: But we know from looking at the actual distribution of allele?s in the population that drift or divergence is not occurring. We know from looking at the current distribution of alleles, and stationary population concepts, that the gene being studied is not selection neutral. Given the limited number of alleles actually observed in the population, and stationary population concepts, we know that the selection rate applicable to the vast majority of potential alleles is 100%.
Absolutely not. Population bottlenecks and founder effects are the primary reason why there are a relatively limited number of alleles actually observed. (Note that there are a vast number of alleles if you could sequence everyone, but that most of these are vanishingly uncommon, and a majority of genes have a single common allele shared by almost every human. Not that huge of a majority; a recent survey found that 28% of genes were polymorphic in humans. [Taken from Molecular Evolution by Li.])
quote: We know from stationary population techniques the selection or decrement rate for most potential alleles is 100%. . . In effect, the actual number of possible ?random mutations? of a gene is very small.
Absolutely not. People regularly do random mutagenesis of genes of interest looking for altered function in order to investigate functionally important parts of the gene. Typically, the majority of mutations has no substantial effect.
(Furthermore, as I mentioned above, stationary population techniques are wrong, because the populations aren't stationary on those time scales.)
If you want some examples of random mutations that exist in human populations, just type in "allelic variation human SNP" in PubMed.
quote: If a potential allele is eliminated by natural selection then we should be able to see/measure when the ?flawed gene? actually kills or makes reproduction impossible.
There are many very rare diseases. You don't hear about them much because they occur at rates of about 1/10^6, which makes even recognizing a disease as distinct problematic. Some of the most dramatic cases are known, but there is no reason to suspect that we are anywhere near classifying all the rare diseasees. Note lethal alleles only need to be cleared at a rate of 1/10^6 or so, since that's how frequently they arise.
(Incidentally, I'm not sure what this "error correction type" is.)
quote: The remaining model contains a known number of genes with each gene having only a small number of possible 'random variations'. Using a process of elimination we should be able to back track on such a relatively simple variation-selection model and determine what genes are associated with what types of changes.
Absolutely not. First, why would we know the number of genes? From sequencing, I suppose--as I mentioned, that shows that about 28% of genes have multiple alleles present in the human population at fairly high frequency.
Second, given that each organism is different from every other organism at hundreds or thousands of alleles, and there is no guarantee that the effects will be independent, how is this process of elimination supposed to work? It certainly can't be done at the organismal level, since we can't isolate each gene in a reasonable amount of time. (It typically takes multiple labs years of work to figure out some of the effects of a single allelic variation in humans.)
Third, this is all hopelessly inadequate, because most alleles haven't had a chance to occur yet, but in order to model an evolutionary process, you have to allow that they could.
I hope this has shown that the "we do not yet have enough knowledge to model" claim is completely valid in this case.
Of course, we are probably missing mechanisms as well, and we could find those missing pieces much faster if we could do accurate modeling at the level of detail you seek. That is one of the biggest benefits of doing modeling--it very rapidly points out where you have made bad assumptions.
Biology transitioned from primarily a descriptive science to primarily an experimental science in the middle of the 20th century. It's at a stage now where it can increasingly become a quantitatively modeled science, but at the level of detail you're talking about, evolution is one of the very hardest problems to model, and we're not there yet, nor will we be for many, many years.
This does not mean that we know nothing about biology, however. Descriptive science and qualitative experimental science can produce results that allow you to learn a great deal about a process. You seem to be confusing the difference between knowing that your knowledge is effectively complete (in that it all major factors have been understood well enough to construct a model), and not knowing anything at all.
quote: My observation has been that these models are forced to fit observed data by adjusting assumptions.
Do you have any examples in mind? [ 11. March 2003, 21:18: Message edited by: Rex Kerr ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 12. March 2003 10:15
Rex,
Quote WB: But we know from looking at the actual distribution of allele?s in the population that drift or divergence is not occurring. We know from looking at the current distribution of alleles, and stationary population concepts, that the gene being studied is not selection neutral. Given the limited number of alleles actually observed in the population, and stationary population concepts, we know that the selection rate applicable to the vast majority of potential alleles is 100%.
Quote Rex: Absolutely not. Population bottlenecks and founder effects are the primary reason why there are a relatively limited number of alleles actually observed. (Note that there are a vast number of alleles if you could sequence everyone, but that most of these are vanishingly uncommon, and a majority of genes have a single common allele shared by almost every human. Not that huge of a majority; a recent survey found that 28% of genes were polymorphic in humans. [Taken from Molecular Evolution by Li.]) The mathematical relationship among -1)forces of increment (mutation), 2)forces of decrement (selection) and 3) population distributions seems to be a major source of confusion in evolutionary biology.
There is general agreement that raw rates of mutation such as point mutation are relatively small and relatively predictable. For the most common types of mutations from allele X1 to allele X2 may have frequencies in the range 1 in 10^8 per cell division. It is also generally agreed that if you look at the distribution of alleles of a specific gene in a population, you will find very high concentrations of certain alleles. As you state, ‘a majority of genes have a single common allele shared by almost every human’.
Many biologists, it appears, fail to recognize that selection rates or decrement rates can be calculated from these two pieces of information. Specifically, if the mutation or increment rate generating allele Xn is positive, and if the distribution of Xn in the population is 0%, then the selection or decrement rate is 100%. If the decrement rate is anything other than 100%, the expectation is that the frequency of the allele in the population will gradually increase over time.
It can be stated as a general principle that 1)if genetic change is the result of one process (or set of processes) of increment mutation and one process of decrement natural selection, 2)the observed population contains n alleles X1,…,Xn for gene X, 3)then the selection or decrement rates applicable to all possible allele other than X1, …,Xn are 100%. Put another way, if an allele is not observed in the population (more specifically if the frequency of an allele or set of alleles in a population is below some frequency then the selection or decrement applicable is approximately 100%) then the decrement rate is 100%(the allele is fatal).
It will be noted that if we abandon the ‘mutation-natural selection’ assumption and recognize that biological systems have very complex genetic change mechanisms, then we can explain how biological systems both 1)maintain genetic stability and uniformity over long periods of time, and 2)rapidly generate complex genetic changes when appropriate. Complex, intelligent, genetic change mechanisms, while indicated by the observed patterns of genetic change, are not compatible with any existing models or theories.
This issue is indirectly related to the ‘scientific standards’ topic of this thread. If we assume genetic change is the result of mutation-natural selection processes, and if we know mutation rates and current distribution of alleles, then we have mathematical techniques for calculating selection rates. If this information is available, why isn’t used or recognized? What scientific standard or lack of standard exists that apparently allows scientists to ignore known mathematical facts?
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 12. March 2003 13:03
warren asked quote: What scientific standard or lack of standard exists that apparently allows scientists to ignore known mathematical facts?
Answer: When the phenomena being modeled and the variables identified by theory as relevant are inappropriately mapped (or not mapped at all!) into the terms and operators of the math. If the math being employed doesn't faithfully represent the phenomena being modeled, then the results of massaging the math are irrelevant to reality and to theory testing. This reification of abstract math is common in IDist analyses and is a prime source of the irrelevant (tornado in a junkyard) and misleading (Scrabble and coin-flipping analogies) "findings" of ID analyses.
Math in this context is a collection of heuristic tools, not a superior reality. It is a collection of techniques for modeling phenomena and theories, not the essentialist embodiment of reality. If one uses the wrong tool one gets strange results that tell us nothing about the reality being modeled. In order to use math models in theory assessment and in testing theory-driven hypotheses about phenomena, one must know the math and the theory and the phenomena. One out of three doesn't cut it.
RBH [ 12. March 2003, 13:05: Message edited by: RBH ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 12. March 2003 15:39
While it may not meet with RBH’s personal approval, multiple decrement analysis is a well established and highly validated form or mathematical analysis. Decrement rates can be calculated given rates of increment and population statistics. Given known mutation rates, the known distribution of alleles, and the mutation-natural selection we can calculate that the decrement rates for almost all potential mutations is 100%. If you accept the mutation-natural selection assumption, then it is not logically or mathematically impossible for both ‘known distributions of alleles’ and ‘evolutionary change’ to occur. That is plain simple mathematical reality.
For groups of organisms to avoid genetic divergence and maintain an identify as a species there must exist very strong forces of selection/decrement (not limited to natural selection), which prevent divergence and maintain a very low level of genetic variance (as Rex points out this is demonstrated by known distributions of alleles).
If genetic change is to occur, then first the strong selective pressures need to be turned off. Second, it seems likely that very complex changes in genes (and probably in other factors) must occur in a highly improbable and highly adaptive form very quickly. The existence of complex adaptive and/or genetic change mechanisms seems like a reasonable possibility. While mutation-selection appears to be a mathematical impossibility, there are, IMO, other possible but far more complex genetic change mechanisms.
If RBH is willing and able to address the issues being raised here, then his participation will be appreciated. If he is simply going to express his unsubstantiated subjective opinions then I would prefer that he adhere to his often expressed preference not to participate in threads I start.
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 12. March 2003 17:41
quote: Many biologists, it appears, fail to recognize that selection rates or decrement rates can be calculated from these two pieces of information. Specifically, if the mutation or increment rate generating allele Xn is positive, and if the distribution of Xn in the population is 0%, then the selection or decrement rate is 100%.
You completely ignored the population bottleneck/founder effect. This is mathematically important. Why are you ignoring it? It's a known mathematical fact.
Also, note that I said that a single common allele. If we start off from a single allele, out of a population of, say, six billion, we would expect around, say, six thousand of those individuals to have mutant alleles (somewhat more if you count the generations it takes to produce 6e9 people). Given the length of an average gene, each individual allele probably occurs only about once. Are you claiming that we have sequenced enough people to find these extraordinary rare alleles? You seem not only to assume that we've done this, but that we've done so in a way that can rule out that these rare alleles could be viable!
Worse still, most mutations are recessive, meaning that you don't see any noticable phenotype unless you have two copies. That means that you should be able to see a basal level of recessive alleles due to mutation even in cases where carrying two alleles is spectacularly lethal. (Lethal so it is removed, spectacular so you will notice it among all the noise.) This is a fairly common calculation to do--I even saw it in my undergraduate genetics class. I'd be happy to reproduce it for you. In any case, your model would suggest that not only is almost every allele lethal, but it is a dominant lethal. Given the difficulty of generating dominant lethal alleles even if you try in the lab, one would be advised to look to other explanations, such as founder effects. (Note that some founder effect is expected given archaelogical/paleontological records of spread of various hominid species.)
The bottom line is that the observed numbers seem to be in the ballpark of what we expect, and to tell in more detail we need more detailed measurements than we currently have (e.g. sequenced genes from many more people).
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 13. March 2003 16:33
Rex,
Let us back up a bit. I suggested that we had all the information needed to create a well defined predictive ‘neo-Darwinian’ genetic model. As I outlined, to construct such a model you need 1)to identify the set of possible mutations and possible alleles, 2)for each possible allele identify the expected rate of increment (mutation) and 3)the expected rate of decrement (selection applicable to each allele). You also need to identify the factors that cause the population to grow and/or remain stable, but that is a simple mechanical process.
As discussed, reasonable estimates of raw mutation rates are available. Actuarial multiple decrement/stationary population techniques can be used to calculate selection/decrement rates from the current distribution of alleles. These calculations suggest decrement rates of 0% for the limited number of alleles in the current population and decrement rates of 100% for all other possible alleles. This RM&NS predictive model correctly predicts that the existing distribution of alleles will remain largely unchanged.
If you create a simulation model based on the above structure, and you reduce the 100% rate of decrement to something like 99% you get a result very different from that observed in nature. You are welcome to perform the experiment. Based on available evidence, it appears that the strong decrement processes which eliminate most possible alleles are ‘error correction’ processes not natural selection, because there does not appear to be evidence for the level of mortality needed to eliminate or decrement possible alleles. It would thus appear that decrement rates will not be impacted by changes in the adaptive landscape.
The model described above 1)actually predicts ‘evolutionary behavior’ of systems during periods of stability and 2)the model fits the RM&NS structure.
The obvious ‘problem’ with the above model, is that it not only fits observed data, but it predicts that evolutionary change will never occur. The problem with the model, and one of the problems with neo-Darwinian theory in general, is that it can explain stability using one set of assumptions, and it can explain change using a completely different set of assumptions, but it provides no process or mechanisms for explaining how a system changes from ‘stable’ to ‘evolving’.
Processes like the founder effect and genetic drift only provide possible (and highly dubious) explanations of changes in frequency among the limited number of existing alleles. They do not explain how an organism can ‘turn off the decrement mechanisms’ that normally eliminate most mutations.
The issue here is scientific standards. The process outlined here explains how a precisely defined predictive genetic model can be developed. The model/theory provides accurate predictions of the behavior of genetic systems under ‘stable conditions’ which can apparently exist for millions of years. The model also shows that RM&NS type models are incomplete, because they provide no basis or mechanism for changing from stable conditions to ‘evolving conditions’.
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 14. March 2003 08:28
Populations typically are not stationary with respect to settling times for neutral mutations. Your model assumes equilibrium conditions, and this assumption is inconsistent with the data for most species we care about.
The assumption of error correction is also at odds with a vast number of biological experiments, including just about every mutant screen ever performed--both with an applied mutagen and looking for spontaneous mutants.
What is dubious about a founder effect? It is an unavoidable consequence of the mathematics of small populations.
IP: Logged
|
|
|