ISCID Forums
Topic Closed  Topic Closed


Post New Topic  
Topic Closed  Topic Closed
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » Multiple Decrement Models (Page 2)

 
This topic is comprised of pages:  1  2 
 
Author Topic: Multiple Decrement Models
warren_bergerson
Member
Member # 262

Icon 1 posted 25. March 2003 12:10      Profile for warren_bergerson   Email warren_bergerson   Send New Private Message       Edit/Delete Post 
Matt,

Quote: Finally, Warren has adddressed neither of Rex's questions in his last post. To reiterate: how does one constrain the values of the model's parameters? and
what is the specific hypothesis that the "multiple decrement model" will test?

I addressed both issues but maybe the techniques being used are not familiar to you. The ‘limited number of alleles’ observation can potentially be explained using either the ‘no decrement’ assumption used by Rex, or by the ‘high rate of decrement applied to rare alleles’ assumptions I defined. Although the two sets of parameters can both fit a single measure of allele distribution, the two sets of parameters ‘predict’ statistically different results if for example we were to measure allele distributions for different sub- populations.

The multiple decrement model can be used to calculate the expected result from different sets of or assumptions. By comparing actual results to predicted results we determine which values or range of values are compatible with observed data.

By comparing actual to expected ‘divergence in allele distributions’, we can ‘constrain’ decrement rates for ‘rare alleles’ to ‘on average greater than 90%’ or ‘on average less than 10%’. As I stated in my comments to Rex, running simulations using the two sets of assumptions will demonstrate differences in results produced by the two sets of assumptions.

IP: Logged
Rex Kerr
Member
Member # 632

Icon 1 posted 25. March 2003 17:04      Profile for Rex Kerr     Send New Private Message       Edit/Delete Post 
Here are the results where all but the first 6 alleles are lethal, using the same parameters as in my post dated 19. March 2003 17:22. The following function has been rewritten as shown to pick out alleles that are not lethal for further propagation (original code is on this thread.
code:
void newgen(int *old_pop,int *new_pop)
{
int i,j,k;
for (i=0;i<POP_SIZE;i++)
{
do { k = old_pop[rng_n(POP_SIZE)]; } while (k>LETHAL_LIMIT);
new_pop[i] = mutate( k );
}
}

Up from a single allele (this gets dull quickly):
code:
Generation      0 has    1 alleles, max freq 100.0% (allele 0000), next  0.000
Generation 1000 has 1 alleles, max freq 100.0% (allele 0000), next 0.000
Generation 2000 has 2 alleles, max freq 100.0% (allele 0000), next 0.005
Generation 3000 has 1 alleles, max freq 100.0% (allele 0000), next 0.000
Generation 4000 has 3 alleles, max freq 100.0% (allele 0000), next 0.005
Generation 5000 has 1 alleles, max freq 100.0% (allele 0000), next 0.000
Generation 6000 has 1 alleles, max freq 100.0% (allele 0000), next 0.000
Generation 7000 has 1 alleles, max freq 100.0% (allele 0000), next 0.000
Generation 8000 has 1 alleles, max freq 100.0% (allele 0000), next 0.000
Generation 9000 has 1 alleles, max freq 100.0% (allele 0000), next 0.000
Generation 10000 has 1 alleles, max freq 100.0% (allele 0000), next 0.000

Down from every possible allele (a bit more interesting; [note to self: used seed 14295]):
code:
Generation      0 has 2000 alleles, max freq   0.1% (allele 0996), next  0.115
Generation 1000 has 9 alleles, max freq 47.3% (allele 0005), next 15.440
Generation 2000 has 7 alleles, max freq 51.9% (allele 0005), next 21.170
Generation 3000 has 5 alleles, max freq 53.8% (allele 0005), next 20.440
Generation 4000 has 5 alleles, max freq 55.9% (allele 0005), next 26.460
Generation 5000 has 4 alleles, max freq 48.8% (allele 0005), next 29.500
Generation 6000 has 4 alleles, max freq 40.6% (allele 0005), next 37.315
Generation 7000 has 4 alleles, max freq 41.0% (allele 0005), next 39.735
Generation 8000 has 4 alleles, max freq 44.1% (allele 0005), next 35.095
Generation 9000 has 5 alleles, max freq 62.8% (allele 0005), next 25.595
Generation 10000 has 5 alleles, max freq 79.5% (allele 0005), next 15.095
Generation 11000 has 3 alleles, max freq 90.5% (allele 0005), next 5.910
Generation 12000 has 3 alleles, max freq 94.3% (allele 0005), next 4.750
Generation 13000 has 3 alleles, max freq 91.5% (allele 0005), next 8.505
Generation 14000 has 2 alleles, max freq 99.9% (allele 0005), next 0.090
Generation 15000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000
Generation 16000 has 2 alleles, max freq 99.8% (allele 0005), next 0.210
Generation 17000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000
Generation 18000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000
Generation 19000 has 2 alleles, max freq 99.8% (allele 0005), next 0.210
Generation 20000 has 2 alleles, max freq 96.9% (allele 0005), next 3.150
Generation 21000 has 3 alleles, max freq 95.7% (allele 0005), next 4.270
Generation 22000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000
Generation 23000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000
Generation 24000 has 2 alleles, max freq 99.5% (allele 0005), next 0.520
Generation 25000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000
Generation 26000 has 2 alleles, max freq 97.7% (allele 0005), next 2.260
Generation 27000 has 3 alleles, max freq 99.5% (allele 0005), next 0.490
Generation 28000 has 2 alleles, max freq 100.0% (allele 0005), next 0.005
Generation 29000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000
Generation 30000 has 3 alleles, max freq 99.8% (allele 0005), next 0.235
Generation 31000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000
Generation 32000 has 2 alleles, max freq 96.9% (allele 0005), next 3.080
Generation 33000 has 2 alleles, max freq 99.9% (allele 0005), next 0.110
Generation 34000 has 2 alleles, max freq 100.0% (allele 0005), next 0.045
Generation 35000 has 2 alleles, max freq 96.1% (allele 0005), next 3.915
Generation 36000 has 2 alleles, max freq 92.4% (allele 0005), next 7.560
Generation 37000 has 2 alleles, max freq 98.3% (allele 0005), next 1.720
Generation 38000 has 3 alleles, max freq 100.0% (allele 0005), next 0.005
Generation 39000 has 2 alleles, max freq 98.9% (allele 0005), next 1.130
Generation 40000 has 1 alleles, max freq 100.0% (allele 0005), next 0.000

We might think that we ought to raise the mutation rate enormously in order to compensate; if we raise the rate by a factor of, say, 130, then:

Up from one--
code:
Generation      0 has    1 alleles, max freq 100.0% (allele 0000), next  0.000
Generation 1000 has 28 alleles, max freq 99.8% (allele 0000), next 0.080
Generation 2000 has 22 alleles, max freq 99.9% (allele 0000), next 0.050
Generation 3000 has 27 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 4000 has 26 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 5000 has 29 alleles, max freq 96.6% (allele 0000), next 2.095
Generation 6000 has 35 alleles, max freq 96.5% (allele 0000), next 2.725
Generation 7000 has 28 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 8000 has 25 alleles, max freq 99.8% (allele 0000), next 0.105
Generation 9000 has 29 alleles, max freq 99.3% (allele 0000), next 0.340
Generation 10000 has 22 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 11000 has 18 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 12000 has 21 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 13000 has 27 alleles, max freq 98.5% (allele 0000), next 1.390
Generation 14000 has 21 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 15000 has 18 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 16000 has 25 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 17000 has 19 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 18000 has 27 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 19000 has 30 alleles, max freq 98.8% (allele 0000), next 1.105
Generation 20000 has 36 alleles, max freq 99.7% (allele 0000), next 0.155
Generation 21000 has 19 alleles, max freq 99.8% (allele 0000), next 0.095
Generation 22000 has 18 alleles, max freq 99.9% (allele 0000), next 0.025
Generation 23000 has 31 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 24000 has 31 alleles, max freq 98.9% (allele 0000), next 0.980
Generation 25000 has 29 alleles, max freq 99.4% (allele 0000), next 0.435

Down from all--
code:
Generation      0 has 2000 alleles, max freq   0.1% (allele 1110), next  0.105
Generation 1000 has 33 alleles, max freq 27.0% (allele 0004), next 16.675
Generation 2000 has 41 alleles, max freq 24.5% (allele 0004), next 21.710
Generation 3000 has 34 alleles, max freq 28.4% (allele 0000), next 25.115
Generation 4000 has 38 alleles, max freq 29.7% (allele 0003), next 25.865
Generation 5000 has 34 alleles, max freq 44.7% (allele 0000), next 26.525
Generation 6000 has 31 alleles, max freq 64.2% (allele 0000), next 13.750
Generation 7000 has 40 alleles, max freq 83.4% (allele 0000), next 9.750
Generation 8000 has 38 alleles, max freq 84.8% (allele 0000), next 6.205
Generation 9000 has 31 alleles, max freq 75.6% (allele 0000), next 15.570
Generation 10000 has 31 alleles, max freq 46.2% (allele 0003), next 30.675
Generation 11000 has 24 alleles, max freq 46.3% (allele 0003), next 27.155
Generation 12000 has 31 alleles, max freq 43.3% (allele 0003), next 41.040
Generation 13000 has 35 alleles, max freq 47.6% (allele 0006), next 39.570
Generation 14000 has 26 alleles, max freq 38.3% (allele 0003), next 35.415
Generation 15000 has 32 alleles, max freq 40.4% (allele 0003), next 38.325
Generation 16000 has 26 alleles, max freq 39.4% (allele 0003), next 37.665
Generation 17000 has 26 alleles, max freq 62.5% (allele 0003), next 21.070
Generation 18000 has 27 alleles, max freq 59.8% (allele 0003), next 20.320
Generation 19000 has 20 alleles, max freq 63.3% (allele 0003), next 18.360
Generation 20000 has 28 alleles, max freq 73.0% (allele 0003), next 14.400
Generation 21000 has 24 alleles, max freq 54.2% (allele 0003), next 30.475
Generation 22000 has 27 alleles, max freq 53.5% (allele 0003), next 28.075
Generation 23000 has 27 alleles, max freq 44.1% (allele 0003), next 39.695
Generation 24000 has 39 alleles, max freq 40.9% (allele 0006), next 30.825
Generation 25000 has 39 alleles, max freq 49.7% (allele 0000), next 27.280
Generation 26000 has 29 alleles, max freq 74.4% (allele 0000), next 14.025
Generation 27000 has 25 alleles, max freq 87.0% (allele 0000), next 9.390
Generation 28000 has 27 alleles, max freq 86.6% (allele 0000), next 11.185
Generation 29000 has 30 alleles, max freq 82.0% (allele 0000), next 13.665
Generation 30000 has 30 alleles, max freq 89.5% (allele 0000), next 9.055
Generation 31000 has 25 alleles, max freq 94.1% (allele 0000), next 5.765
Generation 32000 has 28 alleles, max freq 98.0% (allele 0000), next 1.600
Generation 33000 has 29 alleles, max freq 93.8% (allele 0000), next 5.980
Generation 34000 has 39 alleles, max freq 95.0% (allele 0000), next 4.635
Generation 35000 has 24 alleles, max freq 90.8% (allele 0000), next 9.115
Generation 36000 has 28 alleles, max freq 94.1% (allele 0000), next 4.900
Generation 37000 has 25 alleles, max freq 97.8% (allele 0000), next 2.035
Generation 38000 has 21 alleles, max freq 99.9% (allele 0000), next 0.020
Generation 39000 has 31 alleles, max freq 99.9% (allele 0000), next 0.005
Generation 40000 has 19 alleles, max freq 99.9% (allele 0000), next 0.010

Now, I have to caution that these results have to be taken with a grain of salt as they're bashing pretty heavily on the random number generator I'm using (which is not a very robust implementation).

But, basically, for short genes, the limiting case appears to be either a single allele, or piles of very rare alleles, neither of which matches experiments that well on average across the genome.

This looks like a failure to me--having 99% of the alleles instantly lethal (out of 2000) produces such strong selection that it can't explain all the genes that do have multiple alleles.

Is there another hypothesis to test? I'm sure we can find a specific percentage of lethal alleles and mutation rate that will have as a limiting case any possible distribution of alleles we want.

However, this kind of data-fitting is completely unwarranted, since as my previous models show, there is a wide variability in the frequency of primary and secondary alleles with zero lethality. We don't have the resources here to investigate whether various alleles actually are lethal.

[ 25. March 2003, 17:06: Message edited by: Rex Kerr ]

IP: Logged
warren_bergerson
Member
Member # 262

Icon 1 posted 26. March 2003 08:52      Profile for warren_bergerson   Email warren_bergerson   Send New Private Message       Edit/Delete Post 
Rex,

Boring isn’t always bad. The results show that a high decrement model can produce high concentrations of a limited number of alleles. Three general comments on testing procedures.

ESTIMATING AND MEASURING RESULT DISTRIBUTIONS
First, testing would normally be based on looking at result distributions rather than looking at the results of a single run. Typically you would generate 10, 100, or 1000 runs to establish the mean, variance, and general shape of the distribution of results produced by a set of assumptions. For the test being performed here, you would have an expected distribution for both the high decrement and zero decrement scenarios.

In addition to calculating distributions of expected results, it would typically be useful if you can observe multiple ‘actual’ results. For the human genome, we can look at the distribution of alleles for multiple alleles. Given the assumption that ‘effective population sizes are quite small’, we can also obtain results for different sub- populations. Of particular interest would be those separated for a period of time.

I get the impression that in a wide variety of species, if you sample allele distributions per gene you will find a J curve with one allele genes being most common, then 2 allele, then 3 allele. There are undoubtedly exceptions, but the J curve is at least common. I would predict, without any specific knowledge of the subject, that you will find the J curve distribution in species with both large populations and small populations. [Note: The fact that there are two copies of each gene could complicate this prediction. ]

Again based on the results shown, it appears almost certain that the high decrement distribution of results is consistently closer to the actual observed results than the zero decrement results. Given a simple choice between a high decrement model and a zero decrement model, the high decrement model will consistently provide a better fit to observed results.

SENSITIVITY TESTING
While it seems clear that the high decrement model will provide a better fit than the zero decrement model, I don’t know how close the fit will be or how easy it would be to improve the fit by modifying assumption. Running a few ‘what if’ scenarios or sensitivity tests can give you an idea of what could happen. You have provided a number of interesting examples of what if analysis.

One interesting, if somewhat complex form sensitivity testing would be to look at the impact of interbreeding. If you have two separate populations, then over time you would expect to find some level of divergence in the set of rare or uncommon alleles present in the population. I would expect, that even a fairly low rate of interbreeding between the two populations have an impact similar to ( but obviously slower than) doubling the effective population size.

SUCCESSIVE APPROXIMATION
One of the basic ideas behind ‘whole system’ modeling is that the results obtained from one set of studies are carried forward into the next series of studies. Once we have established what types of assumptions are compatible with or can explain the observed distributions of alleles, then those assumptions are used in the next set of tests.

Once we can agree on a set of assumption that fits the distribution of alleles in the human population, then we would consider if those assumptions are compatible with or can be reconciled with the results observed in some other species.

NEXT HYPOTHESIS
Since we do not yet appear to agree on the decrement rate hypothesis, we still need to see if that issue can be resolved. I would suggest we start by comparing the range of results produced by two sets of assumptions.

IP: Logged
Frances
Member
Member # 169

Icon 1 posted 26. March 2003 12:55      Profile for Frances     Send New Private Message       Edit/Delete Post 
Frances. Stop the posturing.

[ 26. March 2003, 13:29: Message edited by: Moderator ]

IP: Logged
Rex Kerr
Member
Member # 632

Icon 1 posted 26. March 2003 17:14      Profile for Rex Kerr     Send New Private Message       Edit/Delete Post 
quote:
Again based on the results shown, it appears almost certain that the high decrement distribution of results is consistently closer to the actual observed results than the zero decrement results. Given a simple choice between a high decrement model and a zero decrement model, the high decrement model will consistently provide a better fit to observed results.
Utterly absurd. I have cited articles, figures, statistics, and so on, that you have agreed with before that clearly show that there are multiple alleles per gene for a substantial number of genes; that heterozygosity is on the order of 20%, and so on. The model based on your specifications shows a process that stabilizes at an allele frequency of 99.9%, or a heterozygosity of 0.2%!

Aside from wishful thinking, what are you basing your statement on, anyway?

quote:
First, testing would normally be based on looking at result distributions rather than looking at the results of a single run. Typically you would generate 10, 100, or 1000 runs to establish the mean, variance, and general shape of the distribution of results produced by a set of assumptions.
I don't have the time and computing resources to do a thousand runs of every conceivable hypothesis. You are welcome to do so if you wish.

I agree that there are interesting questions with respect to inbreeding, distributions of alleles in humans, sexual reproduction, linkage disequilibrium, and so on. I suggest you look through the list of articles that Mesk has provided, though, or do the computations yourself. I've provided code, simulations based on data that are well-constrained by the experimental data, and shown results that are consistent with observation to within a factor of two. For the type of "back of the envelope" calculation typical for message boards, this already seems excessive (and in reasonable agreement with experiment).

If you want to make extravagant claims such as "the high decrement model will consistently provide a better fit to observed results", I think it is high time that you showed results that gave some indication of this.

IP: Logged


All times are East Coast
This topic is comprised of pages:  1  2 
 
Post New Topic  
Topic Closed  Topic Closed
Open Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership