|
Author
|
Topic: Simulating Self Assembly
|
RBH
Member
Member # 380
|
posted 13. February 2003 11:49
warren wrote quote: For reasons that seem to have been abandoned and forgotten, neo-Darwinian genetics, was based on the concept of random variation. However, if you abandon the concept of random variation for directed or non-random variation you are faced with the problem of explaining where did the direction come from? Conventional biology, in abandoning insistence on random variation, has conveniently avoided explaining 'where the non-random' searches can from.
Um. I have no idea where warren gets the notion that "conventional biology" has abandoned random variation. There are several misconceptions embedded in warren's posting:
1. No one to my knowledge ever thought Darwinian evolution is a purely random search process. In fact, as I have argued, biological evolution is not appropriately modeled as a search process.
2. No one has forgotten that mutations are random with respect to the selective environment.
3. No evolutionary biologist or geneticist that I know of argues that mutations are somehow systematically non-random. There was some research a while back (I've temporarily forgotten the name of the researchers) that seemed to suggest non-random mutations occurred in bacterial colonies under stress, but as I understand it, that has not been reliably replicated.
4. No biologist that I know believes that there is only a single fitness landscape on which biological evolution must occur. Biological populations evolve on several fitness landscapes simultaneously.
5. There is no particular need to match the "right" landscape with the "right" search routine. All that standard evolutionary processes require is that the surfaces of fitness landscapes be locally correlated. That's all. And, as Wein's essay referenced above suggests, there are ample grounds to believe that natural fitness landscapes have topographies that tend to be locally correlated. We find the same thing in our GAs: The fitness landscapes have locally correlated topographies.
The very elementary point that Gedanken and I were making is that the process of biological evolution incorporates both random variation and non-random selection (differential reproduction as a function of fitness) and is therefore not a random process. It is a distinctly non-random process. That's all.
Once again, I suggest that warren focus on explicating his model and cease to try to contrast it (whatever it is) with misrepresentations of current approaches. All the latter tactic does is put off people who know about such things, which makes it harder for him to get a sympathetic hearing.
I guess I'll have to wait for the promised examples to understand what it is warren's TA model is supposed to account for and how it will model those phenomena.
RBH
Edited to add "locally" to every occurrence of "correlated." Strinctly speaking, evolution by random variation and finess-dependent selection reduces to a stochastic process only if the topography of the several fitness landscapes on which a population is evolving are locally uncorrelated in all their dimensions. [ 13. February 2003, 12:14: Message edited by: RBH ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 14. February 2003 10:47
As has been discussed before, the concept of ‘variation-selection’ processes that Darwin used in developing his theory of evolution, has its origins in the Greek concept of teleological causation. Given today technology, variation-selection processes are most effectively, usefully, and precisely defined in terms of ‘variation-selection logic machines’.
The basic or elementary mathematical variation-selection logic machine can be characterized as a ‘random-one-alternative-per-cycle’ search engine which searches a defined solution set in order to find the option with the highest level of fitness’. This basic variation-selection machine is a well known mathematical concept and there should be no need to list each of the components involved.
Starting with this basic variation-selection logic machine we can precisely define a wide range of variations involving 1)multiple variation-selection cycles, 2)processes to compare multiple options per cycle, and 3)non-random variation processs. It is possible to precisely define some very complex versions of the basic variation-selection machine. All the definable variations of the basic variation-selection machine perform, or attempt to perform, the same logical search operation. Because they all perform the same operation, it is possible to quantify and compare the performance of different machines.
The technique defined here for quantifying the performance of a variation-selection logic machine is to compare the performance of all machines to the standard or basic random-one-option-per-cycle machine. Because the basic machine has the ability to perform the same searches as any of the more complex machines, we have a universal standard for defining performance of variation-selection search processes.
Mathematical techniques exist for precisely defining different types of variation-selection machines and techniques exist for comparing performance of different types of machines under different conditions. Using these techniques, concepts such as ‘random variation’ and ‘whole system or natural selection’ can be given precise mathematical definitions.
As has been discussed this week, we have the ability to 1)define and model any evolutionary change and 2)identify variation-selection systems capable of modeling and simulating any such change. Given these basic capabilities, it is then possible to evaluate whether some ‘sub-type’ of variation-selection machine has the performance capabilities needed to model and simulate observed occurrences of evolutionary change.
It would, for example, be possible to develop a mathematically precise definitions of a variation-selection machines using ‘random variation’ and/or ‘natural selection’. It would then be possible to test hypotheses such as "identified occurrences of evolutionary change can be modeled and simulated by precisely defined variation-selection machines of type X".
Tools and techniques have been defined for formally expressing Darwinian and neo-Darwinian concepts as testable scientific models. Proponents of Darwinian concepts are welcome to use these defined techniques to demonstrate the validity of their approaches to evolutionary change. Given the obvious levels of complexity involved, it is, IMO, extremely doubtful that anything even vaguely resembling Darwinian concepts can explain evolutionary change. However, the techniques are readily available for any proponent of Darwinian concepts to develop a Darwinian theory and demonstrate its validity.
Given the availability of techniques to formulate mathematically precise expressions of Darwinian concepts, the validity of vague, imprecise, ‘pseudo-mathematical’ demonstrations becomes increasingly dubious. Demonstrations and arguments supporting the Darwinian approach which are based on mathematically ambiguous definitions of ‘random versus non-random variation’, or ‘natural selection versus multiple within lifetime selection processes’ or demonstrations based on unrealistic assumptions, are far less convincing when rigorous, precisely defined techniques are readily available.
The subject of this thread was and is to discuss a new experimental paradigm for analyzing developmental processes specifically and biological information processing in general. In presenting the techniques involved, it is inevitable that comparisons will be made to the quasi-mathematical techniques and arguments sometimes used by some individuals to support the Darwinian approach. Many of the techniques and arguments used in biology do not appear stand up to the more rigorous techniques being defined here.
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 15. February 2003 00:00
I'm about done with this thread. warren wrote quote: Many of the techniques and arguments used in biology do not appear stand up to the more rigorous techniques being defined here.
Since the "more rigorous techniques" have not yet been described in enough detail to actually implement them in code or equations, it's hard to evaluate that remark.
We were promised examples. I'm signing off until I see them.
RBH
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 15. February 2003 09:26
It is time to get back to the main subject of this thread which is modeling and analyzing self assembly. As previously discussed, a developmental process can be modeled by a set C of automated assembly instructions or by a set P of partial assembly instructions. It will be useful at this time to take a high level overview of the types of assembly instructions actually involved in assembling a multi-cellular organisms.
It appears, based on currently available information that a major portion of the developmental/assembly processes can be defined in terms of two general types of assembly instructions. These two types of instructions are
1. Divide/don’t divide cell instructions and 2. Activate gene/don’t activate gene
Both types of instruction involve complex reactions or responses. The divide cell or activate gene response could undoubtedly be broken down into complex sets of responses or reactions. For the discussion here, however, these will treated as simple yes- no reactions or responses. The discussion will only be concerned with the complexity of the trigger or stimulus of the S-R relationships involving cell division or gene activation. To simplify matters further, the discussion will focused on ‘cell divide’ which would appear to be primarily responsible for the physical shape of the organism.
If we treat ‘cell divide’ as a single S-R relationship, then this relationships has far in excess of trillions of occurrences during the lifetime of a complex organisms. It appears that the stimuli or triggers that initiate cell division are very precise and, given the complexity of the physical shapes produced, specific cell divisions ‘decisions’ seem to be controlled by different and precise triggers. In looking at the complex triggers associated with these S-R relationships we want to ask the following questions:
1. Are the complex triggers or trigger mechanisms a)stored in the initial egg cell, or b)are the triggers modified by information processing during the lifetime of the organisms?
2. How do the complex or dynamic trigger mechanisms evolve?
3. Are there materialistic explanations for the proposed trigger and evolutionary processes? and finally
4. How can you construct predictive mathematical theories describing these S-R relationships.
1. FIXED VERSUS DYNAMIC TRIGGERS OR STIMULI There are two broad options for describing or defining the cell division trigger- fixed or dynamic. If the trigger is described or modeled as fixed, then the same complex set of stimuli are always responsible for triggering cell division. If the cell division trigger is defined or modeled as dynamic, then there is assumed to be some process, some type of information processing, capable or modifying the trigger.
We know from observing nature that objects with the appearance of complexity, such as snowflakes, can be assembly from instructions that are or can be described and modeled as simple and fixed. We also know that the assembly instruction associated with cell division is not a simple fixed trigger.
Since it is recognized that the physical shapes of organisms can change or evolve, then it is recognized that trigger mechanisms associated with cell division can change or evolve. The issue here is whether the cell division trigger is defined or viewed as ‘fixed during the lifetime of the organisms’ or ‘dynamic and modifiable’ during the lifetime of the organism. Are there information processing mechanisms in the organism which can modify the cell division trigger during the organisms lifetime?
I believe there is evidence that the cell division triggers can and do change during an organisms lifetime. At the very least, we know that the cell division triggers can become faulty causing the uncontrolled and harmful cell division.
It may be difficult for some to accept, but the question of ‘fixed versus dynamic’ cell division triggers depends not on nature but on the ‘practical concerns of the analyst’. The scientist or analyst has the option of 1)attempting to define a fixed cell division trigger that can model all cell division decisions, or 2)modeling the changes in cell division triggers using within lifetime ‘variation-selection’ processes. There are probably other logical options for modeling and analyzing the cell division S-R relationship, but those are the two options considered here.
It is in theory possible to use either the fixed or the dynamic ‘perspective’ in modeling and analyzing the cell division trigger. The criteria for determining which approach is ‘better’ is which approach produces the most meaningful and useful results.
If we use the fixed trigger approach, then as knowledge accumulates, we can begin to quantify the complexity of the instruction, and the complexity of the evolutionary process required to modify the instruction. If you use the dynamic trigger approach, then an increased understanding of the complexity suggest the existence of increasing complex within lifetime processing capabilities.
2. CHANGE PROCESSES OPERATING ON TRIGGER In terms of conventional neo-Darwinian theory, the cell division trigger would evolve as the result of Natural Selection operating on the variations in the physical mechanisms which store and define the complex fixed trigger. It will only be noted existence of a physical mechanism to store a complex fixed cell division trigger is at this time purely speculative.
Using the dynamic approach, long term evolutionary change is assumed to be the result of a)changes in the variation-selection processes which produce within lifetime changes in the trigger, and b)changes in some of the key parameters impacting variation-selection results. As was discussed earlier, adaptive/evolutionary changes in any such system can be modeled with a TA system involving multiple within lifetime variation-selection processes.
The only substantial difference between the two approaches is the volume of information processing involved in evolutionary change. The dynamic approach suggests the availability of very large information processing capacities. The fixed approach suggests evolutionary change involves very little information processing.
[I will stop here and leave discussion of the other two topics for later.] [ 15. February 2003, 10:05: Message edited by: warren_bergerson ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 15. February 2003 10:03
RBH,
Quote: Since the "more rigorous techniques" have not yet been described in enough detail to actually implement them in code or equations, it's hard to evaluate that remark.
But a substantial number of more rigorous concepts and techniques have been introduced here. As a small sample I have-
1. Provided a mathematically precise definition of ‘random variation’ which can be used to address the ambiguity in the use of that term in neo-Darwinian modeling.
2. Provided techniques for simultaneously modeling with lifetime and between lifetime variation-selection processes.
3. Provided mathematically precise methods of measuring the complexity of biological systems and the complexity of the change processes.
4. Provided techniques for solving the genotype phenotype map problem.
5. Provided a basis for clearly defining natural selection and distinguishing it from other forms of selection.
6. Provided techniques to model and simulate evolutionary/adaptive change for highly dynamic fitness landscapes.
One of the key problems in the analysis of biological systems has been the failure to properly and precisely define variation-selection systems. Specifically, the definitions of variation selection systems have failed to properly recognize within lifetime variation-selection processes. Once these issues are addressed, as they have been here, it becomes possible to quantify the complexity of biological design and biological design processes.
A second key problem is the widespread use of ambiguous definitions of the key terms such a random variation and natural selection. The concepts introduced here provide a basis for addressing the ambiguous definitions.
The discussion to date has introduced more rigorous and precise concepts and techniques. I can not tell if you understand the concepts being introduced unless you ask specific questions.
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 16. February 2003 04:46
Well, I'll venture back in here for a post or so.
quote: If we use the fixed trigger approach, then as knowledge accumulates, we can begin to quantify the complexity of the instruction, and the complexity of the evolutionary process required to modify the instruction. If you use the dynamic trigger approach, then an increased understanding of the complexity suggest the existence of increasing complex within lifetime processing capabilities.
What if we use a fixed-implementation-of-dynamic-trigger? Which is essentially what organisms actually use, as far as we know, and which evolution is free to act upon?
Why can't evolution select for massive within-lifetime processing capabilities? Sure, all that processing gets lost when the organism dies, but that doesn't matter; it survived to reproductive age, and its progeny can go through the processing again.
Maybe this is what you are alluding to when you say that the fixed vs. dynamic view depends on the perspective of the analyst.
quote: The only substantial difference between the two approaches is the volume of information processing involved in evolutionary change. The dynamic approach suggests the availability of very large information processing capacities. The fixed approach suggests evolutionary change involves very little information processing.
Did you say this backwards, or am I misunderstanding you? It seems to me that if you have a dynamic process that can use environmental stimuli to guide cell division, then you need to store less information in the genome, which means that the evolutionary changes would have to transmit less information per generation.
Or is that exactly what you meant? It's hard to tell. Are you focusing on the heritable information, or the total external information processed by the organism (essentially all of which is not heritable)?
quote: a substantial number of more rigorous concepts and techniques have been introduced here.
But without examples, and without a way to test whether they're actually useful. The existence of some level of rigor is nice, but the claims being made far exceed our ability to test them using what has been presented so far.
In some cases, the methods seem so impractical as to be completely useless, for instance, quote: There are 'standard' scientific analytical techniques which can reasonably be expected to make it possible to possible to represent or model and developmental transformation or operational transformation 'A transforms to B' as a complete set C or a partial set P of automated assembly instructions.
sounds like an incredible amount of work if you're going to try to actually capture all of biological complexity at all levels. Maybe you mean to focus on only one level at a time--as in your example of picking out "cell division (Y/N)" and "gene activation (Y/N)" as the output of a TA.
That's why a more complete example would be so helpful. I think I sort of understand the intuition driving your attempt (an intuition that I think is flawed, incidentally), but when it comes to trying to apply any of it, I have a hard time even imagining how I might do it. Thus when you say, quote: Tools and techniques have been defined for formally expressing Darwinian and neo-Darwinian concepts as testable scientific models.
I don't find any tools that I can actually use. I don't know how to apply them. In cases where I might (very high-knowledge cases), I already know other ways to address the same issue. (Phylogenetic trees, gene substitution, etc..)
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 17. February 2003 08:20
Before getting to points raised by Rex, let me finish what I started Saturday. To briefly review, I suggested that a key part of self assembly analysis involves analyzing assembly instructions with a general form such as ‘stimulus sx causes or triggers cell division’ and ‘stimulus not sx causes or triggers not cell division’.
As discussed, both evolutionary biology and design science assume that over time sx changes and that the process responsible for changing sx can be described in terms of variation-selection processes. Design science suggests that the biological evolutionary/adaptive change processes can be represented, modeled, and simulated by a real time, multiple simultaneous selection process TA system. Darwinian concepts suggest the changes in sx are explainable by a simpler, more restrictive ‘whole system natural selection model’.
In the discussion I introduced the following four questions relating to the analysis of cell division triggers.
1 . Are the complex triggers or trigger mechanisms a)stored in the initial egg cell, or b)are the triggers modified by information processing during the lifetime of the organisms?
2. How do the complex or dynamic trigger mechanisms evolve?
3. Are there materialistic explanations for the proposed trigger and evolutionary processes? and finally
4. How can you construct predictive mathematical theories describing these S-R relationships.
My earlier post addressed questions 1 and 2.Today I will address question 3 and 4.
MATERIALISTIC EXPLANATIONS Key to any model or theory of evolutionary/adaptive change is the question of compatibility with the known materialistic laws of chemistry and physics. Models and theories of evolutionary change are more scientifically rigorous if they can be reduced to or explained in terms of known laws of chemistry and physics. Similarly, it might be argued, the failure to reduce models and theories to known laws infers the possible existence of non-materialistic forces.
The approach being described here assumes the existence in individual cells of processes ‘logically equivalent’ to very powerful variation-selection logic machines performing very complex teleological searches. As will be recalled, the automated assembly approach reduces complex assembly processes to sets of S-R relationships. It may not be immediately obvious, but complex variation-selection processes are also reducible to or can be expressed as sets of automated S-R relationships. Since each S-R relationship in the set is a causal relationship known to conform to the known laws of physics and chemistry, it follows that multiple variation selection processes conform to the known laws of physics and chemistry.
It is probably not obvious, but 1)the assembly, operations, and adaptive/evolutionary changes in biological systems can be modeled by sets of S-R relationships, 2)the assembly, operations, and adaptive/evolutionary changes in biological systems can be modeled by sets of variation-selection processes and 3)sets of variation-selection processes are logically/mathematically equivalent to sets of S-R relationships. An S-R relationship can be viewed or defined as a permanent and universal causal relationship or as a dynamic and teleological causal relationship.
Analyzing the operations of biological systems in terms of multiple complex variation-selection processes is, it can be demonstrated, logically equivalent to analyzing biological systems in terms of chains or sets of permanent and universal causal relationships. Again it is probably not obvious, but the operation of biological systems are modeled in terms of teleological variation-selection processes because the approach is more practical and more productive than attempting to define sets of S-R relationships.
PREDICTIVE SCIENTIFIC THEORIES One of the great disappointments in the life sciences has been the lack of success in formulating valid mathematical-predictive scientific theories. "Ideally", science should produce theories which can be expressed in the form F(S)=R where given F and S we could reliably predict R. Hypothetically, if we had such a theory of evolutionary change processes, and we knew the conditions or S associated with apes, we could ‘predict the evolution of man’. Given the law F governing human behavior and the appropriate input S, we could predict behavior R.
Clearly such predictive theories do not exist. The traditional interpretation is that with respect to biological systems, both F and S are two complex to make it possible or practical formulate predictive theories. The assembly instruction approach suggests instead that biological systems involve(or can be viewed as involving) not complex causal relationships, but very large sets of very simple causal relationships. The analytical problem is not the complexity of the causal relationships, but finding a practical and productive method of expressing very large sets of simple relationships.
If we look at individual causal relationships or assembly instructions such as ‘sx causes cell division’, we can either view sx as a ‘complex permanent input or cause’ or as a ‘simple but changeable or dynamic cause or trigger’. Viewed as simple but ‘temporary and local’, it becomes relatively easy find algorithms F(sx)=r which could predict cell division at some point in time. The ‘problem’ is that if F is viewed as dynamic, then it becomes impossible to use the scientific paradigm to test/validate the theory.
The design science solution to this problem is to formulate theories in the form F(sx, g)=r. In such a formulation, F is simple/practical, the stimulus sx is relatively simple, the goal or stimulus g is relatively simple and F(sx, g) can produce a reliable prediction r.
The use of teleological theories is based on practical considerations. It is recognized that ‘r’ could, in theory, be predicted as the result of a very long chain of causal relationships. Such predictions a simply not practical or useful. It is also recognized that there are many equations of the form F which can be used to predict a specific sets of occurrences of an r such as cell division. The functions of the form F(sx, g)=r are not unique, but useful.
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 17. February 2003 13:11
Rex,
Quote: Why can't evolution select for massive within-lifetime processing capabilities? Sure, all that processing gets lost when the organism dies, but that doesn't matter; it survived to reproductive age, and its progeny can go through the processing again.
If you accept that biological systems can be modeled by TA’s with massive within lifetime processing, then you can precisely define, formulate, and test a variety of mathematical theories of evolution including various forms of Darwinian and neo-Darwinian theory.
If Darwin is interpreted to mean ‘natural between generation selection and other within life time selection’ and ‘random and non-random variation’, then Darwin and what I am proposing are the same. If you use a narrower interpretation of Darwin such as ‘evolution is the result of only natural selection’, then the use of TA’s shows that a theory based on the restrictive form is inadequate to explain evolutionary change.
If you accept that evolutionary/adaptive change in biological systems can be modeled or simulated by variation-selection processes, then the amount or volume of variation-selection processing must be compatible with the complexity of the biological systems. Natural selection only systems do not have anywhere near the processing capacity to explain the complexity of biological systems.
Heritability is a separate issue. When modeled in terms of variation-selection processes, there is a fundamental difference between ‘inherits a set of adaptive solutions’ and ‘inherits a set of processes capable of calculating or finding adaptive solutions’. Darwin is based on the ‘inherits solutions’ concepts. If you consider the ‘inherits processing capabilities’ concept, you generate very different models and theories of evolutionary/adaptive change.
Quote: Did you say this backwards, or am I misunderstanding you? It seems to me that if you have a dynamic process that can use environmental stimuli to guide cell division, then you need to store less information in the genome, which means that the evolutionary changes would have to transmit less information per generation.
A good question. The issue is the overall complexity of the system. In order to explain or model evolutionary change in terms of a slow/weak Darwinian change process you need to assume that the evolving system has a limited amount of complexity and evolutionary changes involve small changes in complexity. In a dynamic system involving vast amounts of within lifetime processing, the percentage of inherited information relative to information needed for adaptive solutions is small, but the overall volume of information inherited is still larger.
It is worth repeating the comments about the long term trend in analyzing biological systems. When DNA was first discovered, scientists believed that the DNA and genes had more than enough ‘storage capacity’ to explain the level of complexity associated with life forms. Based on what is known today about the complexity of biological systems, I suspect many analysts would question the ability DNA needed to store the required level of information. If you analyze biological systems in terms of automated assembly instructions, I believe it will become apparent that life forms must involve information generating capabilities in addition to information storage and transmission capabilities.
Quote: But without examples, and without a way to test whether they're actually useful. The existence of some level of rigor is nice, but the claims being made far exceed our ability to test them using what has been presented so far
Most of the discussion to this point has centered on the mathematical/logical differences between TA’s and GA’s. The description of those differences, IMO, been reasonably clear and unambiguous. IMO, the basic information is available to evaluate the claims listed. The problem, IMO, is a general lack of understanding of the mathematics being used.
The mathematics associated with variation-selection logic machines, variation-selection modeling and variation-selection simulation is reasonably complex. Understanding and being able to evaluate and discuss the differences between GA’s and TA’s requires a reasonably in-depth knowledge of the concepts and mathematics involved. Lack of that in-depth knowledge is what makes it difficult to evaluate the conclusions being presented.
Take as a simple example the concept of random variation. In mathematics you can be defined to mean precisely variation where each available or possible variant has an equal probability of occurring. If some other definition of ‘random variation’ is being used then, in terms of mathematics, it should be subject to an equally precise definition. The definitions used in evolutionary biology for terms random variation and random mutation, clearly are not defined with mathematical precision. That conclusion is readily determined from the information provided.
Quote: I don't find any tools that I can actually use. I don't know how to apply them. In cases where I might (very high-knowledge cases), I already know other ways to address the same issue. (Phylogenetic trees, gene substitution, etc..)
It would not be difficult to develop a simplified set P1 of partial assembly instructions for something like the shape of a wing. It would similarly easy to develop a set P2 for a modified wing shape. It would then be easy to construct variation-selection models operating on the assembly instructions in P1 to simulate the change from P1 to P2. The models could be used to measure the expected time needed to produce the P1 to P2 change based on different types of change processes. If the results aren’t conclusive, increase the number of assembly instructions and/or the complexity of different instructions. This is not, IMO, a very difficult exercise. If that is too difficult, start with a set of n y/n instructions where each instruction has m possible inputs. Let P1 be one set of n assembly instructions and let P2 be some other set. Compare the speeds of evolving from P1 to P2 using ‘random variation and natural selection’ to a system using within life time variation-selection processes. The techniques involved, the concepts and the conclusions produced are not, IMO, particularly complex.
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 17. February 2003 14:27
warren wrote quote: The mathematics associated with variation-selection logic machines, variation-selection modeling and variation-selection simulation is reasonably complex. Understanding and being able to evaluate and discuss the differences between GA's and TA's requires a reasonably in-depth knowledge of the concepts and mathematics involved. Lack of that in-depth knowledge is what makes it difficult to evaluate the conclusions being presented.
No, the problem is not that we are do not know or are unable to understand the mathematics involved. So far no really complex mathematical models have been described (though the verbalizations of TA have been complicated). I'm a tiny bit tired of being told I don't understand the model because (even with a doctoral minor in statistics) I don't understand the math. The problem is not the complexity of the math - there has been little of that so far - but how warren proposes that the terms and operators map into real objects and processes in the world so the model, whatever it is, can actually be tested against real data. So far TA's connection is pretty tenuous with the phenomena that standard Darwinian evolutionary theory explains quite well, thank you. What specific phenomena does TA represent?
warren also wrote quote: It would not be difficult to develop a simplified set P1 of partial assembly instructions for something like the shape of a wing. It would similarly easy to develop a set P2 for a modified wing shape. It would then be easy to construct variation-selection models operating on the assembly instructions in P1 to simulate the change from P1 to P2. The models could be used to measure the expected time needed to produce the P1 to P2 change based on different types of change processes. If the results aren?t conclusive, increase the number of assembly instructions and/or the complexity of different instructions. This is not, IMO, a very difficult exercise.
Then please do it, showing the equations and calculations so those of us in the slow group can see what you propose. This has been repeatedly advertised as an "experimental paradigm" and we have been told that it is "experimentally verifiable." So, let us see that verification.
Finally, a week or so ago warren offered to give examples to illustrate the four type of "variation-selection"processes. Are those still in prospect or have they fallen by the wayside?
RBH
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 17. February 2003 20:38
quote: The problem, IMO, is a general lack of understanding of the mathematics being used.
This claim would be more credible if you had presented some mathematics. Show us something specific enough to be amenable to mathematical analysis, and see if we can handle it, hm?
quote: The definitions used in evolutionary biology for terms random variation and random mutation, clearly are not defined with mathematical precision. That conclusion is readily determined from the information provided.
From Molecular Evolution by Wen-Hsiung Li: quote: The assumption that all nucleotide substitutions occur randomly, as in the JC model, is unrealistic in most cases. For example, transitions are generally more frequent than transversions (Ch. 1,7). To take this fact into account, Kimura (1980) proposed a two-parameter model (Fig 3.4), where the rate of transitional substitution [A<->G or C<->T] is a per unit time, whereas the rate of [every other] substitution is b per unit time.
As in the one-parameter model, let p_A(t) be the probability that the nucleotide at the site under consideration is A at time t; p_T(t), p_C(t) and p_G(t) are similarly defined. The probability that the nucleotide at the site is A in the next generation is given by
p_A(t+1) = (1 - a - 2b)p_A(t) + bp_T(t) + bp_C(t) + ap_G(t)
[Derivation omitted, plus equivalent formulas for p_T(t+1) and so on.]
We shall denote the probability that [a nucleotide remains unchanged after t steps] by X(t). It can be shown using Equations (3.13a-c) that
X(t) = 1/4 + e^(-4bt)/4 + e^(-2(a+b)t)/2
The chapter goes on to present a matrix model where the specific rate of substitution from any nucleotide to any other can be at its own rate.
So you can see that people most definitely do use mathematically well-defined models for random mutation.
There is no complete model of random mutation as it occurs in organisms, because we don't have a good idea of the rates of different types of mutation. We may not even know all of the different types of mutations. Whole genome sequencing of related species (or different members of the same species) is necessary to understand the rates and processes quantitatively enough for it to be worth bothering to construct a unified mathematical model for 'random mutation'.
There are bits that are worth modeling, however. For example, see Bhan et al. "A duplication growth model of gene expression networks.", Bioinformatics 18(11):1486-93, Nov. 2002. An excerpt:
quote: Several whole-genome expression time series data sets from yeast microarray experiments were analyzed using a Markov-modeling method (Dewey and Galas, FUNC: Integr. Genomics, 1, 269-278, 2001) to infer an approximation to the underlying genetic network. We found that the global statistical properties of all the resulting networks are similar. The overall structure of these biological networks is distinctly different from that of other recently studied networks such as the Internet or social networks. These biological networks show hierarchical, hub-like structures that have some properties similar to a class of graphs known as small world graphs. . . .
We propose network growth models based on gene duplication events. Simulations of these models yield networks with the same combination of global graphical properties that we inferred from the expression data.
I'm not going to attempt to duplicate their model here, but the network topology found in yeast expression data is unlike most network topologies, yet is well-modeled by gene duplication followed by functional divergence. Interesting, isn't it?
Edited for typos. [ 17. February 2003, 20:41: Message edited by: Rex Kerr ]
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 17. February 2003 22:11
In addition to the reference concerning gene expression networks Rex gave, see this paper. It describes an analysis of the network structure of protein sequences, the products of gene expression. quote: Scale-Free Behavior in Protein Domain Networks Stefan Wuchty European Media Laboratory, Heidelberg, Germany Mol. Biol. Evol., 18(9), 1694-1702, (2001)
Abstract Several technical, social, and biological networks were recently found to demonstrate scale-free and small-world behavior instead of random graph characteristics. In this work, the topology of protein domain networks generated with data from the ProDom, Pfam, and Prosite domain databases was studied. It was found that these networks exhibited small-world and scale-free topologies with a high degree of local clustering accompanied by a few long-distance connections. Moreover, these observations apply not only to the complete databases, but also to the domain distributions in proteomes of different organisms. The extent of connectivity among domains reflects the evolutionary complexity of the organisms considered.
The last sentence of the abstract is particularly interesting. Wuchty's data (analyses of the graph structure of protein sequences in several major protein databases) seem to suggest that one can distinguish among major taxonomical groups according to the 'small worldliness; of the graph structures of their protein sequences. This from p 1699: quote: Interestingly, the majority of highly connected InterPro domains appear in signaling pathways, as the list of the 10 best linked domains of different species in table 3 reveals. Obviously, the evolutionary trend toward compartmentalization of the cell and multicellularity demands a higher degree of organization. Therefore, more emphasis is put on the maintenance of inter-and intracellular signaling channels, cell-cell contacts, and integrity. Hence, proteomes have to provide protein sets which cover such cellular demands. The growing number of highly linked domains of signaling and extracellular proteins seen in comparisons of archaea, prokaryotes, and eukaryotes confirms this assumption.
That seems to speak directly to the kind of information processing stuff warren appears to be concerned with.
While I have some reservations (or at least questions) about the data encoding and analysis techniques Wuchty used, nevertheless his paper provices a nice example of a formal analysis of proteins using graph theory with clear evolutionary implications.
Taken together with Rex's remarks, these kinds of papers illustrate the formal models of biological phenomena that warren tells us don't exist. I strongly suggest that he consult that literature to inform his own model building.
RBH
P.S. added in edit for Gedanken: Reading this paper stirred a faint memory of reading something about a search algorithm on small-world networks that gained considerable efficiency by utilizing second-neighbor information. I'll hunt around for it.
RBH
By golly. Google and ye shall find it. [ 17. February 2003, 22:23: Message edited by: RBH ]
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 18. February 2003 11:03
RBH,
Quote: I'm a tiny bit tired of being told I don't understand the model because (even with a doctoral minor in statistics) I don't understand the math.
A minor in one branch of academic math does not confer expertise in all branches and subspecialties of mathematics. Even for those with credentials and experience with a specific type of math there are huge individual differences in ability to actually apply the mathematics. There are huge differences between abstract academic math and the applied mathematics used in constructing and analyzing complex simulation models. To suggest that a credential in one specialty makes you an expert in all areas is not consistent with what is known of the complexity of mathematics and mathematical modeling. The discussion here needs to address issues not credentials.
One of the central issues in the discussion here is the differences between GA systems and TA systems. These, I suggest, are primarily issues of mathematics and mathematical modeling and issues which can be addressed based on the information presented.
In discussing differences between GA’s and TA’s it is useful to note that:
1. GA’s are computerized search routines based loosely on Darwinian and neo-Darwinian concepts. It would be more accurate to say that GA’s are based loosely on Aristotle’s concept of teleological causation resulting from the interaction of variation-selection processes.
2. GA’s as a mathematical entity appear to be a)poorly or imprecisely defined(random variation and ‘natural selection’ are not clearly defined), b)unnecessarily restrictive(don’t appear to allow multiple within lifetime variation-selection operations), and c) misinterpreted (it is, for example, sometime claimed that GA’s generate designs when much of the design generation was ‘hidden’ in the design of the GA).
3. GA’s can not actually be used to model, simulate or analyze evolutionary or adaptive change because GA’s require as input ‘fitness landscapes’ and ‘genotype to phenotype maps’ which if they exist are unknown.
The purpose of this thread is to discuss a proposed approach to analyzing biological change processes using TA’s, automated assembly/operating instruction, and predictive ‘teleological theories’. In discussing the TA’s it is useful to start with GA’s because there is a general familiarity to GA’s. However, TA’s are not simply extensions of GA’s but rather modifications designed to address the imprecise and incomplete definitions associated with GA’s, to remove the unnecessary restrictions associated with GA’s, and to make it both possible and practical to model, simulate and analyze actual evolutionary change. The information needed to discuss and evaluate the TA modification of GA’s has been presented and is available for discussion if you understand or are willing to learn about the mathematics involved.
Quote: Then please do it, showing the equations and calculations so those of us in the slow group can see what you propose. This has been repeatedly advertised as an "experimental paradigm" and we have been told that it is "experimentally verifiable." So, let us see that verification.
Your question suggests you are confusing ‘mathematical model or system’, ‘mathematical simulation’, ‘experimental analysis’ and ‘experimentally verifiable’. But if we can address the confusion, then it may be useful to work through some simple examples.
To begin, I think we have agreed that an assembly process can be modeled by a partial set of automated assembly instructions P1. If a change occurs in the assembly process, then the result of the change can be expressed as P2.
The first claim made here that for any change from P1 to P2 there is a abstract mathematical TA system which can ‘model’ the change from P1 to P2. The validity of this claim should be, but apparently is not obvious. Consider as a simple example, P1 at time t=1 with instructions (s1, r1), and (s3, r2) and P2 at time t=2 with instructions (s2,r1) and (s4, r2). Assume that at time t=1 the fitness landscape provides ‘survival values’ of 1 for (s1, r1) and (s3, r2) and values of 0 for all other assembly instructions. At time t=2 the fitness landscape provides values of 1 for (s2,r1) and (s4, r2) and zero for all other assembly instructions.
Given the above assumptions, it is should be obvious that a variation-selection system of the type described earlier can mathematically model the change from P1 to P2. If you include in your TA the ability to add new members to the set of recognized stimuli S, add new members to R, and add new assembly instructions, then a TA system can model any change from any P1 to any P2. Similarly, a TA system can model any change from any C1 to any C2. This, IMO, is simple mathematics.
The second claim is that evolutionary changes which can be modeled by a TA system can not be modeled by models which incorporates Darwinian and neo-Darwinian concepts. A TA system incorporates Darwinian concepts if you limit selection to ‘between generation- whole system selection’. If you exclude within lifetime variation-selection processes, then you simply do not have the computing capacity needed to model known evolutionary changes.
This conclusion is, IMO, mathematically obvious. The claims that "Darwinian evolutionary processes can explain ……" , are based on seriously flawed mathematical analysis. Specifically, they are based on a failure to quantify the complexity of evolutionary change and they are based on the use of imprecise definitions of selection and variation.
As I have stated on a number of occasions, the subject of this thread is not Darwinian evolution but an experimental paradigm for analyzing self assembly. The mathematical conclusion that Darwin is ‘completely inadequate’ to explain the actual complexity of evolutionary changes, is simply an interesting side feature.
To this point, I have shown that TA systems can model any evolutionary/adaptive change. The next question to be addressed is ‘Do TA systems represent or model physical reality?" Can the biological systems actually generate the logical equivalent of the processing implied by the TA. I will leave that demonstration for later. This should be enough to discuss for one day.
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 18. February 2003 13:48
Rex,
The theory says that variation or mutations are random. To use a hypothesis that mutations are non-random as evidence supporting a random mutation theory seems a somewhat flawed argument. If you accept or start with a definition that mutations exhibit a specific type of non-random pattern, you do not have a theory of evolutionary change, but one component of the theory. You then need to show that how the specified non-random mutations are compatible with the other components of an evolutionary process. In the absence of the other components, all your example does is support the general assertion that there currently is no scientific theory of evolution.
As I have stated before, the design versus evolution discussion makes a whole lot more sense if you start by recognizing that no one currently knows how to formulate a theory of evolutionary change processes. Biological systems and evolutionary change processes are far too complex to be explained by the simplistic Darwinian and neo-Darwinian concepts. This ‘conclusion’ seems to be obvious and strongly supported by the evidence.
IF you reject as inadequate, explanations based on between generation processing, then explanations based on within lifetime information processing begin to make a lot more sense. TA’s as defined here provide mathematical expressions or models for extremely powerful within lifetime information capacities. As has been discussed here, it is easily shown using the automated assembly approach that 1)biological systems are extremely complex and 2)the processes needed to explain changes in these complex systems must have extremely powerful information processing capacity of the type described by TA's.
If you can accept the two points above, then all that remains is 1)to demonstrate that biological systems have the capacity to produce the logical equivalent of this information processing based on known physical chemical laws, and 2) to demonstrate that it is possible and practical to develop predictive scientific theories which recognize the level of information processing present.
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 18. February 2003 16:45
I can't see participating in this thread any longer unless I see two things:
1. References to the literature (engineering, design, physics, biology, mathematics, or whatever) that describes the mathematics warren claims are necessary to understand his claims about his modeling approach. Nothing he has described so far is mathematically exotic.
2. The concrete examples he offered to supply.
Absent those, participation is fruitless because the TA model (as he has described it so far) is too under-specified to even imagine implementing it in code and mapping it to phenomena of interest. One can't even tell what biological "information processing" he wants to model, say nothing of how TA purports to model it. At least three of the participants in this thread have advanced degrees in scientific disciplines with non-trivial mathematical backgrounds, and as far as I can tell none of us can figure out how to build a TA from warren's description of it. I don't think that's our problem.
RBH
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 18. February 2003 20:25
Regarding the stuff about (s1,r1) and so on--yes, yes, I got all that a long time ago. I'm not interested in the fact that one can have (si,rj) mapped to survival numbers; that is obvious. I am interested in how you determine that mapping in any case of interest. And I am interested in how you decide which si get included, and which rj, and how you keep them to a manageable number, and how specifically you might switch from (s1,r1) to (s4,r2) if not by exhaustive search of each and taking the max pair in each case--and how you can do it in a computationally tractable way if you do use exhaustive search.
Unlike conventional fitness landscapes, discrete (si,rj) "assembly instructions" can be extremely brittle (i.e. not locally correlated) and thus not amenable to any efficient maximization/optimization algorithm.
quote: If you exclude within lifetime variation-selection processes, then you simply do not have the computing capacity needed to model known evolutionary changes.
This conclusion is, IMO, mathematically obvious. The claims that "Darwinian evolutionary processes can explain ……" , are based on seriously flawed mathematical analysis.
I am afraid I have to give up here; for some reason I seem to be unable to explain why the supposedly obvious conclusion is anything but once you realize that evolution can act on adaptive processing systems (e.g. organisms) and yet not need to incorporate the results of those adaptive processing steps. A long time ago I showed that there have been sufficient generations to account for the amount of information in the genome of an organism. I have given numerous examples in support of the genome being the primary repository of heritable information from generation to generation.
And yet you keep making the claim of mathematical obviousness.
Oh well.
quote: As has been discussed here, it is easily shown using the automated assembly approach that 1)biological systems are extremely complex and 2)the processes needed to explain changes in these complex systems must have extremely powerful information processing capacity of the type described by TA's.
If it is easy, please do it instead of telling us that it is easy. If it were difficult, then it would make sense to not actually do it. But if it is easy, why not just do it and show the results instead of using "dead reckoning" to hypothesize what the answer would look like?
IP: Logged
|
|
|