ISCID Forums


Post New Topic  Post A Reply
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » What Sort of Property is Specified Complexity? (Page 5)

 
This topic is comprised of pages:  1  2  3  4  5 
 
Author Topic: What Sort of Property is Specified Complexity?
Kirk Durston
Member
Member # 174

Icon 1 posted 13. September 2002 15:20      Profile for Kirk Durston   Email Kirk Durston   Send New Private Message       Edit/Delete Post 
I'm going to have to sign off now. I'm bug-eyed from staring at this screen and I feel a little fried from generating all these essays today. I won't be able to continue this discussion, as I already indicated. I realize I probably have not convinced even one person, but I do hope I've introduced a few things to think about. My apologies to anyone who might post anything further on this subject, I just can't afford to get any further behind in my work.

I've enjoyed this discussion and feel that being involved in it was well worth my time.

IP: Logged
Grape Ape
Member
Member # 399

Icon 1 posted 13. September 2002 17:47      Profile for Grape Ape     Send New Private Message       Edit/Delete Post 
quote:

Originally posted by Kirk Durston:
Grape, I specifically included the phrase 'given the existence of the first paralogue' to account for what you have pointed out. You are right in that the generation of the second functional paralogue is not independent of the existence of the first. The effect of fixation merely increases the number of opportunities for the second paralogue to evolve.

I appologize if I've misunderstood you. Here's what I take issue with:

quote:
My answer would be, only if the total information required to generate both paralogues was less than 70 bits. The reason for this is that the probability of achieving a second paralogue with a second novel function, from the first paralogue with the first novel function is the probability of achieving the first paralogue, multiplied by the probability of achieving the second paralogue given the existence of the first paralogue. So the addition of required functional information is proportional to the product of each of their probabilities, where each probability refers to the chance of making the step from the parent gene.
You're saying that we should multiply the probability of each event (the probability of the first and the probability of the second) to get the overall probability, but I can see no justification for this. While the evolution of one functional paralogue does increase the probability of another one evolving, for our purposes we can treat them as independent events. This actually works in your favor, but it makes your argument unworkable unless I'm missing something. For example, the chances of a person winning the lottery this week are independent of the chances of a person winning next week. To make it analogous to biology, we would take the first week's winner and reproduce it all over the place (to fixation), so that the second week's winner has a probability of being a prior lottery winner of 1. This makes the chances of winning two lotteries over a given time frame merely the sum of winning each -- we would not multiply the chances. The fact that a subsequent winner is more likely doesn't change this -- it just lowers the overall probability. What you could show in this case is that the chances of winning a lottery are so low that winners effectively never happen, or that when you sum up the individual chances, it becomes unlikely that many winners will happen over time. But since I think you've agreed that a duplication / divergence event is not prohibitively improbable, I don't see any reason why summing up several such events would not overcome your 70 bit limit. For example, let's say the addition of 5 bits has a chance of 10^-15 of occuring within time frame X, which I assume is safely an underestimate. The addition of 70 bits in 5 bit increments, assuming instantaneous fixation of each, should then have a probability of something like 14*10^-15 over the same time frame, rather than 10^-196 which would be required to add all 70 at once -- that's a huge difference. Obviously this is a grossly oversimplified example, but I hope it's sufficient to demonstrate the principle. If your argument is that every such addition of functional information is always greater than 70 bits, then that's a different story. But I think you've already indicated that it isn't by accepting that duplication / divergence would not require that many bits.

quote:

I see the processes of generating functional paralogues as limited by the boundaries of folding sequence space for that protein fold and the number of functions that can be served by that same area of folding sequence space. In other words, there's only so much that nature can ratchet up the information and, as you pointed out, the need for the function has to be already there in order for selection to work. I get suspicious when I see two paralogues, each with different functions. Reason? Because it is also a mark of ID to be efficient and if one fold, with two or three slight modifications, can do two or three different jobs, then go for it. If I see three different functions, then I get REALLY suspicious.

I don't know why. There are plenty of individual proteins that are able to carry out more than one function, though they only carry out one efficiently. It's the hallmark of enzymes, for instance, to be very good at doing one thing but able to do lots of other things weakly. This applies both in regards to substrate specificity and catalytic ability. If you start with one such enzyme and you duplicate it, one paralogue can be selected for specializing in a "weak" activity while the other one keeps the old (and this would clearly be an example in functional information increase AFAICT). In such a case, you need not do a "random walk" at all -- selection can kick in instantly. This is the more refined version of the subfunctionalization model for gene duplication, which based on the observed fact that gene duplicates are preserved more often than previously thought possible. While various folds themselves may be separated into "islands" in sequence space, the evidence strongly suggests that various functions themselves are not -- they tend to overlap.
quote:

If I ever observed a paralogue achieve a novel, beneficial function, I would think it more likely that an old function that was lost, has just been fortuitously regained. It would answer the question as to why the unfulfilled function was there in the first place.

There are lots of examples of paralogues gaining novel function from biotechnology. And no, these cannot be dismissed as examples of ID, because the researchers only use mutation, selection, and recombination -- all found in nature -- to produce them. I would gladly give you some refs, but I don't see the point since you're not going to be able to stick around [Frown] But you can do a PubMed search and see what I mean. This sort of thing also happens in nature all the time, including many organisms that have adapted to degrade man-made chemicals (and in these cases, you can't say that these activities were once lost and now found, though I'm not sure it would matter if they were).

quote:

By the way, the generation of tandem repeats (i.e., genetic digital noise) indicates that the digital information of genetics is subject to the same problems as all other digital information storage and transfer divices, including being subject to the ITT.

I didn't imply that tandem repeats were anything but noise -- I just pointed them out as an example of the mechanism that produces gene duplicates, and why they occur at a greater rate once you already have some in place. However, they are not random noise, they are patterned noise, which is one reason why attempts at figuring out the likelihood of random sequences to generate functional proteins are at least partially flawed. The sequences are unlikely to be random in the first place.
IP: Logged
Art
Member
Member # 179

Icon 1 posted 14. September 2002 11:26      Profile for Art     Send New Private Message       Edit/Delete Post 
Hi Kirk,

No need for you to respond to this, but I thought I'd add something to help (maybe [Smile] ) you find some fruitful directions for analysis.

You said:

quote:
3. Even if it turns out in the end that all proteins have an Nf/N somewhere around 1 in 10^11, all that would indicate is that they are within the reach of natural processes, according to my hypothesis, which puts a lower limit for Nf/N of 8 x 10^-22. So ID would not be required to generate functional proteins. However, for the barebones life form, we need a minimum of 150 specific genes. Not just any genes, but 150 highly specified genes. Given that the specificity of the average gene is assumed to be about 10^-11, the genomic specificity for 150 specified genes would be about 10^-1650.
I think it's important to keep one simple thing in mind (simple, yet often overlooked). Say that we are speaking of a collection of 150 functional specifications, each of which can be satisfied by families of polypeptides that each have an Nf/N of about, say, 10^-13. It turns out that, in a single collection of 10^16 polypeptides of random sequence, each and every one of these different functional specifications will be represented at least once. This means that, in this hypothetical collection, the probability of "obtaining" this collection of 150 enzymes will be 1 - not the impossibly small number suggested in your statement.

This is yet another way of stating my quibble with the use of informational "bits" to make probabilistic inferences.

IP: Logged
Frances
Member
Member # 169

Icon 1 posted 14. September 2002 17:04      Profile for Frances     Send New Private Message       Edit/Delete Post 
Art,

Very correct. In fact to provide a useful example which shows the difference between probability and information lets look at the sequence of 150 coin tosses. Before the entropy is 2^150. Let's consider the sequence of all head. The probability of such a sequence is 1 in 2^150 by CHANCE alone but what if there is a regular process which guarantees the outcome to be a head? Now the probability is 1 for this outcome and the information would be 150 bits.

It's easy to confound probabilistic calaculations with information but they can be quite different. In our discussions Kirk seems to use informational measures (bits representing differences in entropy) in a similar fashion to probabilistic measures (bits to represent randomness). Only when the information generating algorithm is considered to be random chance are the two interchangeable.

Once I return in a few weeks I will address Kirk's comments on Schneider's paper. There seems to be some confusion on Kirk's part on what Schneider did and did not do. I realize that probability and information can be quite tricky concepts. I myself find myself still struggling with it regularly.

[ 14 September 2002, 17:07: Message edited by: Frances ]

IP: Logged
warren_bergerson
Member
Member # 262

Icon 1 posted 15. September 2002 11:00      Profile for warren_bergerson   Email warren_bergerson   Send New Private Message       Edit/Delete Post 
Kirk,

First, it has been both enjoyable and interesting discussing this topic with someone both knowledgeable and courteous. Specified complexity is a fascinating subject, and I doubt if any serious student of the subject would suggest anyone individual or group has all the answers. Courteous, professional exchanges of ideas are essential if we ever hope to advance our knowledge of this or any other serious scientific subject.

In light of your comments, I have reformulated my views on the subject of ITT. The issue here, IMO, is not the validity of some mathematical theorem, but how we define and measure the ‘volume of information’(VOI) or ‘the volume of information at a point in time’.

Three essential point need to be recognized in developing such a definition. First the definition must recognize that biological systems are associated with both increases and decreases in information. It is useless and essentially silly to develop a measure of VOI which suggests that non-living matter has the same VOI as living matter, or that a bacteria must have the same VOI as a human being. Second, the definition of VOI needs to be independent of the processes which produce changes in VOI. I get the impression that people are attempting to define VOI so that ‘by definition’ increases in VOI are not logically possible from mechanical processes. Such definitions might represent interesting abstract mathematical concepts, but they are useless and misleading in scientific analysis. Third, it needs to be recognized that VOI (as well as specified complexity) are ‘scientific measurements’ being developed to aid analysis of biological systems. VOI and specified complexity are not absolute features of nature which can only be defined in one way.

IMO, as I stated earlier, the primary issue here ‘should be’ developing practical, pragmatic definitions and techniques that make it both practical and useful to measure VOI and specified complexity.

With respect to the ‘feasibility’ of demonstrating ‘within lifetime increases in VOI", I believe the issue is ‘definitions and measurement techniques’ not the technical difficulty of the task. Even rather simple techniques, for example, can demonstrate that the information content of a nervous system is at least billions of times greater than the information content of the genome. I am confident the complexity or information content of a tree is similarly ‘billions’ of times greater than the complexity of the trees genome. [Given the complexity of trees and genomes, I realize billions of times more complex is a relatively minor difference.] Again, the issue here, I suspect is one of definition, rather than one of measurement difficulty.

Quote: the better we can see its relationship and effect on the morphology of the finished organism and the harder it is to come up with yet-wilder scenarios as to how it got its information. One thing we've learned is that protein coding genes alone, will not build the organism. But we've just begun to decipher the horrifically complex interacting regulatory system, with emphasis on 'just begun'. This system seems to be encoded into the genome along with the proteins. The discoveries we are making in this reverse-engineering project point to the possibility that ALL the information necessary to build a fully functional, mature organism is programmed into the genome and organelles, with possibly a small amount of regulatory function provided by other biological systems.

The two questions above, ‘what information does it use’ and ‘where does the information come from’ are the very issues I have been studying. (But with respect to human decision making rather than the genome.) As with the study of the genome, the standard assumption in the analysis of decision making is that ‘the basic logic used in decision making is hard coded into humans and essentially transferred from generation to generation’. This ‘hard coding assumption’ is based on the observed fact that essentially all humans appear to use the same decision logic, (just like essentially all humans develop the same type of hearts or brains).

It will be noted that the ‘hard coded/inherited’ assumption is not the only logical possibility. The second logical possibility is that ‘information needed to develop decision making, hearts, and brains’ rather than being hard coded, are recalculated by each new generation. The ‘within lifetime’ information generation assumption or perspective has proved very useful in analyzing, modeling and simulating human decision making. Unlike the hard coded assumption, the within lifetime approach suggests relatively simple easy to measure, but highly dynamic, decision making logic. It has not only proved practical to identify and model the ‘simple decision making logic associated with human behavior, but it has proved possible to identify the very powerful and efficient processes which produce very rapid changes in decision making logic.

An interesting side feature of systems using ‘within lifetime recalculation processes’, is that 1)they can generate novel, or creative solutions, and 2)they are capable of very complex and very rapid ‘evolutionary’ change. My hypothesis is that the same types of ‘within lifetime information generation’ which appears to explain human decision making and human problem solving can also explain both the operation of the genome and evolutionary change in the genome.

The task of reverse engineering the genomic processes, my analysis is not terribly complex project. The reverse engineering, IMO, is a problem that is largely solvable with existing knowledge and a new approach to how the information is interpreted and analyzed.

Quote: a) Only intelligent agents can create functional information
b) If biological design processes produce functional information, then the information must be front loaded into the system unless there is going to be ongoing input by the intelligent agent.

As I stated earlier, this is not a fact. It is a mathematical concept based on a flawed definition of functional information, or on a misapplication of a mathematical concept to biological systems.

Quote: 10. I don't see the need for powerful within-lifetime processes for generating the information required to build a fully mature organism, I think it is all there in the genome.

Again there are two basic issues to be addressed, ‘how does the information generate an adaptive organism’ and ‘where does the information come from’. If you look realistically at the information needed to operate a complex organisms, it is likely that all the DNA in an organisms coded with perfect efficiency couldn’t perform the task. Second, as has been shown over and over again, there are no viable between generation processes which can explain the generation of the information which is known to be transferred between generations. The ‘need’ for within lifetime processes seem to me to be fairly obvious.

It appears that are disagreement comes down primarily to the issues 1) how is VOI defined, 2What logical/mathematical processes create increases in VOI, and 3)how is are the first two items usefully applied to the analysis and modeling of biological systems. These are, it seems to me, issues that are all resolvable.

IP: Logged


All times are East Coast
This topic is comprised of pages:  1  2  3  4  5 
 
Post New Topic  Post A Reply Close Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership