ISCID Forums


Post New Topic  Post A Reply
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » Jack Szostak shows how to apply specified complexity to biology

   
Author Topic: Jack Szostak shows how to apply specified complexity to biology
John Bracht
Member
Member # 5

Icon 1 posted 02. August 2003 21:11      Profile for John Bracht   Email John Bracht   Send New Private Message       Edit/Delete Post 
In this article:

Szostak J. Molecular Messages. Nature 2003;423:689.

In this article, Jack outlines a novel measure of biological information that he calls "functional information" and defines it as follows:

quote:

By analogy with classical information, functional information is simply -log2 of the probability that a random sequence will encode a molecule with greater than any given degree of function.

quote:

Imagine a pile of DNA, RNA or protein molecules of all possible sequences, sorted by activity with the most active at the top. A horizontal plane through the pile indicates a given level of activity; as this rises, fewer sequences remain above it. The functional information required to specify that activity is -log2 of the fraction of sequences above the plane. Expressing this fraction in terms of information provides a straightforward, quantitative measure of the difficulty of a task. More information is required to specify molecules that carry out difficult tasks, such as high-affinity binding or the rapid catalysis of chemical reactions with high energy barriers, than is needed to specify weak binders or slow catalysts.

and there's even a great image to show the concept:

 -

While Szostak doesn't use the term "specified complexity", it's clear that he uses the same concepts. Bill Dembski has made it clear through his writings that biological specifications are always cashed out in terms of functionality, though he hasn't given a lot of detal of precisely how this plays out in biology--it's up to the biologists to take the design inference and apply it to their field. I think Dr. Szostak has given us an excellent primer on how functionality constitutes a detachable pattern, i.e., a specification. The functional pattern is determined by the laws of physics that regulate protein-folding (hydrogen bonding, hydrophillic/hydrophobic, van der Waals forces, etc) and other chemical/catalytic properites of proteins, for example the ability to form or break chemical bonds under certain specialized conditions. Biologically, it is possible to define the shape of the specificational target by mutation experiments, in which the degree of functionality of a set of mutant sequences can be evaluated.

Furthermore, there are other parallels between Szostak and Dembski. Szostak rightly emphasizes the importance of the level of functionality in characterizing the amount of information inherent in biological systems. This is similar to how Dembski defined an algorithmic complexity measure for specifications in No Free Lunch. For example, he gives an example where a corrupt electons officer (named Caputo) assigns Democrats the top ballot slot 40 out of 41 times. In this specificational target, there are multiple patterns (events); you can have the lone Republican selection anywhere among the 40 Democratic selections. Furthermore, the pattern in which there are no Republican selections among the 41 choices is even simpler, algorithmically, and should also be included. Formally, Bill also includes the possibility that Republicans were chosen 40 out of 41 times, since that pattern too would trigger a design inference (ie, a conviction of election fraud). Once all these specificational resources are considered, the probability of the target area can be calculated, and Caputo was reprimanded (though not formally convicted, as I recall).

The analogy with the Szostak example is in what we might call a "functional stringency" criteria instead of an algorithmic complexity criteria. A more stringent functional requirement may only be satisfied by one or two sequences (corresponding to having the plane intersect only the tip of the cone in the image above). If we relax those functional constraints (ie, accept less-functional sequences), we effectively move the plane downward through the cone and end up with a much larger specificational target (again, the specificational target includes all sequences in the plane or above it). As Bill has shown, it is only the high-stringency, low-probability targets that warrant a design inference, and that makes sense. As the stringency is relaxed (ie, less-efficient molecules are allowed), the specificational target eventually gets large enough to reasonably be "hit" by chance alone.

There are some further subtleties to be considered. It's clear that in some cases, multiple independent structures may have the same functionality. For example, there are at least 3 differently-structured Serine proteases which seem to all function at a very high efficiency. However, they are, as far as we can tell, not evolutionarily related--they have very different sequences and structures. Remarkably, though, they each contain a chemically identical active site which has an oxyanion hole and catalytic triad. In this case, the specificational target includes 3 subregions which perform the same function. This gets at Dembski's concept of specificational resources, or the number of independent targets that must be taken into account in design inferences. In the case of serine proteases, it seems that there are 3 specificational resources at the level of 9 orders of magnitude rate enhancement of their chemical reaction. Each of these sub-regions is a target that could potentially be "hit" by a random mutation and hence must be taken into account to accurately determine the probabilities associated with the origin of many proteins. Furthermore, as the functional stringency of the specification increases, the number of specificational resources will also go down. This is because there will be fewer options for protein conformations to carry out biological functions at a high level of functionality that at lower levels. If there are 50 sequences that can perform a function at a low-stringency value (in other words, at a low level of functionality), then maybe there will only be a handful that can perform the function at the peak of the cone. This model accounts for how evolution might get "trapped" into sub-optimal peaks, since it might find one of these sub-targets that doesn't extend all the way to the peak of functionality. Since adjacent sub-regions might represent remarkably different sequences and structures, evolution may not be able to move to a new sub-peak; the organism is stuck at the best it can do with its sub-optimal structure. Thus, specificational resources devide the overall cone (specification) into sub-cones that densely populate the lower section, but thin out and terminate as we move upwards to higher levels of functionality and become quite sparse by the time we reach the top.

One interesting question that should be addressed by all this is: what is the functional stringency required for life? Obviously, biomolecules that function at low efficiency won't be able to sustain a living organism. However, we also know that many reduction-in-function mutations are still viable. So we can move down the peak a bit and still be alive, but not too far. Where is that cutoff? How much functional reduction can biological organisms withstand? This strikes me as an important research question to be addressed in the future.

A related question is whether we can find a new metric which measures the distance from the top of the specificational peak a given sequence is. This would be important for addressing the question above, because we could begin constructing various mutant proteins, characterizing where on the peak the sequence falls, and characterize whether it is life-supporting. Thus, we could begin to draw the plane delineated by life itself upon biological specifications. Knowing the functional stringency imposed by the requirements for life to exist should allow us to begin objectively calculating probabilities of various Darwinian scenarios.

Finally, it occurs to me that many Darwinian thinkers assume that the plane has moved up the cone over time. In other words, that biomolecules were originally very low-efficiency, and somehow organisms managed to reproduce with these rather poorly functional proteins; over time, they evolved the tightly specified and irreducibly complex systems wherein removal of one component causes the system to fail. In my exchange with Ursula Goodenough on Metanexus, I dealt with her argument that the flagellum used to exist as a very simple, low-efficiency cluster of proteins that wasn't particularly well adapted to its task; over time, however, the system became much more specific and efficient. This trend correlates with the plane intersecting the cone moving gradually upward from low-specificity, low-informational content to a much higher functional and informational state.

My question is: can we empirically test this? Perhaps participants on this board can suggest ways. It seems that a uniform "degradation" must be applied to all proteins to re-create the low-specificity, low-functionality early conditions that Goodenough invokes. While I'm very skeptical that any system like this ever has or ever will be able to exist and allow survival, I'm willing to try to test it scientifically. It seems that this thinking underlies Alan Orr's incremental indispensibility model in which low-specificity systems enfolded non-essential components that were co-modified over time to function tightly (and irreducibly) together. Can anyone think of a way to reverse this historical process (if indeed it occurred) in the lab?

Here's one suggestion, and I'm looking for others: let's take a component of an irreducibly complex system and make a disabling mutation in one of the components. Assuming that the system is non-essential (like the flagellum), we can then grow the organism in the presence of a really high mutation rate--hoping to get one or more mutations in other proteins to compensate for the mutation we induced in the system. Of course, this sort of co-evolution experiment is common in labs, but always creates another tightly-specified system with precise binding surfaces, etc. What I want to know is whether an IC system can be made to "degrade" and function at lower functional levels as we move down the specificational "cone" (thereby reversing the Darwinian trend up the cone). This should help to answer the question of whether the Darwinian pathway is completely imaginary or might actually have some evidentiary support.

In conclusion, I think Szostak has shown how Dembski's concept of a specification ought to be applied to biological systems, and has shown that it is possible to empirically define the shape of that target and to calculate the informational content of biomolecules. In addition, he shows how Dembski's ideas of algorithmic complexity and specificational resources might be fruitfully applied to these biological specifications. There are many interesting experiments implied by this work, and I look forward to any suggestions from Brainstorms participants.

John Bracht

[ 03. August 2003, 03:46: Message edited by: John Bracht ]

IP: Logged
Argon
Member
Member # 276

Icon 1 posted 03. August 2003 10:17      Profile for Argon   Email Argon   Send New Private Message       Edit/Delete Post 
John Bracht writes:
quote:
What I want to know is whether an IC system can be made to "degrade" and function at lower functional levels as we move down the specificational "cone" (thereby reversing the Darwinian trend up the cone).
I suggested the example of streptomycin resistance in the Literature Review forum (Nature Refutes ID?: The Evolutionary Origin of Complex Features). Several mutations which produce resistance to streptomycin also reduce the overall function of the translational apparatus in the cell. Consequently, strains carrying these resistance mutations are at a disadvantage (compared to the "wild-type") when the antibiotic is no longer in the environment. That could be considered a step downward in protein translation. However, secondary mutations can arise in an interacting subunit that compensates for the growth defect produced by the single strep-r mutations. This restores the efficiency of protein translation and permits strains to compete effectively with "wild-type" cells under conditions +/- streptomycin.

There are a couple things to note when talking about biological information with respect to 'functionality'. The first is that function is context dependent. What is "sufficient" in one case may be a lethal deficiency in another. Perhaps this has better application in studies of abiogenesis or the emergence of the first enzymes where the question is "what are the minimal number of chemical functions necessary for life". In contrast, it probably won't be useful for determining whether systems like blood clotting and the development of the immmune system can arise naturally.

The second issue is how to measure the information for things like pseudogenes. There are cases where functional genes likely rose from parts of pseudogenes. Yet if we apply a functional metric of information to pseudogenes, then they would have zero information because they often have no immediate function in the cell. Yet most would suspect that pseudogenes have more "functional information" than random sequences.

Interesting topic, biological information. I think the concept will continue to entertain everyone for decades to come.

[ 03. August 2003, 10:18: Message edited by: Argon ]

IP: Logged
Pim van Meurs
Member
Member # 541

Icon 1 posted 03. August 2003 16:19      Profile for Pim van Meurs     Send New Private Message       Edit/Delete Post 
Interesting article thanks John

Here is the link to the article

Szostak, JW. Functional information: molecular messages. Nature. 2003 June 12; 423:
689.PDF

[message edited by Moderator: the link is good. The attempt to take the discussion off-topic is not.]

[ 03. August 2003, 21:46: Message edited by: Moderator ]

IP: Logged
Matthew J. Brauer
Member
Member # 819

Icon 1 posted 04. August 2003 16:33      Profile for Matthew J. Brauer   Email Matthew J. Brauer   Send New Private Message       Edit/Delete Post 
John:

Regarding your query:
quote:

What I want to know is whether an IC system can be made to "degrade" and function at lower functional levels as we move down the specificational "cone" (thereby reversing the Darwinian trend up the cone).

This paper:
In vitro evolution of beta-glucuronidase into a beta-galactosidase proceeds through non-specific intermediates. addresses the issue quite nicely.

IIRC, Matsumura selected a glucuronidase for galactosidase activity. He found that the specificity and glucoronidase activity of the enzyme decreased in the initial phase of the selection. At some point, the specificity began to increase, as did the galactosidase activity.

He visualized this in a similar fashion to Szostak, except that there are two or more overlapping cones. The selection forces the sequence down the first (glucuronidase) cone, to the point where the cones intersect. At this point the selection forces the sequence up the galactosidase cone.

On the "IC"-ness of beta-glucuronidase I have no opinion.

IP: Logged
John Bracht
Member
Member # 5

Icon 1 posted 05. August 2003 06:48      Profile for John Bracht   Email John Bracht   Send New Private Message       Edit/Delete Post 
Matt,

Thanks for the reference! This paper looks really interesting. It ties in nicely with the concepts outlined by Szostak.

What I'm wondering now, is whether or not we can get a multi-complex system to degrade or loose specificity in the way that the single enzyme apparently did in the Matsumura paper. I think that has more relevance to the origins of IC systems. Without yet having read the paper in detail, I assume that the enzyme lost functional information but gained the ability to interact with two different substrates and catalzye two different reactions. Now, imagine something like the flagellum. Can a protein complex collectively loose functional efficiency, thereby allowing individual proteins to interact with different partners (perhaps easing the co-optation transition)? How loose can the protein-protein interactions be before function is lost completely (or perhaps more accurately, before function descends below the levels necessary to maintain survival, descending below the "plane" imposed by life's functional requirements)?

BTW, I appreciate the fact that you're not making claims about the ICness of a given system. In this discussion, I'm not making any arguments about the evolvability or non-evolvability of IC systems. Rather, I would like to see some more experimental ideas for how to cause a degradation of IC systems that might model how they originated. Any more takers?

John

IP: Logged
Argon
Member
Member # 276

Icon 1 posted 05. August 2003 09:38      Profile for Argon   Email Argon   Send New Private Message       Edit/Delete Post 
John, have you considered transcription factors? For example, in bacteria, the specificity and activation of RNA polymerase at particular operon can be modulated by alternate sigma-factors (as well as other transcription factors). These factors strongly alter the functional specificity of RNApol.
IP: Logged


All times are East Coast  
Post New Topic  Post A Reply Close Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership