ISCID Forums


Post New Topic  Post A Reply
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » Denton on Protein Folds (Page 1)

 
This topic is comprised of pages:  1  2  3 
 
Author Topic: Denton on Protein Folds
Mike Gene
Member
Member # 149

Icon 1 posted 09. March 2003 15:02      Profile for Mike Gene     Send New Private Message       Edit/Delete Post 
Michael Denton and others recently published a paper which can be viewed as a synthesis of Denton's typological views (from his first book) and his anthropic views (from his second book). [1] The abstract of the paper reads as follows:

quote:
Before the Darwinian revolution many biologists considered organic forms to be determined by natural law like atoms or crystals and therefore necessary, intrinsic and immutable features of the world order, which will occur throughout the cosmos wherever there is life. The search for the natural determinants of organic form, the celebrated ‘‘Laws of Form’’. was seen as one of the major tasks of biology. After Darwin, this Platonic conception of form was abandoned and natural selection, not natural law, was increasingly seen to be the main, if not the exclusive, determinant of organic form. However, in the case of one class of very important organic forms, the basic protein folds, advances in protein chemistry since the early 1970s have revealed that they represent a finite set of natural forms, determined by a number of generative constructional rules, like those which govern the formation of atoms or crystals, in which functional adaptations are clearly secondary modifications of primary ‘‘givens of physics.’’ The folds are evidently determined by natural law, not natural selection, and are ‘‘lawful forms’’ in the Platonic and pre-Darwinian sense of the word, which are bound to occur everywhere in the universe where the same 20 amino acids are used for their construction. We argue that this is a major discovery which has many important implications regarding the origin of proteins, the origin of life and the fundamental nature of organic form . We speculate that it is unlikely that the folds will prove to be the only case in nature where a set of complex organic forms is determined by natural law, and suggest that natural law may have played a far greater role in the origin and evolution of life than is currently assumed.
Denton et al. begin with an interesting historical survey of pre-Darwinian thinking, where form took priority over function:

quote:
The widespread belief that organicforms are lawful ‘‘givens of nature’’ explains why it was that throughout the pre-Darwinian period from the naturphilosophie of the late 18th century, right up to the period just before the publication of the Origin, although it was universally accepted that organisms exhibited functional adaptations, for Goethe, Carus Goeffroy and Owen, it was always form which was of primary concern. Form came .first and function was viewed as a secondary and derived adaptive feature (Russell, 1916; Richards, 1992).
Denton et al. explain the decline of this thinking as follows:

quote:
The Platonic biology of the pre-Darwinian era with its emphasis on evolution by natural law and its conception of a rational order underlying the diversity of life, represented a grand scientific vision, whose heroic goal was nothing less than the unification of biology and physics. It collapsed primarily because it failed to identify the elusive laws of form which might have provided a rational account of organic form and explained how the evolution of the basic invariant forms or types, from cell forms to the body plans of the major phyla, and deep homologies such as the pentadactyl limb, might have come about as a result of natural law. That they had no convincing explanation was explicitly conceded by Owen (1849) in the .final
paragraph of ‘‘On the Nature of Limbs’’: ‘‘To what natural laws or secondary causes the succession and progression of such organic phenomena may have been committed we as yet are ignorant.’’

The authors also note:

quote:
Of course no serious biologist doubts that some biological forms may be given by natural law and arise spontaneously out of the intrinsic self-organizing properties of their constituents and may not need any genetic program for their specification. The spherical form of the cell and the .at form of the cell membrane are two well known examples. Other more complex examples cited by Waddington (1962) are the various cytoplasmic structures made up of multiple layers of membranes such as the grana and intergrana regions of chloroplasts, the hexagonal arrangement of the rhabdomeres in the eyes of insects and the many forms described by Thompson (1942) in Growth and Form, including radiolarian skeletons, the shapes of mollusk shells, the curved shape of animal horns. But on the whole, natural law is considered to play a very trivial role in the generation of biological form and particularly in the generation of complex seemingly asymmetric biological forms such as protein folds, cell forms, body plans, etc.
Denton et al. then argue that protein folds represent a genuine example where the pre-darwinian thinking has been validated. Let me quote their argument at length:

quote:
The protein folds are the basic building blocks of proteins and therefore of the cell and indeed of all life on earth. Each is a polymer between 80 and 200 amino acids long consisting of from about 1000 to 3000 atoms folded up into a complex intricate three-dimensional shape. Most folds exhibit a hierarchical structure composed of basic secondary structural elements such as a helices and b sheet conformations which are often arranged into more complex motifs which are in turn combined together to make up the native conformation of the fold.

It is important at this stage to note that the great majority of functional proteins in the cell consist of two or more basic folds linked together into multidomain or multifold complexes. In this paper we are considering only the fundamental nature and evolutionary origin of the folds and not of the higher order adaptive structures into which they are combined. These higher order complexes resemble, ‘‘Lego-like’’-,contingent assemblages put together by natural selection for various biological functions during the course of evolution by gene duplication and fusion (Brandon & Tooze, 1999).

Despite these early successes the lack of any apparent regularity in protein structures, and the great dissimilarity among those that had been determined, provided no basis for a rational classification (Ptitsyn & Finkelstein, 1980; Richardson, 1981). The picture was still in those early days compatible with the Lego model, that the folds in living organisms on earth might be individual members of a near infinite set of contingent material assemblages put together by natural selection over millions of years of evolution. It was only during the 1970s, as the number of 3D structures began to grow significantly, that it first became apparent that there might not be an unlimited number of protein folds, that the folds might not belong to a potentially infinite set of artifactual Lego-like constructs. On the contrary, it became increasingly obvious as more structures were determined that the protein folds could be classified into a .finite number of distinct structural families containing a number of related but variant forms, i.e. that the classification system of fold structures was typological (Ptitsyn & Finkelstein, 1980; Richardson, 1981; Orengo et al., 1997). This was an important finding as the very fact that protein folds can be grouped in such a way was itself significant, for it provided the .first line of evidence that the folds might be natural forms determined by physical law.

It also became apparent that the 3D structures of individual folds were essentially invariant, some such as the Globin fold and the Rossman fold for example, having remained essentially unchanged for thousands of millions of years. Both their invariance and the typological classification schemes into which they could be grouped argued for their being a .finite set of ‘‘real timeless structures’’ determined by physics rather than being mutable ‘‘Lego-like’’ aggregates of amino acids determined by selection.

Consideration of the various physical constraints which restrict the folded spatial arrangements of linear polymers of amino acids, the laws of fold form, suggests that the total number of permissible folds is bound to be restricted to a very small number. One recent estimate based on possible arrangements of typical structural elements gave a maximum of 4000 folds (Lingard & Bohr, 1996). Based on similar considerations, the authors of another recent paper suggested that the maximum is likely to be no more than a few thousand (Chothia et al., 1997). A different type of estimate based on the rate of discovery of new folds, rather than permissible spatial arrangements, suggests that the total number of folds utilized by organisms on earth might not be more than 1000 (Chothia, 1993). In many recent reports the total number of different folds is often cited to be somewhat less than 1000 (Holm & Sander, 1996; Orengo et al., 1997; Zhang & DeLisi, 1999; Holm & Sander, 1999).

Whatever the actual figure, the fact that the total number of folds represents a tiny stable fraction of all possible polypeptide conformations, determined by the laws of physics, reinforces further the notion that the folds like atoms, represent a .finite set of allowable physical structures which would recur throughout the cosmos wherever there is carbon-based life utilizing the same 20 amino acids.

This would seem to be a very important point. If there are only a few thousand possible protein folds, it strongly suggests that protein folds are not high information structures. This has very significant implications for our reconstruction of evolutionary history. Denton et al. don't fully draw this out (in my opinion), but it is hinted at here:

quote:
Further evidences consistent with the Platonic conception that the protein folds represent a set of lawful immutable natural forms, ‘‘primary givens of physics,’’ are those many cases where protein functions are clearly secondary adaptations of a primary, immutable form (Gerlt & Babbitt, 2001). This is spectacularly true in the case of some of the more common folds also known as superfolds (Orengo et al., 1994; Gerlt & Babbitt, 2001). In the case of one superfold the so-called triosephosphate isomerase (TIM) barrel, an eight-stranded alpha/beta bundle (see Fig. 1), essentially the same fold, has been secondarily modified for many completely unrelated enzymic functions occurring in such diverse enzymes as triosephosphate isomerase, enolase and glycolate oxidase (Orengo et al., 1994). Another example, where a basic fold has been secondarily modified for various biochemical functions, in this case closely related functions, is the various elegant functional adaptations to oxygen uptake and carriage exhibited by the globin fold in myoglobin and the various vertebrate hemoglobins. The fact that in many cases where the same fold is adapted to different functions, no trace of homology can be detected in the amino acid sequences, suggesting multiple separate discoveries of the same basic structure during the course of evolution (Orengo et al., 1994; Brandon & Tooze, 1999), further reinforces the conclusion that the folds are a .finite set of ahistoric physical forms.
Let me expand on this slightly. If there are only a few thousands protein folds, then our degree of confidence about homology is greatly weakened if the main pillar of this inference is based on structural similarity. That is, if we were dealing with a nearly-infinite number of potential protein folds, then the fact that two proteins share folds would be strongly suggestive of common descent. But if the number of structures is quite limited, then an origin through convergence, or common design, is equally plausible. This is quite significant in our post-genomic age , given that biologists seem to be relying more and more on structural similarity to infer 'homology.'

For example, let's assume there are only about 1000 different protein folds. Let's now imagine that an intelligent designer sought to seed this planet with microbial life forms containing proteins whose average number of domains was three. This would mean that our designer could only design about 300-350 proteins before he/she would have to reuse a fold.

1. Michael J. Denton, Craig J. Marshall andMichael Legge. The Protein Folds as Platonic Forms: New Support for the Pre-Darwinian Conception of Evolution by Natural Law. J. Theor. Biol. (2002) 219, 325–342

IP: Logged
Frances
Member
Member # 169

Icon 1 posted 09. March 2003 19:58      Profile for Frances     Send New Private Message       Edit/Delete Post 
I hope that the moderator will allow me to post even though I have been banned from being the first to respond to a posting. I will not address the comments made by Mike Gene, although much of what he has raised has been part of a topic I will be discussing soon on ISCID. Instead I would like to provide some links relevant to Denton's paper

Laws of form revisited, Denton and Marshall NATURE VOL 410

Berkeley Scientists Create First 3-D Map of Protein Universe

Protein fold and family occurrence in genomes: power-law behaviour and evolutionary model by Jiang Qian, Nicholas M Luscombe & Mark Gerstein

A structural census of the current population of protein sequences

An Alternative View of Protein Fold Space Ilya N. Shindyalov and Philip E. Bourne1

Estimating the Total Number of Protein Folds Sridhar Govindarajan, Ruben Recabarren, and Richard A. Goldstein1,

Protein structures sustain evolutionary drift Folding & Design, 1997, 2, 519-524. Burkhard Rost

Protein Engineering, 1999, 12, 85-94 Twilight zone of protein sequence alignments Burkhard Rost

ON THE NUMBER OF STRUCTURAL FAMILIES IN THE PROTEIN UNIVERSE. ALEXANDROV N.

Genome Informatics 12: 135–140 (2001)Conservation of Protein Interaction Network in Evolution Jong Park Dan Bolser

Effect of Alphabet Size and Foldability Requirements on Protein Structure Designability
Nicolas E.G. Buchler and Richard A. Goldstein


Myriads of protein families, and still counting
Victor Kunin1, Ildefonso Cases, Anton J Enright, Victor de Lorenzo, and Christos A Ouzounis


A map of the protein space An automatic hierarchical classification of all protein sequences

A Unified Sequence-Structure Classification of Protein Sequences: Combining Sequence and Structure in a Map of the Protein Space


[ 09. March 2003, 20:48: Message edited by: Frances ]

IP: Logged
Rex Kerr
Member
Member # 632

Icon 1 posted 09. March 2003 21:35      Profile for Rex Kerr     Send New Private Message       Edit/Delete Post 
That is very interesting, but I have a couple of questions.

First, I agree that if there are only a few thousand basic folds, arguments by structural homology are weakened. However, aren't arguments about the probability of random function simultaneously strengthened?

Second, although I will read the paper to try to answer the question for myself, do you know if they distinguished the hypotheses that (1) the stable folds we find are a historical accident of all possible stable folds and have been maintained through decent (much like the particular choice of amino acids we have), (2) polypeptide sequences fold into one of these thousand or so forms simply via the laws of physics (chemistry), or (3) we can classify any sensible (e.g. non-intersecting) 3D structure of repeating subunits with a thousand exemplars?

The conclusions are very different in each case. In the first case, the homology argument for common descent remains, while the second case suggests that 3D homology is not good evidence, and the third case suggests that we have observed something about our powers of classification, not about the real world.

IP: Logged
Mike Gene
Member
Member # 149

Icon 1 posted 10. March 2003 00:43      Profile for Mike Gene     Send New Private Message       Edit/Delete Post 
Rex,

As for your first point, yes, arguments about the probability of randomly generated function are strengthened. But then, it gets a little tricky to call this "random."

As for your second point, allow me to quote Denton et al. more extensively. These are various excerpts that should run in order:

quote:
During folding the amino acid sequence of a protein appears to be searching conformation space for increasingly stable intermediates which lead it step wise toward the deepest energy minimum for that sequence, which corresponds to its .final native conformation (Ptitsyn & Finkelstein, 1980; Finkelstein & Ptitsyn, 1987). The process is driven thermodynamically via a succession of free energy decreases (Dinner et al., 2000). The process of folding is often pictured as being analogous to a ball .finding its way down the sides of a complex rather irregularly shaped bowl to the bottom of the bowl, its .nal preordained and natural resting place, where the bottom of the bowl represents the natural free energy minimum of the fold. Extending this analogy we can think of there being a preexisting energy landscape containing 1000 or so uniquely shaped bowls or free energy minima. This picture lends itself to Platonic interpretation. Even the terms used in the literature reflect the Platonic concept of matter ‘‘.finding’’ or ‘‘filling’’ a pre-existing mold. Thus the folding process is often described as a mechanism by which ‘‘sequence selects structure.’’ As a recent author commented: ‘‘Thus the notion that sequence determines structure might be more precisely formulated with the concept that sequence chooses between the limited number of secondary structures available to the polypeptide backbone’’ (Honig, 1999). In other words, it is not the sequence which specifies the mold but the mold which specifies which sequences can be accommodated. For the mold is prior to the sequence, although of course during folding each particular sequence is prior in time to the form which it .finally makes manifest. The ubiquitous text book claim that ‘‘the amino acid sequence determines the 3D form of the protein’’ is a mechanistic interpretation of the folding process which might be more accurately stated Platonically as ‘‘the prior laws of form determine which amino acid sequences can fold into a stable 3D form.’’ If the sequence contains any information, it is not information to create or generate a unique artifact-like assemblage analogous to a Lego construct or a watch, but more of a guide through a preexisting Platonic landscape to an already prefigured end.

The free energy difference between the native conformation of the fold and its denatured state is small. A consequence of this is that folds are only marginally stable. This allows for a crucial degree of structural .flexibility, which is necessary for many catalytic and other functional activities, which often necessitate the fold adopting slightly different conformations. But it also means that because of the continual buffeting and bombardment arising from molecular collisions in the turbulent interior of the cell, the configuration of each fold is continually subject to conformational disturbances which may involve anything from the movement of a few atoms to the unfolding of sections of the amino acid chain (Brandon & Tooze, 1999). However a fold is able to maintain and regain its native conformation in the face of these microchallenges because its native conformation, being a natural free energy minimum, acts as a natural attractor ‘‘continually drawing’’ all the parts of the fold back into its proper native conformation (the natural free energy minimum of the fold). And just as a ball in a bowl always ends up at the bottom of the bowl, a fold is also able to get back ‘‘home’’ or to recover its proper conformation along an infinity of different paths. In short, the folds are robust natural existents, whose proper forms are under the governance and supervision of natural law. In the case of artifactual contingent assemblages of matter such as a watch or ‘‘Lego construct,’’ there is no natural agency or natural guarantor of ‘‘proper form’’, no eternally present Platonic mold or free energy minimum acting as an attractor continually drawing or guiding the assembly of its components to a preordained end. Consequently artifacts are not robust and are incapable of recovering their form after rearrangements of their components. Natural forms are robust, contingent artificial forms are fragile. In the case of natural forms, the agency of natural law acts ‘‘freely’’ as the guarantor of form. In the case of artifacts there is no such guarantor. These considerations highlight an interesting difference between natural and artificial forms. In the case of artifactual forms such as Lego assemblages or watches or other types of machines which are put together from the bottom up mechanically, we have an infinity of forms, but each is led up to or assembled by only a few or even only one unique constructional path. In the case of natural forms on the other hand, and the folds are classic examples, there is a finite number of forms but an infinite number of paths via which their actualization may be achieved.

The 3D conformations of the folds exhibit another sort of robustness, they are remarkably resistant to evolutionary changes in their amino acid sequences (Brandon & Tooze, 1999). As referred to above, although there are only a limited number of folds permitted by physics, many very different apparently unrelated amino acid sequences can fold into the same form (Gerlt & Babbitt, 2001). Because protein functions depend on the maintenance of a stable scaffold, the tolerance of the folds to sequential changes may well confer another crucial element of robustness making them ‘‘immune to most mutational insults’’ and hence evolutionarily reliable constructional units of the cell. The self-organization of the same fold from very different amino acid sequences again underscores the natural autonomy and Platonic primacy of the fold over its material constituents and highlights the fact that the folds are natural existents and not artifactual aggregates of matter like machines, which do not possess any natural autonomy over their components, and are far less tolerant of variations in their basic constituents.

We speculate that the fact that the robustness of the folds [which enables them to maintain their forms and dependent functions in the face of both mutational challenges and conformational disturbances due to the turbulence of the cell’s interior] is ‘‘natural’’ may have deep evolutionary implications. The robustness of biological systems is generally conceived of as being analogous to that of advanced machines utilizing such devices as feedback control, parallel circuitry, error fail-safe devices, redundancy and so forth (Keller, 2000; Kitano, 2002; Csete & Doyle, 2002). But such robustness which we suggest might be termed ‘‘artifactual robustness’’ is inherently complex and can only be arrived at after millions of years of evolution and is necessarily a secondary and derived feature of any biological system or structure. The robustness of the folds is a natural intrinsic feature of the folds themselves and not a secondarily evolved feature. Robustness of this sort is ‘‘for free’’ and does not require the intervention of natural selection. Such robustness has the enormous evolutionary advantage in that it provides evolution with ‘‘ready-made’’ stable structures upon which to build more complex structures and functions. We speculate that an element of natural robustness may be a necessary feature of all biological forms utilized by evolution from the molecular to the organismic level.

If the folds were contingent assemblages of matter like Lego constructs, watches or other sorts of artifacts, where the parts are the primary things and pre-exist the whole, then the various parts of each fold, the constituent a helices and b sheet conformations and higher order submotifs which make up the whole should be stable structures (like Lego bricks) which should exist prior to and independent of the wholes in which they occur. And if this were the case then the structure of the whole fold should be easily predictable (as in the case of an artifactual assemblage like Lego) from the character and properties of its parts in isolation. But this is evidently not the case. On the contrary, many of the unique secondary structural motifs which make up a mature fold are in most cases, either highly metastable outside the fold or non-existent. We can state this formally by saying that most submotifs which make up a fold are existentially dependent on being part of the native conformation of the whole fold, outside of which they have no independent existence. Evidence that the specific conformation adopted by particular segments of the amino acid chain is determined by the whole was gained some time ago in the classic complementation experiments of Anfinsen, which showed that the isolated S-peptide of ribonuclease, which comprises residues 1–20 of the enzyme, is a random coil in isolation, whereas most of it forms an a helix in combination with the rest of the molecule. Similarly, the fragments of myoglobin released from the molecule after cyanogen bromide treatment, the so-called peptides 1, 2 and 3, have very little residual secondary structure after their removal from the myoglobin molecule. Peptide 1 aggregates, peptide 2 is a random coil and the large central peptide, peptide 3, has only residual a helical properties Anfinsen, 1973). Evidently the structures adopted by the different sections of the amino acid sequence are ‘‘context dependent.’’ The case of the prion proteins illustrates again that the same sequence may fold into two alternative structures depending on context (Prusiner, 1995; Manson, 1999). Studies on the Arc repressor molecule (Cordes et al., 2000) have revealed that one section of the protein can switch from a helical to a sheet conformation depending on minor environmental changes, including temperature and solvent conditions. The fact that the same or very similar sequences may adopt different secondary structures dependent on context is at the heart of the whole problem of predicting the 3D structure of a protein from its amino acid sequence. If the same sequences always adopted the same structures whatever the context, if in other words the substructures of proteins were like the building blocks of Lego, or the cogs of a watch, then the problem of prediction of ‘‘global form’’ from the form of the building blocks would be easy, but of course this is not the case and the problem is far from solved.
The linguistic analogy again springs to mind. The meaning of a word in a sentence may vary depending on the sentence in which it occurs, its context. In a spoken sentence of English for example, the sound represented by the letters ‘‘rite’’ or ‘‘right’’ may refer from anything from a medieval ritual to a movement or a moral judgement. Outside the context in which it occurs it is impossible to determine its meaning. Consequently it is impossible to determine the meaning of a ‘‘whole’’ sentence from the study of its constituent words in isolation. Proteins, like sentences, are intensely holistic entities. All the current evidence suggests that the various parts of the fold, the various constituent a helices and b sheet conformations and higher order submotifs, exert what appears to be a mutual and reciprocal formative influence on each other and on the whole, which itself in its turn exerts a reciprocal formative influence on all its constituent parts. In this characteristic proteins are unlike any other material objects with which we are familiar.

If the folds are indeed lawful natural forms arising out of the intrinsic physic al properties of amino acid sequences, it is hard to see how selection for function can have played a significant role in their origin or evolution. The problem is somewhat like trying to provide a selectionist/functional explanation for the spherical shape of a cell, or the .at shape of the cell membrane! Selection may select a whole fold and then modify it for function and this is clearly

In the case of the folds, origin per saltum or in Finkelstein’s (1994) words ‘‘by choice from random [amino acid] sequences’’ is only a feasible mechanism if the protein folds are easy to find by chance and therefore common in amino sequence space. There is no doubt that individual a helices and b sheets are very common in sequence space and as the folds arise naturally from combinations of these subunits one might presume that the folds themselves would be relatively common. Whether they are or not has been a subject of considerable discussion (Finkelstein, 1994; Cordes et al.,1996; Sauer, 1996; Axe, 2000). Thermodynamic considerations of the random characteristics of fold sequences support the contention that stable folds are common in sequence space (Finkelstein & Ptitsyn, 1987; Finkelstein, 1994; Finkelstein et al., 1995). In Finkelstein’s (1994) words ‘‘little editing of a random sequence is necessary for the formation of the protein globule itself.’’ In libraries of random amino acid sequences, a helical proteins displaying cooperative thermal denaturation and specific oligomeric states have been recovered at frequences of 1% (Cordes et al., 1996). Evidence that different stable structures may be close in sequence space is supported by the case of the prion proteins and other cases where different structures may be adopted by the same sequence, such as the Arc repressor mutant, referred to above. Discussing the implications of the conformational switch in the Arc repressor the authors (Cordes et al., 2000) comment: ‘‘The intermediate can adopt either fold y Thus distinct protein folds need not be isolated islands in sequence space, but can be linked by evolutionary bridges where multiple

Another line of evidence which suggests that the folds must be relatively common in sequence space is the existence of overlapping genes. These appear to occur in the genomes of almost all organisms. New functional proteins could never have been discovered or evolved, embedded in existing gene sequences, if the folds were not relatively common in sequence space.

The current consensus view (Finkelstein, 1994; Plaxco et al., 1998; Brandon & Tooze, 1999) is that stable folds are in fact quite common in amino acid sequence space. Brandon & Tooze (1999) have even speculated that as many as one in a hundred random amino acid sequences may fold into a stable form. If indeed folds are that common in sequence space, occurring, say, at a frequency of one in a hundred random sequences, then, this might mean that wherever random polypeptides are synthesized anywhere in the cosmos, in the laboratory, in a pre-biotic soup or in a primeval cell, all the basic folds utilized by life on earth would be bound to be generated after only a few thousand trials. And this leads us to surmise that if at some stage in cellular evolution ‘‘random polypeptides’’ were synthesized in great numbers, then all the folds might have been discovered quite easily by chance. This would mean that in the right environment, just as atoms are assembled in the stars and crystals form when rocks cool slowly, so the protein folds would also be formed automatically, wherever conditions permitted the synthesis of any quantity of polypeptides to occur. And because the association of many proteins with their prostheticgroups is basically spontaneous, and does not require the intervention of an enzyme, this raises the possibility that many protein functions may also have been generated deterministically in the protocell without the necessity for selection, by the direct association of small organic compounds with particular protein folds. In effect this means that merely by synthesizing a few thousand random polypeptides in a ‘‘broth’’ containing the basic biochemicals used by life on earth, including enzymic prosthetic groups, all the necessary proto-functional enzymes needed for cellular metabolism might be generated per saltum by natural law. These primitive enzymes could then be .fine tuned by selection to generate the highly efficient enzymes of modern cells. In effect the origin and evolution of the protein-based biochemistry of modern cells may be ‘‘a free lunch.’’ Evidence that intermediary metabolism may also have been given deterministically in the protocell was obtained in a recent study (Morowitz et al., 2000) which concluded that: ‘‘The chemistry at the core of the metabolic chart is necessary and deterministic and would likely characterize any aqueous carbon based life anywhere it is found in this universe.’’

The alternative to per saltum models is to envisage physically determined ‘‘constructional series’’ or evolutionary pathways starting from say, a simple single a helix structure and leading via a series of small motifs to the .final fold. One example of a simple, two-step ‘‘constructional sequence’’ for the evolution of the classic TIM barrel from a half barrel was reported recently (Lang et al., 2000). But how feasible might such constructional pathways be in the case of many folds? In the case of some folds, the globin fold for example, no one has yet been able to provide a credible constructional sequence from simple motif to .final fold to show how the fold might have come about via a series of stable intermediate forms. Some folds, like perhaps the TIM fold, may lend themselves to construction from simpler motifs but this may not be true of all folds. Of course as any set of small stable motifs and constructional sequences in prefold space would also be a .finite set and very much given by physics, then such constructional sequences if they exist, would be no less ‘‘built-in’’ than the .final set of stable folds to which they lead. However as there would be different routes through this pre-fold space to the 1000 folds, there would inevitably be an element of contingency in the actual routes taken. Nonetheless, the 1000 protein folds would still represent a physically determined or ‘‘built in’’ bottle neck through which protein evolution had to pass and through which it would have to pass on any earth-like planet, where life uses proteins constructed out of the same 20 amino acids.

The discovery that the folds are natural forms, whose evolution is determined largely by physical law and which are bound to arise spontaneously, ‘‘for free’’, in any large set of random amino acid sequences, strongly supports the widely held belief among origin of life researchers (already mentioned above) that life is itself an inevitable end of chemistry, a phenomenon which is bound to arise in the correct environmental conditions, perhaps in space or perhaps on the surfaces of newly formed planets (Lehninger, 1982; De Duve, 1991; Sowerby et al., 2001). It also provides new support for the currently fashionable Anthropic view that the laws of nature appear to be .fine tuned for life (Barrow & Tippler, 1986; Denton, 1998; Davies, 2001). For the lawful nature of the folds provides for the .first time evidence that the laws of nature may not only be fine tuned to generate an environment .fit for life (the stage) but may also be fine tuned to generate the organic forms (the actors) as well, in other words that the cosmos may be even more biocentric than is currently envisaged!

The protein folds clearly represent a .finite set of about 1000 natural forms determined like atoms and crystals and other natural forms by the laws of physics. They can be classified into distinct structural types, their structures can be accounted for in terms of a rational morphology of constructional rules, they have remained invariant for billions of years, and in many cases their functional adaptations are clearly secondary modifications of what are evidently ahistoric primary forms. They are robust both in their capacity to resist long-term evolutionary mutational challenges and in their capacity to maintain and regain their proper form in the face of the destabilizing challenges posed by the buffeting and turbulence of the cell’s interior. Depending on their frequency in sequence space, their evolutionary origin may have occurred either by saltation, or via physically determined intermediates, in other words by pre-ordained constructional pathways. In short, they do not conform in any way to the Darwinian conception of organicforms as contingent ‘‘Lego-like’’ functionally contrived assemblages of matter. On the contrary, they are wonderful exemplars of the pre-Darwinian and Platonic conception of organicforms as abstract, lawful and rational features of the eternal world order, which will occur throughout the cosmos wherever the same 20 protogenicamino acids are used to make proteins.


IP: Logged
Cornelius G. Hunter
Member
Member # 81

Icon 1 posted 10. March 2003 01:21      Profile for Cornelius G. Hunter   Email Cornelius G. Hunter   Send New Private Message       Edit/Delete Post 
Rex:

I agree with Rex that this is interesting. Protein structure is another example of why molecular biology is interesting. Molecular biology gives us examples we can (almost) get our arms around. In organic chemistry things appear to be determined by fundamental natural laws. And at the morphological level there appear to be too many possibilities and arbitrary choices for the designs to be determined by law. But in molecular biology there is the glimmer hope that we can generate the set of all possible solutions and therefore get some idea about the relative role between natural law and choice.

It sounds like Denton is saying that the protein structure data are suggesting that natural law has a stronger role than what evolution otherwise would have guessed. His reasoning is that there are a relatively few (thousands) canonical folds found, compared to what theoretically we might have found. [Note: he seems to omit studies which came up with higher estimates of the #folds, in the tens of thousands, but I suppose even these higher estimates don't necessarily refute his thesis].

From the excerpts that Mike provided I wasn't quite clear whether Denton et. al. are suggesting a tweak to evolution or something to replace evolution. If the latter, I would put the thesis more into the "philosophy of biology" box rather than "evidential argument" box. That is, there are far bigger problems with evolution than the fact that there are only a few thousand protein folds, and in fact I'm sure evolutionists would have no problem incorporating such a conclusion. Instead, it would seem to be a more general claim about how best to interpret what we are learning about biology (molecular biology in this case).

I'm not clear on what Rex was getting at when he wrote: "I agree that if there are only a few thousand basic folds, arguments by structural homology are weakened. However, aren't arguments about the probability of random function simultaneously strengthened?" Perhaps more words would help.

Then Rex asked: "do you know if they distinguished the hypotheses that (1) the stable folds we find are a historical accident of all possible stable folds and have been maintained through decent (much like the particular choice of amino acids we have), (2) polypeptide sequences fold into one of these thousand or so forms simply via the laws of physics (chemistry), or (3) we can classify any sensible (e.g. non-intersecting) 3D structure of repeating subunits with a thousand exemplars?"

It is not "either-or" as these are not exclusive hypotheses. Of course, Denton et al would affirm #2. But then you say: "3D homology is not good evidence [for common descent]" under #2. This is not quite right.

First, it is probably good to keep in mind that structural homology is not a stand alone concept in evolution. IOW, homologous structures are supposed to have been generated by the same genes so we would typically expect to find the genetic similarity to parallel the structural similarity in homologous structures. Designs in which there is significant structural similarity but not genetic similarity are more likely to be convergent rather than homologous in evolution's view. Sooo, evolution can explain a variety of cases:

A) similar structure and similar amino acid sequence = likely homology
B) similar structure and not too similar amino acid sequence = likely remote homology
C) similar structure and dissimilar amino acid sequence = analogy (ie, convergent)

It is true that at the morphological level, structural similarity (eg, pentadactyl pattern) is used in a strong metaphysical argument for evolution. Not so at the molecular level where structural similarity is not typically used as evidence for evolution, so your comment that "3D homology is not good evidence [for common descent]" under your #2 isn't on track.

For the record, the typical evolution argument at the protein level is from sequence, not structure. Specifically, a strong view of contingency (and low view of necessity) is assumed and therefore the parallel between (i) a protein sequence variation across species, and (ii) that of a different protein across those same species, or (iii) the variations of morphological features across those same species, is seen as a great coincidence unexplained by design and nicely explained by evolution.

As for your #3, it is not clear what you mean here. What do you mean by "repeating subunits with a thousand exemplars"? Denton et al are talking about folds that, in general, do not have repeating subunits.

--Cornelius

IP: Logged
Frances
Member
Member # 169

Icon 1 posted 10. March 2003 01:32      Profile for Frances     Send New Private Message       Edit/Delete Post 
While I understand Denton's argument that the laws appear to be designed for life, it may be more appropriate to state that the laws 'designed' life. I find it fascinating that Mike started this thread since I am close to starting a thread on similar findings in RNA space.

I find Denton's claim so far to be somewhat begging the question. I have no problem with the reality that laws of nature determine the composition of chemicals what I find to be a much more difficult assertion is the suggestion that this is somehow anti-Darwinian "they do not conform in any way to the Darwinian conception of organic forms as contingent".

I will start my own thread to discuss the world of RNA and show how RNA (maybe similar to DNA/protein) consists of a few very common structures which in many cases are close neighbors of eachother and many uncommon structures. Peter Schuster and his students have done a lot of work in this area and have shown how neutral evolution may have played a very important role in the evolution of RNA. What is so fascinating to me is that most of these common structures seem to form well connected networks and as others have shown, rather than being random networks they are actually scale free networks, something one may expect from an evolutionary history. I will also address some thesis work which suggests that similar findings could apply to proteins. If this is the case then we may understand why evolution was so succesful.

I understand that Denton and perhaps also Mike may want to argue that these data show evidence of some front loading. I have no problem with the idea that some intelligence front loaded our world at the beginning of the Big Bang. As I have stated elsewhere, this would be the best of two worlds (science and religion).
One of the papers I quoted above suggests that proteins sustain evolutionary drift. It may indeed be harder to distinguish between convergent and divergent evolution. Given the success of Schuster with RNA, finding similar characteristics for protein space would help us understand some of the observed characteristics of evolution such as punctuated equilibrium. I find it fascinating that Mike and I may be looking at the same data although maybe from different perspectives and reach maybe very similar conclusions about what the data show.

The structure of the protein universe and genome evolution

EUGENE V. KOONIN, YURI I. WOLF & GEORGY P. KAREV
quote:

Despite the practically unlimited number of possible protein sequences, the number of basic shapes in which proteins fold seems not only to be finite, but also to be relatively small, with probably no more than 10,000 folds in existence. Moreover, the distribution of proteins among these folds is highly non-homogeneous — some folds and superfamilies are extremely abundant, but most are rare. Protein folds and families encoded in diverse genomes show similar size distributions with notable mathematical properties, which also extend to the number of connections between domains in multidomain proteins. All these distributions follow asymptotic power laws, such as have been identified in a wide variety of biological and physical systems, and which are typically associated with scale-free networks. These findings suggest that genome evolution is driven by extremely general mechanisms based on the preferential attachment principle.

quote:

Projection of the structure of the protein universe on genomes and quantitative analysis of the outcome seems to result in some unexpected insights into general principles of genome evolution. Remarkably, the size distributions of folds for the explored part of the protein universe and of domain families for all analysed genomes, as well as the distribution of the number of domain connections in multidomain architectures, are all described by the same type of mathematical functions, in which the power law appears as an asymptotic. This suggests that extremely general mechanisms of evolution, apparently based on the preferential attachment (proliferation) principle, are at work in all these contexts.

Did evolution leap to create the protein universe?
Burkhard Rost

quote:

The genomes of over 60 organisms from all three kingdoms of life are now entirely sequenced. In many respects, the inventory of proteins used in different kingdoms appears surprisingly similar. However, eukaryotes differ from other kingdoms in that they use many long proteins, and have more proteins with coiled-coil helices and with regions abundant in regular secondary structure. Particular structural domains are used in many pathways. Nevertheless, one domain tends to occur only once in one particular pathway. Many proteins do not have close homologues in different species (orphans) and there could even be folds that are specific to one species. This view implies that protein fold space is discrete. An alternative model suggests that structure space is continuous and that modern proteins evolved by aggregating fragments of ancient proteins. Either way, after having harvested proteomes by applying standard tools, the challenge now seems to be to develop better methods for comparative proteomics.

quote:

Conclusions

Is protein structure and/or sequence space continuous, or has nature leaped when inventing folds and functions? If proteins were assembled from fragments, does this imply modularity of sequences and folds, as, for example, seen in short peptide fragments that regulate the targeting of proteins through the cell? Does the existence of short motifs or modules imply fragment assembly? I doubt that we have the data to unambiguously answer these questions. In fact, the evidence from analyses of entirely sequenced organisms is equally spread between pro and con ‘natura non facit saltus’.

Others show some beautiful examples of protein evolution

Evolution of a Protein Fold in Vitro
Matthew H. J. Cordes, Nathan P. Walsh, C. James McKnight, Robert T. Sauer1

quote:

A “switch” mutant of the Arc repressor homodimer was constructed by interchanging the sequence positions of a hydrophobic core residue, leucine 12, and an adjacent surface polar residue, asparagine 11, in each strand of an intersubunit
b sheet. The mutant protein adopts a fold in which each b strand is replaced by a right-handed helix and side chains in this region undergo significant repacking. The observed structural changes allow the protein to maintain solvent exposure of polar side chains and optimal burial of hydrophobic side chains. These results suggest that new protein folds can evolve from existing
folds without drastic or large-scale mutagenesis.

Others look at how theoretical models seem to match the data

Estimating the Total Number of Protein Folds
Sridhar Govindarajan, Ruben Recabarren, and Richard A. Goldstein

quote:

ABSTRACT Many seemingly unrelated protein families share common folds. Theoretical models based on structure designability have suggested that a few folds should be very common while many others have low probability. In agreement with the
predictions of these models, we show that the distribution of observed protein families over different folds can be modeled with a highly stretched exponential.
Our results suggest that there are approximately 4,000 possible folds, some so unlikely that only approximately 2,000 folds existing among naturally occurring proteins. Due to the large number of extremely rare folds, constructing a comprehensive database of all existent folds would be difficult. Constructing database of the most-likely folds representing the vast majority of protein families would be considerably easier. Proteins 1999;35:408–414. r 1999

More when I finish reading up on the RNA experiments.

Why are some protein structures so common?
(tertiary structureyprotein evolutionylattice modelsyfitness landscapesyspin glasses)
SRIDHAR GOVINDARAJAN AND RICHARD A. GOLDSTEIN

quote:

ABSTRACT Many biological proteins are observed to fold into one of a limited number of structural motifs. By considering the requirements imposed on proteins by their need to fold rapidly, and the ease with which such requirements can be fulfilled as a function of the native structure, we can explain why certain structures are repeatedly observed among proteins with negligible sequence similarity. This work has
implications for the understanding of protein sequence– structure relationships as well as protein evolution.

The following quote is quite interesting:

quote:

In contrast to models that explain the limited number observed structural motifs by postulating a limited number possible motifs (7), in our model, essentially all structures possible, given the right set of interactions. It is the uneven probabilities of finding any particular structure that results the observation of relatively few folds. As a result, as more protein structures are solved, novel motifs will continue observed. This model would also predict that a few folds would be observed very frequently, while most other folds would observed only once in unrelated proteins. This seems to common phenomenon in the data base of known protein structures, as the work of Orengo et al. (4) indicates.



[ 10. March 2003, 01:55: Message edited by: Frances ]

IP: Logged
Cornelius G. Hunter
Member
Member # 81

Icon 1 posted 10. March 2003 02:39      Profile for Cornelius G. Hunter   Email Cornelius G. Hunter   Send New Private Message       Edit/Delete Post 
Mike:

Here are some comments on portions of the Denton et. al. text you excerpted:

**************
Denton et. al.
During folding the amino acid sequence of a protein appears to be searching conformation space for increasingly stable intermediates which lead it step wise toward the deepest energy minimum for that sequence,
**************

This is a strange, almost Aristotelian way of putting it. I would not say the unfolded structure is "searching conformation space" for more stable intermediates. True, the structure does sample a small range of perturbations do to thermal fluctuations, but I think it is clear that the folding pathway is motivated by bonding interactions (primarily hydrophobic in the early stages). IOW, the folding is being "pushed" by chemical interactions rather than "pulled" by the next stable intermediate.

**************
Denton et. al.
The process is driven thermodynamically via a succession of free energy decreases (Dinner et al., 2000). The process of folding is often pictured as being analogous to a ball .finding its way down the sides of a complex rather irregularly shaped bowl to the bottom of the bowl, its .nal preordained and natural resting place, where the bottom of the bowl represents the natural free energy minimum of the fold.
**************

First, we do not know for sure that native conformations are free energy minimums. Second, the funnel model he describes is hardly convincing or the consensus.

**************
Denton et. al.
Even the terms used in the literature reflect the Platonic concept of matter ‘‘.finding’’ or ‘‘filling’’ a pre-existing mold. Thus the folding process is often described as a mechanism by which ‘‘sequence selects structure.’’ As a recent author commented: ‘‘Thus the notion that sequence determines structure might be more precisely formulated with the concept that sequence chooses between the limited number of secondary structures available to the polypeptide backbone’’ (Honig, 1999).
**************

To finish the quote, "Thus the notion that sequence determines structure might be more precisely formulated with the concept that sequence chooses between the limited number of secondary structures available to the polypeptide backbone and determines how they are ordered with respect to one another in space" (Honig, JMB, 293:283-293).

The problem is that the sequence-to-beta strand correlation is difficult to explain without considering the 3D tertiary structure, since the strand is stabilized by long-range (in sequence) interactions (brought about by collapse to tertiary structure), typically with other strands to form a sheet. Likewise, even the helix preference is hard to justify without considering the tertiary structure since there is a tremendous entropy barrier weighing against the enthalpically neutral trade between backbone-solvent vs backbone-backone (i to i+4) H-bonds. IOW, there does not appear to be a free-energy preference for secondary structure in the absence of tertiary, or at least collapse, considerations.

**************
Denton et. al.
In other words, it is not the sequence which specifies the mold but the mold which specifies which sequences can be accommodated. For the mold is prior to the sequence, although of course during folding each particular sequence is prior in time to the form which it .finally makes manifest. The ubiquitous text book claim that ‘‘the amino acid sequence determines the 3D form of the protein’’ is a mechanistic interpretation of the folding process which might be more accurately stated Platonically as ‘‘the prior laws of form determine which amino acid sequences can fold into a stable 3D form.’’ If the sequence contains any information, it is not information to create or generate a unique artifact-like assemblage analogous to a Lego construct or a watch, but more of a guide through a preexisting Platonic landscape to an already prefigured end.
***************

But the majority of sequences don't even fold, and in those that do, the folding pathway appears to be critical. I, personally, am convinced (this week anyway) that the solution to the protein folding problem will reveal sequence encodings that force the unfolded chain down a pathway that results in the native conformation. At the risk of oversimplifying, there are two fundamental requirements here: (i) A series of coordinated sequence-driven interactions forcing the chain down the pathway, and (ii) sufficiently favorable late-stage tertiary contacts such that the final structure is at a lower free energy than the preceding intermediate (ie, no late-stage frustration). Put simply, I don't think (non homologous) structure prediction will work very well without considerations of the folding pathway. If so, then the sequence of information encoded in the primary structure is what counts in structure determination, and the limited #native structures we find in nature could easily be explained as some combination of (i) this is all nature needs, and (ii) this is all the variation you can achieve given that you're using helices and strands (which are necessary results of the fundamental peptide backbone design). IOW, there are only so many ways you can pack the chain into a high-density globular cluster.

**************
Denton et. al
We speculate that the fact that the robustness of the folds [which enables them to maintain their forms and dependent functions in the face of both mutational challenges and conformational disturbances due to the turbulence of the cell’s interior] is ‘‘natural’’ may have deep evolutionary implications. The robustness of biological systems is generally conceived of as being analogous to that of advanced machines utilizing such devices as feedback control, parallel circuitry, error fail-safe devices, redundancy and so forth (Keller, 2000; Kitano, 2002; Csete & Doyle, 2002). But such robustness which we suggest might be termed ‘‘artifactual robustness’’ is inherently complex and can only be arrived at after millions of years of evolution and is necessarily a secondary and derived feature of any biological system or structure. The robustness of the folds is a natural intrinsic feature of the folds themselves and not a secondarily evolved feature. Robustness of this sort is ‘‘for free’’ and does not require the intervention of natural selection.
**************

What? Is it just me, or are they contradicting themselves? One the one hand, they say the robustness comes from evolution, and in the very next sentence they say it is a natural intrinsic feature that evolution can use. What gives?

--Cornelius

IP: Logged
yersinia
Member
Member # 324

Icon 1 posted 10. March 2003 03:28      Profile for yersinia     Send New Private Message       Edit/Delete Post 
The only thing I have to add to this interesting discussion is maybe emphasize one of the points of one of the articles that Francis mentioned, namely that the statistical distribution of folds in sequence space is key.

E.g. if the distribution is exponential then it may be that there are tens of thousands (or more) of possible folds but that only the 1,000 or so most common ones in the distribution got heavily sampled in the last 4 billion years or so.

OTOH, what if the distribution was more uniform, but that once a fold is happened upon once for function A, cooption (rather than re-invention) becomes a much more likely way of getting the fold used in functions B, C, etc. In this case the most common folds found in biology would simply be those that happened to be "found" first.

OTOH again, certain folds might be more useful/ adaptable than others, which might make them more common in biology, whatever their place in the sequence-space distribution.

It's all very complex but I will enjoy following the discussion...

IP: Logged
Rex Kerr
Member
Member # 632

Icon 1 posted 10. March 2003 05:08      Profile for Rex Kerr     Send New Private Message       Edit/Delete Post 
Cornelius, I agree that my comment #2 is not "on track" in that 3D protein structure is not typically used as a robust indicator of common descent. Denton's results (or, really, his summary of others' results), if they are #2, simply suggest that 3D structure is even less good of an indicator.

In terms of my first comment, I meant that if you randomly fall into one of a small number of folds, and folds are important for function, then your chance of randomly attaining a function is the product of the probability of getting the right folds--which is a lot higher than you might imagine if you thought the primary sequence of amino acids was the key feature.

The "repeating subunits" I was referring to were amino acids. If you have any poly-X chain, it is going to have some structure. It may be that carefully picking 1000 maximally distinct structures will give you at least one structure that is close to an arbitrary structure. In this way, we would have partitioned poly-X-structure space into 1000 pieces. Of course any structure will be close to one of those, because that is how we've constructed it!

Upon reading the quotes that Mike selected from the Denton article, it looks as though my options #1 and #3 have not clearly been ruled out yet, but it's pretty hard to tell, since everything's a reference to other papers that I don't have time to look up.

I do find it perplexing that Denton concludes that "the cosmos may be even more biocentric than is currently envisaged!". I could understand the use of a term like biophilic or biofavorable, because indeed, conditions in the universe allow life as we know it, and it wouldn't take too great a change for it to not allow life. But calling it "biocentric" makes it sound like life is the goal--which is perhaps personally comforting, but is very hard to support. (Incidentally, even the biophilicity is difficult to support in the sense of relative-to-alternatives, because don't know what fraction of possible universes could contain something that could categorize itself as life--and I'm not sure we even know if the concept of possible universes is sensible.)

Yersinia makes good points about the distribution of fold-stability. I would add that we have only crystallized a few thousand proteins, and as such, I would expect that the least-common folds out of the thousand have only single examples so far. This suggests that we will find more as we crystallize more proteins, and emphasizes the point that it is the distribution of folds that is most important--both among proteins found in extant organisms, and in random polypeptide chains.

IP: Logged
Cornelius G. Hunter
Member
Member # 81

Icon 1 posted 10. March 2003 11:51      Profile for Cornelius G. Hunter   Email Cornelius G. Hunter   Send New Private Message       Edit/Delete Post 
I agree with Yersenia and Rex that the distribution of folds is another interesting dimension. However, to be fair to Denton et. al., we should probably allow them the *premise* that there are only a few thousand folds. After all, several investigators are coming up with that number.

Also, when I say the limited number of observed folds could be simply due to geometrical packing constraints, again to be fair, this probably would be acceptable to their thesis. Their point being that natural laws => organic chemistry => the workings of the peptide backbone => secondary structure => packing / geometrical constraints => limited #folds.

In fact, the precise number of folds possible isn't really so important to their thesis as the fact that the folds appear to be discrete rather than continuous in conformational space.

In my research I've studied this very question at the secondary structure (what I call "local structure") level. My conclusion was that though secondary structure is typically presented and thought of as discrete forms, this is true only for alpha helices. Otherwise, local structure appears to have a continuous distribution in conformational space, though with a broad concentration at the beta strand (extended) conformation.

Of course, the H-bonding pattern of helices vs sheets do give us a nice couple of motifs that could be thought of as Platonic forms, but what I found was that the beta strands take on a wide range of shapes (not surprising) and that the set of all observed local structures in globular proteins do not fall into nice, isolated clusters, but rather fill out a continuum (again, the helix being the exception which does have a pretty concentrated distribution in conformation space).

Nonetheless, this doesn't refute the Denton et. al. thesis about the tertiary folds being Platonic-like forms. And at the morphological level, this approach could be used to explain remarkable convergences, such as the marsupials vs placentals. I haven't read Denton's book, does he talk about this there? He must as this is such an obvious application.

Interestingly, it seems that Denton's thesis could play well in both evolution and ID camps. Evolution needs to be rescued from its hard over "contingency dominates necessity" view which, though in some cases provides support, in other cases creates problems. What better way to explain the marsupials and placentals, to name one example, than to say that, like protein folds, there are only a few possible solutions out there in design space. And obviously, the front-loading version of ID could resonate with Denton's appeal to laws which appear designed for life.

--Cornelius

IP: Logged
charlie d.
Member
Member # 159

Icon 1 posted 10. March 2003 12:20      Profile for charlie d.     Send New Private Message       Edit/Delete Post 
quote:
***********
The robustness of biological systems is generally conceived of as being analogous to that of advanced machines utilizing such devices as feedback control, parallel circuitry, error fail-safe devices, redundancy and so forth (Keller, 2000; Kitano, 2002; Csete & Doyle, 2002). But such robustness which we suggest might be termed ‘‘artifactual robustness’’ is inherently complex and can only be arrived at after millions of years of evolution and is necessarily a secondary and derived feature of any biological system or structure. The robustness of the folds is a natural intrinsic feature of the folds themselves and not a secondarily evolved feature. Robustness of this sort is ‘‘for free’’ and does not require the intervention of natural selection.
**************

What? Is it just me, or are they contradicting themselves? One the one hand, they say the robustness comes from evolution, and in the very next sentence they say it is a natural intrinsic feature that evolution can use. What gives?

Actually, I think that they are making a distinction between "system robustness" (such as that characteristic of biological pathways and "machines"), which is necessarily derived and (they seem to concede * ) the product of evolutionary history, and the "physical robustness" of protein folds, which in their model is an inherent, primary property of the chemistry of proteins (and the "platonic" forms they represent, I guess).

* I was under the impression that Denton originally subscribed to the ID view that engineering analogies are more than just useful descriptive terms for biological entitities, and reflected bona fide design. Perhaps I am misinterpreting their thought here, and that passage only means that such systems are derived, but still designed (a form of derived, progressive design, so to speak). On the other hand, I know Denton has changed some of his other positions (most notably, about common descent), so it's possible that he's gone "mainstream" on this as well.

IP: Logged
Mike Gene
Member
Member # 149

Icon 1 posted 11. March 2003 00:21      Profile for Mike Gene     Send New Private Message       Edit/Delete Post 
Cornelius: First, it is probably good to keep in mind that structural homology is not a stand alone concept in evolution. IOW, homologous structures are supposed to have been generated by the same genes so we would typically expect to find the genetic similarity to parallel the structural similarity in homologous structures. Designs in which there is significant structural similarity but not genetic similarity are more likely to be convergent rather than homologous in evolution's view.

That's how I would tend to view it, but the fact is that many biologists have increasingly relied on structural similarity alone to declare homology. The argument is that 3D form is likely to be more strongly conserved than sequence, thus giving us a better signal for ancient events. However, if there are only a few thousand possible functional folds, the inference to homology, resting primarily on structural similarity, is greatly weakened.

Rex: Second, although I will read the paper to try to answer the question for myself, do you know if they distinguished the hypotheses that (1) the stable folds we find are a historical accident of all possible stable folds and have been maintained through decent (much like the particular choice of amino acids we have), (2) polypeptide sequences fold into one of these thousand or so forms simply via the laws of physics (chemistry), or (3) we can classify any sensible (e.g. non-intersecting) 3D structure of repeating subunits with a thousand exemplars?

Regarding (1), I would point out there is no reason to think these folds are historical accidents and neither is there any reason to think the particular choice of amino acids is also a historical accident. On the contrary, the special class of folds that comes with this particular class of amino acids may constitute part of the teleological reason for the choice of those 20 amino acids. More on this later.

Frances: I understand that Denton and perhaps also Mike may want to argue that these data show evidence of some front loading.

I don't see it as "evidence of some front loading." The significance is two-fold - it increases the plausibility of front-loading. In fact, over the last year-or-so, I have been steadily uncovering facts and observations that make front-loading evolution, through the design of life, quite plausible. Secondly, as I mentioned before, inferences to homology based on structural similarity become very shaky. The incredulous "why would a designer reuse this fold?" argument is been seriously damaged.

Yersinia: OTOH, what if the distribution was more uniform, but that once a fold is happened upon once for function A, cooption (rather than re-invention) becomes a much more likely way of getting the fold used in functions B, C, etc. In this case the most common folds found in biology would simply be those that happened to be "found" first.

That's one possibility. On the other hand, the most common folds may be descendants of originally designed states. And as you note, it is more likely these would be reused and shuffled about than having new folds appear de novo. This is one aspect of my thesis of front-loading evolution that I explained almost a year ago. Furthermore, there is the interesting hypothesis that some folds are inherently more "co-optable" than others. This likewise feeds into front-loaded evolution.

IP: Logged
yersinia
Member
Member # 324

Icon 1 posted 11. March 2003 01:00      Profile for yersinia     Send New Private Message       Edit/Delete Post 
quote:

Secondly, as I mentioned before, inferences to homology based on structural similarity become very shaky. The incredulous "why would a designer reuse this fold?" argument is been seriously damaged.

I think you would need to tie fold to function in order to begin to damage the homology-based-on-structural-similarity argument. Then you could argue "for this function only this fold can do the job".

However, it's not clear to me that there is much necessary mapping between fold and function. Based on my (very limited) knowledge it would appear that the same fold can perform many functions and that many different folds can perform the same function. It may even be that the major significance of folds is mostly that they form relatively stable structures/scaffolds, but that function and interactions are mostly determined by relatively few, morphologically "minor" side chains.

If something like this is the case, and if we go with the definition of homology as "similarity in excess of that required for similarity in function", then IMO the structural homology inference remains a strong one (and we are only talking about folds in any case, structural homologies can be based on more macro features, e.g. the dynein-AAA ATPase homology).

Finally, we should keep in mind that the structural homology inference is supported by the following (IMO fairly compelling) pattern found in many proteins:

100% sequence similarity -- superimposable structures

90% sequence similarity -- superimposable structures
.
.
.
20% sequence similarity -- superimposable structures
(approaching random sequence similarity)
10% sequence similarity -- superimposable structures

...and this decay in sequence similarity follows the usual patterns of closeness-of-lineage-relationship.

[ 11. March 2003, 01:02: Message edited by: yersinia ]

IP: Logged
Frances
Member
Member # 169

Icon 1 posted 11. March 2003 01:27      Profile for Frances     Send New Private Message       Edit/Delete Post 
I do not understand why these findings would undermine the use of sequence similarities to infer homology. As I understand it there is a large space of sequences which map to a protein space which appears to be adhering to a power law. Basically this means that there are a few common folds and many rare folds. In fact what it does show is that what initially may have been evidence of convergent evolution may in fact be divergent evolution after all.
These findings and the findings that these protein networks, like so many other networks in biology are scale free, combined with succesful evolutionary models to recreate such distributions seems to strengthen rather than weaken the evolutionary case. In fact another aspect of these networks, degeneracy instead of redundancy seems to be prevalent. While redundancy is a common engineering concept, degeneracy seems to be far more an 'invention' of nature.

Concepts such as punctuated equilibrium, neutral evolution, gene duplication, degeneracy and robustness all seem to flow naturally from these scale free networks. Furthermore there may be other ways to infer homology see for instance "Identification of homology in protein structure classification" by Sabine Dietmann & Liisa Holm

quote:

Structural biology and structural genomics are expected to produce many three-dimensional protein structures in the near future. Each new structure raises questions about its function and evolution. Correct functional and evolutionary classification of a new structure is difficult for distantly related proteins and error-prone using simple statistical scores based on sequence or structure similarity. Here we present an accurate numerical method for the identification of evolutionary relationships (homology). The method is based on the principle that natural selection maintains structural and functional continuity within a diverging protein family. The problem of different rates of structural divergence between different families is solved by first using structural similarities to produce a global map of folds in protein space and then further subdividing fold neighborhoods into superfamilies based on functional similarities. In a validation test against a classification by human experts (SCOP), 77% of homologous pairs were identified with 92% reliability. The method is fully automated, allowing fast, self-consistent and complete classification of large numbers of protein structures. In particular, the discrimination between analogy and homology of close structural neighbors will lead to functional predictions while avoiding overprediction.

But once again we see no way to establish if there was actual front loading involved or that the initial state was determined by some common ancestor. So once again front loading seems to remain indistinguishable from front loading and the 'evolution' from the initial state onwards seems to be fully guided by natural law. I do realize that Mike may not be arguing against methodological naturalism but many ID proponents seem to propose ID as an alternative to methodological naturalism. As Murray argues in "NATURAL PROVIDENCE (OR DESIGN TROUBLE)" Front loading suggests that ID would be no alternative to methodological naturalism.

quote:

So perhaps disciplinary territorialism should not rule out Intelligent Design as a genuinely scientific explanation. But we are not out of the woods yet. For even though countenancing design
as an explanation might in principle count as genuine science, it cannot if the design hypothesis is not empirically distinguishable from explanations which appeal only to the natural powers of natural substances. If such empirical distinguishability is not possible, then there is no scientifically respectable way, by IDT’s own lights, to defend intelligent design as an explanation distinct from law and chance.

But we are getting side tracked here on issues of front loading which are not really relevant to this thread on protein folds.

[ 11. March 2003, 02:04: Message edited by: Frances ]

IP: Logged
Mike Gene
Member
Member # 149

Icon 1 posted 11. March 2003 03:26      Profile for Mike Gene     Send New Private Message       Edit/Delete Post 
Yersinia: I think you would need to tie fold to function in order to begin to damage the homology-based-on-structural-similarity argument. Then you could argue "for this function only this fold can do the job".

I don't agree. If there are only a relatively few folds, then the homology-based-on-structural-similarity argument is without much support. Why infer homology from structural similarity? And certainly a design perspective does not entail that each fold must have a specific function.

Look it at this way. If you want me to accept homology between two proteins, you'll need something better than structural similarity.

Frances: I do not understand why these findings would undermine the use of sequence similarities to infer homology.

We're not talking about sequence similarities. The question relates to inferring homology from structural similarities.

It is interesting that Conway Morris raised the same point:

quote:
The question we need to ask is whether a structure (molecular or organismal) is similar because it shares a common ancestry and thus is homologous or because there is no (or very few) alternative. The former approach, of course, underpins most evolutionary thinking and has potentially a strong historical component. Convergence, on the other hand, points towards adaptive constraint in which the historical dimension is relatively unimportant.
http://idthink.net/biot/scm/index.html
IP: Logged


All times are East Coast
This topic is comprised of pages:  1  2  3 
 
Post New Topic  Post A Reply Close Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership