ISCID Forums


Post New Topic  Post A Reply
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » Quantifying Biological Complexity

   
Author Topic: Quantifying Biological Complexity
Moderator
Administrator
Member # 1

Icon 1 posted 06. December 2002 11:56      Profile for Moderator   Email Moderator   Send New Private Message       Edit/Delete Post 
I'm not how sure how well this will work, but in another thread Evan said the following:

quote:

If anyone else besides Warren has some ideas about how to quantify biological complexity (a critical step in actually applying the explanatory filter), I hope they will offer their “brainstorm” about this sometime.

Well, why don't people take a hack at it in this thread? Remember, the purpose of Brainstorms is not to present rock solid evidence (not necessarily at least), but preliminary ideas. Brainstorm!

So, here are the rules.

1. Say all you have to say about the thread topic without making the core of your post a criticism someone else's comments. Let us know what YOU think about biological complexity and what the best way to approach it empirically is.

2. Each person can make only one post in this thread. If you need to make changes, edit your original post.

Post away. (BTW, Evan, thanks for the idea!)

[ 06. December 2002, 11:57: Message edited by: Moderator ]

IP: Logged
Noel Rude
Member
Member # 516

Icon 1 posted 06. December 2002 16:00      Profile for Noel Rude   Email Noel Rude   Send New Private Message       Edit/Delete Post 
Not being a biologist maybe I can still take a stab at this topic. There is nothing more information rich than language, but linguists have always worried more about describing and accounting for the system that codes that information than quantifying the information coded therein. Maybe this is how biologists should react to DNA.

However, perhaps of interest to biology, is that information in language is negotiated clause by clause, and each clause must relate in some way to what has gone before to be coherent, but without advancing the flow of information we have only redundancy (or tautology). I have a linguist friend who went away years ago to work on AI, and his interest was in detecting and measuring this advance of new information. How is it, for example, that a computer that was fed news items would be able to detect new information in them. This would have to come -- not just from detecting a novel string of 1s and 0s, but from computing from them the logical information they contain and then matching this against the information stored in its memory banks. No mean task this.

Maybe this pertains to biology, not only in the way DNA gets read and in the internal workings of the cell, but in evolution. I have found most biologists reluctant to admit to any evolutionary advances -- maybe this is what Warren Bergerson is getting at -- demonstrating increases in specified complexity.

[ 06. December 2002, 16:09: Message edited by: Noel Rude ]

IP: Logged
warren_bergerson
Member
Member # 262

Icon 1 posted 06. December 2002 17:15      Profile for warren_bergerson   Email warren_bergerson   Send New Private Message       Edit/Delete Post 
For those who haven’t read the Biological Information thread, I define complexity or one dimension of the volume of biological the information in terms of the teleological or adaptive complexity of input-output functions. At a point in time, the teleological complexity of a processing unit is defined by N the number of possible processing algorithms divided by Nf the number of processing algorithms which are adaptive.

To illustrate, the concept involved consider the teleological complexity of a single neuron. A single neuron can have something like 1,000 to 10,000 inputs from other neurons. Inputs to neurons are in the form of all or nothing impulses which produce chemicals which influence the electrical potential of the neuron receiving inputs. There is a maximum frequency of such impulses and the impact on the electrical potential gradually fades over time.

For the sake of discussion, assume that a neuron has 1000 inputs, assume the duration during which there can be no more than 1 impulse from an input neuron is k, and assume the impact on electrical potential lasts for a duration of 3k. Given these assumptions the generation of impulses at a point in time t depends on something like 3000 binary inputs and a processing algorithm Ft. Given the size of the input domain, there are something like 2^3,000 (very big number) possible processing algorithms. This is the value of N. The portion of processing algorithms which are adaptive or teleological will be some subset of the set of possible processing algorithms. While I don’t have direct measures of the size of Nf, it is much smaller than N. The level of complexity is very high.

A second dimension of the volume of biological information is speed at which a system can change if the set of adaptive processing algorithms change. It is well known that the processing algorithm in a neuron can and does change in a matter of milli-seconds.

Using my definition of complexity, the levels of biological complexity in biological systems are very, very, high and the processes responsible for producing changes in teleological complexity are extremely powerful and extremely fast.

My definition of complexity is clearly very different from the definitions used in genetics. The adaptive/evolutionary change processes associated with my definition of complexity is clearly much fast and much more powerful than the processes suggested by Darwinian and neo-Darwinian theory.

It will, however, be noted that my definitions are consistent with what is known about the speed and power of animal intelligence and human creative intelligence. The ability of animals and humans to generate creative designs is based on very powerful processes. The biological designs produced by non-human -non-animal systems are fully as complex the design produced by humans.

It seems exceedingly reasonable that the processes producing biological designs are fully as powerful as those associated with animals and humans. It is, to me, incredible that the scientific community can still accept theories which suggest that the complexity of life forms was produced by a process that can generate something like 1 bit of information for every 55 lives.

IP: Logged
Daniel Edington
Member
Member # 421

Icon 4 posted 06. December 2002 22:23      Profile for Daniel Edington   Email Daniel Edington   Send New Private Message       Edit/Delete Post 
[Wink] you could at least spell my name right. anyway my post was contentful and positive. Care to explain why you feel otherwise?

Moderator snip...

Mr. Edington snip...

[ 07. December 2002, 19:27: Message edited by: Daniel Edington ]

IP: Logged
Frances
Member
Member # 169

Icon 1 posted 08. December 2002 14:23      Profile for Frances     Send New Private Message       Edit/Delete Post 
Concepts of complexity and information are not new to science and not even to evolutionary science.

I would propose to follow in the footsteps of Shannon and define information in terms of Shannon entropy. This approach places complexity/information close to the definition of Dembski I(A)=-log(2)P(A) but with a major difference. Rather than restricting the analysis to uniform distribution functions we need to determine the actual entropy reduction in order to determine information flow. Using this approach one can determine the amount of entropy reduction due to the interaction with the environment.

Let's define some terms to simplify the discussion

X ensemble of sequences
P(j): Probability of occurrence of sequence S_j

Entropy of ensemble X

H(X)=-sum p_i log p_i

summed over all i

When the probabilities are drawn from a uniform distribution, entropy reaches is maximum value H_max

H_max(X)= L

Where L is the sequence length

When selection is active, the probabilities of finding a certain genotype becomes however non-uniform the information stored in X about the environment E is defined to be I(X:E)

I(X:E)=H_max(X) - H (X|E)= L + sum p_i log p_i

Thus one may argue that the correlation between the Environment and the ensemble is what determines the information.

ev: Evolution of Biological Information

What is Complexity by Adami

Schneider has succesfully applied these concepts and extended it to include molecular information theory

One of the 'problems' with this approach is that complexity/information looks at the DNA/RNA rather than at the proteins coded for. As is well known, several different codons can all encode for the same amino-acid and many different codon sequences can all have similar protein foldings.

A Testable Genotype-Phenotype Map: Modeling Evolution of RNA Molecules

This paper describes some exciting new findings in RNA evolution by looking not just at the genotype but also the phenotype.

The findings are
  • More sequence than structures
  • Few coimmon and many rare structures
  • Common structures are found almost anywhere in sequence space
  • Neutral networks of common structures extend over whole sequence space
I believe that a combination of phenotype mappings (possible in RNA space, doubtful at the moment in DNA space) can help us further understand the relevance of adaptive versus neutral evolution.
This paper by Schuster shows how adaptive evolution and neutral evolution 'work together' to bridge 'gaps' in fitness.

[ 09. December 2002, 00:49: Message edited by: Frances ]

IP: Logged
yersinia
Member
Member # 324

Icon 1 posted 08. December 2002 15:01      Profile for yersinia     Send New Private Message       Edit/Delete Post 
If it were up to me, I would define a first approximation of genetic information as:

The amount of space it takes a computer to hold the functional sequence.

E.g., more DNA is more information. Even a mere gene duplication is a bit more information, as (1) even unmodified gene duplications can be biologically relevant for increasing gene dosage etc. (e.g. pesticide resistance), and (2) even under a compression algorithm, it would take a bit of information to specify "X copies of gene Y" instead of just assuming only 1 copy.

Divergent copies of genes would of course have more information, being less compressible.

Modifications could be made to expand the above, separating out e.g.:

1) Coding sequence information
2) Regulatory sequence information
3) Non-sequence-dependent function (e.g. "skeletal DNA")
4) Totally nonfunctional

One way to merge the above would be to modulate the information measurement depending on "allowed variability" within a given sequence. This would get very complicated very quickly, but there are some methods that would allow an approximation, e.g. looking at protein sequence variability for the same protein across taxa -- e.g. many proteins retain the same 3D structure with low sequence similarity. This would reduce the information metric drastically, but would also allow comparison, e.g. skeletal DNA might have almost no info value even in a large number of bases.

To me, this seems the obvious route to go. Of course, information measured this way is readily increasable by known mutational and selective processes, perhaps this feature would account for the unpopularity of this metric in certain circles...

IP: Logged


All times are East Coast  
Post New Topic  Post A Reply Close Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership