|
Author
|
Topic: Shapiro on the Genome
|
gedanken
Member
Member # 594
|
posted 01. March 2003 18:36
My apologies to the moderator for not making clear the connection of my little detour into the definition of information and the original topic of Shapiro’s Genome Engineering collection of articles, and Mike’s additional introductory remarks including Davies on “information theory” and further remarks on “communication theory”. I intend this to pretty much wrap up my discussion of this, relating back to the topics.
Norbert Wiener, in his 1949 Cybernetics: or Control and Communication in the Animal and the Machine said in the concluding paragraph of chapter V on “Computing Machines and the Nervous System”:
quote: As a final remark, let me point out that a large computing machine, whether in the form of mechanical or electric apparatus or in the form of the brain itself, uses up a considerable amount of power, all of which is wasted and dissipated in heat. The blood leaving the brain is a fraction of a degree warmer than that entering it. No other computing machine approaches the economy of energy of the brain. In a large apparatus like the Eniac or Edvac, the filaments of the tubes consume a quantity of energy which may well be measured in kilowatts, and unless adequate ventilating and cooling apparatus is provided, the system will suffer from what is the mechanical equivalent of pyrexia, until the constants of the machine are radically changed by the heat, and its performance breaks down. Nevertheless, the energy spent per individual operation is almost vanishingly small, and does not even begin to form an adequate measure of the performance of the apparatus. The mechanical brain does not secrete thought “as the liver does bile,” as the earlier materialists claimed, nor does it put it out in the form of energy, as the muscle puts out its activity. Information is information, not matter or energy. No materialism which does not admit this can survive at the present day.
Information is a character of organization that goes beyond any simplistic characterization, and in this I need to modify somewhat my earlier presentation. We see, however, that “information” does require energy, for example, and that “information” is a property or description of physical states of matter and energy. (And since “matter” is a form of energy, “information” is a property of the organization of energy.) However the nature of connection between biological systems and concepts of information that Mike and Shapiro mentions have origins that go back for many years. I don’t think that this is a new recognition, however much its truthfulness has been extended and enhanced with recent knowledge.
Mike Gene, quoting Davies:
quote: This point goes back to something Paul Davies noted, as he appraoched the topic of biology from the perspective of a physicist:
Concepts like information and software do not come from the natural sciences at all, but from communication theory, and involve qualities like context and mode of description - notions that are quite alien to the physicist’s description of the world. Yet most scientists accept that informational concepts do legitimately apply to biological systems, and they cheerfully treat semantic information as if it were a natural quantity like energy. Unfortunately, “meaning” sounds perilously close to purpose, an utterly taboo subject in biology. So we are left with the contradiction that we need to apply concepts derived from purposeful human activities (communication, meaning, context, semantics) to biological processes that certainly appear purposeful, but are in fact not (or are not supposed to be).
I disagree somewhat with Davies here in that there is a direct connection of thermodynamic entropy (in fact the negative of that term) and a particular energy based understanding of information. That particular understanding deals in organization and disorganization, but strictly from an energy perspective. (Suggested reading: SecondLaw.com.) Since all information stored in physical systems exists as an organization of energy in some form, and that form deals in organization vs. disorganization, this connection is not insignificant -- and in that respect is not in the least foreign to physicists.
It is my contention that there is a continuum of degree of applicability of information concepts that extends from the simplest physics to the most complex of biological or man-made computer or control or processing systems.
I have, however, found the comments on definitions of information to be very useful, and will try to present a modified view to accommodate what I have learned.
Definitions so far:
- Gedanken (original): Information is “data” subject to interpretation. (Interpretation is some process that makes use of the “data” in some manner. “Data” is an assembly of physical states of energy, and how those states form the “data” is dependent on the process that will interpret the data and thus make use of it. I discussed examples like a command that causes action when the ‘information’ is interpreted. In this concept the “information” was a greater abstraction, and the “data” was the physical state of energy in the medium.)
- Charlie D: Information is the interpretable representation of something in a different format (physical medium).
- Frances: (From Shannon) Information is the reduction in uncertainty. (Here I presented details in a post above showing that Shannon discusses information in terms of a human ultimately interpreting what is called “information”, and thus Frances’s definition does not avoid interpretation issue, even though it was left to ellipsis. Frances’s definition also brings in the notion of quantification of information.)
- ‘Nobody’: Computer Science. Processed, stored, or transmitted data. (When asked about “interpretation” ‘Nobody’ referred to uses of the human genome in terms of reverse-engineering of “software”, and pointing out the issue of “coding” of information. Here we still see the responsive action of a system outside of the organized energy state that determines the “information” by performing some sort of operation such as “process, store, transmit”.)
- Dembski (From here): “the actualization of one possibility to the exclusion of others”. (Dembski distinguishes from, but does not exclude, communication over a channel. After discussion of communication channels, Dembski concludes that they are not necessary; the essence is that when one possibility happens and others are rules out. Dembski identifies “specified” and “unspecified” information as distinctions within the general term “information”.)
- [I shall note Dictionary.com’s definition 3: A collection of facts or data: statistical information. (This is significant in that it also relates “information” as data rather than allowing for “information” to be an abstraction beyond or distinct from “data”. This is not particularly useful for our purposes IMO -- I only note it in regard to ‘Nobody’s definition and referral to computational and software processes as significant relationships of biology and man-made computing.)]
I think that we have almost universal agreement that some sort of process must extend the formatted energy of “information” to make some further use of it. Information is not the “energy” itself, and must extend to the formatting or organization of the energy (but often far beyond the simple “information” concept of negative thermodynamic entropy as well). Anything that is information must contribute to organization in some manner, and must include some level of identification of a system outside of the stored energy states that form the “information” storage medium. . We have disagreement about the relationship of “data” to “information”, with some definitions defining information as data with additional requirements, and other definitions ignoring the term “data”. At this point I am willing to abandon the notion of the “information” as a further abstraction of the “data”, and rather to partially define information as data stored in a formatting of energy states. (I may propose an alternate definition at the end in which we reverse this decision.)
Given the apparent consensus on the need for an outside system to “interpret” the stored energy states in some manner, I would like to focus on my little parable I gave in the post above dated 24, February 2003. In that I gave a case of a telephone conversation, in which we can consider the voltage (or a digitized version thereof) of the waveform on the telephone line as the “data”, but wherein there were two agents observing that “data”. One agent was a person listening to another speaking, and another agent was detecting a noise source that generated waveforms on the telephone line. For each agent, a different aspect of the waveforms (or ‘data’) was of interest, and the other was interference in the aspect of interest of that agent. The person listening on the telephone finds the waveforms added by a poor connection to be “noise” and interfering with the speech “information” that she is receiving. But the technician regards the voice waveforms as “noise” and wants to interpret the waveforms added by the poor connection so as to locate the source of the poor connection or gain some “information” about location from that waveform’s properties. Each agent finds the other’s noise is his/her information and the other’s information is her/his noise.
This gets us to the physical nature of the interpretation of “information” in a biological system. That which is noise in one context can be “information” in another. Thus the “information” is only relevant in terms of the physical process that reacts in some manner. Information is a property of stimulus-response systems, however widely described. Specifically “information” is the aspect to which the system responds.
Now clearly from the definitions above we are not required to have a human interpreting the “information” at the end of the causality chain before the energy states are considered to contain “information”. Evidence for this is the references to computers processing or storing alone, and thus the “information” may never be sensed by a human (though potentially could be). But I think that it is still “information” even if it is just used in such an organizing process, and that the human is not required for “interpretation”. In this I still feel justified in claiming that “interpretation” is an aspect of a system making some use of the energy states of the “information” that are impinging on that system.
One way I disagree with Dembski’s definition (that Frances claimed was similar to Shannon’s that he posed) is that information is clearly in relation to a given process (or process model). Thus “information” is not simply “actualization of possibilities” since the possibilities must be defined in relation to the processes of interest in order to give any quantification of probabilities as is requested by Dembski. I agree on the actualization of one possibility, and the exclusion of other possibilities. But the measure of those in terms of probabilities requires context. Thus there is an element of the “communications channel” identifiable in every case of “information” because the “channel” can somehow be associated as the context in which the possibilities are given a probability model. This identification may be somewhat unusual, and not the standard “channel” that one would think of as a telephone line. But nevertheless the channel can be identified in every case in which information is present (stored, transmitted, interpreted). And if one considers other more expansive list of possibilities, including interactions from more distant systems, and also changes the probability model in terms of the initial state and context from which any probability is to be modeled, then the probabilities themselves change. Since they change, the “information” value changes -- thus is dependent on context which can be identified as an equivalent of a “channel”. The term “channel” is more appropriate to discussion of communication, and the term “context” may be substituted for “channel” without loss of the sense of my meaning. Here, once again, the “context” is the way the choices are interpreted. (e.g. a set of choices may all be given a single value in interpretation, and thus the probability model is based on the interpretation and not the microscopic analysis of all possible outcomes.)
One consequence of this is that there is no absolute probability or measure of information, but rather information must be measured in terms of a context which includes a starting state. That pathway from starting state to actualization of possibilities can be identified as the “channel” (reversing the identification of the “channel” as the “context”). From this I conclude that Frances is correct, that Dembski’s “information” is essentially Shannon information, but with consideration to a slight increase in generality. With the proper method of identifying the “channel” understood, there is no difference in the base definition. Dembski goes on to differentiate “specified” and “unspecified” information as distinct forms of information.
Something I had previously not realized is that thermodynamic or statistical mechanics “information” (negative thermodynamic or stat. mech. entropy) is a special case of Shannon (or Dembski) information. One simply identifies the “channel’ (or context as above) in terms of an initial state of a system. Then the “interpretation” is a very specialized view of the entire set of all states in the system. This may have a sort of absolute nature in the sense that for a given system there is very nearly a single description of the states that can be computed, and thus a single “interpretation” in terms of thermodynamic order and complexity. There are in fact issues of granularity even in this type of model, and thus actually no single “interpretation” even in statistical mechanics “information” from physics, and even in this different communications channels can be identified in terms of different types of “interpretation” at the end of the channel as described above. This relates Dembski information back to the original statement by Davies, that “information” terminology would be foreign to Physics, and in this sense it is not. But clearly we have other forms of information, when we interpret the events in terms of effects within a model of how a subset of a biological system functions rather than in terms of all possible states in the system. Thus once again biological information is an extension or generalization of something that can occur in a simple physics particle system description. The physics description is a generalized notion of order, while the biological system view or interpretations refine this to additional views of system function.
Returning to Wiener’s Cybernetics (1949) chapter on “Computing Machines and the Nervous System”:
quote: Computing machines are essentially machines for recording numbers, operating with numbers, and giving the result in numerical form.
quote: It is noteworthy fact that the human and animal nervous systems, which are known to be capable of the work of a computation system, contain elements which are ideally suited to act as relays. These elements are so-called neurons or nerve cells. While they show rather complicated properties under the influence of electrical currents, in their ordinary physiological action they conform very nearly to the “all-or-none” principle; that is, they are either at rest, or when the “fire” they go through a series of changes almost independent of the nature and intensity of the stimulus. …
quote: Thus the nerve may be taken to be a relay with essentially two states of activity: firing and repose. … This is perhaps an oversimplification of the picture: the “threshold” may not depend simply on the number of synapses but on their “weight” and their geometrical relations to one another with respect to the neuron into which they feed; and there is a very convincing evidence that there exist synapses of a different nature, the so-called “inhibitory synapses,” which either completely prevent the firing of stimulation at the ordinary synapses. … A very important function of the nervous system, and, as we have said, a function equally in demand for computing machines, is that of memory, the ability to preserve the results of past operations for use in the future. …
Wiener notes very early the importance of numerical value (e.g. “weights” of threshold). But this predates a long history of AI research that concentrates on symbolic processing, methods that belie the statement that computers are primarily for working with “numbers”. But as I have discussed in other threads, the solution to problems like the AI “frame problem” will almost certainly be found in approximate reasoning and things that relate to “fuzzy logic”. Symbolic theorem proving type calculation is not sufficient for judgment. Wiener is prescient in noting prior to 1949 the importance of these methods of computation in information processing, including their importance to the issue of “interpretation” of information. It could be due to lack of experience due to the uses often made of those early systems, but such early systems were used for code breaking and other symbolic computations and interpretation of symbols by computer was hardly a foreign notion even in 1949. We note that Shannon’s “information theory” was also dated as of 1948.
In Shapiro’s paper “Genome Organization and Reorganization in Evolution: Formatting for Computation and Function” in the set referenced above are the keywords:
quote: evolution, genome formatting, genome system, system architecture, natural genetic engineering, computation, signal transduction, DNA rearrangements, information storage, repetitive DNA
I completely agree with Mike’s point that these terms (and the other references to “information” and “communications” theory aspects being used in biological descriptions) make a firm association of information and communications theory and biological systems. I completely agree that some form of something that must be related to what we call “intelligence” must be involved. Where we disagree is on the physical relationships of the “intelligence” in temporal and philosophical levels of causality.
I personally find that the very processes of constrained random variation and selection according to principles of relationship to physical environment are related to my understanding of mental processes in the physical brain as it creates a “design”. As such I consider these associations of physical processing of “information” by physical systems to be entirely consistent with expectations for both human designed systems, and those that are considered to be the product of biological evolution according to current theory. As one Eastern philosopher said “The universe naturally peoples.” This does not remove the interest in such associations from the philosophical level -- why is the universe one which “naturally peoples”?
This for me is the important philosophical realization of this entire issue. Why does the universe, whether or not some of life was seeded here on Earth by an intelligent embodied or unembodied agent, naturally sustain these relationships? Those questions are not answered by making inferences of agent action in the physical chain of causality -- because they go beyond single events to the entire pattern of physical relationships of “information” and “processing” and the physical systems that we can observe today. (Not to mention that any such action by an agent just gives rise to questions of regress, wherein the same question may be asked again.)
But to return to the study of physical systems, I have not found the conflict that some seem to imply when considering the concurrence of information processing in human designed control and communications systems, and those complexities of interpreting energy patterns as “information” in physical systems that are considered to have evolved.
--
To discuss an alternate view, we can return to the issue of “information” as an abstraction. In this view physical systems process “data” in different levels of complexity. But humans create descriptions of those processes as abstractions. In this case the “data” becomes “information” only in an abstract model created by a human. The entirety of the constructs of “information” or “communications” presented by Shapiro are thus human constructs in human terms to explain operation of physical systems. Since all human models are of equal footing, there is nothing significant in noting that “information” in this context is a human construct.
[Edit includes addition of “Dembski” definition of information, and discussion thereof. Also minor corrections to sentences. Since no one has replied to this post, I felt no harm would occur with updating.] [ 02. March 2003, 23:58: Message edited by: gedanken ]
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 02. March 2003 23:05
Prior post was updated somewhat as indicated. I thought this would be better as integrated rather than a separate post.
IP: Logged
|
|
Erik
Member
Member # 160
|
posted 03. March 2003 06:01
Frances wrote: "I personally like the definition given by Shannon which is btw the same as used by Dembski to define information. Basically information is the reduction in uncertainty. This definition does allow for a rigorous mathematical foundation to address the concept of information (biological). In fact using Shannon's concepts of information and the second law of thermodynamics I believe one can derive the lawof 'conservation' of CSI for closed systems and similarly show that for open systems (including systems with intelligent inputs) no such restrictions exist."
Dembski's definition is not the same as Shannon's. Shannon's definition depends on the probabilities of all messages (outcomes) that could have been sent, and nothing else. Dembski's definition depends on (i) the state of knowledge of the person using it (i.e. the ability to formulate "specifications"), and (ii) the probability of the specification of the event/outcone actually observed, but not on the probability of outcomes outside the specification. The law of conservation of CSI cannot be derived in any mathematical sense; it is incurably informal and non-rigorous. See my essay on the topic and Dembski's response.
Erik
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 04. March 2003 01:27
Eric,
Dembski appears to be defining “specified information” for the term that he claims a “conservation”. (I mentioned this in my listing on Dembski’s definition.) So there are additional constraints to “specified information” from simple “information”. But I think this thread has been concerned mostly with general information as I shall call it here. It would be worth examining the comparisons.
There are aspects of this that I think make it highly relevant to this thread.
For one, conservation of information, if it were to have a scientific basis, could be used as an argument from some sort of analogy that both were created by the same type of mechanism. I in fact agree with this assessment, but do so because I associate the mechanism of evolution with a form of intelligent action because I see parallels in both that I think account for the similarity. But the only “conservation” argument is one that gets back to a philosophical issue of character of the properties (potentials) of the universe (observed in science), rather than an indication of involvement of an external agency.
The papers that I think are most relevant to this post are my original link to Dembski’s popular paper Intelligent Design as a Theory of Information, and Dembski’s response to Eric If Only Darwinists Scrutinized Their Own Work as Closely: A Response to “Erik”. The first lacks in precision in order to be a popular presentation, and thus cannot be relied upon for technical details. The latter ignores some issues from Eric’s essay, so we’ll have to punt. Also of relevance are Matt Young’s How to Evolve Specified Complexity by Natural Means, and Dembski’s response Refuted Yet Again! A Brief Reply To Matt Young, and a response by Matt Young to Eric (which I don’t think I agree with). I do agree that “specified information” is (ill-)defined differently than general information as I have described as an agreement of several definitions in previous post.
What has caught my attention is that there may be an explanation of why “specification” can be found as a common theme in both things created by human “design”, and in biological evolution and biological systems. (Thus part of the description of biological systems, and part of the description of human invention in communications theory, for example.) My premise is that the human mind is limited, but has a good ability to make connections. Ideas are taken from observation of nature and then can form “specifications” for human designs.
From Dembski’s response:
quote: The Law of Conservation of Information in its deterministic form claims that specified complexity can’t be created from scratch by deterministic processes.
From this we have some interesting implications. Deterministic process can be tracked back through causality chain to get to the point in which we examine a probabilistic or uncertain source of causation. I think I will ignore this beyond echoing Eric’s concerns for clarity, as it would correspond to a perfect (error free) communication channel. (And note that I have claimed that every information context can be identified as it’s communication channel, even if the concept of it being such a “channel” is somewhat unconventional.)
And later discussing stochastic processes:
quote: Yet as soon as stochastic processes come into play, chance can produce information -- indeed, unlimited amounts of it. Chance can also produce limited amounts of specified information. …
Clearly Dembski is not quarreling in any way that general information (e.g. unspecified or unclassified) being communicated in a biological system would be subject to a “conservation” law. Thus the general information discussion is not relevant to discovery of a conformance of information processing in biology and in human design with respect to a “conservation” proposal.
Furthermore this distinguishes the general notion of “information” (or general information) from “specified information.” This is consistent with my post above, that Dembski further classifies information as “specified” or “unspecified”, and what we have been attempting to define and relate to communication is the more general information. But this is also of interest that information that is interpreted must have a character that goes beyond simple randomness or characterization simply as “data”. It will be interesting to investigate the correspondence of “specificity” to the notion of information having a character that is specifically related to how it is “interpreted”. So “specified information” must be a subset of general information.
I have been concentrating on the aspect of “interpretation” as a necessary concept of what we consider to be information in biology. Is this act of “interpretation” sufficient to provide information with the quality of “specification?”
I think not -- for example humans and animals react to random occurrence (say a noise of specific character) and that is interpreted by specific mechanisms, yet the “information” content, being relevant and interpreted by that mechanism, is not itself a “specification”. For if it were, all noises could be declared “specification”. So I believe that there is a concept of “information” that lies somewhere else beside simple “data” and Dembski’s notion of “specification”. (Or if this association can be made to hold, it belies the relevance of “specification”.)
But “specification” is likely to come from things that are part of human thought. Since human thought is based on our experience with ourselves and nature (including observation of nature) it is entirely likely that the body of “specification” available to us would include those things that are descriptions of functional systems that either nature “designed,” or that we can “design” by making connections to similar ideas (based in part on observation of that very nature).
Dembski continues: quote: … Any statement of a stochastic form of the Law of Conservation of Information therefore must incorporate a modulus of probability to control for the production of limited amounts of specified information by chance. The universal probability bound sets the limit to just how much specified information can be produced by chance. In particular, specified complexity is beyond its remit. The argument sketched above shows how the modulus of probability comes into play in the stochastic formulation of the law. This argument, with its focus on the limited free play of chance over the space of stochastic elements, was always implicit. Erik has helped make it explicit.
But here is the problem, and the association with my previous points.
I agree with Dembski that a short sequence of action, when having as an explanation something that is less probable than the UPB, that such explanation is probably flawed. (e.g. another explanation outside of those given must be applied.)
But this does not hold for long sequences of action. Because there are contingencies present at each point of action. In the action, we can add “limited amounts of specified information by chance” at each juncture. But what about continuation indefinitely? What happens when we simply keep adding these “limited amounts” over a very very long period of time.
Erik comments in his essay that “it is reasonable to conclude that at most 10^90 rejection functions can be identified.” This is in a context of the discussion of quantum transitions in the universe’s entire history must have been in the range of 10^150, by multiplying the number of elementary particles (10^80) in the observable universe times the lifetime of the universe in units of plank time transitions of (10^70). And in that mentions that Dembski’s argument “while flawed, actually yielded a good conclusion” (a fact remarked on in Dembski’s response). But Erik leaves to a footnote number “5” the question “Or should it be 2^(10^90) rejection functions?”
But specifications are not concomitant with elementary transitions -- there is no direct scalar relationship. Where there are relationships are in the number of possible assemblies (or very approximately permutations) of objects, and the number of possible ways of describing functional relationships in language to form “specifications”. Erik is correct in his footnote in noting the combinatorial explosion of possibilities. The number of possible assemblies has little to do with the number of elementary transitions, and each such transition is a possible point of quantum indeterminacy in the pathway. The number of actually possible pathways depends on the physical constraining nature of physical reality itself, which feeds back to the combinatorial explosion of possible pathways. This reduces the number of pathways to an almost infinitesimally small fraction of all possible cross-products of such transitions -- but that small fraction is a fraction of an almost unimaginably large number.
As much as Dembski tries, the number of potential possibilities for functional forms has little to do with the number of actual actions having been taken by the universe to this point. There are 10^150 different forms in a series of 500 coin tosses alone, and the number of permutations of relationships of physical forms must be exceedingly large. There has been no demonstration that the fraction of functional forms is small fraction of contingencies given limiting processes like natural selection. And furthermore the enormous number of possibilities for forms are demonstrated by our very observation of exceptional variety of life itself.
Way back on page 1 of this thread I said:
quote: In the thread algorithmic info, probability, etc. (including successive posts) I make the point that an arbitrary amount of the “software” of a Turing machine with a “program” (software) on tape can be transferred to the actual state machine representation of the Turing machine. Thus certain measures like Kolmogorov complexity are subject to certain viewpoint issues. (Also see my first post which is in more descriptive terms rather than highly technical terms of the thread.)
My point in the thread referred to is that a universal Turing machine (taken as a data compression scheme) is hardware dependent. In other words the size of the expressions depends on the TM itself. (Yes there are constants of translation, but those constants can be arbitrarily large between any two languages!) I think that our language naturally includes the subject matter of observed complexity of nature. Thus we have natural basis in our language of short strings to represent invention, both biological and human.
And that “specifications” relate to intelligence is to be expected in my opinion. [ 04. March 2003, 01:44: Message edited by: gedanken ]
IP: Logged
|
|
|