|
Author
|
Topic: Paper On Specificity
|
Jerry D. Bauer
Member
Member # 756
|
posted 23. May 2005 17:51
Thaxton, Yockey and Brillioun included the total number of proteins because they would not have been mathematically correct had they not. That's just high school probability mathematics.
Would you really think there is not more information content in two roles of quarters than there is in one?
One role of quarters contains 40 coins and considering 2 microstates 2^40 = 10^12, log2(10^12) = 40 bits.
Two roles of coins contains 80 coins, 2^80 = 10^24, log2(10^24) = 80 bits, exactly twice the amount information.
This ain't exactly brain surgery, Chris. But I would be glad to recommend some pages on how this is done, if you wish.
I don't have that book in front of me. But I doubt very sincerely Dembski thinks that one unit of information doesn't contain less information than two of those units would.
According to the Ken Miller paper I sent you to:
"When Dembski turns his attention to the chances of evolving the 30 proteins of the bacterial flagellum, he makes what he regards as a generous assumption. Guessing that each of the proteins of the flagellum have about 300 amino acids, one might calculate that the chances of getting just one such protein to assemble from "random" evolutionary processes would be 20-300 , since there are 20 amino acids specified by the genetic code. Dembski, however, concedes that proteins need not get the exact amino acid sequence right in order to be functional, so he cuts the odds to just 20-30, which he tells his readers is "on the order of 10-39" (Dembski 2002a, 301). Since the flagellum requires 30 such proteins, he explains that 30 such probabilities "will all need to be multiplied to form the origination probability"(Dembski 2002a, 301). That would give us an origination probability for the flagellum of 10 -1170, far below the universal probability bound."
20^30 = (10^39)^30 = 10^1170--Gee. Dembski calculated it just as I did and I have not read the book. Of course, he did a lot of estimation and omitted several of the proteins. I looked up the amino acid sequences and calculated all of the relevant proteins I could identify and so our estimates are not similar, but the math is identical.
IP: Logged
|
|
Mesk
Member
Member # 630
|
posted 25. May 2005 03:51
Jerry,
Do two identical copies of a DNA sequence contain more information than one copy of the same sequence?
If so, would you agree that a genetic duplication event spanning a stretch of (say) 500 nucleotides would constitute the generation of CSI according to your definition?
If not, then how can it be that two identical copies of a protein molecule contain more information than a single copy of the same molecule?
Mesk. [ 25. May 2005, 03:51: Message edited by: Mesk ]
IP: Logged
|
|
Jerry D. Bauer
Member
Member # 756
|
posted 25. May 2005 08:04
Hello Mesk. Haven't talked to one of my favorite biologists for awhile. Hope all is well.
Yes, two identical DNA sequences contain more information than would one. And yes, I would agree that a genetic event (think polyploidy if that's not what you're thinking) can produce CSI.
IP: Logged
|
|
Mesk
Member
Member # 630
|
posted 29. May 2005 04:56
Hi Jerry,
So do you accept that certain mutations with a reasonably high probability of occurring without any intelligent intervention - such as polyploidy -can result in the generation of CSI?
Mesk. [ 29. May 2005, 05:00: Message edited by: Mesk ]
IP: Logged
|
|
|
|
JohnAaron
Member
Member # 1554
|
posted 31. May 2005 11:36
So do two copies of the same dictonary contain more information than one? In one sense yes: there is physically more material containing CSI. However as to the nature of the content, there is not any more. Rather it is simply two copies of the same information. Copying information X gives you 2X, not X plus Y. Furthermore, if you have 2X, any subsequent changes in on X doesnt give you X plus Y, but gives you X plus (X-A), where A is the subset of specified complexity that is lost in the mutation/change process. Otherwise I could just write the first chapter of my phd thesis, copy it inaccurately 10 times and have an easy phd no?
IP: Logged
|
|
Jerry D. Bauer
Member
Member # 756
|
posted 01. June 2005 02:15
I see your point, John. If you write a PhD thesis and copy it 9 times with the duplicator, you don't have 10 PhDs to stick on the old C.V. But I would argue that you have 10 times the amount of information as you could then diffuse that information to 9 more recipients.
I also think we may be confusing complexities. When one makes needless copies of information for no purpose, that information is not specified because if you threw a copy of the thesis in the trash, nothing would happen. But when one makes necessary copies as in 10 professors waiting to see the work, and you throw one copy in the trash, one professor will not see the thesis and you may not get the PhD.
Simple aggregate complexity as in adding or subtracting marbles from a bucket of marbles will result in C = xy where y represents the unit (a marble) and x is the number of marbles. But specified information is different in that C = a + b + c + d +.....n.
The latter is specified because if copy 1 is going to professor a, copy 2 to professor b, copy 3 to professor c.........we then have specificity as in C = 1a + 2b + 3c........
Is that clear as mud?
IP: Logged
|
|
JohnAaron
Member
Member # 1554
|
posted 01. June 2005 10:39
yes that does make sense. I also dont know if we are on the same page with respect ot 'specificities'. I accept that in one sense two strands of identical DNA contain twice as much information as one strand, much like two buckets of water contain more water than one bucket of water. But you still have fundamentally the same thing: water! Thus although a genetic event (eg ployploidy) may produce twice the "volume" of DNA/information it really doesnt change the "content". So plant A and plant B have information X and Y respectively, and ployploidy gives a new plant, C, with information XY there is not really any increase in the information, simply a new context (plant C) in which pre-existing information (XY) operates. Therefore polyploidy only produces new CSI in the most limited sense, that of "volume" and not of novel information. [ 01. June 2005, 10:40: Message edited by: JohnAaron ]
IP: Logged
|
|
Stephen Wright
Member
Member # 195
|
posted 01. June 2005 14:38
quote:
“Although this means that the classical information storage capacity of a qubit is exactly one bit, there is no elementary entity in nature corresponding to a bit. It is qubits that occur in nature. Bits Boolean variables, and classical computation are all emergent or approximate properties of qubits, manifested mainly when they undergo decoherence.” (see Deutsch 2002a). http://www.qubit.org/people/david/Articles/ItFromQubit.pdf
This is a very confusing subject area, and I admire, Jerry for striding into the battle with confidence. The ISCID encyclopedia doesn’t have an entry for specified information. I am certain about one issue. In terms of Shannon Information – transmitting two dictionaries informs the receiver zero more than one dictionary. I think the criteria for the target is important - where a change in certainty/uncertainty needs to be assessed to address specificity - in this case the definition of words. There is no additional reduction in the uncertainty of word definitions after two transmissions instead of one.
Same case with 10 term papers. Multiple identical term papers do not change the value of specified information with which uncertainty can be reduced by recovering the content from code. Each transfer of the paper to another intelligent agent reduces the uncertainty at the receiver. But the number of receivers does not impact the i measurement of the original paper. The number of transmissions affects the amount of “mutual information” created. The concept of mutual information is very interesting to me and I would appreciate reading other opinions of what it means.
A movie compressed on a DVD has a specific number of bits when physically stored. When broadcast it doesn’t increase to the multiple of the exact number of listeners.
The roll of the quarter’s information is also unclear in your delineation. Unless there is code that comes from a series of H and T’s how is the information content specified? If the uncertainty target is - how many quarters? – any single finite number answers the question with certainty, no matter how many rolls. (Also, tacitly reducing the uncertainty regarding mass, volume and purchasing power values.) Please refer to the citation above, first transmitted at ISCID’s hometown during Wheeler’s 90th birthday event, by D. Deutsch. It seems like you are conflating the virtual concept of codified information and the physical storage sites as magnetic particles (or as qubits). I think they are different. Maybe you could comment?
ps – referring to the melodious phrase 2 bits – 4 bits –6 bits - a dollar, before 1947 the answer for the 1 roll of quarters is 320 bits. The context of where/how uncertainty exists at the receiver, is part of the Gestalt of information transfer quantification.
IP: Logged
|
|
Jerry D. Bauer
Member
Member # 756
|
posted 01. June 2005 18:31
You both make some good points and I enjoyed your posts. John says: "But you still have fundamentally the same thing: water!"
Can't argue with that. Then Stephen lays a line on me seeming so blatantly obvious that my 16-year-old would look at me with an "Oh...Duh...."
"A movie compressed on a DVD has a specific number of bits when physically stored. When broadcast it doesn’t increase to the multiple of the exact number of listeners."
So, I'll have to go deeper to stay in this discussion.
Shannon stated in his original paper: The choice of a logarithmic base corresponds to the choice of a unit for measuring information. If the base 2 is used the resulting units may be called binary digits, or more briefly bits, a word suggested by J. W. Tukey. A device with two stable positions, such as a relay or a flip-flop circuit, can store one bit of information. N such devices can store N bits, since the total number of possible states is 2N and log2 2N =N. If the base 10 is used the units may be called decimal digits."
Note that with his relay switches he is considering 2 possible states of existence, off or on, information or no information. He also noted in that same writing:
"One feels, for example, that two punched cards should have twice the capacity of one for information storage, and two identical channels twice the capacity of one for transmitting information."
There are several types of information and I think we have more than one sliding down this thread.
Information can be as simple as a pebble lying beside the roadway. The existence and discovery of that pebble is information because it conveys meaning to the observer that something is lying by the roadway and that something is a pebble. When one observes this information, an information channel develops, that channel being photons reflecting into the person's eyes from the pebble and they consciously or subconsciously record the fact there is a pebble into their memory.
But unlike active (kinetic) information that flows such as a rumor on the web, signals through a telephone line or radio waves from the station to receivers, the information concerning the pebble is passive and just like with potential energy, this is potential information because other than one observer happening onto that pebble, that information cannot develop a channel.
But I can input energy and intelligence and develop a channel through which the information can flow. I will take the pebble and show it to 100 people. Then the next week, I will take the pebble and show it to 900 more. And just lake Shannon's relay switches that either store information or don't, I take the information from 10^2 to 10^3 states, and bits increase from 6.64 bits to 9.97 bits.
Was that John's water that doubled? Had two pebbles have existed, I could add energy and intelligence into the system, employ two students to take each pebble around and each show the same amount of people a pebble as did I and we would end up with twice the amount of information. This is called information diffusion.
When Longfellow composed a poem, he had that information concentrated in his own mind initially and it got there through work in the form of chemicals flowing through neurons guided by intelligence. I can further state that the information was initially contained in one state of location (his mind). When Longfellow shared his new poem with his wife that information was then located in two states, his mind and his wife's.
His wife then told the poem to two of her friends and that information became contained in four states, and so it goes as that information is spread.
We can grasp two things happening here. 1) Information is diffusing out from a point where it was concentrated, similar to the tendency which we see with the nature of energy/matter. 2) We are also observing resting states increase as the information diffuses, from one state in Longfellow's mind to two as he shares the poem with his wife and to four as she shares it with two of her friends and possibly to thousands of minds when he publishes it. In ID we call these resting states microstates or macrostates, depending on which we wish to study.
Information concentration -----> diffusion is the very purpose of a newscast. The reporters go around the world to gather information, then concentrate it in the mind of the anchorman or on a teleprompter in front of him, and he then diffuses it around the world again. This information goes through more diffusion as people begin to talk about what was on the six o:clock news.
I see this as the actively flowing information as in Stephen's example of a movie airing to several people. Those people may then buy the DVD and diffuse it around to a few more states.
But to consider the type of information I used in calculating the probabilities of state of amino acids in proteins in that paper, we need look no further than a standard dictionary. Information: A numerical measure of the uncertainty of an experimental outcome.
That uncertainty is calculated by Shannon's formula:

But if all things are equiprobable then this formula can be reduced (and usually is) to H = log N where N is the total states.
I then used a formula also from Shannon's paper referenced above to calculate the information in bits:
Bits= log2(N) = log10(N) / log10(2)
And so, gentlemen, I would think John has apples, Steven has oranges and in the paper, I had pears.
IP: Logged
|
|
Jerry D. Bauer
Member
Member # 756
|
posted 01. June 2005 19:38
The above post was rather generic and Stephens post needs a bit more detail.
It's true that transmitting two dictionaries to one recipient transmits no more information if we consider the words of the dictionary to be all of the information inherent in the dictionary (I know of one MIT guy that claims most of the bits available in a new lap-top are found in the hardware--the housing and such), but why would we ever want to do this? It would normally be the other way around. Two dictionaries will go to two recipients which represents twice as much information. And in a library setting, one may go to 100s and 1000s of recipients.
And there is a code of H and Ts in quarters just as Shannon's relays considering off/on: quarters have two possible states of existence, H or T. So this may be where you are seeing a conflation.
If I'm going for a certain pattern, say to flip 500 coins and have them all come up heads, then H represents on in that system and T off. If I toss 500 coins wanting all heads and only get 300 H and 200 T, then I have 300 bits of information and 300 coins that "store" this information and 200 that do not.
As to specificity in a coin system, there is none in a handful of change randomly tossed on the dresser. But there is if I am going for a specified system. If I shoot for 25 heads in a 25 coin system and eventually make it, each coin plays a specific role in that system to sustain it. If I flip one coin back to tails, I no longer have that system so that coin is specified. In fact, that coin's specificity is 2^25 = 10^7.52.
You may have a different take on this and if so, I would be interested in you furthering it.
IP: Logged
|
|
Stephen Wright
Member
Member # 195
|
posted 02. June 2005 15:18
Jerry,
There are a lot of interesting issues here. But I don't want to go all over the place. You wrote, "And there is a code of H and Ts in quarters just as Shannon's relays considering off/on: quarters have two possible states of existence, H or T. So this may be where you are seeing a conflation."
Yes. Information storage as a bit cell on a hard drive is easy to relate to as "information". But it is a physical reality, not virtual information that specifies a Boolean yes/no. So this is NOT Shannon Information as I understand it. If the bit cell is polarized in concert with an app, the binary pattern will then be specified as a virtual bit.
Same with the quarters. Yes, flipping the heads and tails gives us a storage mechanism with a capacity for being organized into a code. But to get to specified information the code must be in place within the context of sender and receiver (where subjective views appear to come into this) and related to the physical pattern. Further the quarters must be displayed in a pattern that maps to the code. Then - not the quarters - but the pattern can be virtual bits of Shannon Info.
Please correct me, if this is off, but the bit value you present is really a storage capacity estimate for a physical device and not a message content in a virtual form ready for transmission.
IP: Logged
|
|
Jerry D. Bauer
Member
Member # 756
|
posted 02. June 2005 15:26
Just in case one may think we're sometimes talking to ourselves in here, I get an email this morning stating that people are using my statement to Mesk that nature can generate CSI via polyploidy as an admission that Dembski was wrong, and lo and behold, they are:
http://www.theologyweb.com/campus/showthread.php?t=54476&page=1&pp=16
So, in the words of the great American statesman, Barney Fife, let's nip this in the bud:
Note that the biologist never followed up on his question in order that I could clarify for him.
The intelligent intervention had already been made in that DNA before mutations began happening to it. You are aware that the code in each cellular genome contains more information than all 30 volumes of the Encyclopedia Britannica, are you not?
Well, then I have a question for you: Do you think that Windows XP could just "poof" out of a rock? Perhaps Bill Gates should fire all those programmers he spends billions of bucks on every year and just take a walk in the desert hoping that a cactus will spit out a new CD with Longhorn on it.
Quite ludicrous, isn't it? And it's just as ludicrous to think something in nature besides intelligence could program the uber-complicated preprogrammed code in DNA. Where there is preprogrammed code, there was a programmer and common sense ought to tell you that this programmer was not some dumb stalagmite. Stalagmites do very little programming to my knowledge.
So there lies your answer. A plant can produce CSI through polyploidy without the intervention of an intelligent agent. But only because an intelligent agent already programmed the code for this to happen.
Bill Gates: “Human DNA is like a computer program, but far, far more advanced than any software we've ever created.”
On to your other posts later.
IP: Logged
|
|
Jerry D. Bauer
Member
Member # 756
|
posted 03. June 2005 00:10
Stephen:
I feel we can't limit ourselves to an exclusive paradigm here as science is just too broad and authoritative concepts like these I would think need be given flexibility to be used in many areas and disciplines.
Shannon borrowed the term bits from Tukey and defined the term as the way we mathematically measure something. As I previously quoted: "The choice of a logarithmic base corresponds to the choice of a unit for measuring information. If the base 2 is used the resulting units may be called binary digits, or more briefly bits, a word suggested by J. W. Tukey." Why complicate that useful definition any further?
We can, of course view bits as stored energy, flipped switches, or as symbols such as letters or flipped quarters. Shannon did all three and many people have since then. Observe how Shannon begins with letters and comes out in bits in the same paper:
As a simple example of some of these results consider a source which produces a sequence of letters chosen from among A, B, C, D with probabilities.....successive symbols being chosen independently. We have H = *math excluded* bits per symbol
And we (me?) are by far just a few of the people that recognize this in infodynamics:
"A bit is a binary digit (0 or 1), or the amount of information needed to give an answer to a single yes-or-no question. The outcome of one coin toss could be reported in one bit (0 for heads, 1 for tails). The result of two coin tosses could be reported in two bits, which can represent four possible combinations (00, 01, 10, 11)."
http://www.cns.nyu.edu/csh04/Articles/ReinagelTutorial2000.pdf
IP: Logged
|
|
JohnAaron
Member
Member # 1554
|
posted 04. June 2005 06:49
Jerry, you are right when you say we are all discussing slightly different concepts/types of information. as far as you go in your paper i think you make fair and reasonable claims about quantifying information in that system in terms of bits. it looks good to me however in addition to describing biological systems in terms of bits (essentially a Shannonian persuit) what do you make of other aspects of "information/language processing" that are apparent in biology? I am thinking of features of language such as syntax, symantics, pragmatics and apopbetics. Through my studies, especially in immunology, i have been struck at how many aspects of biological systems display these same features. for example, one could argue that any given immunoglobulin molecule represents a particular semantic arrangement (eg bind to antigen X and activate the classical complement cascade) derived from the unique grouping of relevant syntactic componenets (eg particular variable light and heavy genes being combined to enable binding to antigen X, along with appropriate genes being used to specify a complement-fixing Fc region). The syntax of the immunoglobulin molecules themselves being derived from the statistical aspect of information (ie the 'bits' component of amino acid sequence or even the nucleotide sequence of the relevant genes). As far as I can see no-one goes beyond the Shannon concept of 'statistics' when thinking of information in biological systems. It strikes me that there are other aspects of information-processing (language) that biological systems dislay ubiquitously. Any thoughts anyone?
IP: Logged
|
|
|