ISCID Forums


Post New Topic  Post A Reply
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » The CSI Bit String puzzle

   
Author Topic: The CSI Bit String puzzle
andyg
Member
Member # 415

Icon 1 posted 12. September 2004 15:44      Profile for andyg         Edit/Delete Post 
Here are four bit strings, where the bits are taken from an ASCII keyboard.

1.

atactgcgtcataagtaatctctattagacaaagatttcattacctgttggcatattgca
aaaataacaccaatacggaatcgtcatgttcacgattaaaacagatgatctcacccatcc
agcagtgcaagcattagtggcttaccatatttccggcatgctgcagcagtctccccctga
aagcagtcatgctttagacgtgcaaaaattacgtaacccgacagtgacattctggtcagt
atgggaaggcgaacaactcgcaggaattggtgcgctgaagttgctggatgataaacatgg
cgaactgaaatcaatgcggaccgcgccaaattatttacgtcggggtgtcgccagtctgat
tttacgccacattttgcaggtcgcccatgacagatgccttcatcgcctgagtttagaaac
gggtacacaggctggatttacggcctgccatcaactttatttgaagcatggtttcgttga
ttgcgaaccgtttgccgattatcaacttgatccacacagtcgatttttgtcattgacgct
atgcgaagataatgagttgctttgagccagacgcagcacattcttgcattcgacgtgctg
cgtctttatttatcaccaacaggaaacgccttgtccatagacgccccttccacatgcgtc
acaagaaacctctattccagtgacacaattacgcctaattaattacatataatatttaat
tatgaattcctcaccatctattacatgctttttaaccatatcggaatatttatcataatc
ggcgggattcataacaatatattttcgctgcgatatttcatagcgaatccctgtaagggt
ccatggcattaaaaatgcctctttaataggattacatttcatacaaagtaattttaaatt
gccaggtatcgcaggaataacctcaatcttattatattcaatatacgcttctttcaaatt
tttggggaaccatctaatttctttaatattattctcactacaatcaaaaaccttagcggt

2.
gctatvcggatcgatcrgachacgcgccgttatctagcgatcg
acgatcgagcgccsgatcggcgtcatactgactcagtmcac
tcgatcagtcagvaaaacagtgragacgtcgagtasgtkag
cmhgsacacgtgatcygctagdtcgatcatgcatgcatgcg
atagtatgcgatghcatcgatgtcatgtgyacgttcatgtcaatn
cggscgatcgahtccavgctagcttgcagcachgctcgtgat
artantgatatgctgatgascgatcagtgatgagtcahgtcac
agtgtactacgatcgatgctgagctcttctttctagcvgatctagg
ctac

3.

actgnatgragswrathtggathytnytnathgcnatggayga
ratgwsnaarathtgygcnaayacngaygarttyathaayga
rtgywsnathacncarwsngcngtngarcayathwsnytn
athttygargatcnyrt

4.

atgccacctttaacaactaaaataacaggaagcaacaattacttttccttaatatctcttaacatcaatggtctcaactc
gccaataaaaagacatagactaacaaactggctacacaaacaagacccaacattttgctgcttacaggaaactcatctca
gagaaaaagatagacactacctcagaatgaaaggctggaaaacaattttccaagcaaatggtgtgaagaaacaagcagga
gtagccatcctaatatctgataagattgacttccaacccaaagtcatcaaaaaagacaaggagggacacttcattctcat
caaaggtaaaatcctccaagaggaactctcaattctgaatatctatgctccaaatacaagagcagccacattcactaaag
aaactttagtaaagctcaaagcacacattgcgcctcacacaataatagtgggagacttcaacacaccactttcaccaatg
gacagatcatggaaacagaaactaaacagggacacacggaaactaacagaagtggaaaaacattatgaactaaccagtac
ccctgagctcttgactctagctgcatatgtatcaaaagatggcctagtcggccatcactggaaagagaggcccattggac
acgcagactttgtgtgccccggtacaggggaacgccagggccaaagggggggagtgggtgatagaattgaacaaaaccat
ccaagatctaaaacaataaagaaatcacaaagggagacaactctggagatagaaatcctaggaaagaaatcaggaaccat
agatgtgagcatcagcaacagaatacaagatatgcaagagagaatctcaggtgcagaagattccatagaaaacatggaca
caacaatcaaagaaaatgcaaaatgcaaaaagatcctaactccaaacatccagaaaatccaggacacaatggtaagacca
aacctaaggataataggtatagatgagaatgaagattttcaacttaaagagccaataaatatcttcaaccaagttctaga
agaaatcttccctaaccaaaagaaagagatgcccatgaat

Two questions:

1. How, in principle, would one go about determining whether any of the strings contain complex specified information?

2. For the more ambitious - do any of the strings contain CSI? Do some contain more CSI than others?

[ 12. September 2004, 15:45: Message edited by: andyg ]

IP: Logged
Scott
Member
Member # 1222

Icon 1 posted 12. September 2004 16:50      Profile for Scott   Email Scott   Send New Private Message       Edit/Delete Post 
quote:
1. How, in principle, would one go about determining whether any of the strings contain complex specified information?
1. The number of possible symbols at each position in the string.

2. The length of the string.

IP: Logged
Micah Sparacio
Member
Member # 6

Icon 1 posted 12. September 2004 21:44      Profile for Micah Sparacio   Email Micah Sparacio   Send New Private Message       Edit/Delete Post 
The following tool might be worth using to standardize the bit strings:

http://www.roubaixinteractive.com/PlayGround/Binary_Conversion/Binary_To_Text.asp

IP: Logged
Salvador T. Cordova
Member
Member # 959

Icon 1 posted 12. September 2004 23:09      Profile for Salvador T. Cordova     Send New Private Message       Edit/Delete Post 
quote:

Two questions:

1. How, in principle, would one go about determining whether any of the strings contain complex specified information?

2. For the more ambitious - do any of the strings contain CSI? Do some contain more CSI than others?


The default answer is inconclusive.

If we find two species of independent lineage having these sequences we presume genetic engineering (common design) or lateral gene transfer or something else.

Ironically, the whistle was blown on Monsanto because some of their genetically engineered food slipped through European import controls. Monsanto could have argued the design inference is unreliable therefore the DNA sequences can't be attributed to them, but no one will buy that explanation when real money is involved. I'm sure one could ping all the detractors of genetic engineering to find equally compelling examples.

So in answer to your question, "inconclusive" is the default answer. If the string matches a novel gene conjured up by another scientist and that gene has fairly unique features, it is a strong candidate for CSI.

As far as already living creatures, that's where all the intersting debate is right now, but it seems to me, few really understand what has been laid out in ID literature regarding CSI. That's ok, that's what these boards are for. Perhaps together we can iron any kinks.

Do 1 and 4 correspond to any known DNA sequence???

[ 12. September 2004, 23:18: Message edited by: Salvador T. Cordova ]

IP: Logged
andyg
Member
Member # 415

Icon 1 posted 13. September 2004 22:33      Profile for andyg         Edit/Delete Post 
Scott wrote:

quote:
1. The number of possible symbols at each position in the string.
2. The length of the string.

In this case - where I specified that the symbols come from an ASCII keyboard, (1) is certainly helpful.... but such an approach is not useful in the general case of detecting CSI, as the investigator would not know the number of possible symbols/components in teh message/system.

As far as point 2 goes, I'm not sure. Bill Dembski has talked about a 10 to the 150th power as a probability bound, but on other occasions (IIRC), has said that a telephone number or ATM number can exhibit CSI. Perhaps Bill would like to comment?

Salvador's answer presupposes that some of my strings represent nucleotide sequences. That may or may not be the case, but to assume this again relies on background knowledge of the sort that is not (I think) appropriate to determining CSI in general.

Perhaps I should have been more clear at the outset - the four strings should be treated without any presuppositions, as if the strings had beenr eceived from outer space.

IP: Logged
Scott
Member
Member # 1222

Icon 1 posted 14. September 2004 03:51      Profile for Scott   Email Scott   Send New Private Message       Edit/Delete Post 
quote:
Perhaps I should have been more clear at the outset - the four strings should be treated without any presuppositions, as if the strings had been received from outer space.
Now you're just being inconsistent.

quote:
I specified that the symbols come from an ASCII keyboard.
Yes, you did. And now you're saying we should ignore that. So which is it?

If we are to treat the strings as if they have been received from outer space, then are you sure that ASCII characters are appropriate?

Did we receive ASCII characters, or did we receive some other string that we then translated into ASCII characters?

Brings up an interesting question. What is the probability that a received bit string, when translated into ASCII characters, would consist only of the characters shown in the OP.

: Revisiting the original question.

So do you agree that the exercise is to calculate probabilities?

And that in order to do so we need to know, or make some assumptions about, the number of symbols being employed? (If we can represent the result as 'bits' we can reduce this to two.)

And that we also need to know, or make some assumption about, whether each symbol is equiprobable at each location in the string?

And lastly, we need to know the length of the string.

I should have known we were getting off to a poor start when you called these character strings "bit strings."

[Roll Eyes]

Serious discussion is invited, encouraged, even. Please revisit your OP and revise it accordingly.

thanks

IP: Logged
andyg
Member
Member # 415

Icon 1 posted 14. September 2004 22:11      Profile for andyg         Edit/Delete Post 
Scott writ:

quote:
So do you agree that the exercise is to calculate probabilities?

I'm not sure. As I wrote above, Dembski has variously said that CSI is present at a 10 to the 150th power bound or less, but has also said that phone numbers contain CSI.
IP: Logged
David L. Hagen
Member
Member # 323

Icon 1 posted 18. September 2004 00:32      Profile for David L. Hagen   Email David L. Hagen   Send New Private Message       Edit/Delete Post 
Clues:
Strings 1 and 4 contain only the four characters agct so could nominally be DNA strings.
Using the online DNA decoding tool at: http://www.geneseo.edu/~eshamb/php/dna.php
(Can someone verify the following interpretation?)

String 1 nominally decodes to 340 amino acid codons,
BUT it does not begin with start - ATG
NOR does it end with stop - TAA, TAG or TGA.
String 4 nominally decodes to 360 amino acid codons.
It Does begin with Start - ATG,
BUT ends with AAT instead of TAA = Stop.
String 2 has characters other than agct. So at least “noisy”.
The remaining characters nominally code to 106 amino acid codons with two characters left over.
String 3 also has characters other than agct so it may also be “noisy”.
It nominally decodes to 28 amino acid codons with 2 characters left over.
(BioJava.org also appears to have DNA conversion routines.)
Possible further tests:
Compare the strings against public genome data bases to see if they match to known genomic strings. E.g. do they match gene coding or non-coding regions?
What happens if these strings are reversed?

IP: Logged
Scott
Member
Member # 1222

Icon 1 posted 18. September 2004 14:25      Profile for Scott   Email Scott   Send New Private Message       Edit/Delete Post 
First andyg needs to tell us if we received bit strings, or ascii characters. He provides ascii strings, but then calls them bit strings, and then later says to pretend they arrived from outer space, but fails to specify in what format they arrived. I'm still waiting for him to modify the OP to clarify just what it is he is seeking an answer to, and just what the supposed puzzle is.

There are, of course, several levels of analysis one could use to determine CSI. If these were received as bit strings, I suppose there would be no surprise that they could be converted into ascii characters, but what are the probabilitites of getting just these ascii characters?

Then, as you point out, if we notice a pattern indicative of DNA or RNA, we could do the sort of analysis you are engaged in.

quote:
What happens if these strings are reversed?
Good question. We could also convert to mRNA code.

Also, can we use Perl or something similarly capable and do regular expression searches to look for start codons and stop codons and adjust for any possible reading frames.

Questions for you math whizzes. Given only four characters, how long would a string of those four characters need to be (minimum length) to qualify as a specification?

Is the minimum information content supposed to be at least 500 bits? We could ask then how long a binary string would need to be and do the conversion?

Given an ascii character set (256 characters), how long would a string need to be to qualify as a specification?

If we cannot use any of these strings as a specification, the whole exercise may be moot.

If I weren't busy moving, I might come up with a few procedures for manipulating these strings. One simple, and useful one, might be just to count the characters. I'm sure not going to spend the time to do it manually [Wink] .

IP: Logged
andyg
Member
Member # 415

Icon 1 posted 20. September 2004 19:58      Profile for andyg         Edit/Delete Post 
quote:
I'm still waiting for him to modify the OP to clarify just what it is he is seeking an answer to, and just what the supposed puzzle is.
The puzzle is to say which of the strings contain CSI. If you can't do that, tell me how you would do it in principle.
IP: Logged
Scott
Member
Member # 1222

Icon 1 posted 20. September 2004 22:18      Profile for Scott   Email Scott   Send New Private Message       Edit/Delete Post 
quote:
The puzzle is to say which of the strings contain CSI. If you can't do that, tell me how you would do it in principle.
In order for someone to tell which of the strings contain CSI, you need to resolve your prior inconsistent statements about those strings.

As for detecting CSI in principle, that's been covered.

IP: Logged
michaelgoodrich
Member
Member # 393

Icon 1 posted 28. September 2004 11:22      Profile for michaelgoodrich   Email michaelgoodrich   Send New Private Message       Edit/Delete Post 
Andy G. writes:

quote:
Here are four bit strings, where the bits are taken from an ASCII keyboard.
Methinks more context is needed; e.g., what is the specification of the probablistic context of the raw data?
IP: Logged
Jerry D. Bauer
Member
Member # 756

Icon 1 posted 06. November 2004 00:24      Profile for Jerry D. Bauer   Email Jerry D. Bauer   Send New Private Message       Edit/Delete Post 
Sorry, I don't see any specified information anywhere in there. Although some strings are more or less complex when compared to the others when utilizing Berlyne's complexity as in 'a pattern can be considered more complex the larger the number of independently selected elements it contains,' this seems about as far as we can take it.

If information is to be considered specified then each piece of information must play a specific role in working with the others to produce a primary function in the whole. Now, one could assume at first examination that these are nucleotides coding for something in a living organism and if this were true each group of nucleotides might serve in a specified roll in that if I were to remove any one group of them the overall function might cease or change.

But if they came from outer space then the odds this is pre-programmed code for earth life might seem pretty remote. Therefore, I must conclude that this is complex information, but not complex specified information.

IP: Logged


All times are East Coast  
Post New Topic  Post A Reply Close Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership