ISCID Forums


Post New Topic  Post A Reply
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » algorithmic info, probability, etc. (Page 2)

 
This topic is comprised of pages:  1  2 
 
Author Topic: algorithmic info, probability, etc.
Groucho
Member
Member # 605

Icon 1 posted 18. December 2002 11:05      Profile for Groucho     Send New Private Message       Edit/Delete Post 
quote:

The problem is this is not a valid Turing machine, according to the theories presented. The reason is there is no “random” function in a TM!

If there's no "random" functions in a TM then there's no "random" functions in C. Anything that's deterministic is by definition pseudo-random.

quote:
Since the universe is essentially random, and a TM according to these theories are essentially deterministic, we have an essential conflict. Pseudo-random generators are not “random” in the Quantum Mechanics sense, since they are essentially predictable and repeatable if one knows the input state. The same TM with the same input tape always creates the same output tape (presuming it halts).

I somewhat assumed to begin with that the input mutation sequence m was being generated by quantum mechanics or the like. To assume that input m is completely nondeterministic does not in any way undermine my model. I don't care what's generating m. Where do my arguments in any way depend on m being generated by a deterministic process??? I do assume that e (i.e. the laws comprising natural selection) be conceived of as deterministic. I don't understand your point.

I'll give you this: In my initial post, I proposed a rationale for computing probability of a string X based not on its length, but rather the length of the shortest program that would generate it - H(X). I don't assume however, that m is necessarily being generated by a deterministic process. Are you proposing the length of a string be used to determine its probability?

If you want, you can assume that instead of the "real" universe, my arguments apply only to the most accurate simulation of the universe imaginable, running on a deterministic computer. If you think they're valid in that context, and yet want to dismiss them as irrelevant to discussions of the "real" universe, then I guess that's your prerogative.

(I'm still working on a reply to your previous post.)

[ 18. December 2002, 11:32: Message edited by: Groucho ]

IP: Logged
gedanken
Member
Member # 594

Icon 1 posted 18. December 2002 11:16      Profile for gedanken         Edit/Delete Post 
Hi Groucho,

Please don’t take that (second) post as implying a criticism of your specific methodology. I was speaking in this case of the more general way that ID proponents make use of the K.. complexity variants, and problems therewith.

I’m still trying to understand the implications of your presentation. In fact we might just possibly reach the same conclusion, when combining what you are presenting and the point about probability. I certainly myself believe that random events add information, as it would be measured by K.. complexity variants.

So what I gave is intended to be an issue to be dealt with in discussing complexity as measured by these methods. As the theme develops I may be able to relate my posts to the theme.

IP: Logged
Iain Strachan
Member
Member # 96

Icon 1 posted 19. December 2002 08:15      Profile for Iain Strachan     Send New Private Message       Edit/Delete Post 
Gedanken,

I'd like to address your objections to Algorithmic Information Theory concerning the choice of language. While this is possibly a difficult issue to resolve because the length of a program (and hence the number of bits of "Information") will vary according to the programming language, I think that an equivalent problem arises in conventional information theory, and indeed in the Bayesian approach to statistical inference.

Consider

Keeping Neural Networks Simple by Minimizing the Description Length of the Weights

Hinton and Van Camp (1993) Proc 6th ACM Conference on Computational Learning Theory.

In this paper, H & VC apply the minimum description length principle to the critical problem of optimizing the complexity of a neural network model in order to provide the best generalization performance. The description length (the analogue of the Chaitin information measure as the length of the program) is the expected number of bits that you have to transmit over a channel from sender to receiver.

In such a process, a key step is that the sender and receiver have to agree on a probability distribution with which to encode the data. Roughly speaking the number of bits one has to transmit for a value is larger for low probability and smaller for higher probability. For the best coding, one should choose the true probability distrbution (which would be the Bayesian posterior distribution of the weights of the neural network learned in the training process), but in principle one can arbitrarily choose any distribution.

It is clear that an inappropriate choice of distribution (i.e. one that assigns low probability to values that are most probable), will lead to a greater amount of information that you need to transmit.

But this does not imply that therefore there is a major problem with the MDL principle. Given an agreed probability distribution, for example the model residuals (and all the hidden assumptions that implies), one can prove useful results; as in the paper, it is shown that under a Gaussian noise model, that the minimization of description length corresponds to choosing a least-squares cost function in the non-linear regression. Similarly, choosing a Gaussian for encoding the weights (learned parameters), and minimizing the description length is shown to be equivalent to a simple form of regularization (weight decay), which can be shown empirically to improve generalization performance.

Now of course one could choose a wholly inappropriate probability distribution, in which case minimizing the description length would give extremely bad results, IMO this is pretty much the same as choosing an inappropriate programming language to solve a particular problem (e.g. using COBOL to solve a set of partial differential equations).

In the same way (and I believe this is just another way of looking at the problem), within a Bayesian framework of machine learning, where learning from a data set allows one to deduce a posterior distribution for the model parameters, an inappropriate choice of prior will also lead to bad results.

Essentially, I see all of these as different ways of looking at the same problem; useful results can be derived if one takes the same programming language, the same probability distribution to encode the transmission of information, or the same prior.

It is always of course possible to choose a pathological way of coding; just as one could have a computer language that put out the works of Shakespeare with one machine instruction, one could assign the 1 bit code "0" to mean the number pi to billion digits. But neither model would be practically any use. The Shakespeare computer will only have advantage if you want to regularly print out the works of Shakespeare; it will be no use if you want to print out the Authorized version of the Bible. Likewise, the "pi" code offers no advantage if one wanted to transmit the weight parameters of a neural net to 32 bit precision, unless one of them just happened to be pi.

Groucho,

I'll try to respond to your points in another post if I get time; but things are busy right now.

Iain.

IP: Logged
Groucho
Member
Member # 605

Icon 1 posted 19. December 2002 21:13      Profile for Groucho     Send New Private Message       Edit/Delete Post 
gedanken:

I probably should not have conveyed so much irritation at you merely because you forced me to defend my views.

Incidentally, let me emphasize that I'm not entirely sure all these views even originated with me. Its a situation where you're thinking along a certain line which guides your research. You read a lot of things some of which you understand only marginally at the time. Then a few years later, lo and behold you have a brilliant insight. But instead of brilliance, is it you just finally understanding something you no longer even specifically remember reading? Anyway, maybe my arguments will have more weight if you think there's a chance they didn't originate with me. (Or rather, are the ideas I have presented pretty commonplace, and you all have in fact just been humoring me? )

I started to do a recap here, but then I realized I was repeating things I'd already repeated several times. That doesn't mean what I've written is clear. If you want me to go over anything again I will. (It would be gratifying to know what I've written has at least been understood.)

As far as the language issue, I think I'll let you and Iain debate that, as for reasons I won't repeat, it's not of relevance to me.

Regards

IP: Logged
gedanken
Member
Member # 594

Icon 1 posted 19. December 2002 22:29      Profile for gedanken         Edit/Delete Post 
Groucho, I am coming to agree with your point that my points about choice of TM are “not of relevance” to the mathematical results you are proposing. The reason (if I understand what you are proposing) is that it absolutely does not matter which particular TM you choose to make your measurement, if your points are correct then they hold for that particular TM as well. (And you don’t need your assertion that you are using a “simple” TM for purposes of the measurement!) Therefore if one found a TM that met conditions I would require as making a valid model, then that TM could still be used for your analysis without other change. I’m not saying that your overall method of modeling would be a good model of biological reality without finding and justifying such a TM as relevant. (And I furthermore assert that there is no such model that will satisfy my conditions, and therefore there can be no such meaningful modeling of probability of events based on K.. type measures -- but read on…)

But I think you are not concentrating on modeling of biological reality at the moment. I think you are concentrating on relationships and implications within the Chaitin measure, based upon assumptions that Dembski has made. As such I am interested in seeing where this leads.

Now I’m not sure that an expanded discussion from those results would not have relevance to my point on choice of TM, for my argument is with the “assumptions that Dembski has made.”

Also Iain’s last post is in agreement for the most part with what I would think, and to the extent I don’t agree he has not provided any argument against my point on choice of TM. (Go back and read my series of posts.) So at this point I would prefer to turn to other issues until it becomes relevant.

--

As to Groucho’s original thesis:

Groucho, you addressed your original post as relating to Dembski’s use of the Chaitin or similar measure, I would assume from some introductory remarks. I think that Iain made an essential observation that Dembski would call this a measure of “complexity”, and in Dembski’s terminology this would not be a measure of “information”. So to translate to the vocabulary of Dembski, substitute the word “complexity” at each place where Groucho used “information”. (Iain posed additional constraints and methods that Dembski would apply when attempting to measure “information”.)

I am interested in seeing where this leads. For example I could see that the relations still hold when applying the method Iain suggested to produce a measure of “information” as Dembski defines it. (But it would be more complex than I would wish to try to write myself at the moment, I am still studying this. I am still trying to understand the terms precisely as well.)

--

Brief PS:

“If there's no "random" functions in a TM then there's no "random" functions in C. Anything that's deterministic is by definition pseudo-random.”

But “C” is not necessarily a mathematically defined computer language in the sense of being equivalent to a TM. Real programs run on real machines are not necessarily deterministic, as are their mathematically defined counterparts. Specifically some machines and versions of C allow for a random seed to be established by way of the time of day the job was started, for example. That is not “deterministic” information purely from the program input. (Now if you add that TOD input as an additional input, then you get back to a reasonably deterministic result -- but that was not in the “C” job program + data submission by the programmer, which is the view I am taking. But of course the subsequent calls to rnd() would be deterministically dependent on the initial seed which was truly ‘random’ -- but in principle the computer could have a rnd() generator based on a quantum mechanical noise source and which is therefore algorithmically equivalent to a true random generator and not pseudo-random.)

Pseudo-random results are highly compressible (once you actually decode the generator function), while truly random results are highly incompressible. However they can be indistinguishable from the standpoint of the object being simulated by the particular program. I would claim this is a case where the choice of TM issue can become relevant.

[ 20. December 2002, 00:15: Message edited by: gedanken ]

IP: Logged
Groucho
Member
Member # 605

Icon 1 posted 20. December 2002 01:02      Profile for Groucho     Send New Private Message       Edit/Delete Post 
quote:
Groucho, you addressed your original post as relating to Dembski’s use of the Chaitin or similar measure, I would assume from some introductory remarks. I think that Iain made an essential observation that Dembski would call this a measure of “complexity”, and in Dembski’s terminology this would not be a measure of “information”. So to translate to the vocabulary of Dembski, substitute the word “complexity” at each place where Groucho used “information”. (Iain posed additional constraints and methods that Dembski would apply when attempting to measure “information”.)

I am interested in seeing where this leads. For example I could see that the relations still hold when applying the method Iain suggested to produce a measure of “information” as Dembski defines it. (But it would be more complex than I would wish to try to write myself at the moment, I am still studying this. I am still trying to understand the terms precisely as well.)

(Incidentally where are these quotes from Iain you're referring to???)

In reading Dembski's work, anytime he start making assertions about mechanisms not being able to design, or functions not producing information, its hard for me to stick with anything else he has to say. Certainly, programs, robots, etc. can be incredibly complex, and exhibit incredible engenuity (although admittedly faltering a tasks which are trivial to humans.) To me, a program is just a description, and saying that a mechanism can't design things is like saying something that can be specifically described can't design things. IOW, utterly preposterous. Anytime a mechanism is searching a solution space (e.g. a random environment) to find a solution that optimizes an evaluation function, "design" is taking place, AFAIC.

According to Dembski, design is a new category of "contigency", in addition to chance, and distinguished from either chance or mechanism. Design is evident when there is "complex specified information" (CSI) in evidence in a bit stream X, and in a number of bits that exceeds the "universal probability bound." CSI is evident if X has a "detachable" pattern, where "detachable" has a highly specific meaning, and does not include the pattern of being X itself.

I find the above approach esoteric, byzantine, non-intuitive, and utterly irrelevant to my analysis. There is a certain elegance and simplicity I think you strive for when searching for the truth. You may not arrive at the simplicity you want. But there's a certain apparent rightness to your ultimate results which make you say "Yes! Of Course!" and furthermore which is readily apparent to others as well (including non-specialists). Dembski's approach is just too convoluted.

As far as me using the term "information", I think I only used it in the phrase "algorithmic information" which H a measure of, i.e. the size of the smallest program to generate a certain string.

I'd be curious if you could elaborate on exactly how the analysis I've presented (i.e. regarding algorithmic info, probability, etc) coincides with Dembski's work. (I haven't read everything he's written - if I'm just rehashing things he's said, I'd be pretty embarassed.)

quote:
But “C” is not necessarily a mathematically defined computer language in the sense of being equivalent to a TM. Real programs run on real machines are not necessarily deterministic, as are their mathematically defined counterparts. Specifically some machines and versions of C allow for a random seed to be established by way of the time of day the job was started, for example. That is not “deterministic” information purely from the program input. (Now if you add that TOD input as an additional input, then you get back to a reasonably deterministic result -- but that was not in the “C” job program + data submission by the programmer, which is the view I am taking. But of course the subsequent calls to rnd() would be deterministically dependent on the initial seed which was truly ‘random’ -- but in principle the computer could have a rnd() generator based on a quantum mechanical noise source and which is therefore algorithmically equivalent to a true random generator and not pseudo-random.)

I must tell you that your understanding of the term "deterministic" is idiosyncratic. It has nothing to do with inflexibility or restrictions regarding the specification of input. We could write a C Compiler using a TM. Its too late for me to elaborate more than that this evening.

FTR, I will go back through this thread (ultimately), and see if I can derive a summary post that will clear up any ambiguities in my analysis, etc.

(Also, where are those Iaian quotes??? I'll edit this message if I find what you're talking about.)

[ 20. December 2002, 01:04: Message edited by: Groucho ]

IP: Logged
Iain Strachan
Member
Member # 96

Icon 1 posted 20. December 2002 09:41      Profile for Iain Strachan     Send New Private Message       Edit/Delete Post 
Groucho:

quote:

(Also, where are those Iaian quotes??? I'll edit this message if I find what you're talking about.)


I think some wires got crossed there. I don't recall mentioning "complexity" here. The only point where I referred directly to Dembski-an terms (and it's possible I hadn't quite got the right end of the stick) was specificity .

This was in response to your ideas concerning a sequence m plus a natural selection algorithm e producing a string b. You then compute the probability of m and e arising simultaneously and this being less than the probability of b arising by chance. However, the point was that the sequence m does not have to be precisely specified in order for e to produce b. I can run a genetic algorithm, for example, hundreds of times over with a different sequence m_i where i indexes the mutation sequence, and it will succeed in producing b every single time. It may take a different length of time for each time the algorithm runs, but it will always get there (assuming it's a problem soluble by a genetic algorithm).

I acknowledge that the point about Mersenne Twister may have been a red-herring. The point I was trying to make is that it's a deep philosophical problem as to just what is "random". True randomness is something that we can't detect. The only thing we can detect is a pattern, indicating lack of randomness. The Chaitin paper referenced by Gedanken earlier ends with a statement to the effect that we can't decide whether the universe is truly random or pseudo-random. IOW, the belief that the collapse of the wave function is a truly "random" process is just that, a belief. Our observations tell us that it looks random, but then our observations of a given sequence of 32-bit numbers that come out of the MT algorithm also appear random, and pass all known statistical tests for randomness.

Another point here that is possibly related is that the Kolmogorov complexity is something that cannot precisely be calculated, because of the halting problem. We may have a computer program that goes into an infinite loop, or it may produce the specified string b after 10^30 years of computation. There is no automated way of telling. One can therefore only calculate an upper bound on the complexity (by finding a simple program that produces the desired sequence). It is not in general possible to prove that one has found the shortest program. That is why Schmidhuber (in the paper on Neural Nets I cited above) introduces a time limit as well, where log(time) contributes to the complexity.

Gedanken wrote:
quote:

So to translate to the vocabulary of Dembski, substitute the word “complexity” at each place where Groucho used “information”. (Iain posed additional constraints and methods that Dembski would apply when attempting to measure “information”.)

Aha! Think I know where Gedanken got the wires crossed here. I was talking at the time about Schmidhuber's paper on Kolmogorov complexity of programs that find solutions to neural network problems & the additional time constraint as an addition to the complexity (which overcomes the halting problem). However, this has absolutely nothing to do with Dembski; it was a reference to Schmidhuber's work.

Iain

(sorry about the confusing spelling! Two "i"'s and one "a". Always gets people [Razz] )

[ 20. December 2002, 09:57: Message edited by: Iain Strachan ]

IP: Logged
gedanken
Member
Member # 594

Icon 1 posted 20. December 2002 10:52      Profile for gedanken         Edit/Delete Post 
Iain has the identification of the source of my remarks correct in the first instance:

Iain’s first post on page 1 said:
quote:
(1) m is, as you say, a random sequence of events, causing mutations to occur at random. Therefore it must be generated by a long (and hence very unlikely) program & contain a large amount of algorithmic information. But the problem here is that m lacks what Dembski would describe as specificity . In other words any sequence of bits for m would adequately serve the purpose, if evolution is the way it happens. In other words, the information hasn't been specified, but e simply culls advantageous changes whenever the randomness of m throws up something interesting. So in your equation with Prob(m), one really needs to consider all possible, suitably random bit strings and add them all up. Since most bitstrings are random, this means Prob(suitable m) is close to 1.

(2) The "Tornado in a junkyard" is also a random process, which could be represented also by a bitstream, say t, which itself must also contain a huge amount of algorithmic information.

I am interpreting beyond Iain’s statement to reach conclusions -- which could be in error since I don’t have the copy of NFL I have been reading at hand now and can’t look them up.

The interpretation is two part: first from what I have seen (but don’t have at hand) Dembski refers to Chaitin complexity (or K.. complexity). Then Iain’s point about “specificity” -- random events can have complexity and are not what Dembski would call information unless they have what Dembski would call specificity. So I am simply interpreting from these points to the statement that Dembski calls Chaitin measure complexity and not information. It is entirely possible that Dembski is inconsistent in his usage -- and in that case incoherent on the issue.

Since we are discussing Dembski’s approach, I think that to be effective we must stick to Dembski’s terminology when expanding on the implications. As such I might recommend not using the term “information” (even in combination like “algorithmic information”) when we have not met the terms of Dembski’s usage which must include an identification of specificity. (Now I might also think that the “specificity” issue has problems of being well-defined -- but let’s deal with that later.)

I think that the summary post would be very helpful (even at the risk of some duplication -- I think the moderators will allow this due to the importance in understanding the subject).

IP: Logged
Groucho
Member
Member # 605

Icon 1 posted 20. December 2002 15:26      Profile for Groucho     Send New Private Message       Edit/Delete Post 
quote:
(Iain)
(1) m is, as you say, a random sequence of events, causing mutations to occur at random. Therefore it must be generated by a long (and hence very unlikely) program & contain a large amount of algorithmic information. But the problem here is that m lacks what Dembski would describe as specificity . In other words any sequence of bits for m would adequately serve the purpose, if evolution is the way it happens. In other words, the information hasn't been specified, but e simply culls advantageous changes whenever the randomness of m throws up something interesting. So in your equation with Prob(m), one really needs to consider all possible, suitably random bit strings and add them all up. Since most bitstrings are random, this means Prob(suitable m) is close to 1.

Both Iain and gedanken identifed the above quote as the Dembski reference.

quote:
(Iain)
However, the point was that the sequence m does not have to be precisely specified in order for e to produce b. I can run a genetic algorithm, for example, hundreds of times over with a different sequence m_i where i indexes the mutation sequence, and it will succeed in producing b every single time. It may take a different length of time for each time the algorithm runs, but it will always get there (assuming it's a problem soluble by a genetic algorithm).


First, let me give my understanding of a "specification" as Dembski utilizes the term, and the relevance of it to his arguments. Someone correct me if I'm wrong.

A specification for Dembski is any property of a string other than the property of being the string itself. If the calculated probability of such a property is less than the Universal Probability Bound [10^150] then the specification is termed "Complex Specified Information" and deemed the result of "intelligent design". As best I can tell, it is merely axiomatic for Dembski that the "intelligent" designer cannot be a mechanism.

So looking at b, the biological world as it exists today, Dembski would presumedly calculate the probability of any string of equivalent size having properties such as metabolism, brains, giraffes, etc, and if it was lower than U.P.B., immediately conclude "intelligent design". In Dembski's scheme "intelligence" is a whole separate category from mechanism, by definition. I don't know how to reconcile that with the fact that we could easily describe a mechanism that had as its goal state the binary equivalent of the aforementioned complex properties of life. Indeed, such a mechanism would be very complex - but it would be neither "intelligent" nor a "designer", not according to Dembski. [Presumedly, the above could be a gross characterization of Dembski's Design Inference, and if it is, hopefully someone will correct me. But this is honestly the best I can make of it.]

In my discussion, I used the occasion of your reference to Dembski specifications to introduce my own. The relevance of a specification to my arguments is subtly different [though not entirely] than in Dembski's arguments. However, my denotation of the term "specification" is the same as Dembski's, i.e. it denotes a "property" of a string, i.e. for which there exists a computable predicate which will return true if X has that property, and false otherwise. By a specification a set is inferred, of all strings that have that property. I introduced one specification, S_eb[x]: "every x such that e[x] = b". Thus for all x, if e[x]= b, then S_eb[x] is true. IOW, for e[x] = b to be true then the input mutation sequence x must exhibit the property S_eb.

Thus Iain, it is not true as you say that, "the sequence m does not have to be precisely specified in order for e to produce b." m has to have the property S_eb, if e[m] is to yield b. [Note that in my arguments b refers specifically to the biological world as it exists today, and e refers specifically to natural selection, m to the actual mutation sequence. However, the arguments apply whatever e,m,b happen to be [if you catch my drift [?]].

The relevance of specification S_eb[x] is as follows:

I contend that if H[e] < H[b] then for all x, prob[e[x] = b] <= prob[b], this is equivalent to prob[e]*prob[e[x]= b|e] <= prob[b], which is equivalent to prob[e]*prob[S_eb[x]|e] <= prob[b]. [As I stated previously, if H[e] > H[b], i.e. natural selection is more complicated than the biological world, then nothing more needs to be said.]

I believe the proof is as follows [and now I'm thinking I must have read it, but I don't know where]: Say Pb is the smallest program [without input] that generates b. if len[e] = m, m < len[Pb], then what is the most information that could be conveyed by e? IOW, What is max[prob[b|e]]? max[prob[b|e] occurs when e cooresponds exactly to any m bits of Pb, in which case prob[b|e] = 1/2^[H[b]-m]. However, prob[b|e] = prob[e[x] = b|e] = prob[S_eb[x]|e]. Thus,
prob[e]*prob[S_eb[x]|e] = [1/2^m]*[1/2^[H[b]-m] = 1/2^H[b] = prob[b].
_______________

[From the above, it easily follows that
prob[S_eb[x]|e] <= prob[b]/prob[e].]

Also, as stated previously, as H[e] increases relative to H[b], can we not start to make inferences regarding the intelligence of e [considering that e must have huge segments of code from b embedded in it]?

I want to conclude with a brief discussion of uniform probability. Since I introduced the notion of the probability of e [i.e. the probability of natural selection] it raises the question of what sort of mechanism could result in the creation of e. This is also relevant to some of gedanken's previous remarks that questioned whether the number of bits in H[b] for example indicated its true probability. Also, many of Dembski's detractors have taken him to task for the usage of uniform probability distributions in calculations relative to U.P.B., etc. [Note: This all may derive from the work of others which may be more rigorous and informed than my own analysis.]

Gedanken indicated for example that although b had a probability of 1/2^H[b] [according to my formula] its probability could be much higher given some background knowledge of the universe that we don't have currently [e.g. some types of operations being inherently easy for the universe, despite having a long binary description]. [Hopefully, that's a reasonable characterization of gedanken's arguments.]

How could the probability of some string r be in actuality greater than H[r]? Only with additional background knowledge of other processes in operation. To return to the proof above, if we had knowledge that m bits of Pb existed, it would increase the probability of b existing from 1/2^H[b] to 1/2^H[b]-m. However, the probability of m bits of Pb existing is equal to 1/2^m. Thus in any situation where additional background knowledge k exists such that prob[r|k] > prob[r] we would still have to consider the probability of k. i.e. prob[k]*[prob[r]|k] = prob[r] = 1/2^H[r]. Thus, it seems that uniform probability distributions are warranted.

Now that I think of it though, if you accept uniform distributions in calculating the probability of b, we don't even have to consider natural selection or the mutation sequence and do exactly what Dembski has been castigated for, i.e. identify a detachable specification S in b, and assuming a uniform probability distribution, compute prob[S] and see if its lower than 10^150. We can disregard natural selection or any mechanism that could account for S, given that any such mechanism [e] together with any input x such that S[e[x]] is TRUE, would have equal or lower probability than S.

Actually, the last statement was proved previously in relation to prob[b], not prob[S], so I think this needs to be established.

IOW, we previously showed that:

prob[e]*prob[e[x]=b|e] <= prob[b].

The above needs to be recast in terms of some specification s:

for all e,m,b,s,x,y:
if
e[m] = b and
H[e] < H[b] and
s[b] = TRUE and
s[y] = TRUE --> H[b] <= H[y]
then
prob[e]*prob[s[e[x]]=TRUE|e] <= prob[s[y] = TRUE]

Note in the above we assume that b is the smallest binary string such that s[b] is TRUE, where s[x] presumedly equates to something like: "giraffes[x] and metabolism[x] and brains[x] and ..." [i.e. s is the conjunction of every relevant observable property of life we want explained.] However, we were already implicitly assuming that b was the smallest such string. At least it does not invalidate any previous aspects of the analysis to make such an assumption. We don't particularly care what specific binary string b cooresponds to, as long as the string cooresponds to s - the properties of life we're interested in.

I think it follows trivially from how we're defining probability that, if b is the smallest binary string that has property s, then for all x prob[s[x]] <= prob[b], and from that the above theorem easily follows [I think].

Thus, we can disregard any mechanism leading to b [or to use Dembski's terminology, any "chance hypothesis" leading to b], and just calculate the uniform probability of s[b] [for some specification s] and see if its less than the Dembski's Universal Probability Bound [10^150].

[ 22. December 2002, 11:25: Message edited by: Groucho ]

IP: Logged
Groucho
Member
Member # 605

Icon 1 posted 22. December 2002 16:16      Profile for Groucho     Send New Private Message       Edit/Delete Post 
I now realize that substantial segments of the above were derived from something I read, it must have been over three years ago at least. The situation was as I intimated previously, i.e. I was researching along similar lines, and encountered something similar to what I've presented above, I guess without realizing its full significance at the time. Because I haven't seen anything about it in the interim, and my mind has been on other things, I must have forgotten about it. In fact, I think Dembski is actually prevented from using any of the above by the guy who originally came up with it (because he's not in the ID crowd, I guess.) But the implication for Dembski is as I intimated in my previous post, i.e. you don't have to consider alternative chance hypotheses (if I'm using the term correctly) by just have to derive a specification assuming a uniform probability distribution.

Maybe you all already knew this (and didn't feel like correcting me because I was a newbie who crashed your party.) Surely someone in this forum must have a name or link to the person originally associated (at least in published form) with the ideas presented above. I've done a search myself, but haven't come up with anything.

IP: Logged


All times are East Coast
This topic is comprised of pages:  1  2 
 
Post New Topic  Post A Reply Close Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership