ISCID Forums


Post New Topic  Post A Reply
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » Refining the Generic Chance Elimination Argument

   
Author Topic: Refining the Generic Chance Elimination Argument
Rex Kerr
Member
Member # 632

Icon 1 posted 19. February 2003 21:01      Profile for Rex Kerr     Send New Private Message       Edit/Delete Post 
I'm not sure if this problem has already been addressed and fixed, but I'll raise it and fix it anyway. It's regarding a mathematical error that entered the rigorous form of the Generic Chance Elimination Argument between the time when TDI and NFL was published.

Enumerating Specifications: Problems and Solutions

In Dembski's explanatory filter, probabilistic resouces come in two types: replicational resources, which are the number of tries to hit a target, and specificational resources, which are the number of targets to be hit.

We will consider whether the current formulation of proabilistic resources adequately accounts for targets to be hit. First, the definition of the number of specificational resources:

(From NFL)
quote:
SpecRes is the set of all specifications T such that (1) T is a subset of W, (2) T is detatchable from E for S with respect to all chance hypotheses {H_i}(i in I), (3) max(i in I) P(R|H_i) >= max(i in I) P(T|H_i), (4) phi(R) >= phi(T), and (5) T is set-theoretically maximal with respect to conditions (1) to (4)
Definitions: W is the set of all possible outcomes. E is the event that we actually observed. S is the subject who is doing the evaluating. I is an index set that enumerates the chance hypotheses H_i that might be used to explain E. R is the specification that we have chosen to specify at least E. phi is a measure of complexity of a specification.

Note that R is always either one of the specifications T, or is a subset of such a specification T.

This seems straightforward enough: you see an event, realize a specification R that it corresponds to, and then check to see how many specifications T are simpler (phi(R)>=phi(T)) and at least that rare (P(R|H_i)>=P(T|H_i)).

Unfortunately, there may be a potential problem, suggested by the (paraphrased quote): "I write faster than those who write better than me, and I write better than those who write faster than me." If you have a series of specifications, each of which is more complicated and more rare than the last, might you underestimate the true number of specifications that need to be considered?

To see that this is exactly possible, consider the following scenario. The process we will consider is, "How many times can you roll a die before it lands on a 1"? This is explicitly a random process with only one relevant chance hypothesis H.

In order to talk about the results of sequential terminating processes like these, we decide to use the word F(n) to denote a process that finishes after n steps. How complex is each of these words? Well, typically, early terminations will be more frequent than later ones, so inspired by Huffman encoding schemes we use simple symbols to denote F(1), F(2) and so on, and complex symbols to denote F(1923561895), F(1923561896), and so on. (Note: this is entirely natural, as you can see by counting the characters in my example.)

Now suppose we roll our die and it lands on a 6 on the nth try. We can calculate the probability of this event: P(E|H) = 1/6*(5/6)^(n-1). And we can come up with a specification that includes this event: R = F(n). Now let us calculate the specificational resouces that are relevant. F(n) is itself detachable from E; we came up with F(n) and every other F before rolling any dice at all.

Each token F(m) is a specification. These are all unitary subsets of W, and don't depend on E, so are detachable. Now, P(R|H) >= P(F(m)|H) exactly when m>=n. There are also various sets T = {F(m_j)} that have this property, and note that they must obey min(m_j) > n. These are the T that pass condition (3). With condition (4), we see that we already decided phi(R) >= phi(F(m)) at least approximately when n>=m. And for our sets, a set is obviously more complex of a description than its components, so we must have at least n > min(m_j).

What specifications pass both (3) and (4)? None of the sets do, since we must have both n > min(m_j) and min(m_j) > n. Similarly, the singletons must obey n >= m and m >= n, meaning m = n. Only one specification is relevant, F(n): the description of what actually happened. SpecRes = 1.

But this is clearly wrong. We roll the die once, so we have one replicational resource, and we have already calculated that there was only one specificational resource. Yet P(F(n)) is no greater than 1/6 for every n, which is less than 1/2 (Dembski's suggested value for alpha).

Conclusion: every possible outcome of this random process is an example of specified complexity!

How do we fix the estimate of probabilistic resources? There are a number of ways to proceed.

First, we might insist that SpecRes is at least however many specifications we pick before observing the outcome. But this renders the filter practically unusable on real-world data, because we don't know what to put through the filter until we've looked at the real world. Every time we are surprised (which are the very times we might suspect design), we would have to defer judgment. So that option is unworkable.

Second, we might try to inflate the number of replicational resources to include the entire universe of possible die rolls. This would overcome the counterexample, but it doesn't fix the problem; even if you allow 10^150 replicational resources, you can generate a new counterexample with any random process such as a lottery where each outcome has probablity less than whatever limit you've set for your replicational resources. The problem is in accounting for specificational resources, and needs to be fixed there.

Third, we might try to relax one of the conditions (3) or (4). If we relax condition (4), we admit specifications of arbitrary complexity, giving us infinite specificational resources, which isn't what we want since almost all of those specifications have an effective probability of zero. If we relax (3), we allow a lot of high-probability things to count as targets in terms of specificational resources, even though we wouldn't count it if we hit them. Still, this will fix the problem, and if our specification R is short enough, we will still have a usable system. This is similar to the fixed complexity bound used in TDI, but allows the flexibility of being tailored to the specific situation it's being used in.

Let's consider the Caputo example. We can, perhaps, specify Caputo's results with the string "1R", which is 16 bits. There can't possibly be more than 2^16 simpler specifications, so we have about 64 thousand specificational resources; together with the replicational resources of elections (~120 million in Dembski's example), we can no longer conclude that Caputo's election-fixing antics were an example of specified complexity (although if he'd continued for another 10 elections or so, we could have). But this is not so bad; we are trying to use a completely general test to solve a very specific problem. We might expect to lose a bit of power along the way.

There are still problems remaining (e.g., we need to put a few common-sense conditions on phi), but the step of removing condition (3) from the enumeration of SpecRes eliminates perhaps the most serious mathematical defect.

Note: UBB complains when I try to use less-than symbols or the HTML ampersand-lt-; code for it. I have therefore coverted all formulae to only use greater-than. There may, however, have been errors in the conversion process. If something seems exactly backwards, it probably is.

IP: Logged
gedanken
Member
Member # 594

Icon 1 posted 26. September 2003 11:05      Profile for gedanken         Edit/Delete Post 
I've suddenly become interested in this completely neglected thread, because it bears on other issues I have been discussing. I want to try to understand this argument--and am at the moment digesting the issues presented here and in Does intelligence imply "motive"?" thread with regard to this issue.

But there are a couple of papers that should be read with regard to this. And I warn that both contain some political/personal statements, and ask that those be ignored and we concentrate on the technical issues. (I'm not presenting them to emphasize any political/personal, rather their technical content. Reader will have to do his/her own extraction of proper subject matter.)

On site Talk Reason is paper by 'Erik': On Dembski's Law of Conservation of Information (PDF). Note that other formats are available by searching way down in above link to Talk Reason for the heading of this paper.

The issues in this paper of interest to this discussion are the portion that deal with Dr. Dembski's GCE procedure, and the aspects that move on to "conservation of information" are not relevant to this discussion here. As far as I know this is the only formal treatment I have seen that attempts to deal close to the issue that Rex brought up.

Then Dr. Dembski's reply: If Only Darwinists Scrutinized Their Own Work as Closely: A Response to "Erik". Once again all of interest here is anything specifically technical dealing in any way with Rex's issue here.

IP: Logged
gedanken
Member
Member # 594

Icon 1 posted 26. September 2003 11:16      Profile for gedanken         Edit/Delete Post 
Rex, do you think we should present our two posts copied here as a start? I am also interested specifically in the problem of what it means to have the "specification independent of the event", as is more technically embodied in the "detachability" formal description and associated issues.

I don't want to go on at length about detachability and independence in the other thread, but rather would like to add that issue to this thread and see what we can learn. So I am asking for an addition to the subject matter, entirely compatible with "Refining GCE", of subject of exactly what independence and detachability means. In other words when the issue might get back to any sort of comparative analysis, or "motive" issues, for example, that would go back to the other thread, but this thread sticks to understanding detachability and the alpha cutoff and phi complexity measure issues involved.

ALSO a request: There was an ARN discussion thread by Erik that discusses this issue--can anyone find that? Also are there any other ISCID threads that are relevant? Links could be provided for reference. Thanks.

[ 26. September 2003, 11:21: Message edited by: gedanken ]

IP: Logged
Rex Kerr
Member
Member # 632

Icon 1 posted 26. September 2003 21:14      Profile for Rex Kerr     Send New Private Message       Edit/Delete Post 
Interesting links. I'm not sure I like Erik's notation of Dembski's theory any better than I like Dembski's; Erik's looks more familiar (and somewhat clearer) to me, but is much harder to write in plain text.

I think Erik's attempted counterexample is getting at the same point that I illustrated above, but as far as I can tell, it doesn't actually work. Maybe I'm misunderstanding. (In particular, it seems that phi(T)=phi(R) for every rejection function, so each f defines a specificational resource, and we have 200 of them.)

In any case, whether we discuss things here or on the previous thread or on a new thread depends on what exactly we're going to discuss.

Although I appreciate having this post resurrected from the depths of the archive, it wouldn't be on topic to discuss anything other than the misestimation of specificational resources on this thread. (Erik brings up many other issues.) [Off-topic not because it doesn't match the title, but because people skimming the first post (which is highly detailed/technical) wouldn't necessarily have a good idea of what to find in later posts.]

A more general discussion of detatchability, accounting for resources, whether specifications are limited by the number of bit-operations in the history of the universe, etc., would best be done on another thread. (If not the "motive" thread--it also seems off-topic there--then I would recommend a new thread.)

I am, of course, happy to discuss specificational resource mis-estimation right here.

[ 26. September 2003, 22:54: Message edited by: Rex Kerr ]

IP: Logged


All times are East Coast  
Post New Topic  Post A Reply Close Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership