|
Author
|
Topic: Does intelligence imply “motive”?
|
gedanken
Member
Member # 594
|
posted 25. September 2003 18:22
quote: Which is the same thing. The pattern must match the event observed.
(My emphasis.)
And it also must be given independently of the event. These are competing issues. If you take information from the event, you are constructing a 'fabrication'.
(That is why I gave the 500 coin toss example above. If you take the pattern from the coin toss results, you are constructing a “fabrication”. In that case having the pattern “match the event observed” created a fabrication.)
In the method of the EF, you must find that "specification" independent of the event, but you can find a specification that matches the event as closely as possible (having been located independent of the event). So once again it is not the pattern of the Easter Island figures "eyes" that are the "specification", it is a pattern that you can find independently that counts.
I want the reader to notice that Alonso has never given a specification that is truly “independent” of the event. Fror example:

quote: Since the event includes intricate shapes that match a human face, of course the probability will be low.
This is interesting since the facial features have sharp edges where our faces are smooth. The Easter Island figure’s patterns are representational not accurate copies of human figure.
Show your pattern. Show it was independent. Show it has low probability. But most of all, show your pattern in sufficient detail that we can objectively evaluate it. (That pattern will be termed the "specification" if it is truly independent of the event--but that is what we want to check!)
Show which features of the human face are designated, and which are missing, and where these features are found “independent of the event”.
(You might also want to read through my thread carefully. I have not claimed that the EF will not make a closer pattern match to the Easter Island figures than to the stone faces, for example. But you may have missed the subtlety if you don’t actually read the thread. It is a matter of probability being extremely low that has not been demonstrated. And then ultimately there are other issues that are more important, like if the information that forms the “independently found pattern” must in examples used to demonstrate the correctness of the EF always wind up supporting the physical evidence of a physical designer, for example. And you might also notice that I have already given a superior pattern in part, which gives a better match and which is independent of the event. Can you find that?)
[Note edits!] [ 25. September 2003, 18:39: Message edited by: gedanken ]
IP: Logged
|
|
Nel
Member
Member # 614
|
posted 25. September 2003 18:37
Ged writes:
quote:
And it also must be given independently of the event. These are competing issues. If you take information from the event, you are constructing a 'fabrication'.
This is another misinterpretation of Dembski's writings. From No Free Lunch:
quote:
a rejection function f is detachable from E if and only if a subject posesses background knowledge K that is conditionally independent of E...and such that K explicitly and univocally identifies the function f. Any rejection region R of the form T-gamma = {(omega is contained in Omega)| f(omega) >= gamma} or T-delta = {(omega is contained in Omega)| f(omega) <= delta} is then said to be detachable from E as well. Furthermore, R is then called a specification of E, and E is said to be specified.
In otherwords, my background information independantly identified nose, mouth, eyes (complete with pupils), hat. However, I also know of the pattern nose, mouth. The problem that Dembski's math is trying to identify which of these maps to the event. So they are not competing issues.
Ged writes:
quote:
(That is why I gave the 500 coin toss example above. If you take the pattern from the coin toss results, you are constructing a “fabrication”. In that case having the pattern “match the event observed” created a fabrication.)
This is completely different from how the design inference is made with the easter island statues. I have observed eyes, noses, mouth, face, way before I saw the easter island statues. With this example, you must first see the event in order to specifiy it (a fabrication).
Ged writes:
quote:
In the method of the EF, you must find that "specification" independent of the event, but you can find a specification that matches the event as closely as possible (having been located independent of the event). So once again it is not the pattern of the Easter Island figures "eyes" that are the "specification", it is a pattern that you can find independently that counts.
It is precisely the pattern of the easter island statues eyes that are the event. This event matches, independantly with my background knowledge of eyes (complete with pupils). Thats why a design inference is made.
Ged writes:
quote:
Show your pattern. Show it was independent.
The toddler test and the fact that I have seen eyes with pupils, mouths, and noses before I ever saw the eastern island states shows the pattern and shows that it was independant.
Ged writes:
quote:
Show it has low probability. But most of all, show your pattern in sufficient detail that we can objectively evaluate it. (That pattern will be termed the "specification" if it is truly independent of the event--but that is what we want to check!)
After you show that is has high probability and that the pattern is is not sufficient in detail, I will respond.
Ged writes:
quote:
(You might also want to read through my thread carefully. I have not claimed that the EF will not make a closer pattern match to the Easter Island figures than to the stone faces, for example. But you may have missed the subtlety if you don’t actually read the thread. It is a matter of probability being extremely low that has not been matched.
I already responded to this, which is why I will strongly note that none of my points have been addressed in any part of this thread. For example, I will note that the eyes with pupils, mouths, and noses and hat, is more complex than eyes or something that ambiguously looks like an eye (one oval shape with no pupil). Obviously the former is of low probability.
[Added in edit:
Ged writes:
quote:
This is interesting since the facial features have sharp edges where our faces are smooth. The Easter Island figure’s patterns are representational not accurate copies of human figure.
Exact replicas of human figures are not required, simply the closer the better. That is why a closer look showing the same type of humanoid face on Mars would have nailed down the design inference and the completel absence of form yielded a negative result. It is a non-sequitor to say because it is not an exact match that it is not independant of the event. The actual nostrils of the nose, the pupils of the eyes were seen way before I ever saw the event. This is why in the Caputo case, the specification was 40 or more Ds
end edit ] [ 25. September 2003, 19:07: Message edited by: Nelson-Alonso ]
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 25. September 2003 19:53
quote: quote: And it also must be given independently of the event. These are competing issues. If you take information from the event, you are constructing a 'fabrication'.
This is another misinterpretation of Dembski's writings. From No Free Lunch: ...
Alonso, I do strongly object to "misrepresentation" here.
I will let Dr. Dembski speak for himself (from NFL p.15)
quote: ... For a pattern to count as a specification, the important thing is not when it was identified but whether in a certain well-defined sense it is independent of the event it describes. Drawing a target around an arrow already embedded in a wall is not independent of the arrow’s trajectory. Consequently, such a patern cannot be used to attribute the arrow’s trajectory to design. Patterns that are specifications cannot simply be read off the events whose design is in question—in other words, it is not enough to identify a pattern simply by inspecting an event and noting (i.e. “reading off”) its features. Rather, to count as specifications, patterns must be suitably independent of events. I refer to this relation of independence as detachability and say that a pattern is detachable if and only if it satisfies that relation (see section 2.5).
(Bold mine, italics in original.) And of course section 2.5 describes the same issue as Alonso quoted about the “detachability,” Alonso’s quote from 2.7 on the GCE procedure. This is a method of establishing “independence”. And this can be found in the index, under “Pattern, independently given”.
Dembski clearly shows how these are competing issues, and how one actually resolves them in the formal explanatory filter procedures (like GCE procedure). The GCE procedure’s “extremal set” is how the competing issues are resolved. The best possible match is found from independent sources, but no aspect that is not found independently can be used. Alonso if you don’t understand how the “extremal set” resolves the competing issues, you don’t understand the GCE procedure version of the EF.
quote: I will note that the eyes with pupils, mouths, and noses and hat, is more complex than eyes or something that ambiguously looks like an eye (one oval shape with no pupil). Obviously the former is of low probability.
Right! But where have you specified the “eyes with pupils, mouth, and noses and hat” as a “specification”? And we of course have a figure with a “hat”, a “nose”, and mouth, chin, various features, but randomly generated. Your method of estimating the closeness of these features to the human has not been specified—it must be part of the specification. Remember your own quote: “such that K explicitly and univocally identifies the function f.” Now explicitly and univocally give us the function “f”! I’m afraid that your specification allows a great many different functions “f”, and different “f” expressions would give very different answers.
Now Alonso, if you had read the thread you would realize how I have already discussed the issue of features and how close they are. I have discussed how the more refined versions of the GCE procedure, with it’s “independently” determined rejection function f can do a better job than crudely worded sentences that can very ambiguously identify these features.
And you would read the distinction between the sharp edges that form “representations” of the eyes, etc., and an actual match for the face itself. Our faces only vaguely look like those Easter Island figures. They are abstract in many qualities rather than strict geometric matches. You could read that I already analyzed direct geometric comparison and note that the Easter Island figures are distorted and thus are geometrically not a good match. You could read how I already analyzed using lists of features count as match quality—and problems with that and random figures. If we allow the abstractions we wind up with relatively high probability of matching randomly created figures, even if that probability is “medium” low rather than incredibly low. This would be “relatively” hi as compared to the UPB of 10^-150, for example.
You would also notice that I already gave an independent pattern group that does have a much better match. This is the area of human created statuary. These human generated statues are observed by us completely independently of the Easter Island figures. And art books undoubtedly contain descriptions of statuary methods in which edges form sharp lines where our human tendency to recognize “lines” that represent the face. I have not looked up an explicit statue or explicit art book description to use as “specification”, but I think these would agree more closely as targets for matching.
Then the actual description of the matching function ‘f’ still has to be resolved. Knowledge K cannot just be knowledge about the subject, it must generate the function ‘f’ and even this example does not give that yet. I could give that as a computer pattern recognition program, with both knowledge and function specified, and that could be done independently—then it could be tried on examples to see how the probability arises. Remember per se the function ‘f’ is not inherent in art books or statues themselves. In Caputo the count of “d”s is given as a function ‘f’, because counting per se is known separately. Where is comparison of closeness to either artwork or human figure given?
The issue never was whether the specification could be found that would provide a good set of inputs to the EF. Rather the issue was whether the specification had been given. Because if a pattern has not yet been given, which is suitably “independent”, and which has demonstrably (by calculation) low probability of occurring, then the EF terms have not been demonstrated. It is a misrepresentation to claim that the conditions of the EF have been met, when these items are not present in the pattern that was presented!
Now let’s look at what we have found. Now the edges of the eyes forming sharp lines do give very close matches to other statuary features in which those same edges are emphasized, for example. What we have found is that we can match the EF conditions in this case when the evidence used for the “pattern” happens also to be evidence for human creation of similar figures. Read back how I apply these issues.
AND by the way, if you take a closer look at the Easter Island figures, you see that the surface is very rough and with a grainy surface, nothing like the human figure. And you realize, for example, that the apparent “arms” are attached at the side, nothing like the actual human. The list goes on and on. The match gets better when compared to art, not when compared to the actual human form, wherein the matching criterion is the number and degree of match to representaional features of human generated artwork and sculpture. While not an unambiguously and univocally determined “f”, this is at least closer. [ 25. September 2003, 20:30: Message edited by: gedanken ]
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 25. September 2003 21:53
Hm, I'm not sure whether I can help with P(E|H&K)=P(E|H) since I've never found that to be a particularly instructive part of Dembski's methodology. And, actually, he broke his math in NFL when dealing with this issue. (I.e. it's not just unclear but provably false, and I've shown a counterexample and offered a correction in another thread that nobody posted to.)
So I'm going to return to TDI: quote:
Given an event E, a pattern D (which may or may not delimit E) [and a hypothesis H, probability measure P, side information I, and a complexity measure C]...we say D is detachable from E if and only if the following conditions are satisifed: CINDE: P(E|H&J) = P(E|H) for any information J generated by I TRACT: C(D|I) < L [for some upper bound L].
The second part--dropped in NFL--is crucial for my understanding of why this might actually work.
Firstly, I think one must consider not only information generated by I but information K that might generate I. (This is, I suppose, implicit in the probability calculation of P(E|H&I), but it helps to make it explicit.)
For example, suppose I contains the information 'An example of a text string is "aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla"' with no further explanation. Oddly enough, you have just observed a random text generator spit out "aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla"! Now, if you go back through your information I and look at how that was generated, you might find generating knowledge K that says, 'The machine over there just spat out "aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla"'. Now clearly P("aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla"| random text generator & text generator just spat out "aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla")=1 and not 26^-49.
So the condition would fail.
So I think Dembski is assuming that the E and K are causally disconnected.
Fine, fine, but in every case you do the analysis, they're actually *not* causally disconnected in practice. Sure, you could come up with "aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla" on first principles alone, but that's not how you actually do it.
That's where the complexity measure comes to the rescue (although Dembski forgot to properly include the complexity in the universal probability bound, but it's a subtle point so I'll ignore it for now).
Suppose you give me a knowledge set K that really is disconnected and you say, 'Well heck, it says here what a letter is, and it says that you can make strings by sticking them together, so I can generate "aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla" from first principles from this knowledge!'
The complexity bound, C("aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla",K) will tell you that this has too high of complexity and thus that you'd never really get it from first principles. So we're okay.
NFL replaces complexity bounds with complexity comparisons, as far as I can tell. Again, in that case, the P(E|H&K)=P(E|H) says that in the knowledge set you're proposing, you can't have anything that gives any clues about E in this case, but here you say, 'Well, if I can generate "aosdfhlahglnakxurammomxkshgkslhmxlaoixghrzla" from K, I can generate a heck of a lot of other stuff too, and without some way to distinguish them on the basis of complexity, I'm going to lump them all together.'
In principle, this also works, even if the implementation wasn't quite right.
I think this is what the process is supposed to look like: you start describing an object in general terms. As your description increases in length, you reduce the probability of something matching it, but you increase the number of things you could describe in a description that long. If you have a random string, you really can't do better than say every letter (and thus your description is as long as the probability is low) and the two cancel out. It's the large disparity which is a flag that something weird is going on.
Perhaps something in this will help, and perhaps not.
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 25. September 2003 22:19
Thanks, Rex,
I had that bookmarked.
So far I was not sure that P(E|H&K)=P(E|H) was false when the event E was a particular sequence of 500 coin tosses, and knowledge K was simply the present knowledge of the coin's state.
Because P(E) under conditions of H is a weird beast. H is a statement (so-called hypothesis) but is not known to be the actual process. If the ID agent caused the event, then the process is not even active, rather a different process which might also be probabilistic in part but not that process stated in H.
So how does worldly knowledge at various kinds and times limit or illuminate event E? If one knew the physics of each coin toss in advanced, one could completely predict the toss and it would not be random. It takes a specific kind of lack of knowledge, the kind that allows the 2 state event without correlation at the time of the flip. We know that the flip is going to occur (it did occur), but we don’t know sufficient detail to know its outcome among the Heads and Tails. There were other possibilities, like no flip whatsoever, so we had to at least have that partial knowledge of the event that the flip per se occurred.
Now take our knowledge of the results. How does that change the 2 state event without correlation at the time of the flip? That is H, the hypothesis that each flip occurred without knowing the details at the time of the flip. (Remember we can’t so limit our knowledge to nothing at all about the event, we have to know that the flip occurred or else we have changed the probability.) So present knowledge K of the flip results does not change probability of event E under the conditions of H. One can know both, no change. Knowing the present result does not change the flip itself in any way. Knowing the result is not causal to the original process of the event.
But I still don't completely know what 'causal' independence of E and K means. If I know presently what 500 coin flips were, that knowledge had no 'causal' effect on the past event. It's only a constraining knowledge about present state. That probability of present state is no longer 50/50 for each coin.
The only way I can figure it is that the meaning is there must be complete non-constraining of events in omega out of which the H could have had events, by present knowledge K. (e.g. knowledge of the flip results further constrains them.) That means that anything we know about the world and the event might have some constraint, and P(E|H) is never equal P(E|H&K). So almost no knowledge is "independent of the event" if it has anything whatsoever to do with being close to the event. You can't pick any knowledge K that is anywhere close to providing a constrained function f unless it somehow related to the knowledge of the current values--how could you select knowledge K out of the trillions++ of things you could talk about? And as soon as you select the knowledge K to narrow the function 'f', bingo you have constrained E just like present knowledge of the 500 coin flips. (I'm still thinking about that last point.)
So now to have K truly be independent of E, not constraining it by being related, K has to be derived by some automated procedure (or equivalent) that cannot have any limitation on E. Thus we search over all knowledge K element of Universal-knowledge available to subject S. But now we have the problem that S knows the result, how does that keep from poping up? You have to blindly test all possible knowledge of subject S, and only knowledge that is as though S did not know of events E is allowed, as though gained separately. We are assuming a history of knowledge K is also known, so as to eliminate the connections. But we also assume (as above) that we know sufficient constraining issues of E that the setup was according to H, so we just eliminated all K.
So we actually allow K to be “causally” connected to E, in that it is just sufficiently causally connected so that it allows the randomness to be established as in H, but not more connected, either causally or in constraints to further change the view of the event E. This is what P(E|H)=P(E|H&K) means! K has causal and constraining knowledge of E, but not causal or constraining so as to change the probability from the original hypothesis.
Now we have the wideness of scope problem that I think Rex is referring to. Because subject S, though not knowing of all possible states of 500 coin flips individually, does have knowledge of the concept of any arbitrary coin flip sequence including E. Now I see where the "tractability" comes in--we can't allow S to search too long through her possible knowledges to get to the really esoteric ones or else almost anything goes. You can't allow the esoteric concept that "Oh each of these 2^500 possibilities could exist, let me simply enumerate them," wherein one of them happens to be the same pattern as our event E and it generates an f that perfectly constrains.
Now another alternative: Give up all abstractions like individual possibilities for the coin flips as part of knowledge K. Restrict K to strictly past states of events, or 'memory' of S. Now we have a different problem. Because each knowledge K is concrete. No intersection of H and K ever occurs, because presumably event E as an abstraction of possibilities, constrained by S's knowledge of the limitations and only those limitations in H. (Except for that one little bit, mentioned above, wherein subject S did happen to already know about event E directly--but remember we can't use that bit of knowledge. But that was the only non-abstract bit of knowledge about E that we had, the actual event corresponding to E itself.) That failed, looks like the best was the previous—deal with the “wideness of scope problem”.
That last little bit was extremely important. All knowledge Ki of S that was based on the knowledge of the event E (so we don't have to spend forever searching independent knowledge and testing without knowledge of E) still has to be searched in some vaguely ordered manner. (At least in theory--e.g. we are listing all the possible abstract knowledge of S and abstractly searching that to find knowledge K that specify functions f and then also abstractly testing f to meet the criterion, all without S actually knowing of this per se due to independence.) Thus we are allowing all the abstractions that S could generate--thus the "tractability" concern to order them or else one simply waits for the knowledge matching E to pop out.
To make the last actually practical, we allow sub-optimal knowledge K to be used. In this case we allow knowledge K to depend on the event E outcome, so long as it is causally disconnected and only constraining such as containing knowledge of the outcome after the fact of the event. So the "tractability" disallows the direct knowledge of the event, and only allows patterns that are more "compressed" and still dependent only on outcome knowledge and not causal knowledge.
That is where Rex's comparison comes in! HA [ 26. September 2003, 01:57: Message edited by: gedanken ]
IP: Logged
|
|
Nel
Member
Member # 614
|
posted 26. September 2003 11:29
Ged writes:
quote:
(Bold mine, italics in original.) And of course section 2.5 describes the same issue as Alonso quoted about the “detachability,” Alonso’s quote from 2.7 on the GCE procedure. This is a method of establishing “independence”. And this can be found in the index, under “Pattern, independently given”.
Your quote from No Free Lunch pretty much confirmed what I wrote here:
quote:
I have observed eyes, noses, mouth, face, way before I saw the easter island statues.
Yes you must be able to identify the event without reading it off the event. Nevertheless, "bidirectional motor" is a specification as well, it is independant of the event, but it has nothing to do with the easter island statues!
That is why this quote shows that they are not competing issues. Notice that in the very next paragraph that you quote, Dembski writes:
quote:
Detachability can be understood as asking the following question: Given an event whose design is in question and a pattern describing it , would we be able to explicitely identify or exhibit that pattern if we had no knowledge which event occurred? [/b] Here is the idea. An event has occurred. A pattern describing the event is given. The event is one from a range of possible events. If all we knwe was the range of possible events without any specifics about which event actually occurred (e.g. we know that tomorrow's weather will be rain or shine, but we do not know which), could we still identify the pattern describing the event? If so, the pattern is detachable from the event.
The event must fall within the range of possibilities, and if we can independantly identify the event within this range, or as I said, if the independant pattern in this range of possibilities matches the observed event, then we have found the specification. Ged writes:
quote:
The GCE procedure’s “extremal set” is how the competing issues are resolved. The best possible match is found from independent sources, but no aspect that is not found independently can be used. Alonso if you don’t understand how the “extremal set” resolves the competing issues, you don’t understand the GCE procedure version of the EF.
I used the extremal set in it's correct context when I quoted this part:
quote:
S identifies a rejection function f and therewith a rejection region R that includes E and that is an extremal set of f. R is therefore of the form T^gamma = {w in omega | f(w) >= gamma}....or delta...where gamma and delta are real numbers. Typically gamma is chosen as large as possible and delta as small as possible so that
Ged writes:
quote:
Right! But where have you specified the “eyes with pupils, mouth, and noses and hat” as a “specification”?
In my previous replies.
Ged writes:
quote:
And we of course have a figure with a “hat”, a “nose”, and mouth, chin, various features, but randomly generated.
But thats where the complexity criterion comes in. This pretty much addresses the rest of your post.
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 26. September 2003 13:49
Alonso, "extremal set" is a form of mini-max. It balances competing issues (Minimum rejection region that covers all or maximum cases that fit function dirived from K just sufficient to cover event E. You adjust gamma or delta up and down with different competing objectives of covering E and rejecting as much of omega as possible. Tractability gives another set of competing interests.) I'm not going to waste more time on such trivialities.
On your "specification" being spread across previous posts: That is my point. To have a single specification that was "given independent of the event" one has to find that specification given in a single place to evaluate its independence. Vagueness across multiple posts does not show independence. There still has not been any single specification that "univocally" determines function f. No "univocal" continuous function measure of degree of match has been specified. That is that point.
Then once again if 'f' is just a vague presence of a set of characteristics (true/false by vague notion of what the characteristic is) then a list has high probability of being met randomly, thus "low complexity". Alonso is exactly correct, the "complexity" measure handles this. So that specification's rejection region has not been demonstrated to meet criterion of an extremely small alpha probability cutoff, "design" is not inferred.
As a post idea I am working on will demonstrate, a "specification" can be written to match any event and appear "complex", if the alpha cutoff and constraint on the complexity of the "specification" are not controlled properly. (see Rex and my discussion above, Rex has already demonstrated this. Even Dr. Dembski demonstrated this, NFL P.78--Rex is discussing errors in this presentation.) [ 26. September 2003, 15:53: Message edited by: gedanken ]
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 30. September 2003 11:19
Rex, I’m going to work my present issues here. (I’ll get to the tractability error in the other thread soon because I am convinced they relate closely.)
But before I can get to the end of my big paper segment, I need to understand “ProbRes” and “detachability” completely.
At the risk of repeating some of what I said before, I want to use a little more precise mathematical symbology. And this is work in progress I know it is not logically correct yet, not exactly going where I want to go.
Concept brainstorming:
quote: [Subject] S identifies background knowledge K that explicitly and univocally identifies the rejection function f. Moreover, S confirms that K satisfies the conditional independence condition for each chance hypotheses in {H_i} (i in I), i.e., P(E|H_i&K) = P(E|H_i) for all i in the index set I ...
NFL p.72.
Essentially S identifies this information K out of a larger set of information L possessed by S. (K subset of L.) Now L constrains the real world, e.g. L constrains event E, and establishes both possibilities and probabilities for event E in W (‘Omega’ set of possible events).
Now Hypothesis H_i also constrains event E (and has additional causal aspects which in essence do not concern us because all we will discuss is the probabilistic constraint of E in W).
So I claim that H_i, K, and L are all subsets of all possible relevant knowledge V. (V does not have to include all knowledge, for example we don’t have to deal in set escalation problems, only distributions affecting events in W.)
Now I would like to give a relational operator “=:” wherein “knowledge k =: r” means that k subset of V, r subset of W, k “constrains” r or k “induces relationship in” r. (The ‘r’ could be a unitary subset of W, as in event E as well, taken as unitary subset rather than element of W. I’m pretty much going to ignore the distinction between unitary subset and element of W in this symbology.)
Very specifically H_i & K can thus be represented as H_i intersect K.
Now the “constraining” aspect is very important, because it establishes a probability model or distribution on r in W. (‘r’ represents any event or event set in W, such as E, or R.) If one knows too much, then one over-constrains r, increasing its probability. If one knows too little, then one frees up constraints to r too much, and thus reduces its probability. (Or equivalently expands the set r for the same probability integral.)
But if one only has a particular knowledge k that only constrains to a certain degree, then k establishes density function that I will call ‘g’ (since ‘f’ is in use).
Now P(R|H_i) = g * dU wherein ‘*’ is “dot” product operator and U is “privileged measure” induced by hypothesis H_i. (e.g. H_i =: U.) [Concept starting to have problem here, I’m not being consistent on sets, but perhaps you get the idea.]
Perhaps we should “induce” ordered pair: H_i =: (U, g), wherein then P(R|H_i) = g * dU.
Now P(E|H_i intersect K) = P(E|H_i) says about exactly the correct thing to denote ‘independence’ (being virtually the definition of ‘independence’), and exactly no more or less than was needed. The problem is, as Rex noted, the statement is not very informative. What I am trying to do is construct a more mathematical model of ‘knowledge’ that is more informative of that.
Now I think that in modifying event from ‘element’ of W to ‘subset’, that I can make E bold like other sets. We could consider the event rather than an element, it could be a small “ball” in event space, making it comparable to R. Rather than E element of R, we simply change to E subset of R.
The usefulness is that now we can say
K =: ( g1: E X Real ) g is probability density function over event ball E.
and therefore P(E|K) = g1 * dE.
Then also H_i =: ( g2: E X Real )
Wherein g1 = g2. (or at least g1 * dE = g2 * dE).
Thus P(E|H_i intersect K) = P(E|H_i).
Key point is that each of L, K, and H_i (all subsets of V) all induce constraints on events, such as density functions g. But the “knowledge” can be “intersected” in the sense of the constraints induced probability density ‘g_x’ remain the same under the specified domains like event ball E.
Rex, I’m trying to construct a presentation of “background knowledge” that is more formal. The point is supposed to be that the background knowledge establishes the probability distrubution—without that level of background knowledge we have not even limited to (for example) a particular coin flip having happened and having two states that are not further constrained, etc. Then hypothesis (not specifically part of present knowledge of S, but part of possible knowledge, also constrains event E. Key point is to make the “AND” condition in detachability ‘independence’ into an “intersection” operator—either on sets within knowledge itself, or with probability distribution functions induced by knowledge sets.
Where I got into trouble is deciding where the intersection should occur. It certainly can be required to occur in “knowledge” held by S, (and that might make sense), but it might possibly be in the induced functions. Right now I’m just presenting as is so far as a true “brainstorm”.
Also knowledge may be distinguished into two categories. But it is very important that we understand that these are closely related or overlapping categories. I am in fact intentionally combining them under the knowledge of subject S that I label L. The categories are distinct in the sense that some “knowledge” is of subjects of discussion. Other knowledge is of actual observation, possibly “facts”. For example any constraints that affect actual known aspects of event E are in the “fact” category. But hypotheses about E also in principle constrain event E (and its probability), so fall in the very same set structure. In fact it is an interaction of these kinds of “knowledge” that allow for the conjunction statement to even be meaningful to say “H_i&K” and talk about probabilities varying due to that conjunction.
BTW getting this down may be very useful to some of my point, rather than leaving it in more vague terms. My “big paper” so far has identified this area (“ProbRes” and “detachability”) as key in understanding difficulties in the difference between eliminative and comparative inference! And this is key in making a technical presentation of how “motive” and other aspects of knowledge L of subject S about event E can be related in comparative or eliminative inferences.
PS: I was thinking about “inducing” a tripple:
r =: (g, U, c) where g is probability distribution, U is “privileged measure”, and c is “certainty” (e.g. “true”/”false” or “fairly certain”) in terms of S’s actual belief in correctness or truth of g, U. ‘c’ is only used to determine area of application, but knowledge can be intersected in some manner wherein only ‘c’ is variable, but particular limited g and U do not change if conditionally “independent”. I need to look at “detachable” again. [ 30. September 2003, 17:48: Message edited by: gedanken ]
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 30. September 2003 13:08
Another aspect is to recognize that and how “ProbRes” is all about controlling for coincidence.
If one simply looks for patterns and events simultaneously, one has a higher chance of making a match than if one simply takes a given pattern and checks for events alone. And on the other side, given a pattern, if one looks over all of the universe or more limited area of the Earth for a particular pattern, then one may have a much greater opportunity to find such a pattern, even if in each instance that probability was only P of a match.
Thus if there are M opportunities for a match of probability P, one will find a very different distribution if the question is whether match will occur anywhere in all M opportunities, rather than in a specific case. So to control for coincidence, one must control by reducing the acceptable probability of match by 1/M.
Then if there are N opportunities to choose at random among possible specifications (which must be bounded somehow) from knowledge K, then to eliminate coincidence one must again reduce the acceptable probability of match by 1/N.
(N “bounded” in that elements of SpecRes must be limited, or one could easily create all possible specifications and thus always find both complexity and specification—simply by allowing the specification knowledge K to be sufficiently “complex” in a different sense. This is where the “tractability” issue psi comes into play, so as to limit that side. Rex has identified a mistake in this aspect, in a different thread.)
Since these aspects might be considered independent (they might be independent) then it is prudent to multiply them, and control for coincidence by reducing acceptable probability of match by 1/MN.
(ProbRes set is factored into “ReplRes” of size M and “SpecRes” of size N, forming a cross product set of size MN.)
Now my question: Does this apply equally to an eliminative method, and to a comparative method? Suppose one has chosen a case to analyze, and one in fact has all the information necessary to make a completely Bayesian argument. The Bayesian argument, in a particular case, says that posterior probability favors “intelligent process” as cause, or alternately in a different case it favors “chance-non-intelligent process” as classifications of the particular hypotheses in each of the respective cases.
One aspect to consider: We are desiring of “no false positives” at some inherent level that I shall call ‘z’, so is the probability P of event class T under conditions need to be reduced to z / (M N) > P in both cases of eliminative and comparative analysis? (In other words in the GCE procedure, alpha = z / (M N) > P, wherein Dr. Dembski chose 0.5 > z but not further specified.) Are there aspects of a comparative analysis (say Bayesian) that mitigates either M or N factor (assuming we want to keep inherent “reliability” z constant)?
--
AAAK!! (Added in edit)
Of course! I just answered my question.
Being comparative of two assumptions does not give any help over an “eliminative” case per se in the “ProbRes” issue.
Why? Because the problem of coincidence is not based on a problem of whether one is doing comparison! Rather it is based on whether one has a mechanism or other property that can be described as relating to events and which happens or should happen according to the concept in multiple instances. It is multiple instances requiring a similar relationship that can eliminate both the “ReplRes” and “ProbRes” considerations, otherwise they are necessary.
Because the mechanistic relationship or other repeatable pattern is an argument that the same systematics are or must be occurring in the multiple instances being tested. If the concept being determined is an ad hoc concept, then if it is found once in say a million trials could be due to coincidences that occur once in a million. But if that concept being investigated has relationships of connection to the multiple instances being investigated, then one has a criterion for determining that the concept should (or should not) apply in each of the ‘million’ cases scanned. If the concept was a rule of physics for example, it should be found in each of the ‘million’ cases. But the ad hoc concept may be considered relevant only if it is found in that one of a million cases.
Thus the control for coincidences must control for false positives by the additional factor of MN to account for its ad hoc nature without detail. When one has the detail, one knows in each of the MN potential cases whether the studied relationship should apply, and would thus require it of each of the MN cases. It would not be accepted if it only occurred in one of the MN cases, it would have to be consistent in its “reliability” across the board in the MN cases. Thus the “reliability” factor I’m calling ‘z’ does not need to be multiplied by 1/MN to account for coincidences.
For example if one looks through biology for something that lacks an evolutionary step connection (because there is a statistical distribution of lost information needed for the connection), then if one can look at millions of cases one could find such a case by coincidence only. But if one is looking at a mechanism or relationship that must be uniformly present, one must find that relationship consistent in each of the millions of cases. ID only needs be found in the ad hoc few cases in the way ID enthusiasts search, while evolutionary mechanism relationships must be compatible with each case. Thus with respect to the “reliability” factor desired, the MN occurrences of “ProbRes” cases need to be factored into the significance factor for the test. And once again, “ProbRes” has nothing to do with whether the test is comparative like a Bayesian or likelihood approach, but whether it is based on mechanism or repeating relationship that is required to be observable across the ensemble of MN cases. Are we looking for one case, or MN cases, across MN cases in the “ProbRes” resources?
Now of course this still does have everything to do with the nature of the EF “eliminative” inference, in that it has no details about the case to which the so-called “inference” is to be drawn. Since there are no details of the ID hypothesis, we have no basis for requiring those detailed instances to hold across the MN cases of the “ProbRes” possibilities. The ID inference is supposed to hold if it is ‘inferred’ even in only one of the MN cases, for that particular case. Thus the ID inference must consider the 1/MN factor in its reliability criterion. (And of course a forthcoming presentation will show that precisely that requirement makes it incredibly sensitive to ‘noise’ from missing distributions!)
OK, that answers that question, but the above issue of identifying and understanding the necessary relatedness yet conditional independence of specification and event called “detachable” needs to be understood. Thanks for everyone’s patience. [ 30. September 2003, 22:14: Message edited by: gedanken ]
IP: Logged
|
|
Rex Kerr
Member
Member # 632
|
posted 01. October 2003 01:56
The more I think about it, the less I am convinced that detatchability makes any sense at all. The basic problem that I am running into is that knowledge is not detatchable from the world, and every method for classifying events or ascribing meaning/identity is ultimately self-referential.
I've been trying to think of a way to come up with appropriate conditions to salvage the situation, but there continue to be details that elude me. (Hopefully this is just due to a lack of time to think about them.)
Anyway, you've presented interesting formulation here. I'll think about it for a bit before saying anything in particular.
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 04. October 2003 16:14
Before proceeding, I think we need to understand some issues in Dr. Dembski’s presentation in NFL chapter 2, pages 50, 51, 54, and surrounding, then use on p.73.
There he discusses events in a space W (using Rex’s terminology where W is Dr. Dembski’s “Omega” or set of possible events, where E element of W).
The function ‘f’ is generalized as to be any function suitably identified on elements w of W, but the distribution is forgotten about and only used in summary statements like P(E|H_i). While we generalize the function ‘f’ to any suitably identified function for the analysis, the existence of the original density function is ignored in favor of these generalized statements. I would like to call suitably informed density functions ‘g’, defined over the same domains as generalized functions ‘f’. In other words f could be g, for the strictly “Fisherian” method, but is not necessarily so. And furthermore ‘g’ can be indexed or subscripted, as there may be different ‘g’ functions identified with different bits of knowledge.
Now a conditional probability distribution function on a parameter like gamma can be found. For example if we know P(T_gamma|H_i) that is calculated in essence by g*dU where U is in essence T_gamma, and H_i determines g.
‘*’ means a dot product. Here I want only to deal with notation and not the other problems, which I will return to in next post. Rather I want to present some ideas on notation in case there are problems.
What I am really wanting to do is to represent P(T_gamma|H_i) = g_i*d T_gamma. Here whether T_gamma is discrete or continuous, we simply get a sum or an integral of the density function g_i over T_gamma, to get the probability function. But here in many cases W itself determines g probability density, and H_i is simply identifying a distribution over a subset of W. If H_i is simply identifying the subset, then g_i = g, where g is determined by event set W itself! Thus various hypotheses H_i only change subsets of W that are active, and do not change the underlying “chance” distribution over the entire set at all.
Then in different cases, the g_i function would depend upon the hypothesis H_i, and the “active” subset of W might not change at all, as in all events were possible to some degree.
Then G_fi(gamma) distribution function could be defined as per T_gamma = {w in W | f(w) >= gamma}, where G_fi(gamma) = g_i * d T_gamma, wherein T_gamma is function of gamma, g_i is distribution dependent on H_i. This ‘f’ and inequality could be generalized to a relation F (defined by function f and associated parameter) such that T_F(x) = {w in W | F(w,x)). Therein F could be x>f(w), or f(w)>x, or any other parametric relation. Dr. Dembski only gives the two, less than and greater than, as the parametric relationships to identify the constraints, but in principle any relation F could be given.
Thus to speak about independence, we talk about whether the information that produces the identification of relation F of the event and a parameter somehow also constrains the distributions themselves.
But most important to this post, I am wondering about the acceptability of the notation “g * d T” for distribution g and set T subset W, wherein “dot” product ‘*’ represents either summation or integration as necessary. I don’t know if this will help next section yet, either, but I thought I would throw that out before using it next post. [ 04. October 2003, 16:22: Message edited by: gedanken ]
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 05. October 2003 12:50
For reference:
quote: [Subject] S identifies background knowledge K that explicitly and univocally identifies the rejection function f. Moreover, S confirms that K satisfies the conditional independence condition for each chance hypotheses in {H_i} (i in I), i.e., P(E|H_i&K) = P(E|H_i) for all i in the index set I ...
NFL p.72. (I shall be referring in part to my post a few above with a lot of symbology, and in part to Rex’s response just above.)
Thanks Rex, on previous comments.
quote: ... the less I am convinced that detatchability makes any sense at all.
Alas, the essential problem seems to be representing possible connectedness of knowledge.
But of course the issue in “detachability” is disconnectedness from the processes of the “non-intelligent” process hypothesis, not the intelligent based process. And that only with respect to the step of process hypothesized, not all historical processes before that.
For example an intelligent agent sets up an experiment to have a certain distribution. (e.g. Caputo sets up 50/50 distribution). The issue is strictly the hypothesis of the non-intelligent process supposed at the process step, not the entire history. I can see how, for example, one might take characteristics from prior history of biological development as the “pattern” and thus they are not “independent” of prior historical development. As long as the non-intelligent process hypothesized does not in its “random” step leave out such a relevant distribution from the historical sequence that might be active, the independence could still be suitably defined.
That is of course what I am trying to get a more formal presentation, to discover or possibly make more sense of. (Of course if the "formal" presentation turns out to not be general enough, it would not deductively imply anything useful, but I am hoping to construct a symbolic representation that is general enough to investigate deductively.)
Now the strict "conditional independence" rule makes sense probabilistically, of course, if any real-world conditions can actually match that independence and still formulate other useful constraints used.
(I’ve got to point out the very tenuous nature of my post on symbology, as it was really a true “brainstorm” without necessarily fitting with either realistic means of judging “knowledge” or really necessarily going in the direction to capture what I wanted either.)
Here I wanted to go back to the intuitions that I wanted to capture:
Take an example of a chance device like a roulette wheel. This is a model of an equally-distributing randomizing process. The distribution of the slots are equally likely because of symmetry and lack of knowledge. (“Lack of knowledge” being a key point!)
Now consider a modified roulette wheel. This wheel has an ability for various slots to be blocked. And for the sake of simplicity, let’s consider that if the ball tries a “blocked” slot, it simply bounces randomly among all other slots. (I don’t want to consider detailed “physics” here, I’m constructing a mental model of how knowledge affects situations so I want to simplify. The “roulette” wheel could model any number of random symbols and integer symmetrical duplications of symbols chances since the wheel could be of any size and still have the nature described, and any symbol could be repeated any number of times, so I consider it a fairly general model.) In essence this is an equal probability distribution model, except that multiple slots map to the same symbol. Thus while the slots are “equal probability”, the symbols in W can have multiple occurrences and thus are scaled.
The essential model is that every slot that is not blocked leaves N slots unblocked. Of those each slot is therefor (by symmetry assumptions of a normal roulette wheel) pretty much random equal probability for each slot, so P(unblocked_slot_i) = 1/N. Note that N varies with any knowledge of “blocked” slots. Now if a given symbol i appears M_i times, then probability of symbol i is P(i)=M_i/N. Of course if one of the symbol’s slots were blocked, then M_i could also be changed for that symbol, as well as N.
So without even considering any processes outside of the random drop of the ball into open slots, we can see that knowledge about “blocking” state of slots can either increase or decrease the probability of a given symbol being selected.
Initially at least this may not seem to have anything to do with the kind of “knowledge” that Dr. Dembski is talking about as generating specifications. But then think more generally; just think about the “roulette” wheel model increasing in size. The “wheel” simply provides a probability model for events, it simply is a descriptor for more general probability distribution. I’m trying to get at how “knowledge” could in any way constrain the probability we consider, and the “wheel” model is just a substitute for events with a particular distribution before we consider additional information.
For example consider a “fabrication”. That would be a knowledge that the ball dropped into a particular slot, but that knowledge occurring after the fact, after the spin. Remember that is simply a knowledge relating to the event E, just that it is knowledge with a particular time stamp of after the event and directly viewing the event situation. So how would we have otherwise, have any knowledge that could constrain patterns T without knowing about the slot setup? We simply cannot have any pattern T without knowing something about the slot setup. The difference was that “fabrication” knew more than about the slot setup, it knew about the ball roll itself, after it occurred—further knowledge about the wheel.
So once again if we were to restrict our knowledge so as to not have know about the wheel, we could not construct a pattern T at all! That does not work. We need at least some knowledge of the number of slots and their symbols. So let’s try to restricting to knowledge “before” the event. While that is overly restrictive for the kinds of cases we want to analyze, it turns out that even that restriction was not sufficient! Suppose we simply knew about slots that were blocked, before the roll. But that is knowledge that should be considered in H_i, set of all relevant chance distributions! Thus our knowledge L of subject S, to include that knowledge would require it be considered in H_i. Such knowledge should also be included in “all known probability distributions”! That does not preclude it from being knowledge for pattern T.
We might be able to consider our “wheel” model in terms of a case we are more familiar with: Caputo’s drawing of “D” and “R” symbols. Here the “wheel” slots are labeled with a sequence of 41 “D” and “R” symbols, all 2^41 combinations. (This illustrates that the “wheel” model was intended to include all sub-parts of the “event” under a single symbol, and we simply apportion slots to equivalent probability of compound event parts if present.) And the distribution in this case is even on all unblocked slots.
Now if we knew that Caputo fixed the drawings, that would be equivalent to knowing that Caputo blocked off all the slots except for say 42 slots that consisted of 1 or 0 “R” symbols. But of course if we knew that, we would have to include that knowledge in our “relevant probability distributions”! So what if we knew that was a possibility? Here we must interpret carefully. Knowledge of the “intelligent agent” action is excluded in the EF. So knowledge that Caputo might have fixed the drawing is excluded by the structure of the EF test. It is not that such knowledge must fail to be part of larger body of knowledge L of subject S as I have described it several posts above, rather that it is not allowed by the classification scheme of the EF to be part of either K or H_I.
BUT we are allowed independent knowledge that patterns of Caputo fixing the drawing are possible. By that we must separate that knowledge from the by some wave of the hand—e.g. knowledge of possible intelligent agent action is excluded from probability of the event, but included in pattern T.
I think this may be the concept I was starting to get at with knowledge k “inducing” a touple by my symbology; knowledge k =: (U , g, F, c) where c is a classification or classification structure of the knowledge, U is in our case simply a subset of W, and f is a function on W to Real. F is a relation with ‘f’ and a parameter, such as f(w)>gamma, w in W. U in our case of the “wheel” model must necessarily be a discrete set, but in other models could be continuous. U is more or less a subset of W, but possibly the whole set, intended to represent the domain of density function g. In other words ‘g’ is defined for domain U, where U is subset of W.
Referring to my previous post, the idea is that F is some parametric relation, as in f(w)>gamma, along with its parameter. ‘g’ is the probability distribution generated from the knowledge.
Now this touple, (U , g, F, c) is induced by knowledge k from all knowledge L of subject S. The touple (U , g, F, c) does not have to actually do any “restricting”. For example if the set U were empty, then g mapping there would have no elements, and thus could be a “non issue”. Likewise constraint F could simply be true all the time, thus providing no constraint. The idea is that the touple allows the knowledge k to induce the part or all of various relationships we deal with on the event or event set.
Some examples: k =: (U , g, F, c), wherein F is always true, U=W, and g sets a distribution. This occurs in a couple of cases. One is the hypothesis H of random occurrence of all symbols. Thus hypothesis H of random draw over all 41 draws, 50/50 distribution, sets function ‘g’ to 1/2^-41 for all symbols. But its classification ‘c’ is that it is a possibility, not that it is “known” by subject S. Thus it is only possible knowledge, not actual knowledge. Thus the hypothesis H falls in our knowledge structure as induced from particular bit k of subject S’s knowledge.
Another example: k =: (U , g, F, c), wherein F is always true, U=W, and g sets a distribution. In this case, suppose we know that the result of the draw sequence that occurred is “DDDDDDDDDDDDDDDDDDDDDDRDDDDDDDDDDDDDDDDDD”. Thus the knowledge of the actual draw falls in our knowledge structure as induced from particular bit k of subject S’s knowledge. Distribution ‘g’ is 0.0 for all symbols except for the known answer, wherein ‘g’ is 1.0 (or nearly so). We are virtually certain of the final draw results, at the present time Classification ‘c’ is that the knowledge depends after and of the event, and it is known with certainty. (And in our “roulette wheel” model, this corresponds to Caputo blocking off all the slots except for the low “R” count slots.)
Another example: k =: (U , g, F, c), wherein F is always true, U={}, and g is empty. But in this case F is determined from function f(x), x element of W, wherein f counts the “R” symbols. F is simply setting gamma>f(x), f(x) counts. Thus the knowledge of counting “R” symbols falls in our knowledge structure as induced from particular bit k of subject S’s knowledge. (We must note that counting “R” symbols is more specific than simply counting in general. We have had to adapt our knowledge of counting in general to the specific case in question, to make it relevant!
Can we have an example k that affects all terms of the touple? How about if S had knowledge that Caputo fixed the ballot setting drawings such as to have small count of “R” symbols in our symbol string measurement of the event? This case is of knowledge not that it was a possibility, rather that it had occurred! But this is still knowledge without knowing the actual draws, and lack of “independence” has to be shown probabilistically as the causal chain is the same as Caputo setting up a fair draw. k =: (U , g, F, c), wherein count function f is determined as in last example as counting function, but also ‘g’ is determined so as to weight the low “R” count symbols with high probability, and higher count symbols with low probability. Thus in this case all of the knowledge induced touple is filled out with meaningful entries. Classification ‘c’ is of present knowledge after the event! And this affects the probability distribution, e.g. P(E|H) /= P(E|H&k), which I will demonstrate in more detail later.
Now how about knowledge of the possibility of above case? Then the classification ‘c’ is only of “possibility”. But the distribution is still changed. Thus that knowledge k is excluded by the rules.
By the way, I’m hoping we can get along with the “generalized function” concept for f, so as to not have to get me involved in “measure theory”. Rather accept that g * dU is a meaningful dot product that gives either an integral or a sum or some admixture, depending on details of U domain continuity. I don’t think this is important beyond recognizing that if U is discrete then g * dU is simply sum of function f applied to each discrete element of set U, and if U is continuous then g * dU is simply the integral of function f over region U.
The combination of knowledge winds up as some form of overriding of functions and sets, such that the actual distributions are calculated with the overridden and thus assembled functions and sets. Combine the hypothesis distribution and override that with some direct knowledge of events, and the distribution changes to a new g’ and W’ (W’ derived from the combination of original W and g of hypothesis, and overridden with added knowledge, somehow using classification ‘c’ to determine substitution rules.) So P(E|conditions) would be the following. E = {E}, making E into a set. Then P(E|conditions) = g’*d E. Likewise P(R|conditions) would be g’ * d R. So from the “touple” generated, we can develop rules to directly calculate the probabilities being discussed, directly in terms of the touple entries. This is not yet carefully developed—I’m simply trying to demonstrate that it would be possible to do so if the concept was refined and made more rigorous.
The touple is induced from sub-portion of our knowledge: k =: (U , g, F, c). And that touple can completely represent a number of statements. For example P(E|H). The touple has U=W, g is hypothesized distribution, and F constrains to true for symbol E, false otherwise. (‘c’ is classified a hypothesis—still unclear how to handle.) Thus P(E|H) = g * d (U intersect E), wherein E was determined by relation F in the touple. Likewise for fixed region R, F specifies R, either directly or with parametric relation as above. P(R|H) is completely defined as g * d (U intersect R), wherein R was determined by relation F in the touple.
Key point is that all the probability statements made in the EF can be represented by the touple structure, if the concept is properly refined. In each case of touple (U , g, F, c), relation F determines a set r subset of W. The derived probability relation is uniformly P(r|condition) = g * d (U intersect r), wherein r was determined by relation F in the touple. The “condition” is whatever ‘g’ and ‘U’ are derived from, given in appropriate symbolic form which may be determined from outside of the touple itself, but which must be mathematically equivalent to the conditions described in distribution ‘g’ over subset ‘U’. For ensemble statements like pattern T with parameter ‘gamma’, for example, the relation F is simply parameterized with gamma, the method is the same and is represented by yet another touple (U , g, F, c).
Statement combination can be fairly rigorously defined. Let me represent “statement relates to touple” with “:=:”, in the following: Event Set’s ‘T’ are representable T :=: ({}, {}, w in T, c_T) In this case we have no difficulty combining T in “given” or “|” relation: T | B :=: ({}, {}, w in T, c_T) | (Ub, g_b, F_b, c_b) = (Ub, g’_b, F, c_tb), Where F(w) = F_b(w) & (w in T). Relation F(w) thus gives set r={w in W | F(w)}. We must normalize g’_b by g’_b(w) = g_b(w) / (g_b * d Ub). So P(T|B) = g’_b * d (Ub intersect r).
Combinations A & B: A & B :=: (Ua, g_a, F_a, c_a) & (Ub, g_b, F_b, c_b) = (U, g, F, c) U = (Ua union Ub) g: W X Real = {g | g(w) = g_a(w) if rule(w,c_a,c_b), else g_b(w)} where “rule” is a combinational priority operator based on classifications “c_a” and “c_b” to determine which overrides which. c = who knows the combined classification?
In this concept, we see that a uniform “touple” of symbolically representable aspects induced from knowledge bit k can lead to a uniform representation of event conditionals. The meaning of combining different bits of knowledge can probably be represented (with a little more study), as in combination “H&k”. This is meaningful by some sort of override of the distribution if one knowledge overrides the other’s distribution. Classification of the knowledge ‘c’ could be used in developing the override rules
Thus P(T|A&B) = g * d (U intersect r) where r is determined by F in last paragraph, likewise U, and g.
Knowledge k can include aspects like knowledge of opening up or blocking off of “roulette wheel” symbol locations. All the knowledge of relevance is a discussion of some equivalent, it must discuss “wheel slot” equivalent induced functions or is not relevant. Knowledge includes revelations of possible slot identifications, as well as slot configuration equivalents. All of this can possibly change distributions, and only fails to change distributions in carefully controlled conditions that could fairly be declared “conditionally independent”. That “conditional independence” is only with respect to certain conditions, and is not necessarily general, either—but “conditional” independence is all that is required.
The important point in this development is that hypotheses, events, and other kinds of background knowledge can potentially be represented in a uniform manner. Key and fundamental point is that the various kinds of knowledge k that induce various conditions, hypotheses, and events, are not of dramatically different character, rather are all of a very nearly similar character. Thus I think that the result of a study of this sort would support Rex’s contention of difficulty in separating background knowledge from events in many cases. For example in biological systems, the background knowledge depends on our biological experience, as well as the events being studied. So to keep independence, all events must be constrained to state changes wherein the previous state is assumed, or else the distributions will easily be changed with the knowledge chosen as the pattern. The principle development here is not of a completed system, but a demonstration that knowledge of subject S can be chosen in a fairly uniform representation for the various parts of the explanatory filter procedure. And the main point is that the knowledge “independence” issue is not insignificant, and is not so clearly assumed as not a problem.
I am going at this point to abandon this approach. The symbology is too difficult, for the small gain in understanding that it could produce. The issues can be explained in plain English, in my opinion, not requiring further clarification in highly technical development. But I thought this exercise was very interesting and informative. [ 05. October 2003, 15:02: Message edited by: gedanken ]
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 05. November 2003 14:58
I hope the moderator will excuse this wholesale copy of Dr. Dembski's argument, but I think it will become important in forthcoming entries here. I want the whole presentation to be readily available.
ARN topic: "Theft over toil", comment by Dr. Dembski
quote: Forthcoming in The Design Revolution (due out January):
In an article titled “The Advantages of Theft over Toil: The Design Inference and Arguing from Ignorance” for the journal Biology and Philosophy (2001), John Wilkins and Wesley Elsberry argue that the filter is not a reliable indicator of design. Central to their argument is that if we fail to characterize the full range of natural necessities and chance processes that might have been operating to account for a phenomenon, we may omit an undirected natural cause that renders the phenomenon likely and thereby adequately accounts for it apart from design. Thus, with the combination lock example given above, they consider a poorly constructed lock for which the probability of opening it by chance is much larger than one in 10 billion. Granted, this could happen. But it could also happen that the lock requires dialing the right combination so precisely that the chance of opening it by chance is in fact far smaller than one in 10 billion. Further investigation of the lock could therefore upset or reinforce a design inference.
The prospect of further knowledge upsetting a design inference poses a risk for the Explanatory Filter. But it is a risk endemic to all of scientific inquiry. Indeed, it merely restates the problem of induction—namely, that we may be wrong about the regularities (be they probabilistic or necessitarian) that operated in the past and apply in the present. Wilkins and Elsberry act as if no amount of investigation into a phenomenon is enough to reasonably rule out natural necessities and chance processes as its cause. Yet if design in nature is real, their recommendation ensures we’ll never see it (see chapters 26 and 32).
Contrary to Wilkins and Elsberry, the risk of further knowledge upsetting a design inference has nothing to do with the filter’s reliability. The filter’s reliability refers to its accuracy in detecting design provided we have accurately assessed the probabilities in question (see chapter 12). Wilkins and Elsberry purport to criticize the filter’s reliability but are in fact criticizing its applicability (see chapter 14). They’re like someone who dismisses a calculator as unreliable after a friend, seeking to know what “9 times 9” is, gets the wrong answer by accidentally punching “6 times 6.” If that person were dead-set on dismissing the calculator’s usefulness but were pressed to admit that the friend had made the error, then the calculator hater might insist that nobody can be trusted to use the calculator accurately. That’s essentially what Wilkins and Elsberry have done with the Explanatory Filter.
To so refuse the Explanatory Filter’s applicability irrationally privileges undirected natural causes and renders them immune to disconfirmation. Science is supposed to be a risky enterprise. We go to nature to discover her secrets because we do not know what those secrets are till we look. It follows that what nature reveals to us can be unexpected and even disconcerting (witness the difficulties physicists faced making sense of quantum mechanics in the 1920s and 30s). Yet when it comes to design, Wilkins and Elsberry want a risk-free science. They want their science safely cosseted within a naturalistic cocoon that excludes any place for design in the natural sciences. But such a risk-free science is no science at all. It knows the truth without looking. So when evidence comes that challenges it, it arbitrarily rules that evidence inadmissible.
-------------------- Bill Dembski
IP: Logged
|
|
Ermete22
Member
Member # 970
|
posted 07. November 2003 08:57
Well I'a a new member, and still ha ve some troubles in identifying the style and the habits of the forum. I therefore apologize about possible "gaffes". About the subjet, my opinion is quite precise. Whatever motivation is, it is a linguistic-descriptional concept, which does not refere (and cannot refer) to the computations carried out by the system whose motivations we are discussing. Computations are, even for super-turing machines, purely formal, effective procedures.We can attribute motivations to formal procedure, but this is again one of our linguistic moves. So intelligence does not imply motive; the identification of apparent descriptional motives can support the attribution of intelligence to some system, but with a lot of care and attention to circularity traps. One could, or maybe should be more rigid, but in my opinion apriori rigidity risks to be a form of stupidity.
IP: Logged
|
|
warren_bergerson
Member
Member # 262
|
posted 07. November 2003 09:52
Ermete,
You are, IMO, looking at the question from the wrong direction. Logical operations, as you point out do not inherently contain motives, purpose, or goal. But if you are to create a set of logical operations or a program to simulate intelligent behavior you MUST assume a goal, purpose or motive. Even in simulating the fringes of intelligent behavior such as visual recognition or control of motion, you must assume purpose or motive. All robotic programs are based on goals or motive. All GA algorithms are based on goals or motive. If you assume goals and motives, it appears, to be theoretically possible to simulate any intelligent behavior. Fail to assume goals or motives and it is impossible to simulate any form of intelligent behavior.
Whether or not motive or goals are an inherent feature of the universe is irrelevant. In order to perform scientific analysis of intelligent behavior you must assume purpose or motive.
IP: Logged
|
|
|