|
Author
|
Topic: “Design-centric ID” as statistical hypothesis testing
|
brauer
Member
Member # 398
|
posted 07. October 2002 16:10
“Design-centric ID” as statistical hypothesis testing
It occurs to me that the program of the so-called “Design-centric ID” is to reject the following null hypothesis with a high enough degree of confidence:
“H0: All empirically observable phenomena in the universe are products of chance processes, regularity processes or combinations of the two.”
The statistic used to test for the rejection of the null hypothesis is the quantity of “CSI” in the system. By setting the rejection region to be highly conservative (the “universal probability bound” (UPB) is exceedingly small) one hopes that it’s at the very far extreme of any possible distribution of the test statistic. There are three problems with this approach that I think will need to be addressed in a rigorous manner before “Design-centric ID” can be equipped to achieve its goals.
Null Distributions of the Test Statistic
First (and this has been brought up by others), the distribution of the test statistic for even very simple phenomena is entirely unknown. Consider the example discussed elsewhere on this board: rejecting hypotheses of chance and necessity for the angular diameter of the sun relative to the moon. Bill Dembski offered a first approximation to the distribution of the CSI test statistic when he assumed a uniform distribution of disk diameters and defined a reasonable tolerance for the difference between them. He noted correctly that we have no way of concocting a distribution that takes into account a large number of unknown (unknowable?) events, and that the simple distribution was probably the best that we could do.
But note that assuming a uniform distribution (or any simple distribution, for that matter) of disk size reduces a complex system in 4-dimensional planetary dynamics to a simple question in planar geometry. It ignores what we know (or think we know) about the earth-moon system, as well as those parameters that we don’t know the values of, but think might be important. That may be the best that we can do. The question is is it good enough? My suspicion is that it’s not.
The UPB as a Conservative Fudge Factor to Account for Ignorance of the Relevant Distribution
The second question I have is related. It seems to me that Dembski chooses the cutoff value for his test statistic in order to be extremely conservative. And correctly so: he does not want to bias his results to confirm his expectations! But my feeling is that adopting such an extreme cutoff value as the UPB is a way to sweep the uncertainties about the test statistic’s distribution under the rug. It’s certainly true that using the UPB to delimit the rejection region would probably be a safe bet for just about any distribution of CSI that one could imagine. However, it is not clear to me that doing so is a universal panacea to complete ignorance of the actual distribution.
As an example suppose I wanted to reject the hypothesis that a particular sample of birds came from a population with a certain bill length. I don’t know anything about the bill lengths of birds within the population. So I assume some, say, gamma distribution with an arbitrary mean and variance. Since I realize I’m contriving the distribution, I decide to be conservative and reject the null hypothesis only if the sample mean is beyond x sample standard deviations from zero. No matter how big I make x, the fact remains that I am covering up my ignorance of the distribution of bill length by inflating the significance value of this particular test. How robust are my results to these assumptions? There’s no way of knowing.
Multiple Tests of Design
My final objection relates to the fact that a test to reject H0 is actually a composite test to reject the comparable hypothesis for each of every phenomenon examined. That is: we examine phenomenon 1 and determine if we can reject all hypotheses of chance and necessity. We then go on to phenomenon 2. This continues until we get tired or until we accumulate enough examples of “designed” phenomena that we feel comfortable in asserting that the design of some natural phenomenon exists. If for any given phenomenon the chance/necessity hypothesis is rejected, then not only can we say that we believe that phenomenon to be designed, but that, therefore, some designed phenomenon exists (which conclusion is the goal of “design-centric” ID).
Rejecting the hypothesis H0 above depends upon a large number of multiple tests. Furthermore, the tests are continued until, presumably, some one test reveals design. This is a no-no in experimental design. The multiple-test problem could be addressed by introducing a Bonforroni correction to the significance value, say, by accepting only instances where the CSI is much smaller than that indicated by the UPB. But the practice of making repeated statistical tests until the desired outcome is found is known to be fraught with potential for artifactual bias. The standard practice is to determine, a priori what tests will be done, and then decide on the rejection of H0 at the conclusion of the experiment. If this is not done, the statistical power of the test of the composite hypothesis rapidly approaches zero. (Note that this problem is avoided in a likelihood framework. See Royall, RM. 1997. Statisitcal Evidence: a Likelihood Paradigm. Chapman & Hall. NY – an excellent book).
An additional problem arises when one considers that the phenomena being sent through the explanatory filter may or may not be independent, and that the independence may depend in large part on whether design happened or not! We thus have multiple tests of potentially dependent samples, with no a priori defined decision point. This experimental design would not pass a clinical statistical review, no matter what the context.
Conclusion
There are difficulties with treating “design-centric ID” within a statistical framework of standard hypothesis testing. Furthermore, these problems serve to bias the conclusion of the test towards one of design. While the developers of the statistical theory for detecting design have been admirably (perhaps excessively!) conservative in some assumptions, these questions, left unanswered, are enough to diminish the value of this conservatism.
[typo edits] [ 07. October 2002, 16:14: Message edited by: brauer ]
IP: Logged
|
|
YZ2
Member
Member # 91
|
posted 10. October 2002 13:24
Let's assume design occurs in nature and detecting design is the goal, I can think of 3 ways of detecting design. There may be more, but let's focus on these for the sake of discussion:
1) Consider attributes such as IC, use "knock-out" experiment on its part. If the target object demonstrates design qualities through deletion, then conclude the object is designed.
2) Compare the target object to an idealized design object, if the target object is "sufficiently" closed to the idealized object, then conclude the target is designed.
These two ways require that the target object is very likely designed for the evaluation to be fruitful. Otherwise the evaluation may end up with a negative result.
A third way is by means of statistical inference in a search space of potential design objects. The goal is to narrow the search in a very large space based on estimating existence of design qualities. The advantage of the statistical approach is that it can balance between "false positive" and "false negative". False positive in science is not desirable in that it claims certain object is designed when it is not. But false negative is undesirable too, in concluding certain object is not designed when it is. Whether current statistical methods are adequate for such a test is beside the point. The issue is whether statistical inference can, in principle narrow down the search for design by estimating design qualities in the objects, possibly for further confirmation. This is a different proposal than the conclusive test that you have mentioned. That is, whether statistical inference or some forms of it, is capable of relating an object to design qualities through estimation. I think, in principle it can.
Thus contribution of ID to science is both developing "design detection" methodologies, as well as the detection of design in understanding how design in nature is, such as the interactions of its parts. I think the statistical approach can involve in both.
Quote: "If we knew what it was we were doing, it would not be call research, would it?" -- Albert Einstein. [ 13. June 2003, 10:39: Message edited by: YZ2 ]
IP: Logged
|
|
brauer
Member
Member # 398
|
posted 14. October 2002 11:58
I'm kind of hoping that someone who understands the concept of CSI better than I can correct me if my assumptions about it are wrong.
I guess the one question I'd most like answered is this:
Is the "Explanatory Filter" in fact intended to be equivalent to a statistical hypothesis test, in which the null hypothesis of "no design" is rejected based upon the value of a test statistic represented by "CSI"?
Anyone?
IP: Logged
|
|
|