|
Author
|
Topic: Validation (from Stone Circles thread)
|
brauer
Member
Member # 398
|
posted 29. January 2003 12:22
I'm a little bit frustrated at my inability to communicate the concept of validation in the Stone Circles thread. Please permit me to try again here with an analogy from a different field. Also, forgive the cartoonish nature of the example – I’m trying to keep the discussion light and simple, while capturing the core of the problem.
My Claim I have invented a "psychopathy discriminant function" (PDF). If you submit to me the name of a person, my method will be able to tell you, based on various personality traits, whether or not that person is a psychopath. You may of course sumbit my method to any number of tests, by giving me the personality characteristics of a person. Ready?
First Test You submit: "Aldolf Hitler" My discriminatory filter replies: "PSYCHOPATH"
You submit: "Saddam Hussein" My discriminatory filter replies: "PSYCHOPATH"
Looks good so far!
You submit: "Mother Theresa" My discriminatory filter replies...
Wait a minute! We both know that Mother Theresa is not a psychopath. There's no point in submitting her to the test.
You at this point rightly complain, and I go back to the drawing board to develop "Psychopathy Discriminatory Function v.2.0."
Second Test As before: You submit: "Aldolf Hitler" My discriminatory filter replies: "PSYCHOPATH"
You submit: "Saddam Hussein" My discriminatory filter replies: "PSYCHOPATH"
You submit: "Person X" My discriminatory filter replies: "PSYCHOPATH"
Oops. "Person X" was again Mother Theresa. The filter needs work.
Third Test Aldolf Hitler: PSYCHOPATH Saddam Hussein: PSYCHOPATH"
You submit: "Mother Theresa"
My filter works for awhile. During the development phase for v.3.0, however, I added in a “Saint Module” which explicitly looks at whether the person is on a list of those widely considered to be saints. This incorporation of “background information” changes the performance of the PDF. Before the PDF does its thing, it passes the name through the Saint Module, to eliminate problematic cases such as Mother Theresa.
So, after much grinding of gears, and extensive use of the Saint Module, my discriminatory filter replies: "NOT a psychopath".
…
Now the question is this: how good is my psychopath discrimination function? It claims psychopathy for the people we’re pretty sure are psychopaths. But can it truly disciminate between hypotheses of psychopathy and pro-sociality? We can’t tell, because its “not a psychopath” labeling ability has not been validated. If I either do not apply the PDF to “known” saints, or I incorporate external knowledge of special cases into the “saint module”, I am not providing my filter with an adequate test.
Similarly, the “Explanatory Filter” purports to discriminate between hypotheses of “design” and “not-design”. But if it’s never tested (blind) on systems that we know are not designed, we can never know how well it really works.
So again, I think it’s entirely fair to submit the Stone Circles to the EF, without “telling” the EF ahead of time that we expect a result of “no design”. If the EF responds with a result of “design” for the Stone Circles, then the claim of “no false positives” will be put to rest once and for all. If the EF returns a verdict “not designed” (again, without “knowing” that we expect for other reasons that that will be the judgment) then the confidence in the EF will marginally increase.
But many such trials are necessary if the EF is to be anything more than a Rorschach test for the expectations of the investigator.
[typo edits] [ 29. January 2003, 12:28: Message edited by: brauer ]
IP: Logged
|
|
Paul A. Nelson
Member
Member # 26
|
posted 29. January 2003 13:08
Matt wrote:
quote: Similarly, the “Explanatory Filter” purports to discriminate between hypotheses of “design” and “not-design”. But if it’s never tested (blind) on systems that we know are not designed, we can never know how well it really works.
So again, I think it’s entirely fair to submit the Stone Circles to the EF, without “telling” the EF ahead of time that we expect a result of “no design”. If the EF responds with a result of “design” for the Stone Circles, then the claim of “no false positives” will be put to rest once and for all.
But the EF is not a mindless algorithm that could be run by a machine. It's a rational reconstruction of how intelligent beings (namely, us) infer design. It's also a proposal for a design detection method, which needs to be tested. (On that point, we agree completely.)
But someone -- some person -- has to use the EF. If that person is using the EF correctly, and we give him the stone circles as test data, he should first ask if a natural cause explains the pattern. Off he goes to the geology library; enters "arctic stone circles" in the search box; journal articles (actually, an extensive literature) pop up; the natural cause node of the EF intercepts the pattern. Design never arises as a plausible cause.
Now, could stone circles be used in another sort of EF-validation experiment? Of course: find some way to encode the patterns, so they could be expressed as bit strings. Do the same with radioactive decay, wave outlines on beaches, DNA sequences, Robert Browning sonnets, automobile manuals, Bach concerti, you name it. Ask the EF, or other candidate design detection methods, to sift these bit string patterns, avoiding false positives.
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 29. January 2003 14:13
It may be a mistake to have two threads running on the same subject. I think that Paul A Nelson's remarks are relevant to "stone circles" themes, for example. I'm not saying this is a mistake here, so much as a desire for differentiation or creation of a more encompasing thread for all these -- so we don't have two separate threads to follow for the same argument.
Thanks
IP: Logged
|
|
brauer
Member
Member # 398
|
posted 29. January 2003 16:51
Paul writes:
quote:
[The EF is] a rational reconstruction of how intelligent beings (namely, us) infer design.
and then, in describing the ideal way to use the EF: quote:
"...find some way to encode the patterns, so they could be expressed as bit strings. [...] Ask the EF, or other candidate design detection methods, to sift these bit string patterns, avoiding false positives."
Don't these seem contradictory? I mean, who encodes their problem into bit strings before trying to figure out something about it?
Paul: is the EF just a formalization of what we (as scientists) might do, or is it a mindless algorithm that can be applied to bit strings? if the former, how do we avoid encoding the preconceptions of the researcher? If the latter, when can we expect to see the results of an application of the algorithm made public?
I think advocates of the power of the method need to straighten these questions out before they hype the method any more.
[edit to add:] I'd be willing to abandon this thread in favor of continuing discussion on the Stone Circles thread. [ 29. January 2003, 18:57: Message edited by: brauer ]
IP: Logged
|
|
Cre8ionist
Member
Member # 140
|
posted 30. January 2003 08:33
Hi Brauer,
Just a quick one here,
I wonder how your analogy would hold up when applied to say SETI, or Bloomfield's plagiarism program, or even NASA's MAB test.
Perhaps someone should contact SETI and ask them to run the circles through their detector. They may be a message from space! Oh wait, do we already know that they're not? Hmmm, well for the sake of science maybe we should pretend that they possibly are a message?
But then again, maybe we could admit that it would be a fruitless exercise. Likewise, maybe we could even see why it would be a fruitless exercise to submit these circles to a Complex Specified Information detector. Maybe.................................Cre8
IP: Logged
|
|
Paul A. Nelson
Member
Member # 26
|
posted 30. January 2003 09:54
Matt,
I'll reply here, although we should probably have only one thread running right now on this topic.
Encoding stone circle patterns as bit strings is not meant to add any unnecessary complications to the EF-validation experiment. Indeed exactly the opposite is the case: By representing the circular patterns in the "language" of ones and zeros -- the very same mode of representation that my computer uses (at a fundamental level) for meaningful English text -- one would mask the identity of the pattern, and force design detectors to discover its natural origin. Just giving a design detector a photograph of an arctic stone circle would make his analytical task far too easy.
Bit strings as test data have many advantages, the main one being that all different types of data (naturally and intelligently caused) can be expressed as ones and zeros. The bit strings could be placed in a central web archive, and anyone who wanted should be able to download them for analysis. The strings would be identified to design detectors simply by "nametags" (string 1, string 2, string 3). Only the string compilers would know the sources, e.g.,
String 1 [decay of radium]
String 2 [Bach violin concerto]
String 3 [daily temperatures in Greenland for 1994]
String 4 [Robert Frost poem]
String 5 [base sequence of E. coli]
String 6 [radiotelescope static]
String 7 [talus slope distribution of boulders in Canadian Rockies]
And so on.
But what would the design detectors (i.e., human beings, whether assisted by computers or not) see in the web archive? Only something like this:
String 37 1100001100010010110000111001001001001000100101010...
I think it would be a fun experiment, and the outcome would be telling, one way or another. [ 30. January 2003, 10:12: Message edited by: Paul A. Nelson ]
IP: Logged
|
|
charlie d.
Member
Member # 159
|
posted 30. January 2003 11:05
That does seem like an unnecessary complication to me. I actually have no idea how one would digitally encode stone circles. Can you, just as an example, show us how you would encode them, or the bacterial flagellum? [Not the entire code, of course, just how you'd go about doing it: would you encode the molecular coordinates of every atom in the entire flagellar complex, or its 3D aminoacid structure, the primary sequence of the individual proteins, or of the genes, or digitalize an image, or encode a text description... you see my point]
Since Dembski did not feel the need to do any encoding before doing his flagellum calculations, and as I expect much disagreement will arise as per the appropriate encoding method (such as, would the filter be biased somehow by the encoding algorithm), I think this would likely be entirely superfluous, and quite possibly even counterproductive.
Besides, even if the EF was proven thus to be reliable, the validation would then only be limited to digitally encoded strings (not, for instance, to Dembski's orginal flagellum calculation), meaning that any further use of the filter would have to go through the same argument about appropriate digital translation, etc etc etc. Talk about a methodological pain in the butt!
IP: Logged
|
|
Paul A. Nelson
Member
Member # 26
|
posted 30. January 2003 14:28
Designing good (significant) experiments usually is a pain in the butt. Reading about them (e.g., the Michaelson-Morley experiment, the 19th century French debates/experiments about spontaneous generation, or for that matter pretty much any novel experiment where folks have to work hard to set up the conditions) always makes my head ache.
Having said that, I don't think encoding natural patterns would be so hard. Whether Bill Dembski did this or not is entirely beside the point. [ 30. January 2003, 14:59: Message edited by: Paul A. Nelson ]
IP: Logged
|
|
charlie d.
Member
Member # 159
|
posted 30. January 2003 15:03
quote: Designing good (significant) experiments usually is a pain in the butt. Reading about them (e.g., the Michaelson-Morley experiment, the 19th century French debates about spontaneous generation, or for that matter pretty much any novel experiment where folks have to work hard to set up the conditions) always makes my head ache.
I sure do agree. My point was actually that digitally encoding the data before analysis would likely not be a well designed experiment, because it adds another layer of methodological confusion and ambiguity, with numerous consequent potential sources of error. quote: Having said that, I don't think encoding natural patterns would be so hard.
I am not at all mathematically minded, but I am still trying to figure out even the first step of doing something like that, ie, what exactly would you encode in for instance the flagellum or stone circles (an abstract geometric pattern? A 3D spacial reconstruction, and at what level of resolution? A primary sequence for the flagellum?). Maybe you should elaborate on this methodological aspect, because I think this is very much the core of your proposal. quote: Whether Bill Dembski did this or not is entirely beside the point.
If you mean your "string" experiment as a validation of Dembski's approach to infer design for the flagellum (and of the EF in general as used so far), I think it is in fact very much the point, because Dembski did not encode his data for the EF analysis of the flagellum, nor of anything else so far. You can't run your control and experimental samples in different conditions. (Of course, you could run the digitalized control experiment first, and then again the flagellum in digital format, if you like that better). [ 30. January 2003, 15:04: Message edited by: charlie d. ]
IP: Logged
|
|
andyg
Member
Member # 415
|
posted 30. January 2003 17:09
I am a little confused by some of Paul's comments on bit strings and EFs, in particular the following:
quote: But the EF is not a mindless algorithm that could be run by a machine. It's a rational reconstruction of how intelligent beings (namely, us) infer design.
quote: Encoding stone circle patterns as bit strings is not meant to add any unnecessary complications to the EF-validation experiment. Indeed exactly the opposite is the case: By representing the circular patterns in the "language" of ones and zeros -- the very same mode of representation that my computer uses (at a fundamental level) for meaningful English text -- one would mask the identity of the pattern, and force design detectors to discover its natural origin.
What I don't understand is this: if you are intending to test an EF by using bit strings, why could you not use a computer to run the analysis? What more would an intelligent investigator need to do to the data to infer design or not? What form would the analysis of the string take?
AndyG
IP: Logged
|
|
gedanken
Member
Member # 594
|
posted 31. January 2003 00:02
A point I made in “Stone Circles” (but embedded in a much longer post) is of the following issue, and in a slightly different way:
In “irreducible complexity” claim of the bacterial flagellum, ID advocates say that there is an inference of “design”. The claim is that the flagellum can’t have come about by natural processes of evolution, at least not so within a very low probability. I presume that the ID advocates are implying that the flagellum therefore fits the conditions of the EF.
Now the condition for a “design inference” of the EF is that the probability of the event coming about by natural (non-intelligent) processes is less than 10^-150 (or other very low “universal probability bound” in that range). But of course, one way that there could be greater than 10^-150 chance of the flagellum coming about by a natural process of evolution is if there was greater than 10^-150 chance that the ID advocates making the calculation got it wrong. I don’t see how the chance of an error by the people making the calculation (missing the potentially “true” cause) is any less an area of probability for the event to have occurred than is the processes which are being calculated. But Paul A Nelson and John Bracht ask that we don’t consider the probability of an error being made in the ID advocates calculation, because all scientific theories are subject to revision when we gain new information (such as discovering what such an error might be).
By the way, someone might complain that I referred to “ID advocates” running the test -- if it is a repeatable test then anyone should be able to make the inference using the steps of the test. If it is “scientific”, then those investigating should not have a personal bias that the test come out a certain way. If you choose to make that point -- remember that you did so!!!!!
Now instead, consider stone circles before a physical cause was known (I know, we can’t go back -- but we can do a gedanken experiment). Also consider the rings of Saturn. (As an aside specific to the rings of Saturn, they are not predicted by physical law as it is currently understood now. See NASA link.)
So in these cases, we have complexity by normal measures (as opposed to meaning “low probability”), we have specification. All we have left is to check on “low probability” of less than the UPB as the condition of the EF (in Dembski’s terms).
But in the stone circles (even before the answers were better known), and in the rings of Saturn, we suspect that there is an answer to be found. Because of this, we cast off the stone circles (even before knowing their cause to greater certainty) and the rings of Saturn as not falling in the conditions of the EF.
But now it is simply the educated suspicion that we will find a reasonable physical pathway in these cases that casts them out -- even though at the time we might not have identified that pathway. Clearly in these cases, the field has not been “swept clean” of possibilities.
So now to the flagellum. Clearly scientists have presented possible pathways, and similarly we have an educated suspicion that we will find a reasonable physical pathway. How do these differ?
The only difference is a subjective difference of who and how of the suspicion that we will find a reasonable physical pathway.
This makes RBH’s and others’ point about the EF being subjective in its application.
Now another point I make is the distinction between a scientific theory and a test. The EF is a “test”, it is not a “theory”. We can validate whether a “test” works by comparing to “theory” and also by experimental verification. In many cases we have a “test” that we don’t fully know why it works, yet we have experimental verification -- so theory is not absolutely necessary.
Now science is based on repeatability, and is supposed to be independent of all bus the barest philosophical foundations of a form of “naïve realism”. Experiments and tests are supposed to be objective -- and to be repeatable. A test that is not repeatable, or which is overly subjective, is not a viable scientific test.
We don’t need to reduce the EF to strictly a computer “algorithm”. Remember that the first “comptuers” were people in rooms of calculators, they were not state machines. An “algorithm” is simply a repeatable procedure broken down into extremely small steps. The steps and the inputs don’t need to be put in the form of “bits” for the steps to be clear and simple. The EF is written in that manner to a large degree. The variable part is the finding of the probability of the event by the known means. This is the variable part -- and also the part subject to mistakes. It is also the part that would be difficult to convert to a bit stream. The consideration given to pathways that could be expected to be found are precisely where the subjectivity of the test exists.
And because of that, it would both be difficult to convert the EF inputs to “bits”, and simultaneously it demonstrates that the EF is not objective. If it were strictly objective, then it would be possible to convert the inputs to bits, no personal judgment would be necessary -- just answers to a sequence of questions. [ 31. January 2003, 00:40: Message edited by: gedanken ]
IP: Logged
|
|
RBH
Member
Member # 380
|
posted 31. January 2003 00:27
I'm not sure whether this should go in this thread or the original Stone Circles thread. I'll put it here and see what happens. Since these are my own words, written several months ago, that seem particularly appropriate to this discussion I trust the Moderator will allow a posting that is mostly (self-) quotation.
Rather than merely refer generally to the Validating Design Discrimination Methodologies thread, I'll quote three paragraphs from the OP that summarize my argument for the necessity to validate novel research methods in general and design/designer detection methodologies in particular: quote: The proposal of MDT is that those kinds of judgments by humans be systematized and formalized in order to meet the requirement for intersubjective testability. In science we require that given a research paper reporting the results of an experiment, field study, or observational study, the report must contain enough information about the observational methods that another researcher with the requisite technical skills and instruments could replicate the procedures and find the same observed result. That requires that the research methods be researcher-independent , in the sense that application and use of the methodology cannot depend on idiosyncratic characteristics of the person doing the research. Given appropriate instructions for staining a preparation, centering a slide, and focusing, a microscope should present the same image in its eyepiece regardless of who is looking through it. Hence systematization, formalization, and empirical validation of research methods is critical. That is especially the case when proposing and using a novel methodology. The rise and fall of N-rays and Martian 'canals' are instructive cautionary tales in this respect
In developing a design discrimination methodology, MDT has the same task as [Single Designer Theory]. First the methodology must be systematized and formalized. Then it must be empirically validated on test materials for which we already know the histories. As is the case for [Single Designer Theory], the first goal of MDT is to develop a formalized researcher-independent methodology that, when it is applied to phenomena whose provenance and history we do not know, can be legitimately expected to reliably tell us something of interest about the phenomena.
Whether one is trying to discriminate design from no design or distinguish the work of one designer from another, the task of developing, systematizing, formalizing, and empirically validating the methodology is critical. Absent that, claims about one's theory are not (yet) testable. A theory may have all the promise in the world, but without empirically validated methodologies for making observations and gathering data to test hypotheses, it will remain merely an interesting conjecture with no empirical content. (Emphasis added, and an offending acronym replaced)
Absent that kind of validation, repeatability in gedanken's terms, or replicability, will fail. The outcomes of observations (not interpretations, but the observations themselves) will depend on who is doing the observation, not how they are doing it. And that ain't science.
RBH [ 31. January 2003, 00:35: Message edited by: RBH ]
IP: Logged
|
|
Paul A. Nelson
Member
Member # 26
|
posted 31. January 2003 10:29
Andy asked:
quote: if you are intending to test an EF by using bit strings, why could you not use a computer to run the analysis?
I expect that people would use computers, very much as modern cryptanalysis does. But here, as elsewhere, the computer would simply be a tool, and the final decision to say that a bit string was "intelligently caused" versus "naturally caused" would still rest with the investigator(s).
As for encoding the bacterial flagellum -- I don't know how one would do that. Bacterial motors and flagella are pretty complex systems. But that shouldn't be an obstacle, because we've got an infinitely rich pool of other natural patterns, much easier to encode, that could be used in the experiment. The point of the experiment is to see if the EF and other design detection methods (such as that being developed by Wesley Elsberry and Jeff Shallit) avoid or fall prey to false positives. This doesn't seem to me to be such a complicated matter:
1. Encode intelligently caused and naturally caused patterns and events, as bit strings;
2. Ask a design detector (a method, employed by humans) to sift the strings.
3. Score the results.
Frankly I didn't expect quibbles about this proposal to come from ID skeptics, but from ID proponents who weren't keen on having to face a test (yet, anyway). But, heck -- the world is a surprising place.
Gedanken and RBH -- I've read through your posts twice, and I confess that I just don't understand your worries. Put that down to my stupidity if you like. The bit string proposal really isn't that difficult or ambiguous. Here's a parallel situation. Some years ago, a physicist (I think he was a physicist) claimed to be able to identify LP recordings, by composer, opus, etc., simply from examining the vinyl discs (with the labels covered, of course). James Randi was skeptical, so he arranged a test, assembling a collection of LPs with their labels carefully masked. The physicist correctly identified every single recording -- that is, every single recording of classical music. Randi had included some rock albums, as wild cards, and the physicist said, "I don't know what these are -- sorry." Randi wrote up the results for Skeptical Inquirer, acknowledging that the guy could do what he claimed to do.
If the EF really does detect design, and avoids false positives, let's test it. I'd like CSICOP to run the test, or, if CSICOP won't do it, Michael Shermer's skeptics organization. Why not? [ 31. January 2003, 10:46: Message edited by: Paul A. Nelson ]
IP: Logged
|
|
GP
Member
Member # 570
|
posted 31. January 2003 11:15
quote: 1. Encode intelligently caused and naturally caused patterns and events, as bit strings;
2. Ask a design detector (a method, employed by humans) to sift the strings.
3. Score the results.
Paul, what do you envision is the time-frame for implementing step #2? I ask because earlier you said: quote: But someone -- some person -- has to use the EF. If that person is using the EF correctly, and we give him the stone circles as test data, he should first ask if a natural cause explains the pattern. Off he goes to the geology library; enters "arctic stone circles" in the search box; journal articles (actually, an extensive literature) pop up; the natural cause node of the EF intercepts the pattern. Design never arises as a plausible cause.
When should we be sufficiently satisfied with fulfilling the natural cause node of the EF with regards to these bit strings that you are proposing? Could it be possible that someone going through the EF never gets past this node?
Thanks in advance for your replies. [ 31. January 2003, 11:16: Message edited by: GP ]
IP: Logged
|
|
Paul A. Nelson
Member
Member # 26
|
posted 31. January 2003 11:29
GP asked:
quote: When should we be sufficiently satisfied with fulfilling the natural cause node of the EF with regards to these bit strings that you are proposing? Could it be possible that someone going through the EF never gets past this node?
Yes.
As I've discussed the proposal with Bill Dembski, Wesley Elsberry, and others, we've talked about a one-year time frame for the bit string experiment. Put the archive of strings on the web, and one year later, ask for results. Since the composition of the archive would be entirely at the discretion of the sponsoring organization (let's say CSICOP), the design detectors would have no way of knowing whether any "intelligently caused" strings were present, or just a few, or many.
I could imagine an archive with all "naturally caused" strings, for instance. The world, after all, might be like that. In that case, the design detectors (if successful) would come back either with nothing or with "this is radioactive decay, these are solar flares, this is a stone circle, this is the period of a pulsar [and so on] -- but we couldn't identify any intelligently caused patterns." [ 31. January 2003, 11:30: Message edited by: Paul A. Nelson ]
IP: Logged
|
|
|