ISCID Forums


Post New Topic  Post A Reply
my profile | search | faq | forum home
  next oldest topic   next newest topic
» ISCID Forums   » General   » Brainstorms   » David Owen: A Shot in the Dark

   
Author Topic: David Owen: A Shot in the Dark
Moderator
Administrator
Member # 1

Icon 1 posted 02. June 2003 23:36      Profile for Moderator   Email Moderator   Send New Private Message       Edit/Delete Post 
A Shot in the Dark
by David Owen

Abstract- What can we learn from a shot in the dark? What can a relatively simple, random procedure tell us about an unknown and potentially complex space? Simple techniques based on randomized algorithms have been surprisingly successful in solving very dificult computational problems. This may be due to a phase transition observed for certain problems: most problem cases are easy to solve or easily shown unsolvable; few approach the worst case. Or it may be because of funnels: small sets of key variables, found in many systems, that largely determine the behavior of the entire system.

We summarize experiments motivated by these ideas, experiments in which 1) a simple random search outperformed more sophisticated strategies in finding subtle errors in software models, and 2) machine learning was used to determine which models are best and worst candidates for random search. We conclude with a broader application of these ideas, suggesting that the success of random search procedures, sometimes cited as evidence that intelligent design is unnecessary, may actually be evidence of intelligent design present in the search space.

To read the entire paper, click here.

[ 02. June 2003, 23:37: Message edited by: Moderator ]

IP: Logged
RBH
Member
Member # 380

Icon 1 posted 03. June 2003 15:28      Profile for RBH     Send New Private Message       Edit/Delete Post 
Owen has provided an interesting paper in both search and software validation, and demonstrates (among other things) that I have been away from that literature too long! I am particularly intrigued by the paper because Owens' "shot in the dark" technique is identical to a procedure I spent several months testing back in 1991 when I was working on using GAs and other search techniques to train neural nets. I called it "fly casting," but it was the same technique: toss random probes out into the phase space of possibilities. I failed to find any benefit due to fly casting, but I know now (waaay too late!) that the failure was not due to the technique itself but was due to an inadequate representation of the relevant variables of the problem space. Fly casting didn't have a chance because my representation was flawed.

Main Study

The body of the paper - all but the last section - describes comparative research on a software validation/search method called "Lurch." Lurch employs random "shots in the dark" to evaluate the ability of a given program to perform the function(s) the specifications for the program require. What is particularly interesting about Lurch from the point of view of problem-solving and search is that it is characterized by what appears to be an exponentially decreasing probability of finding a solution over iterations of random attempts. That implies that if Lurch doesn't find an answer (determine whether the program under test is flawed) in relatively few iterations, it won't find one and there's a high probability that there isn't one. In other words, Lurch 'knows' when a problem is too hard for it and to give up, producing an output that says 'I haven't found a solution and therefore there probably isn't one'! Or, to be more precise, Lurch's user knows how to set a stopping criterion for Lurch based on iterations or some other variable (called saturation below) that indicates that Lurch has found all it probably will on the problem.

Lurch is clearly superior to a couple of other search techniques designed for the same sort of problem in the universe of tasks employed in the research, to determine whether a given tic-tac-toe board (ranging in size from 2x2 to 15x15) represents a position that is winnable for one or the other player. The search techniques must evaluate all possible sequences of play from the given board configuration to ascertain whether the configuration represents a winnable position for one player or the other, or not. (Note that Lurch's task is not to determine which player will win, but only that there is or is not the possibility of a winner.) The number of potential sequences scales exponentially with the dimensions of the board, so this problem rapidly exhausts the practical capacity of a real system to perform an exhaustive search.

The stopping rule set for Lurch - when it decided that it has failed to find a winning sequence and therefore probabilistically concluded that a board configuration has no winning outcome - was a function of "saturation." Saturation is the percentage of new sequences tested by the random shots in the dark, and decreases rapidly with the number of iterations. As Owen notes, this stopping rule (as distinguished from Lurch's generation of random probes) requires that Lurch keep some kind of record of what has been previously tested, but since a 'satisficing' (in Herbert Simon's sense) stopping rule does not require detailed records of every single sequence, Lurch used a memory-conserving hashing scheme to provide a rough estimate of saturation. That estimate served as the termination condition against an experimenter-defined criterion.

The data show that Lurch outperforms the two other search techniques tested, SPIN (two varieties) and SMV. Lurch solved the problems more quickly and with fewer errors that either SPIN (both varieties) and SMV. SPIN and SMV differ from Lurch in several respects. Both SPIN and SMV are exhaustive search algorithms - they check every single potential sequence of play, while Lurch, as noted above, tests random sequences. An important implication of that difference is that SPIN and SMV must keep track of states that were tested previously, while Lurch does not. Keeping track of states in a problem space that scales exponentially puts an exponentially increasing load on the machine performing the search, and indeed, SPIN in its 'normal' mode and SMV both exhausted the memory of the machine on which the research was run at board sizes around 9x9 or 10x10. SPIN has a "supertrace" memory-saving approximation mode that was roughly comparable to Lurch in its demand for memory over board sizes.

Constraining the Search Space

Owen used a machine learning technique, TAR-2, to determine parameter ranges with which to constrain the space of potential solutions that Lurch tested. When trained on a training set of data, TAR-2 provides the restricted ranges of values for attributes that are associated with solutions of varying 'goodness.' Thus one can ascertain the range(s) of values for input parameters that are most important in determining 'good' versus 'bad' outputs. When problems whose input ranges of attributes so constrained were presented to Lurch, Lurch's performance was substantially better than when it performed unconstrained searches. Thus, "intelligently" constraining the search space results in more efficient and more successful searches.

The success of search path constraints in improving the performance of a random search technique is explained by the idea that pathways in search spaces are not homogenous; that some pathways are "funnels" defined in attribute terms. Recall that the test algorithms must follow each sequence of moves from a given configuration of a tic-tac-toe board to determine whether there can be a winner. Those sequences are pathways, defined on problem attributes, in the search space of all sequences. If there are some ranges of values of problem attributes that are more likely to 'contain' potentially fruitful solutions (in the sense of being more likely to lead to correct solutions) then constraining the 'random' probes of Lurch to those ranges is more likely to produce a successful outcome - a correct classification of the board as winnable or not.

Relating the research to Intelligent Design

In the last section of the paper Owen tries to relate the findings to the question of intelligent design in biology. This is where I was disappointed. Owen derives two related implications from the research on Lurch. First Owen infers that although unconstrained random Lurch is more effective at the task set for it than two other search techniques explicitly designed for that kind of task, nevertheless
quote:
... the success of Lurch on a software model ... in no way precludes intelligent design of the overall system. (p. 11)
In essence, Owen says that just because a simple random system outperforms a designed system doesn't rule out the hypothesis that the simple system (or the larger system in which the simple system operates) was designed. In the absence of any constraints on the properties of "intelligent design" (and there are none described in the paper), that is a strikingly unremarkable conclusion.

The second (apparent) implication that Owen draws is more interesting and less trivial. Implicitly (it seems) depending on the results of the TAR-2-derived search constraints, Owen suggests that
quote:
In fact, as we have seen above, it is possible to design a system in such a way that the right sort of behavior emerges. Given a range of possible software models we can sort them according to their tendency to reveal information and determine which sorts of models are amenable to random search. (p. 11)
I am not completely certain that Owen is here referring to the TAR-2 experiment. The vagueness is regrettable since it allows substantial (unconstrained!) wiggle room for interpretations. Not only is Owen not clear about the basis for that assertion, he does not explicitly relate the TAR-2 findings to intelligent design at all. But that is the genuinely interesting finding!

The question for biological evolution that is raised by the TAR-2 experiment is this: Are the pathways 'searched' by random mutations in conjunction with natural selection naturally constrained in a manner analogous to the way the pathways that Lurch randomly searched were constrained by the TAR-2 machine learning results? Is there a naturally occurring analogue to the TAR-2 constraining mechanism in natural evolution? In evolutionary theory this translates directly to the question of the evolution of evolvability. There is a growing literature on just that topic, and any implications Owen's research might have for intelligent design in biological evolution must be evaluated in the light of that literature. Because Owen is so coy about making the connection, though, it's difficult to directly evaluate his notion of the connection. For a recent example of that literature see Caporale's "Foresight in Genome Evolution" in the most recent (May-June 2003) American Scientist. Caporale argues "... that under selective pressure, the probability of different classes of mutations can change, with consequences for survival" (p. 241). That natural process of evolving constraints is precisely analogous to the kind of constraints on Lurch's search space that Owen derived from the TAR-2 machine learning procedure. I strongly recommend that the two papers be read together, as I serendipitously did.

RBH

Edited for (random!) typos.

Edited again real late for a truly weird paragraph reversal error. I mean, dyslexia is one thing, but a whole paragraph? Uh oh!

[ 03. June 2003, 17:11: Message edited by: RBH ]

IP: Logged
Rex Kerr
Member
Member # 632

Icon 1 posted 03. June 2003 21:31      Profile for Rex Kerr     Send New Private Message       Edit/Delete Post 
I found this to be an interesting paper, although section 10 could have been written a little more clearly.

However, I would note that even random models were searched efficiently with Lurch a significant fraction of the time. This suggests that design of searchability may not be necessary; searchability may not itself be a rare criterion.

(Indeed, I suspect that Lurch would most frequently perform poorly on certain classes of human-designed systems where we have predicted rare behaviors and rely upon them for the function of the system. For instance, off-by-one errors are common in programming, but Lurch would be unlikely to find them efficiently by querying input space at random.)

IP: Logged
warren_bergerson
Member
Member # 262

Icon 1 posted 04. June 2003 10:54      Profile for warren_bergerson   Email warren_bergerson   Send New Private Message       Edit/Delete Post 
I think it is useful to separate the discussion of this article into 1)mathematical findings and 2)implications of mathematical findings for genetic evolution.

IMO, the mathematical findings are not exactly new, but they are worth noting including:

1. The term ‘random search’ covers a very wide range of different search techniques, involving a wide range of different processes and mechanisms. Different random search techniques can produce very different results. The Lurch program might more accurately be described as a simple modifiable search algorithm with feedback features and random walk features. It is at the very least imprecise to treat all ‘random search’ or ‘genetic algorithm’ search systems as a single type of search process.

2. Different types of search routines have different degrees of success in solving different types of problems. The effectiveness or efficiency of a search is a function of both the type of search routine and the type of problem. Two important corollaries. 1) No one search routine can efficiently solve all problems. and 2)The right combination of the search routine and problem can result in very fast, very efficient problem solving.

3. Using search routines in problem solving requires a direct relationship between ‘search objects’ and ‘solution implementation’.

The implications for genetic evolution in biological systems include:

1. If genetic evolutionary change involves a single random search process, then there should be some types of adaptive problems which biological systems have extreme difficulty in solving. I am aware of no evidence suggesting a single random search process.

2. If genetic evolutionary processes are dynamic, and if the same types of evolutionary changes occur over and over again, then it is possible that biological systems have the capacity to ‘evolve search routines which are appropriate to specific types of adaptive problems’. The speed and efficiency with which organisms adapt to many problems suggests that genetic evolutionary processes are indeed dynamic, and genetic evolutionary processes have evolved to match search process to type of adaptive problem.

3. The relationship between ‘problem solving search routines’ and ‘genetic evolutionary change’ is difficult to evaluate if we assume that the search object is the gene or string of genetic code. This is in large part because the relationships between search object-gene and search implementation- phenotype is so ambiguous.

4. The relationship between ‘problem solving search routine’ and ‘genetic evolutionary change appears to be far more direct and analyzable if the search object/search implementation is the ‘process or program controlling gene expression events’. Based on current knowledge, there appears to be a direct relationship between ‘gene expression’ and phenotype/survival related trait. The impact of dynamic, evolving search routines would seem to be directly relevant to the process responsible for reprogramming gene expression processes.

IP: Logged
David_Owen
Member
Member # 715

Icon 1 posted 10. June 2003 00:56      Profile for David_Owen   Email David_Owen   Send New Private Message       Edit/Delete Post 
Thanks to all of you for your careful reading and comments. I am new to the ISCID site, but I'm learning more as I try to follow some of the posted discussions. Below I've tried to expand and better explain the final section of the paper, in which I tried to relate our software testing results and ideas to Intelligent Design:

Non-technical popular science books written in defense of naturalistic evolution (like Dennet's "Darwin's Dangerous Idea", quoted in the paper) sometimes use the analogy of a nondeterministic computer algorithm as
circumstantial evidence for evolution: if it works on the computer, it works in biology too.

So the first (perhaps "unremarkable") point I tried to make is just that the success of a nondeterministic search algorithm may have something to do with the structure of the search space. And, by analogy, the success of an apparently un-designed evolutionary process may have to do with the structure of the space of biological possibilities. At the very least the success of an un-designed search does not preclude design present in the search space.

The TAR2 machine learning experiments provide a paradigm for thinking about how the search space can be modified in order to facilitate search by simple, random shot-in-the-dark procedures. In these experiments the basic question we want to answer is: how would we design a space that would tend to reveal itself to a shot-in-the-dark search procedure?

The point is not that random (apparently un-designed) spaces won't occasionally be good for random search, but: is it possible to learn how to design better search spaces? In software modeling, this corresponds to answering questions about various tradeoffs. For example, would it be better to design a system with many small modules or a few larger modules? Should global or local data be used to store certain information...

Back to the analogy: it might be that the space of biological possibilities could be designed so that a simple, random evolutionary procedure is surprisingly successful at finding complex life forms.

The first two points below summarize the paragraphs above, and the third is really the point I am trying to make (assuming domain-specific definitions of "design" and "success"):

  • The success of a random search doesn't preclude design present in the search space;
  • Spaces may be designed so that random search is more likely to succeed;
  • When we observe a successful random search, we should ask whether the search space was designed (and not just assume it's the worst-case unconstrained space).
This represents, for me at least, a new and different way to think about design. The old way was: how is it that such a hard problem could be solved without help from a designer's mind? The new way is something like: given that a hard problem has been solved, by an apparently mindless procedure, could it be that the problem wasn't so hard after all? Could it be that the problem was itself designed to facilitate a mindless solution procedure?
IP: Logged
Rex Kerr
Member
Member # 632

Icon 1 posted 10. June 2003 02:57      Profile for Rex Kerr     Send New Private Message       Edit/Delete Post 
David, I agree with you that "designing to be easily randomly searched" is a novel and intriguing way to think about design.

However, the most obvious question is then whether design is necessary for searchability. It seems as though your data answers this question in the following way: it helps, but is not necessary. Please correct me if my interpretation is wrong, but it looked to me that a majority of random spaces were fairly effectively randomly searched. Thus, it would seem that random search can be "surprisingly successful" in the absence of design--our surprise is a result of poor intuition about random search, not because of good intuition but an unexpectedly searchable space.

IP: Logged
RBH
Member
Member # 380

Icon 1 posted 10. June 2003 15:39      Profile for RBH     Send New Private Message       Edit/Delete Post 
David wrote
quote:
Could it be that the problem was itself designed to facilitate a mindless solution procedure?
There's an alternate wording for that question, too, in light of the evolution of evolvability literature: 'Can a problem (re)design itself to facilitate a mindless solution procedure?' If one looks at the 'problem' as jointly defined by the search space and the algorithm, David's suggestion is, as Rex says, a very interesting way to reformulate the question.

RBH

IP: Logged
David_Owen
Member
Member # 715

Icon 1 posted 10. June 2003 17:59      Profile for David_Owen   Email David_Owen   Send New Private Message       Edit/Delete Post 
So it looks like there are (at least) two ways to explain why a random search might be surprisingly successful: 1) the size of the search space was overestimated (it may have been designed to facilitate the random search), or 2) the power of the random search was underestimated.

It's true that in the TAR2 experiment the random search is often successful even in the first round. Therefore, given a single example model in which the search was successful, we would not be able to determine whether it came from the first round of models or the second "more designed" round. We'd have no reason to opt for explanation 1 above.

But in some cases explanation 1 might be the right one. It would depend on domain-specific definitions of "design," "success," and, most importantly, what the first-round distribution looks like (for example, is there a large difference between the probability of success in the first-round distribution and the final round distribution?).

This suggests an evolution-of-evolvability vs. designed-search-space debate. Perhaps the debate is already going on? I need to read the papers you've suggested.

IP: Logged


All times are East Coast  
Post New Topic  Post A Reply Close Topic    Move Topic    Delete Topic    Top Topic next oldest topic   next newest topic
 - Printer-friendly view of this topic
Hop To:

Contact Us | ISCID

All content © ISCID and content contributor 2001-2003

The ISCID Forums are aimed at generating insight into the nature of complex systems (e.g. biological complexity, organizational complexity, etc.) and the ontological status of purpose, especially from the vantage point of various information- and design-theoretic models.

Indexed by UBB Spider Hack  |  Powered by Infopop Corporation UBB.classicTM 6.3.1.1

PCID | Encyclopedia | Brainstorms | The Archive | News | Essay Contests | Chat Events | Membership