Intelligent Search

Introduction

Problem solving requires two prime considerations: first representation of
the problem by an appropriately organized state space and then testing the
existence of a well-defined goal state in that space. Identification of the goal
state and determination of the optimal path, leading to the goal through one
or more transitions from a given starting state, will be addressed in this
chapter in sufficient details. The chapter, thus, starts with some well-known
search algorithms, such as the depth first and the breadth first search, with
special emphasis on their results of time and space complexity. It then
gradually explores the ‘heuristic search’ algorithms, where the order of
visiting the states in a search space is supported by thumb rules, called
heuristics, and demonstrates their applications in complex problem solving.
It also discusses some intelligent search algorithms for game playing.

Common
experience reveals that a search problem is associated with two important issues: first ‘what to search’ and secondly ‘where to search’. The first one is
generally referred to as ‘the key’, while the second one is termed ‘search
space’. In AI the search space is generally referred to as a collection of states
and is thus called state space. Unlike common search space, the state space in
most of the problems in AI is not completely known, prior to solving the
problem. So, solving a problem in AI calls for two phases: the generation of
the space of states and the searching of the desired problem state in that space.
Further, since the whole state space for a problem is quite large, generation of
the whole space prior to search may cause a significant blockage of storage,
leaving a little for the search part. To overcome this problem, the state space
is expanded in steps and the desired state, called “the goal”, is searched after
each incremental expansion of the state space.

Depending on the methodology of expansion of the state space and
consequently the order of visiting the states, search problems are differently
named in AI. For example, consider the state space of a problem that takes the
form of a tree. Now, if we search the goal along each breadth of the tree,
starting from the root and continuing up to the largest depth, we call it
breadth first search. On the other hand, we may sometimes search the goal
along the largest depth of the tree, and move up only when further traversal
along the depth is not possible. We then attempt to find alternative offspring
of the parent of the node (state) last visited. If we visit the nodes of a tree
using the above principles to search the goal, the traversal made is called
depth first traversal and consequently the search strategy is called depth first
search. We will shortly explore the above schemes of traversal in a search
space. One important issue, however, needs mention at this stage. We may
note that the order of traversal and hence search by breadth first or depth first
manner is generally fixed by their algorithms. Thus once the search space,
here the tree, is given, we know the order of traversal in the tree. Such types
of traversal are generally called ‘deterministic’. On the other hand, there exists
an alternative type of search, where we cannot definitely say which node will
be traversed next without computing the details in the algorithm. Further, we
may have transition to one of many possible states with equal likelihood at an
instance of the execution of the search algorithm. Such a type of search, where
the order of traversal in the tree is not definite, is generally termed ‘nondeterministic’. Most of the search problems in AI are non-deterministic.

General Problem Solving Approaches

There exist quite a large number of problem solving techniques in AI that rely
on search. The simplest among them is the generate and test method. The
algorithm for the generate and test method can be formally stated as follows:

Procedure Generate & Test

Begin

Repeat

Generate a new state and call it current-state;

Until current-state = Goal;

End.

It is clear from the above algorithm that the algorithm continues the
possibility of exploring a new state in each iteration of the repeat-until loop
and exits only when the current state is equal to the goal. Most important part
in the algorithm is to generate a new state. This is not an easy task. If
generation of new states is not feasible, the algorithm should be terminated.
In our simple algorithm, we, however, did not include this intentionally to
keep it simplified.

But how does one generate the states of a problem? To formalize this, we
define a four tuple, called state space, denoted by
{ nodes, arc, goal, current },
where
nodes represent the set of existing states in the search space;
an arc denotes an operator applied to an existing state to cause
transition to another state;
goal denotes the desired state to be identified in the nodes; and
current represents the state, now generated for matching with the goal.

We will now present two typical algorithms for generating the state
space for search. These are depth first search and breadth first search.

Breadth First Search

The breadth first search algorithm visits the nodes of the tree along its
breadth, starting from the level with depth 0 to the maximum depth. It can be
easily realized with a queue. For instance, consider the tree, given in figure.
Here, the nodes in the tree are traversed following their ascending ordered labels.

The order of traversal in a tree of depth 3 by
breadth first manner.

The algorithm for traversal in a tree by depth first manner can be
presented with a queue as follows:

Procedure Breadth-first-search

Begin

i) Place the starting node in a queue;
ii) Repeat

Delete queue to get the front element;
If the front element of the queue = goal,
return success and stop;
Else do
Begin

insert the children of the front element,
if exist, in any order at the rear end of
the queue;

End

Until the queue is empty;

End.

The breadth first search algorithm, presented above, rests on a simple
principle. If the current node is not the goal add the offspring of the
current in any order to the rear end of the queue and redefine the front
element of the queue as the current. The algorithm terminates, when the goal
is found.

First few steps of breadth first search on the tree of fig.

Time Complexity
For the sake of analysis, we consider a tree of equal branching factor from
each node = b and largest depth = d. Since the goal is not located within
depth (d-1), the number of false search is given by

Further, the first state within the fringe nodes could be the goal. On the
other hand, the goal could be the last visited node in the tree. Thus, on an
average, the number of fringe nodes visited is given by

Consequently, the total number of nodes visited in an average case becomes

Since the time complexity is proportional to the number of nodes visited,
therefore, the above expression gives a measure of time complexity.

Space Complexity
The maximum number of nodes will be placed in the queue, when the
leftmost node at depth d is inspected for comparison with the goal. The
queue length under this case becomes bd. The space complexity of the
algorithm that depends on the queue length, in the worst case, thus, is of the
order of bd.
In order to reduce the space requirement, the generate and test algorithm
is realized in an alternative manner, as presented below.

Depth First Search

The depth first search generates nodes and compares them with the goal along
the largest depth of the tree and moves up to the parent of the last visited
node, only when no further node can be generated below the last visited node.
After moving up to the parent, the algorithm attempts to generate a new
offspring of the parent node. The above principle is employed recursively to
each node of a tree in a depth first search. One simple way to realize the
recursion in the depth first search algorithm is to employ a stack. A stackbased
realization of the depth first search algorithm is presented below.

Procedure Depth first search

Begin

Push the starting node at the stack,
pointed to by the stack-top;
While stack is not empty do

Begin

Pop stack to get stack-top element;
If stack-top element = goal, return

success and stop

Else push the children of the stack-top
element in any order into the stack;

End.

End while;

End.

Fig. A:Depth first search on a tree, where the node numbers denote
the order of visiting that node.

In the above algorithm, a starting node is placed in the stack, the top of
which is pointed to by the stack-top. For examining the node, it is popped
out from the stack. If it is the goal, the algorithm terminates, else its children
are pushed into the stack in any order. The process is continued until the stack
is empty. The ascending order of nodes in fig. A represents its traversal on
the tree by depth first manner. The contents of the stack at the first few
iterations are illustrated below in fig. B. The arrowhead in the figure denotes
the position of the stack-top.

Fig. B: A snapshot of the stack at the first few iterations.

Space Complexity
Maximum memory in depth first search is required, when we reach the
largest depth at the first time. Assuming that each node has a branching
factor b, when a node at depth d is examined, the number of nodes saved in
memory are all the unexpanded nodes up to depth d plus the node being
examined. Since at each level there are (b-1) unexpanded nodes, the total
number of memory required = d (b -1) +1. Thus the space complexity of
depth first search is a linear function of b, unlike breadth first search, where
it is an exponential function of b. This, in fact, is the most useful aspect of
the depth first search.

Time Complexity
If we find the goal at the leftmost position at depth d, then the number of
nodes examined = (d +1). On the other hand, if we find the goal at the
extreme right at depth d, then the number of nodes examined include all the
nodes in the tree, which is

So, the total number of nodes examined in an average case

This is the average case time complexity of the depth first search algorithm.
Since for large depth d, the depth first search requires quite a large
runtime, an alternative way to solve the problem is by controlling the depth
of the search tree. Such an algorithm, where the user mentions the initial depth cut-off at each iteration, is called an Iterative Deepening Depth First
Search or simply an Iterative deepening search.

Iterative Deepening Search

When the initial depth cut-off is one, it generates only the root node and
examines it. If the root node is not the goal, then depth cut-off is set to two
and the tree up to depth 2 is generated using typical depth first search.
Similarly, when the depth cut-off is set to m, the tree is constructed up to
depth m by depth first search. One may thus wonder that in an iterative
deepening search, one has to regenerate all the nodes excluding the fringe
nodes at the current depth cut-off. Since the number of nodes generated by
depth first search up to depth h is

the total number of nodes expanded in failing searches by an iterative
deepening search will be

The last pass in the algorithm results in a successful node at depth d, the
average time complexity of which by typical depth first search is given by

Thus the total average time complexity is given by

Consequently, the ratio of average time complexity of the iterative deepening
search to depth first search is given by

The iterative deepening search thus does not take much extra time, when
compared to the typical depth first search. The unnecessary expansion of the
entire tree by depth first search, thus, can be avoided by iterative deepening. A
formal algorithm of iterative deepening is presented below.

Procedure Iterative-deepening

Begin

1. Set current depth cutoff =1;
2. Put the initial node into a stack, pointed to by stack-top;
3. While the stack is not empty and the depth is within the
given depth cut-off do

Begin

Pop stack to get the stack-top element;
if stack-top element = goal, return it and stop
else push the children of the stack-top in any order
into the stack;

End.

End While;
4. Increment the depth cut-off by 1 and repeat
through step 2;

End.

The breadth first, depth first and the iterative deepening search can be
equally used for Generate and Test type algorithms. However, while the
breadth first search requires an exponential amount of memory, the depth first
search calls for memory proportional to the largest depth of the tree. The
iterative deepening, on the other hand, has the advantage of searching in a

depth first manner in an environment of controlled depth of the tree.

Hill Climbing

The ‘generate and test’ type of search algorithms presented above only
expands the search space and examines the existence of the goal in that space.
An alternative approach to solve the search problems is to employ a function
f(x) that would give an estimate of the measure of distance of the goal from
node x. After f(x) is evaluated at the possible initial nodes x, the nodes are sorted in ascending order of their functional values and pushed into a stack in
the ascending order of their ‘f’ values. So, the stack-top element has the least f
value. It is now popped out and compared with the goal. If the stack-top
element is not the goal, then it is expanded and f is measured for each of its
children. They are now sorted according to their ascending order of the
functional values and then pushed into the stack. If the stack-top element is
the goal, the algorithm exits; otherwise the process is continued until the
stack becomes empty. Pushing the sorted nodes into the stack adds a depth
first flavor to the present algorithm. The hill climbing algorithm is formally
presented below.

Procedure Hill-Climbing

Begin

1. Identify possible starting states and measure the distance (f) of their
closeness with the goal node; Push them in a stack according to the
ascending order of their f ;
2. Repeat

Pop stack to get the stack-top element;
If the stack-top element is the goal, announce it and exit
Else push its children into the stack in the ascending order of their
f values;

Until the stack is empty;

End.

Moving along a ridge in two steps (by two successive
operators) in hill climbing.

The hill climbing algorithm too is not free from shortcomings. One
common problem is trapping at local maxima at a foothill. When trapped at
local maxima, the measure of f at all possible next legal states yield less
promising values than the current state. A second drawback of the hill
climbing is reaching a plateau. Once a state on a plateau is reached, all legal next states will also lie on this surface, making the search ineffective. A
new algorithm, called simulated annealing, discussed below could easily
solve the first two problems. Besides the above, another problem that too
gives us trouble is the traversal along the ridge. A ridge (vide figure above) on
many occasions leads to a local maxima. However, moving along the ridge is
not possible by a single step due to non-availability of appropriate operators.
A multiple step of movement is required to solve this problem.

Simulated Annealing

“Annealing” is a process of metal casting, where the metal is first melted at a
high temperature beyond its melting point and then is allowed to cool down,
until it returns to the solid form. Thus in the physical process of annealing,
the hot material gradually loses energy and finally at one point of time reaches
a state of minimum energy. A common observation reveals that most physical
processes have transitions from higher to lower energy states, but there still
remains a small probability that it may cross the valley of energy states [2]
and move up to a energy state, higher than the energy state of the valley. The
concept can be verified with a rolling ball. For instance, consider a rolling
ball that falls from a higher (potential) energy state to a valley and then moves
up to a little higher energy state (vide figure below). The probability of such

A rolling ball passes through a valley to a higher
energy state.
transition to a higher energy state, however, is very small and is given by

where p is the probability of transition from a lower to a higher energy state,
ΔE denotes a positive change in energy, K is the Boltzman constant and T is
the temperature at the current thermal state. For small ΔE, p is higher than the
value of p, for large ΔE. This follows intuitively, as w.r.t the example of ball
movement, the probability of transition to a slightly higher state is more than
the probability of transition to a very high state.
An obvious question naturally arises: how to realize annealing in search?
Readers, at this stage, would remember that the need for simulated annealing
is to identify the direction of search, when the function f yields no better
next states than the current state. Under this circumstance, ΔE is computed for
all possible legal next states and p’ is also evaluated for each such next state
by the following formula:

A random number in the closed interval of [0,1] is then computed and p’
is compared with the value of the random number. If p’ is more, then it is
selected for the next transition. The parameter T, also called temperature, is
gradually decreased in the search program. The logic behind this is that as T
decreases, p’ too decreases, thereby allowing the algorithm to terminate at a
stable state. The algorithm for simulated annealing is formally presented
below.

Procedure Simulated Annealing

Begin

1. Identify possible starting states and measure the distance (f) of their
closeness with the goal; Push them in a stack according to the
ascending order of their f ;
2. Repeat

Pop stack to get stack-top element;
If the stack-top element is the goal,

announce it and exit;

Else do
Begin

a) generate children of the stack-top element N and
compute f for each of them;
b) If measure of f for at least one child of N is improving
Then push those children into stack in ascending order of
their f;
c) If none of the children of N is better in f
Then do
Begin

a) select any one of them randomly, compute its p’ and test
whether p’ exceeds a randomly generated number in the interval
[0,1]; If yes, select that state as the next state; If no, generate
another alternative legal next state and test in this way until one
move can be selected; Replace stack-top element by the selected
move (state);
b) Reduce T slightly; If the reduced value is negative, set it to
zero;

End;

End.

Until the stack is empty;

End.

The algorithm is similar to hill climbing, if there always exists at least
one better next state than the state, pointed to by the stack-top. If it fails, then
the last begin-end bracketed part of the algorithm is invoked. This part
corresponds to simulated annealing. It examines each legal next state one by
one, whether the probability of occurrence of the state is higher than the
random value in [0,1]. If the answer is yes, the state is selected, else the next
possible state is examined. Hopefully, at least one state will be found whose
probability of occurrence is larger than the randomly generated probability.

Another important point that we did not include in the algorithm is the
process of computation of ΔE. It is computed by taking the difference of the
value of f of the next state and that of the current (stack-top) state.

The third point to note is that T should be decreased once a new state with
less promising value is selected. T is always kept non-negative. When T
becomes zero, p’ will be zero and thus the probability of transition to any
other state will be zero.

Heuristic Search

This section is devoted to solve the search problem by a new technique, called
heuristic search. The term “heuristics” stands for ” thumb rules”, i.e., rules
which work successfully in many cases but its success is not guaranteed.

In fact, we would expand nodes by judiciously selecting the more promising
nodes, where these nodes are identified by measuring their strength compared
to their competitive counterparts with the help of specialized intuitive
functions, called heuristic functions.

Heuristic search is generally employed for two distinct types of
problems:
i) forward reasoning and
ii) backward reasoning.
We have already
discussed that in a forward reasoning problem we move towards the goal state
from a pre-defined starting state, while in a backward reasoning problem, we
move towards the starting state from the given goal. The former class of
search algorithms, when realized with heuristic functions, is generally called
heuristic Search for OR-graphs or the Best First search Algorithms. It may be
noted that the best first search is a class of algorithms, and depending on the
variation of the performance measuring function it is differently named. One
typical member of this class is the algorithm A*. On the other hand, the
heuristic backward reasoning algorithms are generally called AND-OR graph
search algorithms and one ideal member of this class of algorithms is the
AO* algorithm. We will start this section with the best first search algorithm.

Heuristic Search for OR Graphs

Most of the forward reasoning problems can be represented by an OR-graph,
where a node in the graph denotes a problem state and an arc represents an
application of a rule to a current state to cause transition of states. When a
number of rules are applicable to a current state, we could select a better state
among the children as the next state. We remember that in hill climbing, we
ordered the promising initial states in a sequence and examined the state
occupying the beginning of the list. If it was a goal, the algorithm was
terminated. But, if it was not the goal, it was replaced by its offsprings in any
order at the beginning of the list. The hill climbing algorithm thus is not free
from depth first flavor. In the best first search algorithm to be devised
shortly, we start with a promising state and generate all its offsprings. The
performance (fitness) of each of the nodes is then examined and the most
promising node, based on its fitness, is selected for expansion. The most
promising node is then expanded and the fitness of all the newborn children is
measured. Now, instead of selecting only from the generated children, all the
nodes having no children are examined and the most promising of these
fringe nodes is selected for expansion. Thus unlike hill climbing, the best
first search provides a scope of corrections, in case a wrong step has been
selected earlier. This is the prime advantage of the best first search algorithm
over hill climbing. The best first search algorithm is formally presented
below.

Procedure Best-First-Search

Begin

1. Identify possible starting states and measure the distance (f) of their
closeness with the goal; Put them in a list L;
2. While L is not empty do

Begin

a) Identify the node n from L that has the minimum f; If there
exist more than one node with minimum f, select any one of them
(say, n) arbitrarily;
b) If n is the goal
Then return n along with the path from the starting node,
and exit;
Else remove n from L and add all the children of n to the list L,
with their labeled paths from the starting node;

End.

End While;

End.

As already pointed out, the best first search algorithm is a generic
algorithm and requires many more extra features for its efficient realization.
For instance, how we can define f is not explicitly mentioned in the
algorithm. Further, what happens if an offspring of the current node is not a
fringe node. The A* algorithm to be discussed shortly is a complete
realization of the best first algorithm that takes into account these issues in
detail. The following definitions, however, are required for presenting the
A* algorithm. These are in order.

Definition: A node is called open if the node has been generated and
the h’ (x) has been applied over it but it has not been expanded yet.

Definition: A node is called closed if it has been expanded for
generating offsprings.

In order to measure the goodness of a node in A* algorithm, we require
two cost functions: i) heuristic cost and ii) generation cost. The heuristic cost
measures the distance of the current node x with respect to the goal and is
denoted by h(x). The cost of generating a node x, denoted by g(x), on the
other hand measures the distance of node x with respect to the starting node in
the graph. The total cost function at node x, denoted by f(x), is the sum of
g(x) plus h(x).

The generation cost g(x) can be measured easily as we generate node x
through a few state transitions. For instance, if node x was generated from
the starting node through m state transitions, the cost g(x) will be
proportional to m (or simply m). But how does one evaluate the h(x)? It may
be recollected that h(x) is the cost yet to be spent to reach the goal from the
current node x. Obviously, any cost we assign as h(x) is through prediction.
The predicted cost for h(x) is generally denoted by h’(x). Consequently, the
predicted total cost is denoted by f’(x), where
f ’(x) = g(x) + h’ (x).
Now, we shall present the A* algorithm formally.

Procedure A*

Begin

1. Put a new node n to the set of open nodes (hereafter open); Measure its
f'(n) = g(n) + h’ (n); Presume the set of closed nodes to be a null set
initially;
2. While open is not empty do

Begin

If n is the goal, stop and return n and the path of n from the
beginning node to n through back pointers;
Else do
Begin

a) remove n from open and put it under closed;
b) generate the children of n;
c) If all of them are new (i.e., do not exist in the graph
before generating them Then add them to open and
label their f’ and the path from the root node through
back pointers;
d) If one or more children of n already existed as open
nodes in the graph before their generation Then those
children must have multiple parents; Under this
circumstance compute their f’ through current path and
compare it through their old paths, and keep them
connected only through the shortest path from the
starting node and label the back pointer from the
children of n to their parent, if such pointers do not
exist;
e) If one or more children of n already existed as closed
nodes before generation of them, then they too must have multiple parents; Under this circumstance, find the
shortest path from the starting node, i.e., the path (may be
current or old) through which f’ of n is minimum; If the
current path is selected, then the nodes in the sub-tree rooted
at the corresponding child of n should have revised f’ as the
g’ for many of the nodes in that sub-tree changed; Label the
back pointer from the children of n to their parent, if such
pointers do not exist;

End.

End.

End While;

End.

To illustrate the computation of f’ (x) at the nodes of a given tree or
graph, let us consider a forward reasoning problem, say the water-jug
problem, discussed in chapter 3. The state-space for such problems is often
referred to as OR graphs / trees. The production rules we would use here are
identical with those in chapter 3, considered for the above problem.

Example: Let us consider the following heuristic function, where X
and Y denote the content of 4-liter and 3-liter jugs respectively and x denotes
an arbitrary node in the search space.

h’ (x) = 2, when 0 < X < 4 AND 0 < Y < 3,
= 4, when 0 < X < 4 OR 0 < Y < 3,
=10, when i) X = 0 AND Y = 0
OR ii) X = 4 AND Y = 3
= 8, when i) X = 0 AND Y = 3
OR ii) X =4 AND Y = 0

Assume that g(x) at the root node = 0 and g(x) at a node x with
minimum distance n, measured by counting the parent nodes of each node
starting from x till the root node, is estimated to be g(x) = n. Now let us
illustrate the strategy of the best first search in an informal manner using the
water-jug problem, vide figure.

Fig.: Expanding the state-space by the A* algorithm.

Another important issue that needs to be discussed is how to select a
path,when an offspring of a currently expanded node is an already existing node. Under this circumstance, the parent that yields a lower value of g+h’
for the offspring node is chosen and the other parent is ignored, sometimes by
de-linking the corresponding arc from that parent to the offspring node.

Neuro AI

Artificial Neural Networks, Algorithms, tutorials and sofware