# A sticking point

People often ask how to retrieve the principal variation line after a search has completed.  The principal variation is the line (predicted sequence of moves) that the program thinks represents best play for both sides.  This is not returned by an un-modified alpha-beta function -- all alpha-beta returns is a score.

What follows are modifications to a normal alpha-beta search, which collect the principal variation.  Changes are highlighted.

typedef struct tagLINE {

int cmove;              // Number of moves in the line.

MOVE argmove[moveMAX];  // The line.

}   LINE;

int AlphaBeta(int depth, int alpha, int beta, LINE * pline)
{

LINE line;

if (depth == 0) {

pline->cmove = 0;
return Evaluate();

}
GenerateLegalMoves();
while (MovesLeft()) {
MakeNextMove();
val = -AlphaBeta(depth - 1, -beta, -alpha, &line);
UnmakeMove();
if (val >= beta)
return beta;
if (val > alpha) {
alpha = val;

pline->argmove[0] = ThisMove();

memcpy(pline->argmove + 1, line.argmove,

line.cmove * sizeof(MOVE));

pline->cmove = line.cmove + 1;

}
}
return alpha;
}

This is somewhat inefficient because it uses a lot of stack space, but the code would work and would not be slow.  With these changes, if the function returns a value between alpha and beta, it returns not only the value, but the line (series of predicted moves) that resulted in the value.  This is true for any node in the tree, including the root, which is why this is worth doing.  You may some call to alpha-beta:

val = AlphaBeta(depth, -INFINITY, INFINITY, &line);

What you get is the value of the position, and in the "line" buffer are the moves that constitute the predicted line.  The "line" data structure is just an array of moves that were made in this line, and an integer that lets you know how many there were in the line.

If you call AlphaBeta with depth = 0, the function calls "Evaluate()" and returns that value.  There are no moves searched, so there is no move selected, so along with the score is a line of length zero.

If you call the search with some other depth, and you find a move whose score is between alpha and beta, you receive not just the score, but the moves that resulted in the score.  In order to create the line for this instance of AlphaBeta, you take the move that was just searched, stick it in the line buffer that was passed in, and append the moves from the local line buffer.

You might be asking, "Why is there both a line buffer passed in, and a new one allocated on the stack with each recursive call?"  The reason is that you don't want to wreck a line that has already been created, if you find a partial line somewhere later in the tree, and that line gets refuted.  It is not true that if you find a node whose score is between alpha and beta, that that node will work its way all the way to the root of the tree.  There are lots of fragmentary PV's that get discarded.

For those of you who are aghast because I used "memcpy" in an inner loop, it costs nothing since it is rarely executed.

# Another idea

The other obvious way to collect the PV is to retrieve it from the main transposition table after the score comes back.  One of the fields in each transposition table element is the best move found in the position.  Since each element in the PV contains a score that is between alpha and beta, there should be a "best move" stored in each hash element.

It should be possible to follow the hash element chain and extract the PV from the transposition table, easy as pie.

I know that many programs (including some professional programs) do this, but I have not tried it.