How do I implement Negascout with dynamic tree generation?

35 Views Asked by At

I'm trying to implement Negascout for Connect4, where I am generating the tree as I am searching. Here is my code in Python (procena is the evaluation value):

def negascout(node, player, alpha, beta, max_depth):
    if(node.has_children() == False or max_depth == node.depth):
        procena = calc_procena_global(node.state)
        if (player == 1): #min
            procena = -procena
        node.procena = procena
        return procena
    score = -np.inf
    if(node.has_children() and max_depth >= node.depth):
        for i in range(len(node.expansion_order)):
            node.children[node.expansion_order[i]] = Node(1 - node.player, node.state.generate_successor_state(node.expansion_order[i]), node.root)
            node.children[node.expansion_order[i]].depth = node.depth + 1
            node.children[node.expansion_order[i]].root = node.root
            if(i == 0):
                val = -negascout(node.children[node.expansion_order[i]], 1 - player, -beta, -alpha, max_depth)
            else:
                val = -negascout(node.children[node.expansion_order[i]], 1 - player, -alpha - 1, -alpha, max_depth)
                if (((alpha < val) and (val < beta)) 
                #or (abs(alpha) == np.inf and abs(beta)==np.inf)
                ):
                    val = -negascout(node.children[node.expansion_order[i]], 1 - player, -beta, -alpha, max_depth)
            if(val > score):
                score = val
            if(score > alpha):
                alpha = score
            if(alpha >= beta):
                break
    node.procena = score
    return score

I have tested this implementation against Negamax and Negamax with alpha-beta pruning, and it acts differently for some values of depth (the evaluations get calculated incorrectly for some nodes, because the values that were supposed to progress to lower levels get cut off by the empty window iterations).

Edit:

Example1: with depth=5 at one point on a fuller board the Negascout (NS) takes a different move than Negamax alpha-beta (NAB), NS calculates the estimation values as 2 5 2 None None 3 4 while NAB as -5 -5 2 None None -3 -5 (None is a full column in the game), and given the expansion order of 2 6 1 0 5, NS selects column 2 as the optimal move while NAB selects column 6.

Example2: With the depth of 5 and on a board with the state 3 3 (max played column 3 (middle), and min played col 3), NS places the next piece in col 3 while NAB places it in 2. Estimates for the root's children in this case are 3 -6 -7 -7 -7 -6 3 for NS and 3 -6 -10 -7 -10 -6 3 for NAB. Expansion order here is 3 2 4 1 5 0 6.

0

There are 0 best solutions below