Generating all possible strategies for iterated prisoner's dilemma in python

213 Views Asked by At

The game will / can be played multiple times, i.e. the depth of the game is a fixed number n>0. What I want to do is, given that the depth of the game is fixed at n, generate all possible unique and consistent strategies for this game.

Game Overview

The game is played between two strategies X and Y. Each turn X and Y will have a history of all previous moves played, but they have have no information about what move the other side is going to play until they have played their own move. The move they can pick are either +1 or -1, and the result of what would happen if they pick either of these moves does not really matter for this question.

What do I mean by "unique and consistent strategies?"

An example of two "different" strategies that are actually the same is this: 1st strategy is to pick -1 10 times in a row and then on the 11th turn and onwards it would only pick +1, and the 2nd strategy is to pick -1 9 times in a row and then +1 onwards, but let's say the depth of the game is only n=8, so in this case both strategies would produce the same answer against any other strategies. Thus a strategy X is different from a strategy Y if there exist a strategy M where X and Y deviates at some point in their answer against the strategy M within the predefined depth of the game n. By consistent I mean that given the history of the game, the answer from the strategy should always be the same (random is not allowed).

My question

My main question is, if I want a setup where I can simulate all different strategies, what would be a good way to formulate these "strategies" in python? Say we are playing this game at depth 5, which would produce 1024 unique games. What would be a good way to define a strategy or think about a strategy in such a way that I can feed a strategy X two lists [1, 1], [1, -1], where the first list are the two moves X have played, and the second list contains the two moves the other strategy Y have played, and then X would have to pick a move based on this information. Like, let's say the strategy for X is to play +1 only if Y haven't played -1 the last two turns, and the strategy for Y is to just alternate between +1 and -1. What is a good way to represent those two strategies in such a way that I can easily evaluate them, but most importantly, What is a good interpretation of a strategy in code such that I generate all possible strategies that are unique and consistent?

Note: I am not interested in generating all possible games, what I am interested in is generating all possible strategies.

Example of different strategies

Examples of strategies based on this video by Veritasium, which was my main motivation for asking this question. A strategy called Tit for Tat is to copy your opponent's previous move, where your first move is to always play +1. Another strategy is to play +1 and then on the fifth move always play -1 regardless of your opponent. Another strategy similar to Tit for Tat is to only play -1 if your opponent has played -1 two times in a row. The question is what is a good way to formulate these strategies in code such that all possible unique and consistent strategies can be generated.

1

There are 1 best solutions below

8
On

OP wants a way to formulate all possible strategies. In the setup suggested a strategy is selecting one of the two options [1, -1] This makes the setup a binary tree, with X and Y alternating between each other. As the winning situation is not defined the only stop parameter is the tree depth. All strategies here are guaranteed to be different as a binary tree would have only one path to root via parents.

Following examples takes from the small set OP suggested above

leaves = []


class Node:
    def __init__(self, parent, depth, data, root_data):
        self.value = data
        self.left = None
        self.right = None
        self.parent = parent
        if len(root_data) > 0:
            new_data = root_data[0]
            if new_data == -1:
                self.left = Node(self, depth - 1, new_data, root_data[1:])
            elif new_data == 1:
                self.right = Node(self, depth - 1, new_data, root_data[1:])
        else:
            if depth == 0:
                leaves.append(self)
                return
            else:
                self.left = Node(self, depth - 1, -1, [])
                self.right = Node(self, depth - 1, 1, [])


def print_tree(node, level=0, prefix="Root: ", connector="    "):
    if node is not None:
        print(" " * (level * 4) + prefix + str(node.value))
        if node.right is not None:
            print_tree(node.right, level + 1, "├──R: ", "│   ")
        if node.left is not None:
            print_tree(node.left, level + 1, "└──L: ", "    ")


def traverse_to_parent(node):
    path = []
    while node is not None:
        path.append(node.value)
        node = node.parent
    path.reverse()
    return path


if __name__ == '__main__':
    max_depth = 6
    X = [1, 1]
    Y = [1, -1]
    till_now = [e for d in zip(X, Y) for e in d]
    previous_depth = len(X) + len(Y)
    tree = Node(None, max_depth-1, till_now[0], till_now[1:])
    print_tree(tree)
    print("Strategies::")
    for leaf in leaves:
        path = traverse_to_parent(leaf)
        print(f"X:: {path[::2]}  -----   Y:: {path[1::2]}")

Output

/usr/local/bin/python3.9 nodes.py 
Root: 1
    ├──R: 1
        ├──R: 1
            └──L: -1
                ├──R: 1
                    ├──R: 1
                    └──L: -1
                └──L: -1
                    ├──R: 1
                    └──L: -1
Strategies::
X:: [1, 1, -1]  -----   Y:: [1, -1, -1]
X:: [1, 1, -1]  -----   Y:: [1, -1, 1]
X:: [1, 1, 1]  -----   Y:: [1, -1, -1]
X:: [1, 1, 1]  -----   Y:: [1, -1, 1]

Process finished with exit code 0