How to optimize scoring in a match

103 Views Asked by At

I'm designing a decision making system for a contest, where players are required to aim different targets. The probabilities to score on different targets vary, and the more players score on each target, the probability to score on that one decreases. Players have limited attempt chances.

What I come to mind is only Markov Chain and game theory yet I don't know how to implement them, and I wonder is there any other mathematical techniques to maximize my score.

I would deeply appreciate any piece of guidance.

2

There are 2 best solutions below

4
On BEST ANSWER

Markov Processes: A Non-Solution

I don't think a Markov process is going to work here. The Markov property requires that the probability distribution of a process's future states depend only on its present state (or a limited number of past st

A stochastic process has the Markov property if the conditional probability distribution of future states of the process (conditional on both past and present states) depends only upon the present state, not on the sequence of events that preceded it. Since the probability of hitting a target decreases with each successful hit, the future of your process depends on its past and, therefore, the your process is not Markov.

Recursive Brute-Force Search: An Adequate Solution

One way to solve this is via exploration of a search tree. The following C++ code describes the operation:

#include <limits>
#include <iostream>
#include <cstdio>
#include <vector>

std::vector<float> ScoreOn(const std::vector<float> &probs, int target){
  std::vector<float> temp = probs; //Copy original array to avoid corrupting it
  temp[target]          *= 0.9;    //Decrease the probability
  return temp;                     //Return the new array
}

std::pair<float,int> Choice(
  const std::vector<float> &probs,
  const std::vector<float> &values,
  int depth
){
  if(depth==0)                      //We gotta cut this off somewhere
    return std::make_pair(0.0,-1);  //Return 0 value at the end of time

  //Any real choice will have a value greater than this
  float valmax = -std::numeric_limits<float>::infinity();
  //Will shortly be filled with a real choice value
  int choice = -1;

  //Loop through all targets
  for(int t=0;t<probs.size();t++){
    float hit_value  = values[t]+Choice(ScoreOn(probs,t),values,depth-1).first;
    float miss_value = 0        +Choice(probs           ,values,depth-1).first;
    float val        = probs[t]*hit_value+(1-probs[t])*miss_value;
    if(val>valmax){ //Is current target a better choice?
      valmax = val;
      choice = t;
    }
  }
  return std::make_pair(valmax,choice);
}

int main(){
  //Generate sample data and print the current best answer
  int target_count = 8; //Number of targets
  std::vector<float> probs,values;
  for(int t=0;t<target_count;t++){
    probs.push_back(rand()/(float)RAND_MAX);
    values.push_back(80.0*(rand()/(float)RAND_MAX));
  }

  std::cout<<Choice(probs,values,6).first<<std::endl;
}

Now, consider trying to hit a target. If we hit it, the value of our action (call it H) is the value of the target plus the value of all of our future actions. If we miss (M), then the value is zero plus the value of all of our future actions. We weight these values by the probability p of each happening, to get the equation:

Value = pH+(1-p)M

We calculate the future values in the same way, thereby generating a recursive function. Since this could go on forever, we bound the depth of the recursion to some number of levels. Since, at each level the decision tree splits along two paths for each of t targets, we have (2t)**(Depth+1)-1 nodes in the tree. Therefore, choose your depth wisely to avoid thinking forever.

The above code, with optimizations runs in 0.044s for depth=5 and 0.557s for depth=6 on my Intel i5 M480 cpu (which is now about five years old). For depth=6, there are 268,435,455 nodes in the tree and each leaf possibility has only a 1 in 16,777,216 chance of being realized. Unless your value function is weird, there's no need to consider the future any farther than that.

Branch and Bound: An Improved Solution

But, if you did need to go explore a larger space or go faster, you could consider using Branch and Bound methods. This works the same way, except we choose not to expand any subtrees which are provably less than a solution we've already found. Proving a tight upper bound then becomes the primary challenge.

2
On

Why not use a greedy algorithm?

You can't do better at each point in time than to choose the target with the highest expected value (probability of hit multiplied by value of target).