I am working on designing an experiment, where a system can receive 5 inputs [x1, x2, x3, x4, x5] and record target data associated with this "state". Given a new input state, the system will then transition from the old to the new state with a certain speed (these are constants) and record the target value along the path. Given the constraint in total time the system can run, what is the best subset of states that can be selected and the optimal trajectory to maximize the total diversity of data points generated?
The following are the variables we use:
x1 = np.arange(5, 37, 2)
x2 = np.arange(5, 53, 3)
x3 = np.arange(0, 3, 1)
x4 = np.arange(0, 4.3, 0.3)
x5 = np.arange(-3, 3.2, 0.2)
- There are a total of 357120 possible state combinations
- Total experiment time cannot exceed 20 minutes (or 1200 seconds)
- When a new state is reached, the system remains stationary for 5 seconds before beginning to move to a new state
- The following are the rates of change (per 1/10 of a second) for each variable:
x1_speed = 0.1
x2_speed = 0.1
x3_speed = 0.005
x4_speed = 0.13
x5_speed = 0.1
The best strategy I could come up with so far is to randomly initialize subset of states within the constraints of total time allowed, calculate total Euclidian distance between all points in the data and record the subset of states that maximizes that distance as measure of diversity of data points. However, inspecting some interactions in 2D plane still leaves chunks of space not mapped (per picture below).
Result of random state generation & optimization for Euclidian distance
I've looked into path optimization problems, but given the number of possible states * transitions to simulate all possible edges between all possible nodes seemed like wrong approach. Given that the points selected for experiment have to be optimized and are not pre-defined complicates things as well. If it is a diversity maximization LP problem with some constraints, how do you make it recursive to map out the whole path? Alternatively, was reading more about Reinforcement Learning approach.
The code I used to accomplish the above is as follows. I skipped the part where I record the states, but there's extra lines to populate states_df, which then I use to calculate total Euclidian distance. I can add it if necessary, but please let me know if there is a better strategy/algorithm to solve this problem given some of the constraints, thanks!
import numpy as np
import pandas as pd
from scipy.spatial.distance import pdist
def state_gen():
var1 = np.random.choice(x1)
var2 = np.random.choice(x2)
var3 = round(np.random.choice(x3), 1)
var4 = round(np.random.choice(x4), 1)
var5 = round(np.random.choice(x5), 1)
state = [var1, var2, var3, var4, var5]
return state
def state_change(state_old):
state_new = state_gen()
#x1
x1_travel = state_old[0] - state_new[0]
x1_time = abs(x1_travel/x1_speed) #in seconds
#x2
x2_travel = state_old[1] - state_new[1]
x2_time = abs(x2_travel/x2_speed) #in seconds
#x3
x3_travel = state_old[2] - state_new[2]
x3_time = abs(x3_travel/x3_speed) #in seconds
#x4
x4_travel = state_old[3] - state_new[3]
x4_time = abs(x4_travel/x4_speed) #in seconds
#x5
x5_travel = state_old[4] - state_new[4]
x5_time = abs(x5_travel/x5_speed) #in seconds
#Record values to **states_df** here, add to return statement
t_change = max(x1_time, x2_time, x3_time, x4_time, x5_time)
return state_new, t_change
def objective():
#Initialize first state
state_old = state_gen()
t_tot = 0
while t_tot < 11500: #Little under 20 minutes in tenths of seconds
state_new, t_change, = state_change(state_old)
#System stationary for 5 seconds
t_change += 50
t_tot += t_change
state_init = state_new
tot_euc = pdist(states_df.values, metric='euclidean').sum()
return tot_euc, tot_time, states_df
iter_nums = 250000
tot_euc_max = 0
for i in range(0, iter_nums):
tot_euc, tot_time, states_df = objective()
if tot_euc > tot_euc_max:
tot_euc_max = tot_euc
print('Maximum {} attained'.format(tot_euc_max))
print('Total run time: {}'.format(tot_time))