Working on a reinforcement learning VRPTW variant with GNN. Seeking advice on the step function implementation

40 Views Asked by At

So i am working on solving variant of VRPTW with an actor-critic with a Graph neural network and attention mechanism. for now i am finding some difficulties to decide at what stage to apply the constraints that VRPTW has on my approach, do i add somehow the constraints into the step function or do i add it separately and then call it in the step and actor-critic.

this is my first time working on reinforcement learning , i am using tensorflow and a custom environment extended from gym.

for now i have tried writing the code for the calculate costs outside the step function. and the reward is equal -costs (reward = -total_costs)

and i am not really able to test it , to see the results , but something looks wrong to me in this approach and that is why i am askiing for some help

0

There are 0 best solutions below