I'm trying to model a software agent with these specifications: it simulates a labeling robot that has to label three box that are identified by the positions 1, 2 and 3, he cannot label the same box twice and it starts always in position 1. He can do 2 actions: go_to(X) that moves the robot in the X position and label that label the box in front of him. Note that he as to figure out the best plan, so he can do every path (always without labeling the same box twice).
I modeled the problem as this DFA, where the L on the edges correspond to the action label and the 1, 2, 3 on the edges correspond to the actions go_to(1), go_to(2) and go_to(3). It works but i need a version with around 6 to 8 states. the DFA i modeled
I already minimized the DFA with the known algorithms but i was wondering if it's possible to get even less states by using a different approach.