Vectorizing a loop via numpy for qlearner/dyna-q implementation

95 Views Asked by At

I have a 100 x 4 sized 2d numpy array A (q table), and another array B (experience table) that gets continuously updated with a 4 element tuple (representing state, action, state_prime, reward). I need to randomly select a row from array B, extract the 4 elements, and run a function update(s, a, s', r) that updates array A with the 4 elements as arguments. I need to do this x times.

My current implementation involves a for loop and python list:

array B = []
given s, a, s', r
B.append(s, a, s', r)
for i in range x:
  r = random.choice(B)
  update(r[0], r[1], r[2], r[3])

I would like to get rid of the for loop and vectorize the solution but only if it's faster than using python lists. I've tried creating an empty np.array([]) for B, creating a (0, 4) sized np array and vstacking, shuffling, etc. But my current way of using python lists is the fastest by far.

Edit1: I do not know what to initialize the rows of B to thus I cannot use index to write in the tuple.

0

There are 0 best solutions below