While iteration over numpy array, I can't call methods of objects stored in array

Basic Goal of this part of the code: A number of balls (no_balls) move in random directions.

I am trying to move from python lists to numpy arrays for better performance. Here is the reduced code.

Basic Problem: My iterator gives me objects of type ndarray not vpy.sphere , therefore calling sphere.pos on the objects I am iterating over fails. Or is this not possible, since Numpy is build for numbers?? Alternatives for performance?

import vpython as vpy
import numpy as np

#Create and Fill numpy array with random size balls
balls = np.empty([no_ball], dtype=vpy.sphere)

with np.nditer(balls, flags=['refs_ok'], op_flags=['readwrite']) as b_it:
    for b in b_it:
        b[...] = (vpy.sphere( radius=random_in_range(ball_min_r,ball_max_r), 
    debug_msg('populated balls list')

#Main Loop
debug_msg('Starting Main Loop')
while True:
with np.nditer(balls, flags=['refs_ok'], op_flags=['readwrite']) as b_it:
#The actual loop manipulates the position but the problem is that I can't access the   position of the sphere objects. Type returns nd.array for b
        for b in b_it:
#Above outputs
<class 'numpy.ndarray'>
Traceback (most recent call last):
  File "path", line 93, in <module>
AttributeError: 'numpy.ndarray' object has no attribute 'pos'

How do I call methods and members of objects in the array. And on a sidenote, why do I need to call b[...] instead of b, seems obsolete.


A simple class:

In [149]: class Foo():
     ...:     def __init__(self,i):
     ...:         self.i = i
     ...:     def __repr__(self):
     ...:         return f'<FOO {self.i}>'
In [150]: Foo(323)
Out[150]: <FOO 323>

A list of such objects:

In [151]: alist = [Foo(i) for i in range(10)]

An equivalent object dtype array:

In [152]: arr = np.array(alist)
In [153]: arr.dtype
Out[153]: dtype('O')
In [154]: arr
array([<FOO 0>, <FOO 1>, <FOO 2>, <FOO 3>, <FOO 4>, <FOO 5>, <FOO 6>,
       <FOO 7>, <FOO 8>, <FOO 9>], dtype=object)

Fetching the attribute from the list:

In [155]: [f.i for f in alist]
Out[155]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [156]: timeit [f.i for f in alist]
826 ns ± 8.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

and from the array (slower):

In [157]: timeit [f.i for f in arr]
1.66 µs ± 15.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Using nditer - you studied the docs enough to get the flags right, but didn't grasp that b is an array, not a Foo:

In [158]: with np.nditer(arr, flags=['refs_ok'], op_flags=['readwrite']) as b_it:
     ...:     for b in b_it:
     ...:         print(b, b.dtype, b.shape, b.item())
<FOO 0> object () <FOO 0>
<FOO 1> object () <FOO 1>
<FOO 2> object () <FOO 2>
<FOO 3> object () <FOO 3>
<FOO 4> object () <FOO 4>
<FOO 5> object () <FOO 5>
<FOO 6> object () <FOO 6>
<FOO 7> object () <FOO 7>
<FOO 8> object () <FOO 8>
<FOO 9> object () <FOO 9>

Fetching a list of the attribute:

In [159]: res = []
     ...: with np.nditer(arr, flags=['refs_ok'], op_flags=['readwrite']) as b_it:
     ...:     for b in b_it:
     ...:         res.append(b.item().i)
In [160]: res
Out[160]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

And a poor timing:

In [161]: %%timeit
     ...: res = []
     ...: with np.nditer(arr, flags=['refs_ok'], op_flags=['readwrite']) as b_it:
     ...:     for b in b_it:
     ...:         res.append(b.item().i)

7.25 µs ± 60.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

One of the cleaner ways of performing an action on elements of an object array is with frompyfunc:

In [162]: f = np.frompyfunc(lambda b:b.i,1,1)
In [163]: f(arr)
Out[163]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=object)
In [164]: timeit f(arr)
2.1 µs ± 8.58 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Still slower than the iteration, though if we want an array instead of just a list, it is better than:

In [165]: timeit np.array([f.i for f in arr])
5.79 µs ± 21.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The nditer docs need a stronger performance disclaimer. nditer when used in c or cython code is useful and fast, but when accessed via Python code it is inferior to more obvious alternatives. It's extra bells-n-whistles may be useful in some cases, but mostly I see it as a bridge to properly compiled code, not as an end of itself.

At the heart of the performance issue is that Foo is a Python class. So accessing the i attribute has to use the full Python referencing system. It can't make use of any of the fast compiled numpy numeric methods.