is ndarray faster than recarray access?

1k Views Asked by kcw78 At 03 November 2018 at 00:42

I was able to copy my recarray data to a ndarray, do some calculations and return the ndarray with updated values.

Then, I discovered the append_fields() capability in numpy.lib.recfunctions, and thought it would be a lot smarter to simply append 2 fields to my original recarray to hold my calculated values.

When I did this, I found the operation was much, much slower. I didn't have to time it, the ndarray based process takes a few seconds compared to a minute+ with recarray and my test arrays are small, <10,000 rows.

Is this typical? ndarray access is much faster than recarray? I expected some performance degradation due to access by field name, but not this much.

Original Q&A

There are 1 best solutions below

kcw78 On 05 November 2018 at 19:51

Updated 15-November-2018
I expanded my timing tests to clarify differences in performance for ndarray, structured array, recarray and masked array (type of record array?). There are subtle differences in each. See discussion here:
numpy-discussion:structured-arrays-recarrays-and-record-arrays

Here are result of my performance tests. I built a very simple example (using 1 of my HDF5 data sets) to compare performance with the same data stored in 4 types of arrays: ndarray, structured array, recarray and masked array. After the arrays are constructed, they are passed to a function that simply loops thru each row and extracts 12 values from each row. The functions are called from the timeit function with a single pass (number=1). This test only measures the array read function, and avoids all other calculations.
Results given below for 9,000 rows:

for ndarray: 0.034137165047070615
for structured array: 0.1306827116913577
for recarray: 0.446010040784266
for masked array: 31.33269560998199

Based on this test, access performance decreases with each type. Access times for structured array and recarray are 4x-13x slower than ndarray access (but all are only a fraction of second). However, ndarray access is 1000x faster than masked array access. That explains the seconds to minutes difference I see in my complete example. Hopefully this data is useful to others that encounter this issue.

is ndarray faster than recarray access?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in APPEND

Related Questions in FIELD

Related Questions in RECARRAY

Trending Questions

Popular # Hahtags

Popular Questions