I'd like to sort a matrix of shape (N, 2)
on the first column where N
>> system memory.
With in-memory numpy you can do:
x = np.array([[2, 10],[1, 20]])
sortix = x[:,0].argsort()
x = x[sortix]
But that appears to require that x[:,0].argsort()
fit in memory, which won't work for memmap where N
>> system memory (please correct me if this assumption is wrong).
Can I achieve this sort in-place with numpy memmap?
(assume heapsort is used for sorting and simple numeric data types are used)
The solution may be simple, using the order argument to an in place sort. Of course,
order
requires fieldnames, so those have to be added first.The field names are strings, corresponding to the column indices. Sorting can be done in place with
Then convert back to a regular array with the original datatype
That should work, although you may need to change how the view is taken depending on how the data is stored on disk, see the docs