MXNet parameter serialisation with numpy

438 Views Asked by At

I want to use a pre-trained MXNet model on s390x architecture but it doesn't seem to work. This is because the pre-trained models are in little-endian whereas s390x is big-endian. So, I'm trying to use https://numpy.org/devdocs/reference/generated/numpy.lib.format.html which works on both little-endian as well as big-endian.

One way to solve this is to I've found is to load the model parameters on an x86 machine, call asnumpy, save through numpy Then load the parameters on s390x machine using numpy and convert them to MXNet. But I'm not really sure how to code it. Can anyone please help me with that?

UPDATE

It seems the question is unclear. So, I'm adding an example that better explains what I want to do in 3 steps -

  1. Load a preexisting model from MXNet, something like this -
net = mx.gluon.model_zoo.vision.resnet18_v1(pretrained=True, ctx=mx.cpu())
  1. Export the model. The following code saves the model parameters in .param file. But this .param binary file has endian issues. So, instead of directly saving the model using mxnet API, I want to save the parameters file using numpy - https://numpy.org/devdocs/reference/generated/numpy.lib.format.html. Because using numpy, would make the binary file (.npy) endian independent. I am not sure how can I convert the parameters of MXNet model into numpy format and save them.
gluon.contrib.utils.export(net, path="./my_model")
  1. Load the model. The following code loads the model from .param file.
net = gluon.contrib.utils.import(symbol_file="my_model-symbol.json",
                                     param_file="my_model-0000.params",
                                     ctx = 'cpu')

Instead of loading using the MXNet API, I want to use numpy to load .npy file that we created in step 2. After we have loaded the .npy file, we need to convert it to MXNet. So, I can finally use the model in MXNet.

1

There are 1 best solutions below

4
Han-Kwang Nienhuys On

Starting from the code snippets posted in the other question, Save/Load MXNet model parameters using NumPy :

It appears that mxnet has an option to store data internally as numpy arrays:

mx.npx.set_np(True, True)

Unfortunately, this option doesn't do what it I hoped (my IPython session crashed).

The parameters are a dict of mxnet.gluon.parameter.Parameter instances, each of them containing attributes of other special datatypes. Disentangling this so that you can store it as a large number of pure numpy arrays (or a collection of them in an .npz file) is a hopeless task.

Fortunately, python has pickle to convert complex data structures into something more or less portable:

# (mxnet/resnet setup skipped)
parameters = resnet.collect_params()

import pickle
with open('foo.pkl', 'wb') as f:
    pickle.dump(parameters, f)

To restore the parameters:

with open('foo.pkl', 'rb') as f:
    parameters_loaded = pickle.load(f)

Essentially, it looks like resnet.save_parameters() as defined in mxnet/gluon/block.py gets the parameters (using _collect_parameters_with_prefix()) and writes them to a file using a custom write function which appears to be compiled from C (I didn't check the details).

You can save the parameters using pickle instead.

For loading, load_parameters (also in util.py) contains this code (with sanity checks removed):

for name in loaded:
    params[name]._load_init(loaded[name], ctx, cast_dtype=cast_dtype, dtype_source=dtype_source)

Here, loaded is a dict as loaded from the file. From examining the code, I don't fully grasp exactly what is being loaded - params seems to be a local variable in the function that is not used anymore. But it's worth a try to start from here, by writing a replacement for the load_parameters function. You can "monkey-patch" a function into an existing class by defining a function outside the class like this:

def my_load_parameters(self, ...):
   ... (put your modified implementation here)

mx.gluon.Block.load_parameters = my_load_parameters

Disclaimers/warnings:

  • even if you get save/load via pickle to work on a single big-endian system, it's not guaranteed to work between different-endian systems. The pickle protocol itself is endian-neutral, but if floating-point values (deep inside the mxnet.gluon.parameter.Parameter were stored as a raw data buffer in machine-endian convention, then pickle is not going to magically guess that groups of 8 bytes in the buffer need to be reversed. I think numpy arrays are endian-safe when pickled.
  • Pickle is not very robust if the underlying class definitions change between pickling and unpickling.
  • Never unpickle untrusted data.