I'm seeing up to 3x-4x slower results on my first prediction on my MLModel. Seems like once you run the first prediction it's heating the model. Couldn't find any official information about it.
What is the right way to pre-heat my model? Call it on an async thread on app launch with dummy data?
As @Alladinian mentioned, starting from iOS 14 you can use the
MLModel.load(contentsOf:...)function.Here is a usage example for pre-loading a local
.mlmodelfile:If you want to pre-load an externally fetched model, make sure it's compiled using
MLModel.compileModel(at:).