How to work with machine learning algorithms in embedded systems?

10.1k Views Asked by At

I'm doing a project to detect (classify) human activities using a ARM cortex-m0 microcontroller (Freedom - KL25Z) with an accelerometer. I intend to predict the activity of the user using machine learning.

The problem is, the cortex-m0 is not capable of processing training or predicting algorithms, so I would probably have to collect the data, train it in my computer and then embed it somehow, which I don't really know how to do it.

I saw some post in the internet saying that you can generate a matrix of weights and embed it in a microcontroller, so it would be a straightforward function to predict something ,based on the data you providing for this function. Would it be the right way of doing ?

Anyway my question is, how could I embedded a classification algorithm in a microcontroller?

I hope you guys can help me and give some guidance, I'm kind of lost here.

Thank you in advance.

3

There are 3 best solutions below

0
On BEST ANSWER

I've been thinking about doing this myself to solve a problem that I've had a hard time developing a heuristic for by hand.

You're going to have to write your own machine-learning methods, because there aren't any machine learning libraries out there suitable for low-end MCUs, as far as I know.

Depending on how hard the problem is, it may still be possible to develop and train a simple machine learning algorithm that performs well on a low-end MCU. After-all, some of the older/simpler machine learning methods were used with satisfactory results on hardware with similar constraints.

Very generally, this is how I'd go about doing this:

  1. Get the (labelled) data to a PC (through UART, SD-card, or whatever means you have available).
  2. Experiment with the data and a machine learning toolkit (scikit-learn, weka, vowpal wabbit, etc). Make sure an off-the-shelf method is able to produce satisfactory results before moving forward.
  3. Experiment with feature engineering and selection. Try to get the smallest feature set possible to save resources.
  4. Write your own machine learning method that will eventually be used on the embedded system. I would probably choose perceptrons or decision trees, because these don't necessarily need a lot of memory. Since you have no FPU, I'd only use integers and fixed-point arithmetic.
  5. Do the normal training procedure. I.e. use cross-validation to find the best tuning parameters, integer bit-widths, radix positions, etc.
  6. Run the final trained predictor on the held-out testing set.
  7. If the performance of your trained predictor was satisfactory on the testing set, move your relevant code (the code that calculates the predictions) and the model you trained (e.g. weights) to the MCU. The model/weights will not change, so they can be stored in flash (e.g. as a const array).
0
On

For human activity recognition (HAR) using an ARM Cortex-M0 the implementation should be rather efficient, especially due to the small RAM size. A good alternative would be to use a RandomForestClassifier from scikit-learn to train on your PC, and then use a library like emlearn or m2cgen to generate C code that you can run on the microcontroller.

6
On

I think you may be limited by your hardware. You may want to get something a little more powerful. For your project you've chosen the M-series processor from ARM. This is the simplest platform that they offer, the architecture doesn't lend itself to the kind of processing you're trying to do. ARM has three basic classifications as follows:

  1. M - microcontroller
  2. R - real-time
  3. A - applications

You want to get something that has strong hardware support for these complex calculations. You're starting point should be an A-series for this. If you need to do floating point arithmetic, you'll definitely need to start with the A-series and probably get one with NEON-FPU.

TI's Discovery series is a nice place to start, or maybe just use the Raspberry Pi (at least for the development part)?

However, if you insist on using the M0 I think you might be able to pull it off using something lightweight like ROS-C. I know there are packages with ROS that can do it, even though its mainly for robotics you may be able to adapt it to what you're doing.

Dependency Free ROS

Neural Networks and Machine Learning with ROS