The Big Benchmarking Roundup
Over the last six months I’ve been looking at machine learning on the edge, publishing a series of articles trying to answer some of the questions that people have been asking about inferencing on embedded hardware.
But, after a half year of posts, talks, and videos, it’s all bit of a sprawling mess and the overall picture is of what’s really happening is rather confusing.
So here’s a great big benchmarking roundup!
Although some people have dismissed the idea of benchmarks for inferencing as irrelevant because “…it’s training times that matter,” that doesn’t really seem justified. While if you take an academic approach to machine learning you often will train thousands of different models to find one that is ‘paper worthy’ but this does not seem to be how things work out in the world.
Instead for embedded systems training is a sunk cost with the final model being used thousands, perhaps even millions, of times depending on how many systems make use of it. Those models will also tend to hang around, potentially for decades if you’re talking about hardware that’s going into factories, homes, or public spaces. So in the long term it’s how fast those models run on the embedded hardware that’s important, not how long they took to train.
Discussion of the methodology behind the benchmarks can be found in the original post in the series, while the latest results can be found below, and are also discussed in both the first and the final post in the series.
While inferencing speed is probably our most important measure, these are devices intended to do machine learning at at the edge. That means we also need to pay attention to environmental factors.