Merging TensorFlow Lite and μTensor

A new inference engine for micro-controllers?

Alasdair Allan
3 min readMay 8, 2019

In a joint announcement today by the TensorFlow Lite team at Google and the microTensor team at Arm, came the news that the two major inference engine platforms for micro-controllers will be joining forces.

The SparkFun Edge board announced at the TensorFlow Dev Summit. (📷: Alasdair Allan)

“The ability to deploy machine learning models on edge devices opens doors for smarter devices that are power- and bandwidth-efficient. We believe this will usher in an era of TinyML battery-powered devices that can take advantage of low-power wireless communication”

Machine learning development is done in two stages. An algorithm is initially trained on a large set of sample data on a fast powerful machine or cluster, then the trained network is deployed into an application that needs to interpret real data. This deployment, or “inference,” stage is where both these project do their part as both TensorFlow Lite for Microcontrollers and microTensor are light-weight inferencing engines intended to run on “bare metal” systems and within only a few kilobytes of memory like the SparkFun Edge board.

Today’s news comes in the wake of the first experimental support for embedded platforms in TensorFlow Lite being announced on stage by Pete Warden, part of the TensorFlow Lite team at Google, during the TensorFlow Dev Summit in Santa Clara back in March.

Pete Warden from the TensorFlow Lite team at Google talking about the SparkFun Edge. (📹: Google)

Although microTensor has been around a while longer, it was amongst the first projects to bring the current generation of machine learning on micro-controller sized devices, the arrival of TensorFlow Lite for Microcontrollers has brought a lot more visibility to the idea that we can run “tiny models for tiny computers” on the edge. No cloud, or networking needed.

TensorFlow trained 3-layer MNIST model running on Mbed micro-controller using uTensor. (📹: stolenDoggy)

The arrival of hardware designed to run machine learning models at vastly increased speeds, and inside a relatively low power envelopes, without needing a connection to the cloud, combined with makes edge based computing that much more of an attractive proposition. The ecosystem around edge computing is starting to feel far more mature.

That means that biggest growth area in machine learning practice over the next year or two could well be around inferencing, rather than training.

“Both μTensor and TensorFlow Lite for Microcontrollers are at their early stages. μTensor is great for rapid prototyping. All the experimental and conceptual features are being matured in μTensor. We will share the same kernels, file-formats and optimisation routines with TensorFlow Lite for Microcontrollers. Over time, we will be merging the most useful features into TensorFlow Lite for Microcontroller. In the meantime, we are establishing the collaboration between the teams, and will transition to work on the same codebase.”

Today’s deal was presumably brokered at the TinyML Summit, held a couple of weeks after the TensorFlow Dev Summit at the Googleplex, and it almost sounds like microTensor will serve as an experimental trial platform for features that will be merged back into TensorFlow Lite when they’re stable.

Zach Shelby, a Vice President, Developers at Arm, had this to say, “…our mission is to make TinyML available to lots of developers. We have already started contributing Mbed integration and support to TensorFlow Lite for Microcontrollers to maximize the range of hardware targets and dev usability. I believe giving the 350k+ Mbed embedded developer community with access to TensorFlow will enable incredible applications. As soon as possible our goal is to work on a single TensorFlow Lite for Microcontrollers code base, focusing on integrating dev friendly APIs and tools, along with flash, RAM and operator efficiency.

However this is going to work going forward, it does sound very much like there will now only be one major inferencing engine for micro-controllers, rather than two. Pooling effort like this, and minimising code and feature duplication, can only be a good thing for all of us working in the space.