Benchmarking Edge Computing

Comparing Google, Intel, and NVIDIA accelerator hardware

Alasdair Allan


Over the last year custom silicon, intended to speed up machine learning inferencing on the edge, has started to appear. No cloud needed. First to market was Intel with their Moividius-based hardware. However over the last couple of months we’ve seen the arrival of both Google, with their EdgeTPU-based hardware called Coral, and NVIDIA with their GPU-based offering the Jetson Nano.

An edge computing hardware zoo. Here we have the Intel Neural Compute Stick 2 (left, top), a Movidus Neural Compute Stick (left, bottom), the NVIDIA Jetson Nano (middle, top), a Raspberry Pi 3, Model B+ (middle, bottom), a Coral USB Accelerator (right, top), and finally the Coral Dev Board (right, bottom).

The arrival of new hardware designed to run machine learning models at vastly increased speeds, and inside a relatively low power envelope, without needing a connection to the cloud, makes edge based computing that much more of an attractive proposition. Especially as alongside this new hardware we’ve seen the release of TensorFlow 2.0 as well as TensorFlow Lite for micro-controllers and new ultra-low powered hardware like the SparkFun Edge.

The ecosystem around edge computing is starting to feel far more mature. Which means that biggest growth area in machine learning practice over the next year or two could well be around inferencing, rather than training.

The only question now being, which of the new acceleration platforms can inference faster? Time to run some benchmarking and find that out.

Headline results from benchmarking

We’re going to go ahead and compare inferencing on the following platforms; the Coral Dev Board, the NVIDIA Jetson Nano, the Coral USB Accelerator with a Raspberry Pi, the original Movidus Neural Compute Stick with a Raspberry Pi, and the second generation Intel Neural Compute Stick 2 again with a Raspberry Pi. Finally just to add a yard stick, we’ll also run the same models on my Apple MacBook Pro (2016), which has a quad-core 2.9 GHz Intel Core i7, and a vanilla Raspberry Pi 3, Model B+ without any acceleration.

Inferencing speeds in milli-seconds for MobileNet SSD V1 (orange) and MobileNet SSD V2 (red) across all tested platforms. Low numbers are good!

This initial benchmark run was with the MobileNet v2 SSD and MobileNet v1 SSD models, both models trained on the Common Objects in Context (COCO) dataset. A single 3888×2916 pixel test image was used which contained two…