What is Inference at the Edge?

In summary, it enables the data gathering device in the field to provide actionable intelligence using Artificial Intelligence (AI) techniques. These types of devices use a multitude of sensors and over time the resolution and accuracy of these sensors has vastly improved, leading to increasingly large volumes of data being captured. Historically, this data was extracted and analysed afterwards in a manual process and more recently by using AI with a trained neural network.  It is now possible to provide sufficient computing capacity to run an optimized AI model at the point of data capture and this is referred to as ‘Inference at the Edge’.  In some defense, exploration and security applications, this has the benefit of accelerating the OODA loop so that actions can be taken much more quickly.  It also improves security as the data is held on the local device.

How is it achieved?

Inference requires a pre-trained deep neural network model.  The model is trained by feeding as many data points as possible into a framework to provide the best possible accuracy.  Typical frameworks that are used as the basis of these deep neural network models include: Tensorflow, MxNet and Caffe.

Once the model is trained, then there are a couple of intermediate steps that are required before it can be used for inference.  In our case, we provide a range of Intel processor based boards that are suitable for AI applications and the Intel OpenVINO suite is an ideal tool to use.

One of the constituents of OpenVINO is a Model Optimizer tool that facilitates the transition between the training and deployment environment, performs static model analysis, and adjusts deep learning models for optimal execution on the target device, perhaps one of the Concurrent Technologies rugged server boards.  The Model Optimizer produces an Intermediate Representation (IR) of the network, which can be read, loaded, and inferred with the Inference Engine, another key part of the OpenVINO toolkit.

The Inference Engine is a library with a set of classes to infer input data and provide a result. It uses a plugin architecture that is flexible and has implementations for inference on a number of Intel hardware devices in addition to processors.  This is designed so that inference performance can be dramatically improved with the addition of hardware resources that are OpenVINO compatible such as Intel’s Arria FPGA accelerators.

Where do I start?

A good place is the OpenVINO toolkit resource as it contains a wealth of information including demonstrations and pre-trained models.

OpenVINO TOOLKIT