Skip to content

The Essence of Edge AI

Many applications today need edge computing. However, due to privacy issues, the need for real-time responses, and lack of network connectivity, applications are often compromised.  A similar position supporting edge computing comes from the argument around network scalability. The cost and latency for network backhaul can be problematic, especially when addressing the massive data sizes in our data-driven world.

Processing requirements to handle the data-driven world are quickly outpacing the capabilities of existing and projected systems for both raw sensor and generated analytics. Not only is the volume of data increasing exponentially, there is also a dramatic increase in the algorithmic complexity to analyze the data required for application performance. Low-SWaP (size, weight, and power) budgets are already a recognized requirement for edge AI applications, such as drones, IoT devices, and wearables.

“In the time you can upload your selfie to Instagram, our edge AI system can resolve 30 different AI models that are best for your particular environment.”

In the previous edition of this blog series, we discussed a centralized computing system (the cloud) is not scalable, especially when network throughput is severely limited or when the user application requires a near real-time response. The best approach is to leverage computing power that is available locally, in the sense of latency, available throughput, or similar measures. Simply stated, the overhead of moving data to centralized compute can be best minimized by processing the data locally. The energy and time used for data transport can be used for AI processing itself. This is the essence of edge AI.

The Future of Distributed Machine Learning:  Federated AI Learning

Today, edge AI solutions are distributed. Typical solutions consist of a processor integrated with the sensor(s). From one perspective, the centralized computing is scattered spatially in a way that relaxes the overall network storage and peak bandwidth. This approach allows IoT devices to operate in environments where network connectivity is highly variable, and even degraded.

The level of decentralization does not need to stop there. For example, using a new dispersed computing paradigm, we can opportunistically move AI models to data, rather than data to AI models. The size of an AI model is many orders of magnitudes smaller than the sensor data (e.g., kilobytes or megabytes for the AI model, compared to petabytes of sensor data) and so the latency and network bandwidth can be radically reduced for edge AI. In this dispersed AI paradigm, the AI inference can happen on the sensor platform or on the network infrastructure. In the near future, this paradigm can support on-chip training for one-shot-learning and federated AI learning.

Our research has shown advances in network architecture search, parameter quantization, and pruning have afforded neural network models to below 100kB. To put that into perspective, the average smartphone image is about 3MB assuming iPhone 6s Plus with 12MPixel camera. That means, in the time you can upload your selfie to Instagram, our edge AI system can resolve 30 different AI models that are best for your particular environment.

Latent AI supports edge AI solutions that scale to exponentially improve latency and network bandwidth reductions. With Adaptive AI™ technologies, we bring newfound capabilities to compress AI models for heterogeneous sensor and network platforms. Recent results from LEIP Compress prove we can quantize deep neural nets down to 10% of the size without loss of accuracy. We’re continuing to expand the LEIP framework for AI developers to deploy their products and will have some new announcements in early 2020.

For more information, please see Scaling Edge AI in a Data Driven World, Part 1