AI offers the biggest opportunity for many endpoint or edge IoT device manufacturers to add value to their products and achieve meaningful differentiation. But because of edge AI devices’ limited resources – power, processor bandwidth and size – manufacturers are continually bumping up against the ceiling of what they can do in AI with a microcontroller.
This is not to say that there is no scope to implement AI on edge MCUs today: MCUs that are optimized for AI functionality, such as the Ensemble and Balletto family products from Alif Semiconductor, are enabling valuable breakthroughs in the deployment of voice, image, and motion analysis, particularly using a convolutional neural networks (CNN).
But looking to the future, there is more that edge AI can do beyond the capabilities of a CNN. Generative AI and new transformer-based Small Language Models (SLMs), as well as emerging models including Graph Neural Networks (GNNs) and Spiking Neural Networks (SNNs), offer the opportunity to create exciting new features – in agriculture, for instance, a smart digital assistant could take a picture of a crop plant’s diseased leaf and have the device diagnose the disease and instantly provide recommendations for treatment. In the factory, a smart camera could inspect assembled products for defects such as missing or damaged parts.
But the deployment of generative AI via SLMs, GNNs or SNNs is currently beyond the scope even of today’s AI-oriented MCUs.
To take full advantage of the possibilities of generative AI, future MCUs will have a radically different set of specifications than the legacy MCUs in broad use today. The first and absolutely essential change is to integrate specialized neural networking compute capability, in the form of a neural processing unit (NPU). Alif has been the pioneer here: the first generation of Ensemble and Balletto MCUs features one or more Arm® Ethos®-U55 NPUs. And there is more to come: in future, embedded devices will need to offer performance of more than 1 TOPS while consuming so little power that they can be deployed in products such as earbuds with very small batteries while still providing for day-long use between charges. The second generation of Ensemble MCUs will lead the industry in this direction, as they will be based on an uprated Ethos-U85 NPU which supports transformer-based models.
Second, the new edge AI systems running models such as SLMs will have much bigger memory requirements, both on- and off-chip. This suggests that MCU manufacturers will need to introduce faster and lower-power interfaces between the MCU and external memories. The architecture of tomorrow’s MCU will also need to evolve to provide pipelines fast enough to support the operation of bigger and faster memories.
Last, the new MCUs need to integrate more of the functionality needed in edge AI systems, such as interfaces for voice, vision and motion sensing, to enable the implementation of system designs in an edge node’s small footprint.
A new future: compute embedded in the fabric of people’s everyday lives
As this move towards higher-performance AI at the edge gathers speed, a tectonic shift in demand for compute capability could take place, moving from CPU- and GPU-powered hubs – the smartphone and the PC – to a world in which most people’s interaction with the digital domain is via an MCU in an edge device.
In this new, more distributed compute environment, AI-driven functions will be implemented seamlessly and with low latency at the edge, in devices such as smart glasses, or earbuds with enhanced audio. They will provide a more natural user experience in which technology is embedded in the fabric of the user’s life, rather than constantly drawing their attention to their smartphone or PC – a world in which embedded devices take center stage.
The potential impact is profound: as more tasks are delegated to smart devices which are embedded in the environment and which can communicate with each other – whether in health monitoring, energy optimization, security, or other applications – our reliance on smartphone apps and PC-based internet services will diminish. This diffused, decentralized intelligence challenges the monopoly that centralized devices and services currently hold, so that technology can become more personalized and less intrusive.
And at the center of this world will be a new generation of MCUs, exemplified by the Ensemble and Balletto products from Alif, based on architectures optimized for the new technologies of AI and machine learning.