Search

Alif Semiconductor Expands Embedded AI with ExecuTorch 

Collaborations in technology often mean more efficiency and more power to developers. And that is no different with Alif Semiconductor’s latest partnership with The PyTorch Foundation to integrate the ExecuTorch framework into the Ensemble® and Balletto™ families of microcontrollers (MCUs). Through this collaboration, Alif is enhancing its AI-enabled MCUs by providing developers with a powerful tool for deploying advanced machine learning models directly on endpoint devices with greater efficiency. 

Read on to learn about how combining the strengths of ExecuTorch and Alif’s Ensemble family helps reshape the possibilities of AI at the edge and read the official press release here.

Alif’s Ensemble Family 

Since the release of Ensemble, Alif has seen a huge level of interest in transitioning machine learning closer to the source of the data. As said by Alif’s president and co-founder, Reza Kazerounian, “ExecuTorch will significantly broaden the use cases that can be realized by this transition, and Alif is very excited to collaborate with PyTorch on bringing this to microcontroller devices.” 

Alif’s MCUs are built with a heterogeneous computing architecture that cohesively integrates multiple processing cores and dedicated hardware accelerators. The Ensemble and Balletto families include Arm® Cortex®-M55, which manages general processing tasks, and Ethos™ U55, which offloads machine learning computations  through specialized neural processing units (NPUs). These NPUs are specifically designed for AI workloads and handle inference tasks at extremely low power consumption, allowing for real-time data processing directly on endpoint devices.

The Ensemble and Balletto families are Arm-based microcontrollers that scale from single-core to a new class of multi-core devices.

A differentiating aspect of Alif’s architecture is the interplay between its AI accelerator and general-purpose cores. The architecture supports heterogeneous multiprocessing, which allows different cores to simultaneously execute tasks based on their specific capabilities. For instance, the Cortex-M55 core can handle less intensive tasks like sensor management, while the AI accelerators focus on inference and computation-heavy operations. Such an architecture provides flexibility and enables developers to optimize workloads for performance and power efficiency.

ExecuTorch and Seamless AI Model Deployment

The ExecuTorch framework complements Alif’s hardware by offering a streamlined path to design, optimize, train, and deploy complex AI models on endpoint devices.

Alif is partnering with Arm to introduce hardware acceleration for transformer-based models to microcontroller devices. This breakthrough will allow language models and other advanced applications to run seamlessly on edge devices using ExecuTorch. Traditionally, transformers and other advanced models have been difficult to implement at the edge due to their high computational demands. ExecuTorch addresses this issue by providing optimized tools for model compression, quantization, and pruning, allowing engineers to reduce the size and complexity of models without sacrificing performance. 

The three-step process of exporting a PyTorch program using ExecuTorch.

The framework integrates tightly with Alif’s Ensemble family by leveraging the hardware acceleration capabilities of the Ethos U55 and U85 NPUs. The NPU’s architecture includes enhanced support for matrix multiplications and dot products, both central operations in Transformer models. 

By offloading these operations to the NPU, the Ensemble family and ExecuTorch achieve model inference speeds comparable to those seen in cloud-based systems but with a much smaller power footprint. For instance, ExecuTorch can reduce power consumption by up to 80% in some applications compared to cloud-based AI, while also cutting latency to under 10 milliseconds for real-time tasks.  

Power & Cost Efficiency 

Embedded systems, particularly those in industrial automation, smart cities, and medical devices, require continuous operation with minimal power consumption. Traditional cloud-based solutions often require multiple round-trip data transmissions that significantly increase power draw and latency. 

Using on-device inference reduces energy consumption and lowers cost for companies deploying AI applications. By eliminating the need for constant cloud connectivity, enterprises can minimize bandwidth and data center costs while improving system responsiveness. In power-critical applications, the Ensemble MCUs’ reduced energy demand can extend device lifespans and reduce the frequency of battery replacements, contributing to long-term operational savings.

Demonstrations at Electronica and CES

Alif will showcase its Ensemble family and ExecuTorch integration at the two upcoming shows: Electronica 2024 in Munich, Germany, in November, and at CES 2025 in Las Vegas, in January. The demonstrations will highlight real-world applications powered by the MCUs, such as wearable technology, real-time speech processing, and smart industrial sensors.

To learn more about Alif Semiconductor’s MCUs, please contact Alif Semiconductor

X

(Required)
This field is for validation purposes and should be left unchanged.