Search

Founder Insights: Building the Future of Edge AI Microcontrollers

Reza Kazerounian, Co-Founder and President at Alif Semiconductor, shares how Alif is advancing edge AI with ultra-low power microcontrollers, Arm-based NPUs, and scalable MCU architectures for embedded AI and generative AI at the edge.

How is Alif approaching edge AI microcontroller design differently?

Edge AI is a broad term, so I think of it as a spectrum. On the low end you have wearables like fitness bands; on the high end you’re into smart routers or automotive systems. The connectivity profile at each end is different, and applications are highly fragmented across that spectrum. About six years ago we decided to focus on the lower half first—up through things like smart glasses—because that’s where the constraints are toughest. You’re dealing with very small physical size, yet you still need meaningful compute and AI acceleration. In that lower half we view it as roughly 0.25 TOPS at the bottom up to about 5 TOPS in the middle, all battery-operated. Then there’s memory size because the models themselves aren’t small, plus tiny batteries—think of the battery in the arm of smart glasses—and security for remote, distributed devices. Those are the five hardware pillars: size, compute/NPU, memory, power, and security. Software expectations are ahead of hardware—people want LLMs on glasses, pruned SLMs help but are still a challenge—so the companies that master these constraints first will win. That’s what we’re building for.

How do edge AI constraints shape your MCU and NPU architecture?

Our north star is the lowest energy per unit of work. That’s why we started on GlobalFoundries 22FDX FDSOI—great low leakage and solid dynamic current. From the beginning we divided the SoC into regions: a high-efficiency “always-on” domain that can classify or recognize events, and a high-performance domain that wakes to do heavy work, then goes back to sleep. Even within these regions we subdivide further with many individual power domains on the die to automatically manage “turn on only what’s needed, when needed.” We built a transactional interconnect so we can “plug in” processor cores, NPUs, peripherals, and memory block efficiently. That gives us flexibility to decide how many and what types of cores live in each domain. On the real-time side we chose Cortex-M55 with an Ethos-U55 NPU; on the high-performance side add another M55+NPU, and one or two Cortex-A class cores depending on the part number to run high-level operating systems. That lets a customer, for example, handle graphics or networking on the A-cores while the real-time M-cores plus NPUs handle compute, control, and ML inferencing. It’s a way to serve both RTOS and Linux customers across one scalable family of over 40 part numbers so far. I’ve just described our first-generation products, and we’re now rolling out our second generation with significant enhancements, including the integration of Arm’s latest ML accelerator, Ethos-U85, enabling on-chip generative AI without requiring a cloud connection. Bottom line is that we took a scalable platform approach because we started with a clean sheet of paper, and we still build on that today.

Why standardize on Arm CPUs and NPUs for embedded AI?

Think of how 8051s/6800s/68000/AVRs and many other CPU cores gave way to Cortex-M/A for good reasons: developers can easily scale processor capability across their entire end-product range, and they leverage a broad and mature ecosystem. We want that on the NPU side too. If your bare-metal neural accelerator is proprietary, the tool stack above it gets proprietary as well. Then a customer who scales from a low-end design to something higher often must start over with different tools. We believe standardization wins over “I can do a few more MACs per joule but you must adopt my proprietary stack.” So, we license Arm Ethos for the NPU and innovate in the system around it—power management, data movement and storage, connectivity, security—so customers get portability without niche tools.

You mentioned being early with Arm’s neural acceleration blocks. Where are you now on that roadmap?

We were first to market years ago with pairing Cortex-M55 CPUs and Ethos-U55 NPUs into our product platform. We’ve since announced our latest generation of products using the new Arm Ethos-U85 NPU capable of executing Generative AI workloads at the edge. We’re first in the MCU/MPU market with this GenAI combination, and by enabling the use of Transformer Models it will change how developers think about what they can achieve. We also pioneered the partitioned “always-on / wake the high-performance domain” pattern with AI acceleration; we now see competitors entering this space attempting to do something a similar. The market’s gotten tight, but our goal is to stay ahead each generation. Looking forward, we’ll continue to innovate and advance the products with on-die non-volatile memory, and just behind that we will be pushing above the midpoint of the edge-AI spectrum—think GHz+ microprocessors with external DDR, and NPUs rated in very impressive TOPS performance levels. Keep in mind that lowest energy per unit of work will dictate everything that we do.

How do customers adopt and deploy Alif’s edge AI platform? How does Alif support the customer’s design cycle?

We’re a young company, but many of us came from the top competitors like ST, Atmel, Freescale/NXP, and Renesas, so we know the space and the customers. We run a hybrid model: direct OEM engagements with hyperscalers and household names, plus global distribution through Arrow, EDOM, Mouser, and Digikey. We back that with FAEs and offices in USA, Finland, Taiwan and India so we can timeshare support across time zones. There’s a support ticketing system, multiple dev kits with reference designs, GitHubs with sample code and drivers, and thousands of pages of datasheets and hardware reference manuals. In other words, the standard big-company enablement—but delivered fast.

How are you thinking about manufacturing and supply chain strategy, including your use of 22FDX at GlobalFoundries and any potential node transitions or capacity considerations?

We don’t see capacity issues. That node is processed in Europe, and there’s a USA fab capable of it as well. Over time we’ll migrate to a smaller node primarily to keep driving dynamic power down as devices and their batteries shrink, not because of capacity. Early on we also chose not to rely on China or Taiwan capacity in our supply chain, so we’re outside that area.

How does Alif maintain an edge over incumbent solutions?

Large incumbents move slower and must carry legacy families; they can’t just start from a clean sheet and rethink the architecture. You end up jumping device families and losing scalability and compatibility. We tend to win on pace of innovation, modern tech in market, and customer satisfaction—people tell us they get personal support quickly rather than “we’ll get back to you.” That agility matters.

How does Alif compete with low-cost solutions coming out of Asia?

Many of those solutions are fine for a single application but don’t scale across a customer’s portfolio. We also hear consistent complaints about documentation, software, and support. We may not hit the rock-bottom price in every slot, but overall we win on lower risk, stronger docs, real support, and a scalable platform. That’s how we protect margin without racing to the bottom. It’s not easy, but that’s the formula.

What is Alif Semiconductor’s long-term vision for edge AI?

It’s a bold proposition to go against titans in the market. From day one the vision was clear: we know how this market works, and we set out to build a better, more scalable platform—the “ultimate” microcontroller/microprocessor foundation for the edge. That’s the mission and the execution path.

X

This field is for validation purposes and should be left unchanged.
(Required)