aiWare™

Hardware IP for Automotive AI

CNN Acceleration for Automotive AI

The aiWare NPU (Neural Processing Unit) is developed by engineers working side-by-side with our automated driving teams to create a unique solution targeting high performance L2-L4 automotive-grade real-time AI inference for AD/ADAS. Delivering up to 256 TOPS and up to industry-leading 98% efficiency for a wide range of CNN workloads, aiWare's exceptionally high efficiency means lowest possible power consumption. A highly autonomous engine placing minimum demands on the host CPU, aiWare can be either integrated within an SoC or as the primary computation unit of a companion scalable NPU accelerator chip. Core performance, on-chip memory, floorplan, external memory bandwidth and other parameters are all configurable to optimize performance for a wide range of customer requirements. The latest aiWare4 IP core is a 4th generation design, the result of more than 75 engineer-years development in close collaboration with our aiDrive software and NN algorithm R&D teams.

Responsive Image

Industry-leading efficiency

Up to 98% efficiency for a wide range of automotive CNNs​

Responsive Image

Scalable solution

Up to 256+ TOPS per chip using standard libraries and memories

Responsive Image

Safety-first

Hardware and software designed from the ground-up for ASIL-B compliance as a SEooC, complemented by comprehensive safety documentation

Responsive Image

Production-proven hardware IP

Implemented for Nextchip Apache5 automotive production SoC; strong partnership leads to Nextchip licensing aiWare4 for next-generation Apache6 SoC

Responsive Image

Highly deterministic solution​

Hardware determinism helps with certification for ormanding production safety environments; also enables offline high-accuracy performance estimation within 5% of final silicon months before chips available

Responsive Image

Easy integration

Designed for ease of layout, simple software interfacing, and advanced tools for system integration and validation

Excellent performance with low power consumption? Yes, we did it!

Watch the overview of aiWare™ features

Efficient & Scalable

Up to 256+ TOPS; up to 98% efficiency

High efficiency and patented dataflow ensure 2x to 5x higher performance for the same claimed TOPS for a wide range of workloads, minimizing system power consumption. Up to 2GHz operation in 5nm over full AEC-Q100 Grade 2 operation.

Tools & Solutions

Unique SDK from system prototyping to production optimization

aiWare Studio tools are widely acclaimed by OEMs and Tier1s, thanks to a unique approach focused on optimization of NNs by AI engineers. No need for low-level software engineers hand-optimizing NPU code – that’s all done by our advanced offline compiler. Thanks to aiDrive, we can supply fully optimized AI applications too.

Automotive Grade

Powering the most demanding L2-L4 AD applications

Designed from the ground up to be integrated into ASIL-B and higher certified solutions covering a broad range of automotive use cases. From ultra-small edge sensor enhancement up to high-performance central processors, aiWare is the ideal solution for L2 to L4 automotive applications with the most demanding performance requirements.

Driving your development

Features & benefits

Our products are designed to accelerate the realization of your automated driving goals. Click the button below to download our latest aiWare benchmark document and scroll down to see what we can bring to the table. 

Download the benchmark document

High-performance, Efficiency and Scalability

The aiWare hardware IP Core delivers high performance at low power, thanks to its class-leading high efficiency and highly optimized hardware architecture. With up to 64 Effective TOPS per core (not just claimed TOPS), aiWare is highly scalable, so can be used in applications requiring anything from 1-2 TOPS up to 100+ TOPS using multi-core. Thanks to a design focused on optimizing every data transfer every clock cycle, aiWare makes full use of highly distributed on-chip local memory plus dense on-chip RAM to keep data on-chip as much as possible, minimizing or eliminating external memory traffic. Innovative dataflow design ensures external memory bandwidth is minimized while enabling high efficiency regardless of input data size or NN depth.

Unique SDK plus optimized
aiDrive-based AI

The aiWare Studio SDK has been recognized and acclaimed by OEMs and Tier1s around the world for its innovative approach to embedded NN optimization, focusing on iterating the NN itself, not just the NPU code used to execute it. This gives NN designers far more flexibility when implementing their NNs for production hardware platforms than other NPU solutions. The offline performance estimator within aiWare Studio is another highlight, delivering accurate performance estimation within 5% of final silicon on any desktop PC. And thanks to the comprehensive portfolio of aiDrive modular software for automotive AI, a wide range of software solutions fully optimized for aiWare are also available.

Powering the most demanding L2-L4 automated driving applications

The aiWare hardware IP Core delivers the features needed for production automotive deployment, not just claiming high theoretical benchmark results in the lab. The underlying technology for aiWare has been engineered from the ground up for automotive production deployment. With all design processes fully audited to automotive standards, aiWare is designed to be integrated into ASIL-B and higher certified solutions through comprehensive documentation and safety analysis. The RTL is fully characterized for operation to AEC-Q100 Grade 2 operating conditions and is complemented by comprehensive integration documentation to help you achieve optimal PPA layout quickly and easily.
Want to know more about the technical details?

Request additional NN workload files

Efficiency is the best way to assess any NPU’s ability – download the additional benchmark data to see how aiWare4 achieves industry-leading efficiencies up to 98% over a wide range of automotive 

Request additional benchmark data
Interested in aiWare's hardware innovations?

Don't hesitate
to contact us!

Our team is always ready to work with exciting and ambitious clients. If you're ready to start your partnership with us, get in touch.

Contact Us
Interested in more of aiWare's capabilities?

aiWare™ specification

aiWare is a state-of-the-art NPU for automotive inference, with many features built-in to maximize performance for a wide range of automotive AI applications. 

Scalable
performance

The aiWare4 RTL is fully synthesizable to deliver up to 256 TOPS @ 2GHz, using standard libraries and memory compilers. The RTL is designed for easy integration, with no large buses and tiled computation modules to enable easy implementation at high clock speeds.

Up to 98%
efficiency

Since aiWare was conceived to power multi-camera, heterogeneous sensor-based L2-L4  automated driving applications, it delivers optimal performance with large inputs such as 2M-8M pixel cameras, with no upper limits. NN depth is also unconstrained while delivering as high as 98% efficiency on real workloads.

Wide range of built-in functions

One goal of aiWare was to ensure that the overhead for the host CPU is the minimum possible: point aiWare to the start location of the input data, tell it what NN workload to execute, and then generate an interrupt when done. That's it! That is because, in addition to highly optimized 3D and 2D convolution, aiWare supports a wide range of activation and pooling functions natively, enabling many CNN workloads to be executed 100% within the aiWare NPU with no host CPU intervention.