aiWare™ specifications

Hardware IP for Automotive AI


Key features

Performance 1-24 TOPS per core @ 1.5GHz
MAC/Cycle (8x8) 512-4096 INT8 (32-bit internal)
Utilization Up to 98% for NNs such as vgg16 or Yolo
>85% for most automotive CNNs
NN Convolution Up to 100% efficiency achievable. MAC arrays optimized for 2D and 3D convolution and deconvolution. Matrix multipliers not used - no need for Winograd or other transforms
Support functions Wide range of Activation, Pooling, Unary, Binary, Tensor Introduction/Shaping and Linear operations to ensure 100% CNN execution within aiWare core with no host CPU intervention
Configurability Wide range of workload optimizations possible without compromising flexibility
NN capability Any INT8 CNN (no depth limit)
Data types INT8 Native
32-bit internal precision and dynamic per-layer scaling
Sparsity Not needed for well-optimized NNs
Multicore capability No limit
Safety Designed to be part of an ASIL-B and higher certified subsystem
full safety manual


Embedded SRAM 10-200 Mbits per core
External Memory Dedicated off-chip DRAM or shared SOC memory 
Bandwidth reduction On-chip compression 
Main interface AXI4 to LPDDR
AXI4 to host

Neural Network Development Frameworks

Frameworks supported Caffe/Caffe2, TensorFlow, PyTorch, ONNX, Khronos NNEF
Inference deployment Binary compiled using aiWare Studio or command line tools offline
Single binary contains one or multiple NNs, weights and all scheduling info
Software runtime Minimal host CPU management required during execution. Simple generic portable runtime API runs on any RTOS or HLOS; wrappers to popular APIs available on request
Development Tools aiWare Studio provides comprehensive tools to import, analyse and optimize any NN with easy to use interactive UI
Evaluation Tools aiWare Studio features offline performance estimator accurate to within 5% of final silicon
FPGA implementations also available

Target applications

Automotive Inference for automated driving  
High performance automotive multi-camera perception  
Large camera NN processing (no upper limit on input resolution)  
High data rate heterogeneous multi-sensor fusion  
