The aiWare hardware IP Core is highly customizable and developed by engineers working side-by-side with our automated driving teams. It can be deployed within a soc or as a stand-alone nn accelerator. On-chip and external memory sizes are highly configurable to optimize performance for customer requirements. aiWare maximizes host CPU offload using on-chip SRAM and external DRAM to keep execution and dataflow within the core. aiWare was designed for volume production in L2/L2+ and above ADAS systems. The first version of the mature IP core was released over three years ago. Building on this expertise the aiWare IP is more sophisticated than a leading automotive OEM’S recently announced accelerator.
The aiWare IP core is fully synthesizable RTL needing no special libraries, enabling neural network acceleration cores from 0,5 TOPS to 16 TOPS. The IP is layout-friendly thanks to its tile-based modular design. Optimized for efficiency at low clock speeds, the aiWare IP core can operate anywhere from 100 MHz to 1 GHz. The hardware IP core is also highly deterministic to increase safety, removing the complexity of caches or programmable cores. aiWare delivers more than 2 TMAC/s per W (4 TOP/s per Watt – 7nm estimated) while sustaining >95% efficiency under continuous operation. The IP core offers a range of ASIL-B–D compliant implementation options either on-chip with a host CPU SoC or as a dedicated NN-accelerator.
All implementations and aiWare-based solutions are supported by the same software APIs and SDK. The IP’s ultra-low-level architecture ensures all scheduling complexity is handled by SDK tools, minimizing the need for manual code or memory optimization. The aiWare Network Compiler accepts any NN as input via the NNEF™ (available) or ONNX standard (Q2 2019). NNs are converted into a binary ready for execution on the aiWare core. Support for these standards enables the acceleration of NNs from a wide range of frameworks such as Caffe, TensorFlow or PyTorch. The SDK includes a growing portfolio of tools, from compilers to FP32 to INT8 conversion, to performance analysis and NN optimization assistants.
AImotive offers aiWare Evaluation Systems to enable the benchmarking of the hardware IP core. The FPGA-based evaluation platform delivers up to 200 GMAC/s (400-500 GOPS). Our proof-of-concept silicon implementation of aiWare, created with Verisilicon™ and Globalfoundries™, provides up to 1.6 TMACS/s (>3 TOPS). All evaluation systems can run our partners' own neural networks using the aiWare SDK and AImotive's benchmarks. Including the aiWare SDK, the systems rely on NNEF for flexibility. Thus, our partners can independently verify our benchmark results, run on our closely specified benchmark framework, while gaining insight into the performance they can expect from their technologies when using aiWare.