The aiWare IP core is fully synthesizable RTL needing no special libraries, enabling neural network acceleration cores from 1TMAC/s to 50+ TMAC/s. It can be used on-chip with a host CPU SoC or as a dedicated NN accelerator. The application agnostic IP is optimized for low-latency automotive applications. The architecture maximizes host CPU offload. Using on-chip SRAM and external DRAM it keeps execution and dataflow within the aiWare core.
The aiWare SDK provides tools to maximize efficiency. It enables the acceleration of NNs from a wide range of frameworks such as Caffe, TensorFlow or PyTorch. An early implementation of the Khronos NNEF™ standard the SDK offers an NNEF importer to translate these frameworks into binaries executable on aiWare-based systems. The SDK also includes tools to translate CNNs based on FP32, FP16 or INT16 into INT8 with minimal loss of precision.
AImotive has seen many partners suffer the consequences of relying on inappropriate benchmarks for NN acceleration. Thus, we have created an inference environment for benchmarking aiWare. This tightly-specified suite is published openly. The benchmark framework runs on the Caffe framework, with a set of well-defined NN workloads derived from industry-standard benchmarks. The results of tests run with aiWare are also publicly available.
AImotive offers an aiWare v2 FPGA Evaluation System, delivering up to 200 GMAC/s (400-500 GOPS). The system runs sample neural networks created by AImotive and the customers’ own networks. Including the aiWare SDK, the system relies on NNEF for flexibility. Thus, our partners can gain an understanding of the performance of their approaches when run on aiWare-based systems. A high-performance custom aiWare chip was created in Q4 2018.