The aiWare IP core is fully synthesizable RTL needing no special libraries, enabling neural network acceleration cores from 1TMAC/s to 50+ TMAC/s. aiWare can be used either on-chip alongside a host CPU SoC or as a dedicated NN accelerator. Optimized for low-latency automotive applications, aiWare is application-agnostic. It can be used in any NN acceleration where low power consumption, high efficiency, and high sustained throughput are crucial. aiWare’s architecture maximizes host CPU offload. Using on-chip SRAM and external DRAM it keeps execution and dataflow within the aiWare core. Patented dataflow techniques used in aiWare were developed using real-world workloads such as aiDrive, and benchmarks based on industry-standard workloads.
The comprehensive aiWare SDK provides the tools to maximize its efficiency. It enables the acceleration of NNs from a wide range of frameworks such as Caffe, TensorFlow or PyTorch. As one of the first implementations of the Khronos Neural Network Exchange Format the aiWare SDK offers an NNEF importer to translate these frameworks into binaries ready for execution on aiWare-based systems. As a result, aiWare hardware solutions offer unparalleled freedom in the choice of deep learning frameworks. The SDK also includes tools to quickly translate CNNs based on FP32, FP16 or INT16 into an INT8 implementation with little or no loss of precision. Thus, the SDK drastically reduces the overhead of switching to aiWare from existing inference engines.
AImotive has seen many of our partners and customers suffer the consequences of making incorrect decisions regarding NN acceleration because of using inappropriate or outdated benchmarks. Therefore, AImotive engineers have applied their many years of experience creating professional industry benchmarks to develop a controlled inference environment for benchmarking aiWare. This tightly-specified suite is published openly for anyone to use. The benchmark framework runs on the Caffe deep learning framework, with a set of well-defined NN workloads derived from industry-standard publicly-available benchmarks. The results of tests run with aiWare on the framework are also publicly available, showcasing how aiWare outperforms high-end desktop GPUs.Results Framework
AImotive offers an aiWare v2 FPGA Evaluation System, delivering up to 200 GMAC/s (400-500 GOPS). The system runs both sample neural networks created by AImotive and the customers’ own networks. Including the aiWare SDK, the evaluation systems uses NNEF to enable users to take neural networks from different frameworks such as Caffe, TensorFlow or PyTorch and translate these into binaries ready for execution on aiWare-based hardware platforms. This allows our partners to gain an understanding of the performance they can expect with their existing approaches when run on aiWare-based systems. A high-performance custom chip-based version of the evaluation system is currently under development and expected to ship to lead customers in Q4 2018.Contact for More Info