In recent years, Convolutional Neural Network (CNN) has received widespread attention in the field of machine learning due to its high-accuracy performance in character recognition and image classification. Nevertheless, the compute-intensive and memory-intensive characteristics of CNN have posed huge challenges to the general-purpose processor, which needs to support various workloads. Therefore, a large number of CNN-specific hardware accelerators have emerged to improve efficiency. Though significantly efficient, previous accelerators are not flexible enough. In this study, classical CNN models are analyzed and a domain-specific instruction set of 10 matrix instructions, called RV-CNN, is design based on the promising RISC-V architecture. By abstracting CNN computation into instructions, the proposed design can provide sufficient flexibility for CNN and possesses a higher code density than the general ISA. On this basis, a code-to-instruction mapping mechanism is proposed. By using the RV-CNN to build different CNN models on the Xilinx ZC702, this paper found that compared to x86 processors, RV-CNN has an average of 141 times energy efficiency and 8.91 times the code density; compared to GPU, it has an average of 1.25 times energy efficiency and 1.95 times the code density. In addition, compared to previous CNN accelerators, the design supports typical CNN models while at high energy efficiency.
Wenqi Lou, Chao Wang, Lei Gong, Xuehai Zhou. Neural Network Instruction Set Extension and Code Mapping Mechanism. International Journal of Software and Informatics, 2021,11(2):243~258Copy