View : 560 Download: 0

A training method for low rank convolutional neural networks based on alternating tensor compose-decompose method

Title
A training method for low rank convolutional neural networks based on alternating tensor compose-decompose method
Authors
Lee S.Kim H.Jeong B.Yoon J.
Ewha Authors
윤정호정병선
SCOPUS Author ID
윤정호scopus; 정병선scopus
Issue Date
2021
Journal Title
Applied Sciences (Switzerland)
ISSN
2076-3417JCR Link
Citation
Applied Sciences (Switzerland) vol. 11, no. 2, pp. 1 - 23
Keywords
Convolutional neural networksDeep compressionDeep learningLow rankMobileNet
Publisher
MDPI AG
Indexed
SCIE; SCOPUS WOS scopus
Document Type
Article
Abstract
Over the past decade, deep learning-based computer vision methods have been shown to surpass previous state-of-the-art computer vision techniques in various fields, and have made great progress in various computer vision problems, including object detection, object segmentation, face recognition, etc. Nowadays, major IT companies are adding new deep-learning-based computer technologies to edge devices such as smartphones. However, since the computational cost of deep learning-based models is still high for edge devices, research is being actively carried out to compress deep learning-based models while not sacrificing high performance. Recently, many lightweight architectures have been proposed for deep learning-based models which are based on low-rank approximation. In this paper, we propose an alternating tensor compose-decompose (ATCD) method for the training of low-rank convolutional neural networks. The proposed training method can better train a compressed low-rank deep learning model than the conventional fixed-structure based training method, so that a compressed deep learning model with higher performance can be obtained in the end of the training. As a representative and exemplary model to which the proposed training method can be applied, we propose a rank-1 convolutional neural network (CNN) which has a structure alternatively containing 3-D rank-1 filters and 1-D filters in the training stage and a 1D structure in the testing stage. After being trained, the 3-D rank-1 filters can be permanently decomposed into 1-D filters to achieve a fast inference in the test time. The reason that the 1-D filters are not being trained directly in 1-D form in the training stage is that the training of the 3-D rank-1 filters is easier due to the better gradient flow, which makes the training possible even in the case when the fixed structured network with fixed consecutive 1-D filters cannot be trained at all. We also show that the same training method can be applied to the well-known MobileNet architecture so that better parameters can be obtained than with the conventional fixed-structure training method. Furthermore, we show that the 1-D filters in a ResNet like structure can also be trained with the proposed method, which shows the fact that the proposed method can be applied to various structures of networks. © 2021 by the authors. Licensee MDPI, Basel, Switzerland.
DOI
10.3390/app11020643
Appears in Collections:
자연과학대학 > 수학전공 > Journal papers
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML


qrcode

BROWSE