Theory of adaptive SVD regularization for deep neural networks

By: Mohammad Mahdi Bejani & Mehdi Ghatee

Published: Neural Networks, Volume 128, August 2020, Pages 33-46.

Abstract:

Deep networks can learn complex problems, however, they suffer from overfitting. To solve this problem, regularization methods have been proposed that are not adaptable to the dynamic changes in the training process. With a different approach, this paper presents a regularization method based on the Singular Value Decomposition (SVD) that adjusts the learning model adaptively. To this end, the overfitting can be evaluated by condition numbers of the synaptic matrices. When the overfitting is high, the matrices are substituted with their SVD approximations. Some theoretical results are derived to show the performance of this regularization method. It is proved that SVD approximation cannot solve overfitting after several iterations. Thus, a new Tikhonov term is added to the loss function to converge the synaptic weights to the SVD approximation of the best-found results. Following this approach, an Adaptive SVD Regularization (ASR) is proposed to adjust the learning model with respect to the dynamic training characteristics. ASR results are visualized to show how ASR overcomes overfitting. The different configurations of Convolutional Neural Networks (CNN) are implemented with different augmentation schemes to compare ASR with state-of-the-art regularization methods. The results show that on MNIST, F-MNIST, SVHN, CIFAR-10 and CIFAR-100, the accuracies of ASR are 99.4%, 95.7%, 97.1%, 93.2% and 55.6%, respectively. Although ASR improves the overfitting and validation loss, its elapsed time is not significantly greater than the learning without regularization.

Highlights:

  • Applying condition number of synaptic matrices to evaluate the learning model complexity.
  • Regularizing the neural networks by SVD approximation.
  • Simplifying the synaptic matrices with the most important components of SVD.
  • Applying a new Tikhonov term in the loss function to save the best-found results.
  • Proposing an adaptive SVD regularization for CNN to improve training and validation errors.

Leave a comment

Your email address will not be published. Required fields are marked *