IDEAS home Printed from https://ideas.repec.org/p/aeg/report/2019-06.html
   My bibliography  Save this paper

Online Block Layer Decomposition schemes for training Deep Neural Networks

Author

Listed:
  • Laura Palagi

    () (Department of Computer, Control and Management Engineering Antonio Ruberti (DIAG), University of Rome La Sapienza, Rome, Italy)

  • Ruggiero Seccia

    () (Department of Computer, Control and Management Engineering Antonio Ruberti (DIAG), University of Rome La Sapienza, Rome, Italy)

Abstract

Deep Feedforward Neural Networks' (DFNNs) weights estimation relies on the solution of a very large nonconvex optimization problem that may have many local (no global) minimizers, saddle points and large plateaus. Furthermore, the time needed to find good solutions to the training problem heavily depends on both the number of samples and the number of weights (variables). In this work, we show how Block Coordinate Descent (BCD) methods can be applied to improve the performance of state-of-the-art algorithms by avoiding bad stationary points and flat regions. We first describe a batch BCD method able to effectively tackle difficulties due to the network's depth; then we further extend the algorithm proposing an online BCD scheme able to scale with respect to both the number of variables and the number of samples. We perform extensive numerical results on standard datasets using different deep networks, and we showed how the application of (online) BCD methods to the training phase of DFNNs permits to outperform standard batch/online algorithms leading to an improvement on both the training phase and the generalization performance of the networks.

Suggested Citation

  • Laura Palagi & Ruggiero Seccia, 2019. "Online Block Layer Decomposition schemes for training Deep Neural Networks," DIAG Technical Reports 2019-06, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
  • Handle: RePEc:aeg:report:2019-06
    as

    Download full text from publisher

    File URL: http://users.diag.uniroma1.it/~biblioteca/sites/default/files/documents/2019-06.pdf
    File Function: First version, 2019
    Download Restriction: no

    More about this item

    Keywords

    Deep Feedforward Neural Networks ; Block coordinate decomposition ; Online Optimization ; Large scale optimization;

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:aeg:report:2019-06. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Antonietta Angelica Zucconi). General contact details of provider: http://edirc.repec.org/data/dirosit.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.