IDEAS home Printed from https://ideas.repec.org/p/aeg/report/2019-06.html
   My bibliography  Save this paper

Online Block Layer Decomposition schemes for training Deep Neural Networks

Author

Listed:
  • Laura Palagi

    (Department of Computer, Control and Management Engineering Antonio Ruberti (DIAG), University of Rome La Sapienza, Rome, Italy)

  • Ruggiero Seccia

    (Department of Computer, Control and Management Engineering Antonio Ruberti (DIAG), University of Rome La Sapienza, Rome, Italy)

Abstract

Deep Feedforward Neural Networks' (DFNNs) weights estimation relies on the solution of a very large nonconvex optimization problem that may have many local (no global) minimizers, saddle points and large plateaus. Furthermore, the time needed to find good solutions to the training problem heavily depends on both the number of samples and the number of weights (variables). In this work, we show how Block Coordinate Descent (BCD) methods can be applied to improve the performance of state-of-the-art algorithms by avoiding bad stationary points and flat regions. We first describe a batch BCD method able to effectively tackle difficulties due to the network's depth; then we further extend the algorithm proposing an online BCD scheme able to scale with respect to both the number of variables and the number of samples. We perform extensive numerical results on standard datasets using different deep networks, and we showed how the application of (online) BCD methods to the training phase of DFNNs permits to outperform standard batch/online algorithms leading to an improvement on both the training phase and the generalization performance of the networks.

Suggested Citation

  • Laura Palagi & Ruggiero Seccia, 2019. "Online Block Layer Decomposition schemes for training Deep Neural Networks," DIAG Technical Reports 2019-06, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
  • Handle: RePEc:aeg:report:2019-06
    as

    Download full text from publisher

    File URL: http://users.diag.uniroma1.it/~biblioteca/sites/default/files/documents/2019-06.pdf
    File Function: First version, 2019
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ruggiero Seccia & Daniele Gammelli & Fabio Dominici & Silvia Romano & Anna Chiara Landi & Marco Salvetti & Andrea Tacchella & Andrea Zaccaria & Andrea Crisanti & Francesca Grassi & Laura Palagi, 2020. "Considering patient clinical history impacts performance of machine learning models in predicting course of multiple sclerosis," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-18, March.

    More about this item

    Keywords

    Deep Feedforward Neural Networks ; Block coordinate decomposition ; Online Optimization ; Large scale optimization;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:aeg:report:2019-06. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Antonietta Angelica Zucconi (email available below). General contact details of provider: https://edirc.repec.org/data/dirosit.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.