Online Block Layer Decomposition schemes for training Deep Neural Networks

My bibliography Save this paper

Online Block Layer Decomposition schemes for training Deep Neural Networks

Author

Listed:

Laura Palagi
(Department of Computer, Control and Management Engineering Antonio Ruberti (DIAG), University of Rome La Sapienza, Rome, Italy)
Ruggiero Seccia
(Department of Computer, Control and Management Engineering Antonio Ruberti (DIAG), University of Rome La Sapienza, Rome, Italy)

Registered:

Laura Palagi

Abstract

Deep Feedforward Neural Networks' (DFNNs) weights estimation relies on the solution of a very large nonconvex optimization problem that may have many local (no global) minimizers, saddle points and large plateaus. Furthermore, the time needed to find good solutions to the training problem heavily depends on both the number of samples and the number of weights (variables). In this work, we show how Block Coordinate Descent (BCD) methods can be applied to improve the performance of state-of-the-art algorithms by avoiding bad stationary points and flat regions. We first describe a batch BCD method able to effectively tackle difficulties due to the network's depth; then we further extend the algorithm proposing an online BCD scheme able to scale with respect to both the number of variables and the number of samples. We perform extensive numerical results on standard datasets using different deep networks, and we showed how the application of (online) BCD methods to the training phase of DFNNs permits to outperform standard batch/online algorithms leading to an improvement on both the training phase and the generalization performance of the networks.

Suggested Citation

Laura Palagi & Ruggiero Seccia, 2019. "Online Block Layer Decomposition schemes for training Deep Neural Networks," DIAG Technical Reports 2019-06, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".

Handle: RePEc:aeg:report:2019-06

Download full text from publisher

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Ruggiero Seccia & Daniele Gammelli & Fabio Dominici & Silvia Romano & Anna Chiara Landi & Marco Salvetti & Andrea Tacchella & Andrea Zaccaria & Andrea Crisanti & Francesca Grassi & Laura Palagi, 2020. "Considering patient clinical history impacts performance of machine learning models in predicting course of multiple sclerosis," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-18, March.

More about this item

Keywords

; ; ; ;

NEP fields

This paper has been announced in the following NEP Reports:

NEP-BIG-2019-06-24 (Big Data)
NEP-CMP-2019-06-24 (Computational Economics)
NEP-ECM-2019-06-24 (Econometrics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:aeg:report:2019-06. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Antonietta Angelica Zucconi The email address of this maintainer does not seem to be valid anymore. Please ask Antonietta Angelica Zucconi to update the entry or send us the correct address (email available below). General contact details of provider: https://edirc.repec.org/data/dirosit.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Online Block Layer Decomposition schemes for training Deep Neural Networks

Author

Abstract

Suggested Citation

Download full text from publisher

Citations

More about this item

Keywords

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data