IDEAS home Printed from https://ideas.repec.org/p/tse/wpaper/123630.html
   My bibliography  Save this paper

An Inertial Newton Algorithm for Deep Learning

Author

Listed:
  • Bolte, Jérôme
  • Castera, Camille
  • Pauwels, Edouard
  • Févotte, Cédric

Abstract

We devise a learning algorithm for possibly nonsmooth deep neural networks featuring inertia and Newtonian directional intelligence only by means of a backpropagation oracle. Our algorithm, called INDIAN, has an appealing mechanical interpretation, making the role of its two hyperparameters transparent. An elementary phase space lifting allows both for its implementation and its theoretical study under very general assumptions. We handle in particular a stochastic version of our method (which encompasses usual mini-batch approaches) for nonsmooth activation functions (such as ReLU). Our algorithm shows high efficiency and reaches state of the art on image classification problems.

Suggested Citation

  • Bolte, Jérôme & Castera, Camille & Pauwels, Edouard & Févotte, Cédric, 2019. "An Inertial Newton Algorithm for Deep Learning," TSE Working Papers 19-1043, Toulouse School of Economics (TSE).
  • Handle: RePEc:tse:wpaper:123630
    as

    Download full text from publisher

    File URL: https://www.tse-fr.eu/sites/default/files/TSE/documents/doc/wp/2019/wp_tse_1043.pdf
    File Function: Full Text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hédy Attouch & Jérôme Bolte & Patrick Redont & Antoine Soubeyran, 2010. "Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Łojasiewicz Inequality," Mathematics of Operations Research, INFORMS, vol. 35(2), pages 438-457, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Samir Adly & Hedy Attouch & Van Nam Vo, 2023. "Convergence of Inertial Dynamics Driven by Sums of Potential and Nonpotential Operators with Implicit Newton-Like Damping," Journal of Optimization Theory and Applications, Springer, vol. 198(1), pages 290-331, July.
    2. Bolte, Jérôme & Le, Tam & Pauwels, Edouard & Silveti-Falls, Antonio, 2022. "Nonsmooth Implicit Differentiation for Machine Learning and Optimization," TSE Working Papers 22-1314, Toulouse School of Economics (TSE).
    3. Claire Boyer & Antoine Godichon-Baggioni, 2023. "On the asymptotic rate of convergence of Stochastic Newton algorithms and their Weighted Averaged versions," Computational Optimization and Applications, Springer, vol. 84(3), pages 921-972, April.
    4. Emilie Chouzenoux & Jean-Baptiste Fest, 2022. "SABRINA: A Stochastic Subspace Majorization-Minimization Algorithm," Journal of Optimization Theory and Applications, Springer, vol. 195(3), pages 919-952, December.
    5. Bolte, Jérôme & Pauwels, Edouard, 2019. "Conservative set valued fields, automatic differentiation, stochastic gradient methods and deep learning," TSE Working Papers 19-1044, Toulouse School of Economics (TSE).
    6. Bolte, Jérôme & Glaudin, Lilian & Pauwels, Edouard & Serrurier, Matthieu, 2021. "A Hölderian backtracking method for min-max and min-min problems," TSE Working Papers 21-1243, Toulouse School of Economics (TSE).
    7. Bolte, Jérôme & Pauwels, Edouard, 2021. "A mathematical model for automatic differentiation in machine learning," TSE Working Papers 21-1184, Toulouse School of Economics (TSE).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Maryam Yashtini, 2022. "Convergence and rate analysis of a proximal linearized ADMM for nonconvex nonsmooth optimization," Journal of Global Optimization, Springer, vol. 84(4), pages 913-939, December.
    2. Silvia Bonettini & Peter Ochs & Marco Prato & Simone Rebegoldi, 2023. "An abstract convergence framework with application to inertial inexact forward–backward methods," Computational Optimization and Applications, Springer, vol. 84(2), pages 319-362, March.
    3. Le Thi Khanh Hien & Duy Nhat Phan & Nicolas Gillis, 2022. "Inertial alternating direction method of multipliers for non-convex non-smooth optimization," Computational Optimization and Applications, Springer, vol. 83(1), pages 247-285, September.
    4. Francesco Rinaldi & Damiano Zeffiro, 2023. "Avoiding bad steps in Frank-Wolfe variants," Computational Optimization and Applications, Springer, vol. 84(1), pages 225-264, January.
    5. Emilie Chouzenoux & Jean-Christophe Pesquet & Audrey Repetti, 2016. "A block coordinate variable metric forward–backward algorithm," Journal of Global Optimization, Springer, vol. 66(3), pages 457-485, November.
    6. Kely D. V. Villacorta & Paulo R. Oliveira & Antoine Soubeyran, 2014. "A Trust-Region Method for Unconstrained Multiobjective Problems with Applications in Satisficing Processes," Journal of Optimization Theory and Applications, Springer, vol. 160(3), pages 865-889, March.
    7. Zhili Ge & Zhongming Wu & Xin Zhang & Qin Ni, 2023. "An extrapolated proximal iteratively reweighted method for nonconvex composite optimization problems," Journal of Global Optimization, Springer, vol. 86(4), pages 821-844, August.
    8. Bo Jiang & Tianyi Lin & Shiqian Ma & Shuzhong Zhang, 2019. "Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis," Computational Optimization and Applications, Springer, vol. 72(1), pages 115-157, January.
    9. Zehui Jia & Jieru Huang & Xingju Cai, 2021. "Proximal-like incremental aggregated gradient method with Bregman distance in weakly convex optimization problems," Journal of Global Optimization, Springer, vol. 80(4), pages 841-864, August.
    10. Dominikus Noll, 2014. "Convergence of Non-smooth Descent Methods Using the Kurdyka–Łojasiewicz Inequality," Journal of Optimization Theory and Applications, Springer, vol. 160(2), pages 553-572, February.
    11. Radu Ioan Bot & Dang-Khoa Nguyen, 2020. "The Proximal Alternating Direction Method of Multipliers in the Nonconvex Setting: Convergence Analysis and Rates," Mathematics of Operations Research, INFORMS, vol. 45(2), pages 682-712, May.
    12. Peter Ochs, 2018. "Local Convergence of the Heavy-Ball Method and iPiano for Non-convex Optimization," Journal of Optimization Theory and Applications, Springer, vol. 177(1), pages 153-180, April.
    13. Glaydston Carvalho Bento & João Xavier Cruz Neto & Antoine Soubeyran & Valdinês Leite Sousa Júnior, 2016. "Dual Descent Methods as Tension Reduction Systems," Journal of Optimization Theory and Applications, Springer, vol. 171(1), pages 209-227, October.
    14. Bolte, Jérôme & Le, Tam & Pauwels, Edouard & Silveti-Falls, Antonio, 2022. "Nonsmooth Implicit Differentiation for Machine Learning and Optimization," TSE Working Papers 22-1314, Toulouse School of Economics (TSE).
    15. Guoyin Li & Tianxiang Liu & Ting Kei Pong, 2017. "Peaceman–Rachford splitting for a class of nonconvex optimization problems," Computational Optimization and Applications, Springer, vol. 68(2), pages 407-436, November.
    16. S. Bonettini & M. Prato & S. Rebegoldi, 2018. "A block coordinate variable metric linesearch based proximal gradient method," Computational Optimization and Applications, Springer, vol. 71(1), pages 5-52, September.
    17. Masoud Ahookhosh & Le Thi Khanh Hien & Nicolas Gillis & Panagiotis Patrinos, 2021. "A Block Inertial Bregman Proximal Algorithm for Nonsmooth Nonconvex Problems with Application to Symmetric Nonnegative Matrix Tri-Factorization," Journal of Optimization Theory and Applications, Springer, vol. 190(1), pages 234-258, July.
    18. Alexander Y. Kruger & Nguyen H. Thao, 2015. "Quantitative Characterizations of Regularity Properties of Collections of Sets," Journal of Optimization Theory and Applications, Springer, vol. 164(1), pages 41-67, January.
    19. Jing Zhao & Qiao-Li Dong & Michael Th. Rassias & Fenghui Wang, 2022. "Two-step inertial Bregman alternating minimization algorithm for nonconvex and nonsmooth problems," Journal of Global Optimization, Springer, vol. 84(4), pages 941-966, December.
    20. Fornasier, Massimo & Maly, Johannes & Naumova, Valeriya, 2021. "Robust recovery of low-rank matrices with non-orthogonal sparse decomposition from incomplete measurements," Applied Mathematics and Computation, Elsevier, vol. 392(C).

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tse:wpaper:123630. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/tsetofr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.