IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v47y2022i4p2784-2814.html
   My bibliography  Save this article

Suboptimal Local Minima Exist for Wide Neural Networks with Smooth Activations

Author

Listed:
  • Tian Ding

    (Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong)

  • Dawei Li

    (Department of Industrial and Enterprise Systems Engineering and Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801)

  • Ruoyu Sun

    (Department of Industrial and Enterprise Systems Engineering and Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801)

Abstract

Does a large width eliminate all suboptimal local minima for neural nets? An affirmative answer was given by a classic result published in 1995 for one-hidden-layer wide neural nets with a sigmoid activation function, but this result has not been extended to the multilayer case. Recently, it was shown that, with piecewise linear activations, suboptimal local minima exist even for wide nets. Given the classic positive result on smooth activation and the negative result on nonsmooth activations, an interesting open question is: Does a large width eliminate all suboptimal local minima for deep neural nets with smooth activation? In this paper, we give a largely negative answer to this question. Specifically, we prove that, for neural networks with generic input data and smooth nonlinear activation functions, suboptimal local minima can exist no matter how wide the network is (as long as the last hidden layer has at least two neurons). Therefore, the classic result of no suboptimal local minimum for a one-hidden-layer network does not hold. Whereas this classic result assumes sigmoid activation, our counterexample covers a large set of activation functions (dense in the set of continuous functions), indicating that the limitation is not a result of the specific activation. Together with recent progress on piecewise linear activations, our result indicates that suboptimal local minima are common for wide neural nets.

Suggested Citation

  • Tian Ding & Dawei Li & Ruoyu Sun, 2022. "Suboptimal Local Minima Exist for Wide Neural Networks with Smooth Activations," Mathematics of Operations Research, INFORMS, vol. 47(4), pages 2784-2814, November.
  • Handle: RePEc:inm:ormoor:v:47:y:2022:i:4:p:2784-2814
    DOI: 10.1287/moor.2021.1228
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/moor.2021.1228
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2021.1228?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:47:y:2022:i:4:p:2784-2814. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.