IDEAS home Printed from https://ideas.repec.org/a/inm/orijoc/v34y2022i1p638-655.html
   My bibliography  Save this article

A New Likelihood Ratio Method for Training Artificial Neural Networks

Author

Listed:
  • Yijie Peng

    (Department of Management Science and Information Systems, Guanghua School of Management, Peking University, Beijing 100871, China)

  • Li Xiao

    (Key Laboratory of Intelligent Information Processing, Advanced Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China)

  • Bernd Heidergott

    (Department of Operations Analytics, Vrije Universiteit Amsterdam, 1081 HV Amsterdam, Netherlands)

  • L. Jeff Hong

    (Department of Management Science, School of Management, Fudan University, Shanghai 200433, China)

  • Henry Lam

    (Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027)

Abstract

We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based on the so-called push-out likelihood ratio method. Unlike the widely used backpropagation (BP) method that requires continuity of the loss function and the activation function, our approach bypasses this requirement by injecting artificial noises into the signals passed along the neurons. We show how this approach has a similar computational complexity as BP, and moreover is more advantageous in terms of removing the backward recursion and eliciting transparent formulas. We also formalize the connection between BP, a pivotal technique for training ANNs, and infinitesimal perturbation analysis, a classic path-wise derivative estimation approach, so that both our new proposed methods and BP can be better understood in the context of stochastic gradient estimation. Our approach allows efficient training for ANNs with more flexibility on the loss and activation functions, and shows empirical improvements on the robustness of ANNs under adversarial attacks and corruptions of natural noises. Summary of Contribution: Stochastic gradient estimation has been studied actively in simulation for decades and becomes more important in the era of machine learning and artificial intelligence. The stochastic gradient descent is a standard technique for training the artificial neural networks (ANNs), a pivotal problem in deep learning. The most popular stochastic gradient estimation technique is the backpropagation method. We find that the backpropagation method lies in the family of infinitesimal perturbation analysis, a path-wise gradient estimation technique in simulation. Moreover, we develop a new likelihood ratio-based method, another popular family of gradient estimation technique in simulation, for training more general ANNs, and demonstrate that the new training method can improve the robustness of the ANN.

Suggested Citation

  • Yijie Peng & Li Xiao & Bernd Heidergott & L. Jeff Hong & Henry Lam, 2022. "A New Likelihood Ratio Method for Training Artificial Neural Networks," INFORMS Journal on Computing, INFORMS, vol. 34(1), pages 638-655, January.
  • Handle: RePEc:inm:orijoc:v:34:y:2022:i:1:p:638-655
    DOI: 10.1287/ijoc.2021.1088
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/ijoc.2021.1088
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijoc.2021.1088?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Bernd Heidergott & Warren Volk-Makarewicz, 2016. "A Measure-Valued Differentiation Approach to Sensitivities of Quantiles," Mathematics of Operations Research, INFORMS, vol. 41(1), pages 293-317, February.
    2. Bernd Heidergott & Haralambie Leahu, 2010. "Weak Differentiability of Product Measures," Mathematics of Operations Research, INFORMS, vol. 35(1), pages 27-51, February.
    3. Rubinstein, Reuven Y., 1986. "The score function approach for sensitivity analysis of computer simulation models," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 28(5), pages 351-379.
    4. Guangwu Liu & L. Jeff Hong, 2011. "Kernel Estimation of the Greeks for Options with Discontinuous Payoffs," Operations Research, INFORMS, vol. 59(1), pages 96-108, February.
    5. Ramesh Sharda, 1994. "Neural Networks for the MS/OR Analyst: An Application Bibliography," Interfaces, INFORMS, vol. 24(2), pages 116-130, April.
    6. Guangwu Liu & Liu Jeff Hong, 2009. "Kernel estimation of quantile sensitivities," Naval Research Logistics (NRL), John Wiley & Sons, vol. 56(6), pages 511-525, September.
    7. Michael C. Fu & L. Jeff Hong & Jian-Qiang Hu, 2009. "Conditional Monte Carlo Estimation of Quantile Sensitivities," Management Science, INFORMS, vol. 55(12), pages 2019-2027, December.
    8. Kar Yan Tam & Melody Y. Kiang, 1992. "Managerial Applications of Neural Networks: The Case of Bank Failure Predictions," Management Science, INFORMS, vol. 38(7), pages 926-947, July.
    9. L. Jeff Hong, 2009. "Estimating Quantile Sensitivities," Operations Research, INFORMS, vol. 57(1), pages 118-130, February.
    10. Zhenyu Cui & Michael C. Fu & Jian-Qiang Hu & Yanchu Liu & Yijie Peng & Lingjiong Zhu, 2020. "On the Variance of Single-Run Unbiased Stochastic Derivative Estimators," INFORMS Journal on Computing, INFORMS, vol. 32(2), pages 390-407, April.
    11. Yurii NESTEROV & Vladimir SPOKOINY, 2017. "Random gradient-free minimization of convex functions," LIDAM Reprints CORE 2851, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    12. Martin I. Reiman & Alan Weiss, 1989. "Sensitivity Analysis for Simulations via Likelihood Ratios," Operations Research, INFORMS, vol. 37(5), pages 830-844, October.
    13. Yongqiang Wang & Michael C. Fu & Steven I. Marcus, 2012. "A New Stochastic Derivative Estimator for Discontinuous Payoff Functions with Application to Financial Derivatives," Operations Research, INFORMS, vol. 60(2), pages 447-460, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Peter W. Glynn & Yijie Peng & Michael C. Fu & Jian-Qiang Hu, 2021. "Computing Sensitivities for Distortion Risk Measures," INFORMS Journal on Computing, INFORMS, vol. 33(4), pages 1520-1532, October.
    2. Bernd Heidergott & Warren Volk-Makarewicz, 2016. "A Measure-Valued Differentiation Approach to Sensitivities of Quantiles," Mathematics of Operations Research, INFORMS, vol. 41(1), pages 293-317, February.
    3. Yongqiang Wang & Michael C. Fu & Steven I. Marcus, 2012. "A New Stochastic Derivative Estimator for Discontinuous Payoff Functions with Application to Financial Derivatives," Operations Research, INFORMS, vol. 60(2), pages 447-460, April.
    4. Zhaolin Hu & Dali Zhang, 2018. "Utility‐based shortfall risk: Efficient computations via Monte Carlo," Naval Research Logistics (NRL), John Wiley & Sons, vol. 65(5), pages 378-392, August.
    5. Zhenyu Cui & Michael C. Fu & Jian-Qiang Hu & Yanchu Liu & Yijie Peng & Lingjiong Zhu, 2020. "On the Variance of Single-Run Unbiased Stochastic Derivative Estimators," INFORMS Journal on Computing, INFORMS, vol. 32(2), pages 390-407, April.
    6. Jiaqiao Hu & Yijie Peng & Gongbo Zhang & Qi Zhang, 2022. "A Stochastic Approximation Method for Simulation-Based Quantile Optimization," INFORMS Journal on Computing, INFORMS, vol. 34(6), pages 2889-2907, November.
    7. Yijie Peng & Michael C. Fu & Bernd Heidergott & Henry Lam, 2020. "Maximum Likelihood Estimation by Monte Carlo Simulation: Toward Data-Driven Stochastic Modeling," Operations Research, INFORMS, vol. 68(6), pages 1896-1912, November.
    8. L. Jeff Hong & Sandeep Juneja & Jun Luo, 2014. "Estimating Sensitivities of Portfolio Credit Risk Using Monte Carlo," INFORMS Journal on Computing, INFORMS, vol. 26(4), pages 848-865, November.
    9. Yijie Peng & Chun-Hung Chen & Michael C. Fu & Jian-Qiang Hu & Ilya O. Ryzhov, 2021. "Efficient Sampling Allocation Procedures for Optimal Quantile Selection," INFORMS Journal on Computing, INFORMS, vol. 33(1), pages 230-245, January.
    10. Makam, Vaishno Devi & Millossovich, Pietro & Tsanakas, Andreas, 2021. "Sensitivity analysis with χ2-divergences," Insurance: Mathematics and Economics, Elsevier, vol. 100(C), pages 372-383.
    11. Pesenti, Silvana M. & Tsanakas, Andreas & Millossovich, Pietro, 2018. "Euler allocations in the presence of non-linear reinsurance: Comment on Major (2018)," Insurance: Mathematics and Economics, Elsevier, vol. 83(C), pages 29-31.
    12. Silvana M. Pesenti & Pietro Millossovich & Andreas Tsanakas, 2023. "Differential Sensitivity in Discontinuous Models," Papers 2310.06151, arXiv.org.
    13. Xi Chen & Kyoung-Kuk Kim, 2016. "Efficient VaR and CVaR Measurement via Stochastic Kriging," INFORMS Journal on Computing, INFORMS, vol. 28(4), pages 629-644, November.
    14. Pesenti, Silvana M. & Millossovich, Pietro & Tsanakas, Andreas, 2019. "Reverse sensitivity testing: What does it take to break the model?," European Journal of Operational Research, Elsevier, vol. 274(2), pages 654-670.
    15. Katja Schilling & Daniel Bauer & Marcus C. Christiansen & Alexander Kling, 2020. "Decomposing Dynamic Risks into Risk Components," Management Science, INFORMS, vol. 66(12), pages 5738-5756, December.
    16. J P C Kleijnen & W C M van Beers, 2013. "Monotonicity-preserving bootstrapped Kriging metamodels for expensive simulations," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 64(5), pages 708-717, May.
    17. Weihuan Huang, 2023. "Estimating Systemic Risk within Financial Networks: A Two-Step Nonparametric Method," Papers 2310.18658, arXiv.org.
    18. M. Merz & R. Richman & T. Tsanakas & M. V. Wuthrich, 2021. "Interpreting Deep Learning Models with Marginal Attribution by Conditioning on Quantiles," Papers 2103.11706, arXiv.org.
    19. Guangxin Jiang & Michael C. Fu, 2015. "Technical Note—On Estimating Quantile Sensitivities via Infinitesimal Perturbation Analysis," Operations Research, INFORMS, vol. 63(2), pages 435-441, April.
    20. Weihuan Huang & Nifei Lin & L. Jeff Hong, 2022. "Monte-Carlo Estimation of CoVaR," Papers 2210.06148, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:34:y:2022:i:1:p:638-655. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.