IDEAS home Printed from https://ideas.repec.org/a/inm/ormnsc/v68y2022i1p9-26.html
   My bibliography  Save this article

Smart “Predict, then Optimize”

Author

Listed:
  • Adam N. Elmachtoub

    (Department of Industrial Engineering and Operations Research and Data Science Institute, Columbia University, New York, New York 10027)

  • Paul Grigas

    (Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720)

Abstract

Many real-world analytics problems involve two significant challenges: prediction and optimization. Because of the typically complex nature of each challenge, the standard paradigm is predict-then-optimize. By and large, machine learning tools are intended to minimize prediction error and do not account for how the predictions will be used in the downstream optimization problem. In contrast, we propose a new and very general framework, called Smart “Predict, then Optimize” (SPO), which directly leverages the optimization problem structure—that is, its objective and constraints—for designing better prediction models. A key component of our framework is the SPO loss function, which measures the decision error induced by a prediction. Training a prediction model with respect to the SPO loss is computationally challenging, and, thus, we derive, using duality theory, a convex surrogate loss function, which we call the SPO+ loss. Most importantly, we prove that the SPO+ loss is statistically consistent with respect to the SPO loss under mild conditions. Our SPO+ loss function can tractably handle any polyhedral, convex, or even mixed-integer optimization problem with a linear objective. Numerical experiments on shortest-path and portfolio-optimization problems show that the SPO framework can lead to significant improvement under the predict-then-optimize paradigm, in particular, when the prediction model being trained is misspecified. We find that linear models trained using SPO+ loss tend to dominate random-forest algorithms, even when the ground truth is highly nonlinear.

Suggested Citation

  • Adam N. Elmachtoub & Paul Grigas, 2022. "Smart “Predict, then Optimize”," Management Science, INFORMS, vol. 68(1), pages 9-26, January.
  • Handle: RePEc:inm:ormnsc:v:68:y:2022:i:1:p:9-26
    DOI: 10.1287/mnsc.2020.3922
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/mnsc.2020.3922
    Download Restriction: no

    File URL: https://libkey.io/10.1287/mnsc.2020.3922?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Gah-Yi Ban & Cynthia Rudin, 2019. "The Big Data Newsvendor: Practical Insights from Machine Learning," Operations Research, INFORMS, vol. 67(1), pages 90-108, January.
    2. Timothy C. Y. Chan & Tim Craig & Taewoo Lee & Michael B. Sharpe, 2014. "Generalized Inverse Multiobjective Optimization with Application to Cancer Therapy," Operations Research, INFORMS, vol. 62(3), pages 680-695, June.
    3. Carri W. Chan & Vivek F. Farias & Nicholas Bambos & Gabriel J. Escobar, 2012. "Optimizing Intensive Care Unit Discharge Decisions with Patient Readmissions," Operations Research, INFORMS, vol. 60(6), pages 1323-1341, December.
    4. Mallik Angalakudati & Siddharth Balwani & Jorge Calzada & Bikram Chatterjee & Georgia Perakis & Nicolas Raad & Joline Uichanco, 2014. "Business Analytics for Flexible Resource Allocation Under Random Emergencies," Management Science, INFORMS, vol. 60(6), pages 1552-1573, June.
    5. Harvey M. Wagner & Thomson M. Whitin, 1958. "Dynamic Version of the Economic Lot Size Model," Management Science, INFORMS, vol. 5(1), pages 89-96, October.
    6. Gah-Yi Ban & Noureddine El Karoui & Andrew E. B. Lim, 2018. "Machine Learning and Portfolio Optimization," Management Science, INFORMS, vol. 64(3), pages 1136-1154, March.
    7. Sarang Deo & Kumar Rajaram & Sandeep Rath & Uday S. Karmarkar & Matthew B. Goetz, 2015. "Planning for HIV Screening, Testing, and Care at the Veterans Health Administration," Operations Research, INFORMS, vol. 63(2), pages 287-304, April.
    8. Velibor V. Mišić & Georgia Perakis, 2020. "Data Analytics in Operations Management: A Review," Manufacturing & Service Operations Management, INFORMS, vol. 22(1), pages 158-169, January.
    9. Retsef Levi & Robin O. Roundy & David B. Shmoys, 2006. "Primal-Dual Algorithms for Deterministic Inventory Problems," Mathematics of Operations Research, INFORMS, vol. 31(2), pages 267-284, May.
    10. Schütz, Peter & Tomasgard, Asgeir & Ahmed, Shabbir, 2009. "Supply chain design under uncertainty using sample average approximation and dual decomposition," European Journal of Operational Research, Elsevier, vol. 199(2), pages 409-419, December.
    11. Mili Mehrotra & Milind Dawande & Srinagesh Gavirneni & Mehmet Demirci & Sridhar Tayur, 2011. "OR PRACTICE---Production Planning with Patterns: A Problem from Processed Food Manufacturing," Operations Research, INFORMS, vol. 59(2), pages 267-282, April.
    12. Carri W. Chan & Linda V. Green & Yina Lu & Nicole Leahy & Roger Yurt, 2013. "Prioritizing Burn-Injured Patients During a Disaster," Manufacturing & Service Operations Management, INFORMS, vol. 15(2), pages 170-190, May.
    13. Dimitris Bertsimas & Nathan Kallus, 2020. "From Predictive to Prescriptive Analytics," Management Science, INFORMS, vol. 66(3), pages 1025-1044, March.
    14. Omar Besbes & Robert Phillips & Assaf Zeevi, 2010. "Testing the Validity of a Demand Model: An Operations Perspective," Manufacturing & Service Operations Management, INFORMS, vol. 12(1), pages 162-183, June.
    15. Jérémie Gallien & Adam J. Mersereau & Andres Garro & Alberte Dapena Mora & Martín Nóvoa Vidal, 2015. "Initial Shipment Decisions for New Products at Zara," Operations Research, INFORMS, vol. 63(2), pages 269-286, April.
    16. Andrew E. B. Lim & J. George Shanthikumar & Gah-Yi Vahn, 2012. "Robust Portfolio Choice with Learning in the Framework of Regret: Single-Period Case," Management Science, INFORMS, vol. 58(9), pages 1732-1746, September.
    17. Lin, Yi, 2004. "A note on margin-based loss functions in classification," Statistics & Probability Letters, Elsevier, vol. 68(1), pages 73-82, June.
    18. Bartlett, Peter L. & Jordan, Michael I. & McAuliffe, Jon D., 2006. "Convexity, Classification, and Risk Bounds," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 138-156, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Koen W. de Bock & Kristof Coussement & Arno De Caigny & Roman Slowiński & Bart Baesens & Robert N Boute & Tsan-Ming Choi & Dursun Delen & Mathias Kraus & Stefan Lessmann & Sebastián Maldonado & David , 2023. "Explainable AI for Operational Research: A Defining Framework, Methods, Applications, and a Research Agenda," Post-Print hal-04219546, HAL.
    2. Notz, Pascal M. & Pibernik, Richard, 2024. "Explainable subgradient tree boosting for prescriptive analytics in operations management," European Journal of Operational Research, Elsevier, vol. 312(3), pages 1119-1133.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shaochong Lin & Youhua (Frank) Chen & Yanzhi Li & Zuo‐Jun Max Shen, 2022. "Data‐Driven Newsvendor Problems Regularized by a Profit Risk Constraint," Production and Operations Management, Production and Operations Management Society, vol. 31(4), pages 1630-1644, April.
    2. Nam Ho-Nguyen & Fatma Kılınç-Karzan, 2022. "Risk Guarantees for End-to-End Prediction and Optimization Processes," Management Science, INFORMS, vol. 68(12), pages 8680-8698, December.
    3. Erkip, Nesim Kohen, 2023. "Can accessing much data reshape the theory? Inventory theory under the challenge of data-driven systems," European Journal of Operational Research, Elsevier, vol. 308(3), pages 949-959.
    4. Tinglong Dai & Sridhar Tayur, 2020. "OM Forum—Healthcare Operations Management: A Snapshot of Emerging Research," Manufacturing & Service Operations Management, INFORMS, vol. 22(5), pages 869-887, September.
    5. Christian Mandl & Selvaprabu Nadarajah & Stefan Minner & Srinagesh Gavirneni, 2022. "Data‐driven storage operations: Cross‐commodity backtest and structured policies," Production and Operations Management, Production and Operations Management Society, vol. 31(6), pages 2438-2456, June.
    6. Yinchu Zhu & Ilya O. Ryzhov, 2022. "Optimal data-driven hiring with equity for underrepresented groups," Papers 2206.09300, arXiv.org.
    7. Bernardo K. Pagnoncelli & Domingo Ramírez & Hamed Rahimian & Arturo Cifuentes, 2023. "A Synthetic Data-Plus-Features Driven Approach for Portfolio Optimization," Computational Economics, Springer;Society for Computational Economics, vol. 62(1), pages 187-204, June.
    8. Qi Feng & J. George Shanthikumar, 2022. "Developing operations management data analytics," Production and Operations Management, Production and Operations Management Society, vol. 31(12), pages 4544-4557, December.
    9. Viet Anh Nguyen & Fan Zhang & Shanshan Wang & Jose Blanchet & Erick Delage & Yinyu Ye, 2021. "Robustifying Conditional Portfolio Decisions via Optimal Transport," Papers 2103.16451, arXiv.org, revised Apr 2024.
    10. Meng Qi & Ying Cao & Zuo-Jun (Max) Shen, 2022. "Distributionally Robust Conditional Quantile Prediction with Fixed Design," Management Science, INFORMS, vol. 68(3), pages 1639-1658, March.
    11. Jos'e-Manuel Pe~na & Fernando Su'arez & Omar Larr'e & Domingo Ram'irez & Arturo Cifuentes, 2023. "A Modified CTGAN-Plus-Features Based Method for Optimal Asset Allocation," Papers 2302.02269, arXiv.org, revised Feb 2023.
    12. Dmitry B. Rokhlin, 2021. "Relative utility bounds for empirically optimal portfolios," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 93(3), pages 437-462, June.
    13. Yang, Cheng-Hu & Wang, Hai-Tang & Ma, Xin & Talluri, Srinivas, 2023. "A data-driven newsvendor problem: A high-dimensional and mixed-frequency method," International Journal of Production Economics, Elsevier, vol. 266(C).
    14. Andrew Butler & Roy H. Kwon, 2021. "Integrating prediction in mean-variance portfolio optimization," Papers 2102.09287, arXiv.org, revised Nov 2022.
    15. Turgay Ayer & Can Zhang & Anthony Bonifonte & Anne C. Spaulding & Jagpreet Chhatwal, 2019. "Prioritizing Hepatitis C Treatment in U.S. Prisons," Operations Research, INFORMS, vol. 67(3), pages 853-873, May.
    16. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    17. Liu, Congzheng & Letchford, Adam N. & Svetunkov, Ivan, 2022. "Newsvendor problems: An integrated method for estimation and optimisation," European Journal of Operational Research, Elsevier, vol. 300(2), pages 590-601.
    18. Corredera, Alberto & Ruiz, Carlos, 2023. "Prescriptive selection of machine learning hyperparameters with applications in power markets: Retailer’s optimal trading," European Journal of Operational Research, Elsevier, vol. 306(1), pages 370-388.
    19. Max Biggs & Rim Hariss & Georgia Perakis, 2023. "Constrained optimization of objective functions determined from random forests," Production and Operations Management, Production and Operations Management Society, vol. 32(2), pages 397-415, February.
    20. Shuaian Wang & Xuecheng Tian, 2023. "A Deficiency of the Predict-Then-Optimize Framework: Decreased Decision Quality with Increased Data Size," Mathematics, MDPI, vol. 11(15), pages 1-9, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormnsc:v:68:y:2022:i:1:p:9-26. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.