IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0290331.html
   My bibliography  Save this article

HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models

Author

Listed:
  • Luan Carlos de Sena Monteiro Ozelim
  • Dimas Betioli Ribeiro
  • José Antonio Schiavon
  • Vinicius Resende Domingues
  • Paulo Ivo Braga de Queiroz

Abstract

Surrogate models are frequently used to replace costly engineering simulations. A single surrogate is frequently chosen based on previous experience or by fitting multiple surrogates and selecting one based on mean cross-validation errors. A novel stacking strategy will be presented in this paper. This new strategy results from reinterpreting the model selection process based on the generalization error. For the first time, this problem is proposed to be translated into a well-studied financial problem: portfolio management and optimization. In short, it is demonstrated that the individual residues calculated by leave-one-out procedures are samples from a given random variable ϵi, whose second non-central moment is the i-th model’s generalization error. Thus, a stacking methodology based solely on evaluating the behavior of the linear combination of the random variables ϵi is proposed. At first, several surrogate models are calibrated. The Directed Bubble Hierarchical Tree (DBHT) clustering algorithm is then used to determine which models are worth stacking. The stacking weights can be calculated using any financial approach to the portfolio optimization problem. This alternative understanding of the problem enables practitioners to use established financial methodologies to calculate the models’ weights, significantly improving the ensemble of models’ out-of-sample performance. A study case is carried out to demonstrate the applicability of the new methodology. Overall, a total of 124 models were trained using a specific dataset: 40 Machine Learning models and 84 Polynomial Chaos Expansion models (which considered 3 types of base random variables, 7 least square algorithms for fitting the up to fourth order expansion’s coefficients). Among those, 99 models could be fitted without convergence and other numerical issues. The DBHT algorithm with Pearson correlation distance and generalization error similarity was able to select a subgroup of 23 models from the 99 fitted ones, implying a reduction of about 77% in the total number of models, representing a good filtering scheme which still preserves diversity. Finally, it has been demonstrated that the weights obtained by building a Hierarchical Risk Parity (HPR) portfolio perform better for various input random variables, indicating better out-of-sample performance. In this way, an economic stacking strategy has demonstrated its worth in improving the out-of-sample capabilities of stacked models, which illustrates how the new understanding of model stacking methodologies may be useful.

Suggested Citation

  • Luan Carlos de Sena Monteiro Ozelim & Dimas Betioli Ribeiro & José Antonio Schiavon & Vinicius Resende Domingues & Paulo Ivo Braga de Queiroz, 2023. "HPOSS: A hierarchical portfolio optimization stacking strategy to reduce the generalization error of ensembles of models," PLOS ONE, Public Library of Science, vol. 18(8), pages 1-43, August.
  • Handle: RePEc:plo:pone00:0290331
    DOI: 10.1371/journal.pone.0290331
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0290331
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0290331&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0290331?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Michaud, Richard O. & Michaud, Robert O., 2008. "Efficient Asset Management: A Practical Guide to Stock Portfolio Optimization and Asset Allocation," OUP Catalogue, Oxford University Press, edition 2, number 9780195331912, Decembrie.
    2. Marco Marozzi, 2009. "Some notes on the location–scale Cucconi test," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 21(5), pages 629-647.
    3. Pedro M. Mirete-Ferrer & Alberto Garcia-Garcia & Juan Samuel Baixauli-Soler & Maria A. Prats, 2022. "A Review on Machine Learning for Asset Management," Risks, MDPI, vol. 10(4), pages 1-46, April.
    4. E Fong & C C Holmes, 2020. "On the marginal likelihood and cross-validation," Biometrika, Biometrika Trust, vol. 107(2), pages 489-496.
    5. Granger, C. W. J. & Newbold, Paul, 1986. "Forecasting Economic Time Series," Elsevier Monographs, Elsevier, edition 2, number 9780122951831 edited by Shell, Karl.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zanini, Fabio C. & Irwin, Scott H. & Schnitkey, Gary D. & Sherrick, Bruce J., 2000. "Estimating Farm-Level Yield Distributions For Corn And Soybeans In Illinois," 2000 Annual meeting, July 30-August 2, Tampa, FL 21720, American Agricultural Economics Association (New Name 2008: Agricultural and Applied Economics Association).
    2. Taoufik Bouezmarni & Mohamed Doukali & Abderrahim Taamouti, 2024. "Testing Granger non-causality in expectiles," Econometric Reviews, Taylor & Francis Journals, vol. 43(1), pages 30-51, January.
    3. Graham Elliott & Ivana Komunjer & Allan Timmermann, 2008. "Biases in Macroeconomic Forecasts: Irrationality or Asymmetric Loss?," Journal of the European Economic Association, MIT Press, vol. 6(1), pages 122-157, March.
    4. Luca Benati & Paolo Surico, 2008. "Evolving U.S. Monetary Policy and The Decline of Inflation Predictability," Journal of the European Economic Association, MIT Press, vol. 6(2-3), pages 634-646, 04-05.
    5. Sanders, Dwight R. & Manfredo, Mark R., 2006. "Forecasting Basis Levels in the Soybean Complex: A Comparison of Time Series Methods," Journal of Agricultural and Applied Economics, Cambridge University Press, vol. 38(3), pages 513-523, December.
    6. Song, Zhi & Mukherjee, Amitava & Zhang, Jiujun, 2021. "Some robust approaches based on copula for monitoring bivariate processes and component-wise assessment," European Journal of Operational Research, Elsevier, vol. 289(1), pages 177-196.
    7. Gómez-Puig, Marta & Sosvilla-Rivero, Simón, 2014. "Causality and contagion in EMU sovereign debt markets," International Review of Economics & Finance, Elsevier, vol. 33(C), pages 12-27.
    8. Erie Febrian & Aldrin Herwany, 2009. "Volatility Forecasting Models and Market Co-Integration: A Study on South-East Asian Markets," Working Papers in Economics and Development Studies (WoPEDS) 200911, Department of Economics, Padjadjaran University, revised Sep 2009.
    9. David Murrell & Weiqiu Yu, 2000. "The Effect of the Harmonized Sales Tax on Consumer Prices in Atlantic Canada," Canadian Public Policy, University of Toronto Press, vol. 26(4), pages 451-460, December.
    10. Thomas Dohmen & Hartmut F. Lehmann & Mark E. Schaffer, 2014. "Wage Policies of a Russian Firm and the Financial Crisis of 1998: Evidence from Personnel Data, 1997 to 2002," ILR Review, Cornell University, ILR School, vol. 67(2), pages 504-531, April.
    11. Jun Ma & Mark E. Wohar, 2013. "An Unobserved Components Model that Yields Business and Medium-Run Cycles," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 45(7), pages 1351-1373, October.
    12. Pami Dua & Anirvan Banerji, 2011. "Predicting Recessions and Slowdowns: A Robust Approach," Working Papers id:4391, eSocialSciences.
    13. Pär Österholm, 2005. "The Taylor Rule: A Spurious Regression?," Bulletin of Economic Research, Wiley Blackwell, vol. 57(3), pages 217-247, July.
    14. Fritsche, Ulrich & Pierdzioch, Christian & Rülke, Jan-Christoph & Stadtmann, Georg, 2015. "Forecasting the Brazilian real and the Mexican peso: Asymmetric loss, forecast rationality, and forecaster herding," International Journal of Forecasting, Elsevier, vol. 31(1), pages 130-139.
    15. Torstein Bye & Alexandra Katz, 1995. "Returns to Publicly Owned Transport Infrastructure Investment . A Cost Function/Cost Share Approach for Norway, 1971-1991," Discussion Papers 154, Statistics Norway, Research Department.
    16. Corradi, Valentina & Swanson, Norman R., 2004. "Some recent developments in predictive accuracy testing with nested models and (generic) nonlinear alternatives," International Journal of Forecasting, Elsevier, vol. 20(2), pages 185-199.
    17. Korbinian Dress & Stefan Lessmann & Hans-Jorg von Mettenheim, 2017. "Residual Value Forecasting Using Asymmetric Cost Functions," Papers 1707.02736, arXiv.org.
    18. Luong, Phat V. & Xu, Xiaowei, 2020. "Pass-through of commodity price shocks in distribution channels with risk-averse agents," International Journal of Production Economics, Elsevier, vol. 226(C).
    19. Lahiri, Kajal & Yang, Liu, 2013. "Forecasting Binary Outcomes," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 1025-1106, Elsevier.
    20. Guillén, Osmani Teixeira de Carvalho & Issler, João Victor & Franco-Neto, Afonso Arinos de Mello, 2014. "On the welfare costs of business-cycle fluctuations and economic-growth variation in the 20th century and beyond," Journal of Economic Dynamics and Control, Elsevier, vol. 39(C), pages 62-78.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0290331. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.