IDEAS home Printed from https://ideas.repec.org/a/eee/jomega/v104y2021ics0305048321000888.html
   My bibliography  Save this article

Big data and portfolio optimization: A novel approach integrating DEA with multiple data sources

Author

Listed:
  • Zhou, Zhongbao
  • Gao, Meng
  • Xiao, Helu
  • Wang, Rui
  • Liu, Wenbin

Abstract

The existing literature suggests that the out-of-sample performance of traditional mean-variance portfolio strategies is not robust, and their performance is even inferior to that of the equal weight strategy. To address this problem, this paper first clarifies that a complete investment process consists of two parts, namely, stock selection and investment weight formulation. Then, we design a stock selection scheme integrating Data Envelopment Analysis (DEA) with multiple data sources, including historical stock trading data, technical indicators, social media data and news data, to assess the investment value of stocks in terms of historical return, asset correlation and investor sentiment performance. In addition, we use Support Vector Machine (SVM) combined with the multi-source data on stocks to predict the stock price movements and combine the obtained stock price movements and the proposed stock selection scheme to construct the portfolio optimization model. Further, we also carry out an out-of-sample test on the proposed stock selection scheme and investment strategies, in which the constituents of CSI 300 index are selected as the test samples. The empirical results show that the proposed stock selection scheme can effectively improve the out-of-sample performance of all investment strategies. Besides, the proposed investment strategy has a better out-of-sample performance compared to the traditional global minimum variance investment strategy, tangency portfolio investment strategy, and equal weight strategy. Finally, we perform a robustness test of the above findings using an additional dataset.

Suggested Citation

  • Zhou, Zhongbao & Gao, Meng & Xiao, Helu & Wang, Rui & Liu, Wenbin, 2021. "Big data and portfolio optimization: A novel approach integrating DEA with multiple data sources," Omega, Elsevier, vol. 104(C).
  • Handle: RePEc:eee:jomega:v:104:y:2021:i:c:s0305048321000888
    DOI: 10.1016/j.omega.2021.102479
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0305048321000888
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.omega.2021.102479?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Arvanitis, Stelios & Scaillet, Olivier & Topaloglou, Nikolas, 2020. "Spanning tests for Markowitz stochastic dominance," Journal of Econometrics, Elsevier, vol. 217(2), pages 291-311.
    2. Liu, Wenbin & Zhou, Zhongbao & Liu, Debin & Xiao, Helu, 2015. "Estimation of portfolio efficiency via DEA," Omega, Elsevier, vol. 52(C), pages 107-118.
    3. Victor DeMiguel & Lorenzo Garlappi & Raman Uppal, 2009. "Optimal Versus Naive Diversification: How Inefficient is the 1-N Portfolio Strategy?," The Review of Financial Studies, Society for Financial Studies, vol. 22(5), pages 1915-1953, May.
    4. Branda, Martin, 2015. "Diversification-consistent data envelopment analysis based on directional-distance measures," Omega, Elsevier, vol. 52(C), pages 65-76.
    5. Ravi Jagannathan & Tongshu Ma, 2003. "Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps," Journal of Finance, American Finance Association, vol. 58(4), pages 1651-1683, August.
    6. Lin, Ruiyue & Li, Zongxin, 2020. "Directional distance based diversification super-efficiency DEA models for mutual funds," Omega, Elsevier, vol. 97(C).
    7. Lim, Sungmook & Oh, Kwang Wuk & Zhu, Joe, 2014. "Use of DEA cross-efficiency evaluation in portfolio selection: An application to Korean stock market," European Journal of Operational Research, Elsevier, vol. 236(1), pages 361-368.
    8. Dhanya Jothimani & Ravi Shankar & Surendra S. Yadav, 2018. "A Big data analytical framework for portfolio optimization," Papers 1811.07188, arXiv.org, revised Nov 2018.
    9. Phelim Boyle & Lorenzo Garlappi & Raman Uppal & Tan Wang, 2012. "Keynes Meets Markowitz: The Trade-Off Between Familiarity and Diversification," Management Science, INFORMS, vol. 58(2), pages 253-272, February.
    10. Annaert, Jan & Osselaer, Sofieke Van & Verstraete, Bert, 2009. "Performance evaluation of portfolio insurance strategies using stochastic dominance criteria," Journal of Banking & Finance, Elsevier, vol. 33(2), pages 272-280, February.
    11. Renault, Thomas, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Journal of Banking & Finance, Elsevier, vol. 84(C), pages 25-40.
    12. Scaillet, Olivier & Topaloglou, Nikolas, 2010. "Testing for Stochastic Dominance Efficiency," Journal of Business & Economic Statistics, American Statistical Association, vol. 28(1), pages 169-180.
    13. Raman Uppal & Tan Wang, 2003. "Model Misspecification and Underdiversification," Journal of Finance, American Finance Association, vol. 58(6), pages 2465-2486, December.
    14. Michela Nardo & Marco Petracco-Giudici & Minás Naltsidis, 2016. "Walking Down Wall Street With A Tablet: A Survey Of Stock Market Predictions Using The Web," Journal of Economic Surveys, Wiley Blackwell, vol. 30(2), pages 356-369, April.
    15. Thomas Renault, 2017. "Intraday online investor sentiment and return patterns in the U.S. stock market," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-03205113, HAL.
    16. Basak, Suryoday & Kar, Saibal & Saha, Snehanshu & Khaidem, Luckyson & Dey, Sudeepa Roy, 2019. "Predicting the direction of stock market prices using tree-based classifiers," The North American Journal of Economics and Finance, Elsevier, vol. 47(C), pages 552-567.
    17. Xi Zhang & Yunjia Zhang & Senzhang Wang & Yuntao Yao & Binxing Fang & Philip S. Yu, 2018. "Improving Stock Market Prediction via Heterogeneous Information Fusion," Papers 1801.00588, arXiv.org.
    18. Guidolin, Massimo & Liu, Hening, 2016. "Ambiguity Aversion and Underdiversification," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 51(4), pages 1297-1323, August.
    19. Joe Zhu, 2014. "DEA Cross Efficiency," International Series in Operations Research & Management Science, in: Quantitative Models for Performance Evaluation and Benchmarking, edition 3, chapter 4, pages 61-92, Springer.
    20. Tihana Skrinjaric, 2014. "Investment Strategy on the Zagreb Stock Exchange Based on Dynamic DEA," Croatian Economic Survey, The Institute of Economics, Zagreb, vol. 16(1), pages 129-160, April.
    21. Kerstens, Kristiaan & Mounir, Amine & de Woestyne, Ignace Van, 2011. "Non-parametric frontier estimates of mutual fund performance using C- and L-moments: Some specification tests," Journal of Banking & Finance, Elsevier, vol. 35(5), pages 1190-1201, May.
    22. Harry Markowitz, 1952. "Portfolio Selection," Journal of Finance, American Finance Association, vol. 7(1), pages 77-91, March.
    23. Alexander Kempf & Christoph Memmel, 2006. "Estimating the global Minimum Variance Portfolio," Schmalenbach Business Review (sbr), LMU Munich School of Management, vol. 58(4), pages 332-348, October.
    24. Kai-Hua Wang & Chi-Wei Su & Ran Tao & Hsu-Ling Chang, 2019. "Does the Efficient Market Hypothesis Fit Military Enterprises in China?," Defence and Peace Economics, Taylor & Francis Journals, vol. 30(7), pages 877-889, November.
    25. Edirisinghe, N.C.P. & Zhang, X., 2007. "Generalized DEA model of fundamental analysis and its application to portfolio optimization," Journal of Banking & Finance, Elsevier, vol. 31(11), pages 3311-3335, November.
    26. Charles, Amélie & Darné, Olivier, 2009. "The random walk hypothesis for Chinese stock markets: Evidence from variance ratio tests," Economic Systems, Elsevier, vol. 33(2), pages 117-126, June.
    27. Wei, Yu-Chen & Lu, Yang-Cheng & Chen, Jen-Nan & Hsu, Yen-Ju, 2017. "Informativeness of the market news sentiment in the Taiwan stock market," The North American Journal of Economics and Finance, Elsevier, vol. 39(C), pages 158-181.
    28. Ledoit, Olivier & Wolf, Michael, 2003. "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection," Journal of Empirical Finance, Elsevier, vol. 10(5), pages 603-621, December.
    29. Ravi Jagannathan & Tongshu Ma, 2003. "Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps," Journal of Finance, American Finance Association, vol. 58(4), pages 1651-1684, August.
    30. Lamb, John D. & Tee, Kai-Hong, 2012. "Data envelopment analysis models of investment funds," European Journal of Operational Research, Elsevier, vol. 216(3), pages 687-696.
    31. Stelios Arvanitis & Nikolas Topalogou, 2017. "Testing for Prospect and Markowitz stochastic dominance efficiency," Working Papers 201701, Athens University Of Economics and Business, Department of Economics.
    32. Tu, Jun & Zhou, Guofu, 2011. "Markowitz meets Talmud: A combination of sophisticated and naive diversification strategies," Journal of Financial Economics, Elsevier, vol. 99(1), pages 204-215, January.
    33. Siganos, Antonios & Vagenas-Nanos, Evangelos & Verwijmeren, Patrick, 2017. "Divergence of sentiment and stock market trading," Journal of Banking & Finance, Elsevier, vol. 78(C), pages 130-141.
    34. Bjarne Florentsen & Ulf Nielsson & Peter Raahauge & Jesper Rangvid, 2019. "The aggregate cost of equity underdiversification," The Financial Review, Eastern Finance Association, vol. 54(4), pages 833-856, November.
    35. Zhou, Zhongbao & Gao, Meng & Liu, Qing & Xiao, Helu, 2020. "Forecasting stock price movements with multiple data sources: Evidence from stock market in China," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 542(C).
    36. Zhou, Zhongbao & Xiao, Helu & Jin, Qianying & Liu, Wenbin, 2018. "DEA frontier improvement and portfolio rebalancing: An application of China mutual funds on considering sustainability information disclosure," European Journal of Operational Research, Elsevier, vol. 269(1), pages 111-131.
    37. Helu Xiao & Tiantian Ren & Teng Ren, 2020. "Estimation of fuzzy portfolio efficiency via an improved DEA approach," Post-Print hal-03281789, HAL.
    38. Oleg Malafeyev & Achal Awasthi & Kaustubh S. Kambekar, 2017. "Random walks and market efficiency in Chinese and Indian equity markets," Papers 1709.04059, arXiv.org.
    39. Jorion, Philippe, 1991. "Bayesian and CAPM estimators of the means: Implications for portfolio selection," Journal of Banking & Finance, Elsevier, vol. 15(3), pages 717-727, June.
    40. Zhou, Zhongbao & Jin, Qianying & Xiao, Helu & Wu, Qian & Liu, Wenbin, 2018. "Estimation of cardinality constrained portfolio efficiency via segmented DEA," Omega, Elsevier, vol. 76(C), pages 28-37.
    41. Choi, Hyung-Suk & Min, Daiki, 2017. "Efficiency of well-diversified portfolios: Evidence from data envelopment analysis," Omega, Elsevier, vol. 73(C), pages 104-113.
    42. Topaloglou, Nikolas & Tsionas, Mike G., 2020. "Stochastic dominance tests," Journal of Economic Dynamics and Control, Elsevier, vol. 112(C).
    43. Liu, Hong, 2014. "Solvency Constraint, Underdiversification, and Idiosyncratic Risks," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 49(2), pages 409-430, April.
    44. Arvanitis, Stelios & Topaloglou, Nikolas, 2017. "Testing for prospect and Markowitz stochastic dominance efficiency," Journal of Econometrics, Elsevier, vol. 198(2), pages 253-270.
    45. Prasad Sankar Bhattacharya & Dimitrios D. Thomakos, 2018. "Robust model rankings of forecasting performance," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 37(6), pages 676-690, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Abdelouahed Hamdi & Arezou Karimi & Farshid Mehrdoust & Samir Brahim Belhaouari, 2022. "Portfolio Selection Problem Using CVaR Risk Measures Equipped with DEA, PSO, and ICA Algorithms," Mathematics, MDPI, vol. 10(15), pages 1-26, August.
    2. Ma, Yilin & Wang, Yudong & Wang, Weizhong & Zhang, Chong, 2023. "Portfolios with return and volatility prediction for the energy stock market," Energy, Elsevier, vol. 270(C).
    3. Chen, Wei & Zhang, Haoyu & Jia, Lifen, 2022. "A novel two-stage method for well-diversified portfolio construction based on stock return prediction using machine learning," The North American Journal of Economics and Finance, Elsevier, vol. 63(C).
    4. Pejman Peykani & Mojtaba Nouri & Mir Saman Pishvaee & Camelia Oprean-Stan & Emran Mohammadi, 2023. "Credibilistic Multi-Period Mean-Entropy Rolling Portfolio Optimization Problem Based on Multi-Stage Scenario Tree," Mathematics, MDPI, vol. 11(18), pages 1-23, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiao, Helu & Zhou, Zhongbao & Ren, Teng & Liu, Wenbin, 2022. "Estimation of portfolio efficiency in nonconvex settings: A free disposal hull estimator with non-increasing returns to scale," Omega, Elsevier, vol. 111(C).
    2. Kerstens, Kristiaan & Mazza, Paolo & Ren, Tiantian & Van de Woestyne, Ignace, 2022. "Multi-Time and Multi-Moment Nonparametric Frontier-Based Fund Rating: Proposal and Buy-and-Hold Backtesting Strategy," Omega, Elsevier, vol. 113(C).
    3. Chavez-Bedoya, Luis & Rosales, Francisco, 2021. "Reduction of estimation risk in optimal portfolio choice using redundant constraints," International Review of Financial Analysis, Elsevier, vol. 78(C).
    4. Chavez-Bedoya, Luis & Rosales, Francisco, 2022. "Orthogonal portfolios to assess estimation risk," International Review of Economics & Finance, Elsevier, vol. 80(C), pages 906-937.
    5. Xiao, Helu & Ren, Tiantian & Zhou, Zhongbao & Liu, Wenbin, 2021. "Parameter uncertainty in estimation of portfolio efficiency: Evidence from an interval diversification-consistent DEA approach," Omega, Elsevier, vol. 103(C).
    6. Zeng, Ximei & Zhou, Zhongbao & Gong, Yeming & Liu, Wenbin, 2022. "A data envelopment analysis model integrated with portfolio theory for energy mix adjustment: Evidence in the power industry," Socio-Economic Planning Sciences, Elsevier, vol. 83(C).
    7. Hautsch, Nikolaus & Voigt, Stefan, 2019. "Large-scale portfolio allocation under transaction costs and model uncertainty," Journal of Econometrics, Elsevier, vol. 212(1), pages 221-240.
    8. Pinar, Mehmet & Stengos, Thanasis & Topaloglou, Nikolas, 2020. "On the construction of a feasible range of multidimensional poverty under benchmark weight uncertainty," European Journal of Operational Research, Elsevier, vol. 281(2), pages 415-427.
    9. Hsu, Po-Hsuan & Han, Qiheng & Wu, Wensheng & Cao, Zhiguang, 2018. "Asset allocation strategies, data snooping, and the 1 / N rule," Journal of Banking & Finance, Elsevier, vol. 97(C), pages 257-269.
    10. Kourtis, Apostolos & Dotsis, George & Markellos, Raphael N., 2012. "Parameter uncertainty in portfolio selection: Shrinking the inverse covariance matrix," Journal of Banking & Finance, Elsevier, vol. 36(9), pages 2522-2531.
    11. Candelon, B. & Hurlin, C. & Tokpavi, S., 2012. "Sampling error and double shrinkage estimation of minimum variance portfolios," Journal of Empirical Finance, Elsevier, vol. 19(4), pages 511-527.
    12. Maillet, Bertrand & Tokpavi, Sessi & Vaucher, Benoit, 2015. "Global minimum variance portfolio optimisation under some model risk: A robust regression-based approach," European Journal of Operational Research, Elsevier, vol. 244(1), pages 289-299.
    13. Füss, Roland & Miebs, Felix & Trübenbach, Fabian, 2014. "A jackknife-type estimator for portfolio revision," Journal of Banking & Finance, Elsevier, vol. 43(C), pages 14-28.
    14. Paolella, Marc S. & Polak, Paweł & Walker, Patrick S., 2021. "A non-elliptical orthogonal GARCH model for portfolio selection under transaction costs," Journal of Banking & Finance, Elsevier, vol. 125(C).
    15. Istvan Varga-Haszonits & Fabio Caccioli & Imre Kondor, 2016. "Replica approach to mean-variance portfolio optimization," Papers 1606.08679, arXiv.org.
    16. Simaan, Majeed & Simaan, Yusif & Tang, Yi, 2018. "Estimation error in mean returns and the mean-variance efficient frontier," International Review of Economics & Finance, Elsevier, vol. 56(C), pages 109-124.
    17. Víctor Adame-García & Fernando Fernández-Rodríguez & Simón Sosvilla-Rivero, 2017. "“Resolution of optimization problems and construction of efficient portfolios: An application to the Euro Stoxx 50 index"," IREA Working Papers 201702, University of Barcelona, Research Institute of Applied Economics, revised Feb 2017.
    18. Jonathan Fletcher, 2009. "Risk Reduction and Mean‐Variance Analysis: An Empirical Investigation," Journal of Business Finance & Accounting, Wiley Blackwell, vol. 36(7‐8), pages 951-971, September.
    19. Kolm, Petter N. & Tütüncü, Reha & Fabozzi, Frank J., 2014. "60 Years of portfolio optimization: Practical challenges and current trends," European Journal of Operational Research, Elsevier, vol. 234(2), pages 356-371.
    20. Lan, Wei & Wang, Hansheng & Tsai, Chih-Ling, 2012. "A Bayesian information criterion for portfolio selection," Computational Statistics & Data Analysis, Elsevier, vol. 56(1), pages 88-99, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jomega:v:104:y:2021:i:c:s0305048321000888. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/375/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.