IDEAS home Printed from https://ideas.repec.org/a/eee/jomega/v96y2020ics0305048318312131.html
   My bibliography  Save this article

Feature Selection in Data Envelopment Analysis: A Mathematical Optimization approach

Author

Listed:
  • Benítez-Peña, Sandra
  • Bogetoft, Peter
  • Romero Morales, Dolores

Abstract

This paper proposes an integrative approach to feature (input and output) selection in Data Envelopment Analysis (DEA). The DEA model is enriched with zero-one decision variables modelling the selection of features, yielding a Mixed Integer Linear Programming formulation. This single-model approach can handle different objective functions as well as constraints to incorporate desirable properties from the real-world application. Our approach is illustrated on the benchmarking of electricity Distribution System Operators (DSOs). The numerical results highlight the advantages of our single-model approach provide to the user, in terms of making the choice of the number of features, as well as modeling their costs and their nature.

Suggested Citation

  • Benítez-Peña, Sandra & Bogetoft, Peter & Romero Morales, Dolores, 2020. "Feature Selection in Data Envelopment Analysis: A Mathematical Optimization approach," Omega, Elsevier, vol. 96(C).
  • Handle: RePEc:eee:jomega:v:96:y:2020:i:c:s0305048318312131
    DOI: 10.1016/j.omega.2019.05.004
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0305048318312131
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.omega.2019.05.004?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Inmaculada Sirvent & José L. Ruiz & Fernando Borrás & Jesús T. Pastor, 2005. "A Monte Carlo Evaluation Of Several Tests For The Selection Of Variables In Dea Models," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 4(03), pages 325-343.
    2. Emrouznejad, Ali & Yang, Guo-liang, 2018. "A survey and analysis of the first 40 years of scholarly literature in DEA: 1978–2016," Socio-Economic Planning Sciences, Elsevier, vol. 61(C), pages 4-8.
    3. Adler, Nicole & Yazhemsky, Ekaterina, 2010. "Improving discrimination in data envelopment analysis: PCA-DEA or variable reduction," European Journal of Operational Research, Elsevier, vol. 202(1), pages 273-284, April.
    4. Charnes, A. & Cooper, W. W. & Rhodes, E., 1978. "Measuring the efficiency of decision making units," European Journal of Operational Research, Elsevier, vol. 2(6), pages 429-444, November.
    5. Jesús T. Pastor & JosÉ L. Ruiz & Inmaculada Sirvent, 2002. "A Statistical Test for Nested Radial Dea Models," Operations Research, INFORMS, vol. 50(4), pages 728-735, August.
    6. Stefano Benati, 2015. "Using medians in portfolio optimization," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 66(5), pages 720-731, May.
    7. R. Allen & A. Athanassopoulos & R.G. Dyson & E. Thanassoulis, 1997. "Weights restrictions and value judgements in Data Envelopment Analysis: Evolution, development and future directions," Annals of Operations Research, Springer, vol. 73(0), pages 13-34, October.
    8. Nataraja, Niranjan R. & Johnson, Andrew L., 2011. "Guidelines for using variable selection techniques in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 215(3), pages 662-669, December.
    9. Joe Zhu, 2014. "Data Envelopment Analysis," International Series in Operations Research & Management Science, in: Quantitative Models for Performance Evaluation and Benchmarking, edition 3, chapter 1, pages 1-9, Springer.
    10. Dimitris Bertsimas & Angela King, 2016. "OR Forum—An Algorithmic Approach to Linear Regression," Operations Research, INFORMS, vol. 64(1), pages 2-16, February.
    11. Wagner, Janet M. & Shimshak, Daniel G., 2007. "Stepwise selection of variables in data envelopment analysis: Procedures and managerial perspectives," European Journal of Operational Research, Elsevier, vol. 180(1), pages 57-67, July.
    12. Cook, Wade D. & Tone, Kaoru & Zhu, Joe, 2014. "Data envelopment analysis: Prior to choosing a model," Omega, Elsevier, vol. 44(C), pages 1-4.
    13. Tarja Joro & Pekka J. Korhonen, 2015. "Extension of Data Envelopment Analysis with Preference Information," International Series in Operations Research and Management Science, Springer, edition 127, number 978-1-4899-7528-7, September.
    14. Podinovski, Victor V., 2016. "Optimal weights in DEA models with weight restrictions," European Journal of Operational Research, Elsevier, vol. 254(3), pages 916-924.
    15. Agrell, Per J. & Bogetoft, Peter, 2017. "Regulatory Benchmarking: Models, Analyses and Applications," Data Envelopment Analysis Journal, now publishers, vol. 3(1-2), pages 49-91, November.
    16. Golany, B & Roll, Y, 1989. "An application procedure for DEA," Omega, Elsevier, vol. 17(3), pages 237-250.
    17. Ramón, Nuria & Ruiz, José L. & Sirvent, Inmaculada, 2010. "On the choice of weights profiles in cross-efficiency evaluations," European Journal of Operational Research, Elsevier, vol. 207(3), pages 1564-1572, December.
    18. Tarja Joro & Pekka J. Korhonen, 2015. "Data Envelopment Analysis," International Series in Operations Research & Management Science, in: Extension of Data Envelopment Analysis with Preference Information, edition 127, chapter 0, pages 15-26, Springer.
    19. Green, Rodney H. & Doyle, John R. & Cook, Wade D., 1996. "Preference voting and project ranking using DEA and cross-evaluation," European Journal of Operational Research, Elsevier, vol. 90(3), pages 461-472, May.
    20. Yongjun Li & Xiao Shi & Min Yang & Liang Liang, 2017. "Variable selection in data envelopment analysis via Akaike’s information criteria," Annals of Operations Research, Springer, vol. 253(1), pages 453-476, June.
    21. Ruiz, José L. & Sirvent, Inmaculada, 2016. "Common benchmarking and ranking of units with DEA," Omega, Elsevier, vol. 65(C), pages 1-9.
    22. Cook, Wade D. & Ramón, Nuria & Ruiz, José L. & Sirvent, Inmaculada & Zhu, Joe, 2019. "DEA-based benchmarking for performance evaluation in pay-for-performance incentive plans," Omega, Elsevier, vol. 84(C), pages 45-54.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Labbé, Martine & Landete, Mercedes & Leal, Marina, 2023. "Dendrograms, minimum spanning trees and feature selection," European Journal of Operational Research, Elsevier, vol. 308(2), pages 555-567.
    2. Carrizosa, Emilio & Kurishchenko, Kseniia & Marín, Alfredo & Romero Morales, Dolores, 2022. "Interpreting clusters via prototype optimization," Omega, Elsevier, vol. 107(C).
    3. He Jiang, 2023. "Robust forecasting in spatial autoregressive model with total variation regularization," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(2), pages 195-211, March.
    4. Raul Moragues & Juan Aparicio & Miriam Esteve, 2023. "Ranking the Importance of Variables in a Nonparametric Frontier Analysis Using Unsupervised Machine Learning Techniques," Mathematics, MDPI, vol. 11(11), pages 1-24, June.
    5. Imad Bou-Hamad & Abdel Latef Anouze & Ibrahim H. Osman, 2022. "A cognitive analytics management framework to select input and output variables for data envelopment analysis modeling of performance efficiency of banks using random forest and entropy of information," Annals of Operations Research, Springer, vol. 308(1), pages 63-92, January.
    6. Dai, Sheng, 2023. "Variable selection in convex quantile regression: L1-norm or L0-norm regularization?," European Journal of Operational Research, Elsevier, vol. 305(1), pages 338-355.
    7. Georgios Tsaples & Jason Papathanasiou & Andreas C. Georgiou, 2022. "An Exploratory DEA and Machine Learning Framework for the Evaluation and Analysis of Sustainability Composite Indicators in the EU," Mathematics, MDPI, vol. 10(13), pages 1-27, June.
    8. Anderson Hoose & Víctor Yepes & Moacir Kripka, 2021. "Selection of Production Mix in the Agricultural Machinery Industry Considering Sustainability in Decision Making," Sustainability, MDPI, vol. 13(16), pages 1-14, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Peyrache, Antonio & Rose, Christiern & Sicilia, Gabriela, 2020. "Variable selection in Data Envelopment Analysis," European Journal of Operational Research, Elsevier, vol. 282(2), pages 644-659.
    2. Imad Bou-Hamad & Abdel Latef Anouze & Ibrahim H. Osman, 2022. "A cognitive analytics management framework to select input and output variables for data envelopment analysis modeling of performance efficiency of banks using random forest and entropy of information," Annals of Operations Research, Springer, vol. 308(1), pages 63-92, January.
    3. Afsharian, Mohsen & Ahn, Heinz & Harms, Sören Guntram, 2021. "A review of DEA approaches applying a common set of weights: The perspective of centralized management," European Journal of Operational Research, Elsevier, vol. 294(1), pages 3-15.
    4. Villanueva-Cantillo, Jeyms & Munoz-Marquez, Manuel, 2021. "Methodology for calculating critical values of relevance measures in variable selection methods in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 290(2), pages 657-670.
    5. Charles, Vincent & Aparicio, Juan & Zhu, Joe, 2019. "The curse of dimensionality of decision-making units: A simple approach to increase the discriminatory power of data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 279(3), pages 929-940.
    6. Toloo, Mehdi & Hančlová, Jana, 2020. "Multi-valued measures in DEA in the presence of undesirable outputs," Omega, Elsevier, vol. 94(C).
    7. Esteve, Miriam & Aparicio, Juan & Rodriguez-Sala, Jesus J. & Zhu, Joe, 2023. "Random Forests and the measurement of super-efficiency in the context of Free Disposal Hull," European Journal of Operational Research, Elsevier, vol. 304(2), pages 729-744.
    8. Jamal Ouenniche & Skarleth Carrales, 2018. "Assessing efficiency profiles of UK commercial banks: a DEA analysis with regression-based feedback," Annals of Operations Research, Springer, vol. 266(1), pages 551-587, July.
    9. Kottas, Angelos T. & Madas, Michael A., 2018. "Comparative efficiency analysis of major international airlines using Data Envelopment Analysis: Exploring effects of alliance membership and other operational efficiency determinants," Journal of Air Transport Management, Elsevier, vol. 70(C), pages 1-17.
    10. Nataraja, Niranjan R. & Johnson, Andrew L., 2011. "Guidelines for using variable selection techniques in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 215(3), pages 662-669, December.
    11. Liu, John S. & Lu, Louis Y.Y. & Lu, Wen-Min, 2016. "Research fronts in data envelopment analysis," Omega, Elsevier, vol. 58(C), pages 33-45.
    12. Toloo, Mehdi & Keshavarz, Esmaeil & Hatami-Marbini, Adel, 2021. "Selecting data envelopment analysis models: A data-driven application to EU countries," Omega, Elsevier, vol. 101(C).
    13. Pereira, Miguel Alves & Camanho, Ana Santos & Figueira, José Rui & Marques, Rui Cunha, 2021. "Incorporating preference information in a range directional composite indicator: The case of Portuguese public hospitals," European Journal of Operational Research, Elsevier, vol. 294(2), pages 633-650.
    14. Dyckhoff, Harald & Souren, Rainer, 2022. "Integrating multiple criteria decision analysis and production theory for performance evaluation: Framework and review," European Journal of Operational Research, Elsevier, vol. 297(3), pages 795-816.
    15. Harald Dyckhoff, 2018. "Multi-criteria production theory: foundation of non-financial and sustainability performance evaluation," Journal of Business Economics, Springer, vol. 88(7), pages 851-882, September.
    16. Victor V. Podinovski & Tatiana Bouzdine-Chameeva, 2021. "Optimal solutions of multiplier DEA models," Journal of Productivity Analysis, Springer, vol. 56(1), pages 45-68, August.
    17. Adler, Nicole & Yazhemsky, Ekaterina, 2010. "Improving discrimination in data envelopment analysis: PCA-DEA or variable reduction," European Journal of Operational Research, Elsevier, vol. 202(1), pages 273-284, April.
    18. Anna Łozowicka & Bartłomiej Lach, 2022. "CI-DEA: A Way to Improve the Discriminatory Power of DEA—Using the Example of the Efficiency Assessment of the Digitalization in the Life of the Generation 50+," Sustainability, MDPI, vol. 14(6), pages 1-22, March.
    19. Raul Moragues & Juan Aparicio & Miriam Esteve, 2023. "Ranking the Importance of Variables in a Nonparametric Frontier Analysis Using Unsupervised Machine Learning Techniques," Mathematics, MDPI, vol. 11(11), pages 1-24, June.
    20. Valentin Zelenyuk, 2019. "Data Envelopment Analysis and Business Analytics: The Big Data Challenges and Some Solutions," CEPA Working Papers Series WP072019, School of Economics, University of Queensland, Australia.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jomega:v:96:y:2020:i:c:s0305048318312131. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/375/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.