IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v215y2011i3p662-669.html
   My bibliography  Save this article

Guidelines for using variable selection techniques in data envelopment analysis

Author

Listed:
  • Nataraja, Niranjan R.
  • Johnson, Andrew L.

Abstract

Model misspecification has significant impacts on data envelopment analysis (DEA) efficiency estimates. This paper discusses the four most widely-used approaches to guide variable specification in DEA. We analyze efficiency contribution measure (ECM), principal component analysis (PCA-DEA), a regression-based test, and bootstrapping for variable selection via Monte Carlo simulations to determine each approach's advantages and disadvantages. For a three input, one output production process, we find that: PCA-DEA performs well with highly correlated inputs (greater than 0.8) and even for small data sets (less than 300 observations); both the regression and ECM approaches perform well under low correlation (less than 0.2) and relatively larger data sets (at least 300 observations); and bootstrapping performs relatively poorly. Bootstrapping requires hours of computational time whereas the three other methods require minutes. Based on the results, we offer guidelines for effectively choosing among the four selection methods.

Suggested Citation

  • Nataraja, Niranjan R. & Johnson, Andrew L., 2011. "Guidelines for using variable selection techniques in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 215(3), pages 662-669, December.
  • Handle: RePEc:eee:ejores:v:215:y:2011:i:3:p:662-669
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221711006011
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Leopold Simar & Paul Wilson, 2000. "A general methodology for bootstrapping in non-parametric frontier models," Journal of Applied Statistics, Taylor & Francis Journals, vol. 27(6), pages 779-802.
    2. Phillip Fanchon, 2003. "Variable selection for dynamic measures of efficiency in the computer industry," International Advances in Economic Research, Springer;International Atlantic Economic Society, vol. 9(3), pages 175-188, August.
    3. Golany, B & Roll, Y, 1989. "An application procedure for DEA," Omega, Elsevier, vol. 17(3), pages 237-250.
    4. Inmaculada Sirvent & José L. Ruiz & Fernando Borrás & Jesús T. Pastor, 2005. "A Monte Carlo Evaluation Of Several Tests For The Selection Of Variables In Dea Models," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 4(03), pages 325-343.
    5. Adler, Nicole & Golany, Boaz, 2001. "Evaluation of deregulated airline networks using data envelopment analysis combined with principal component analysis with an application to Western Europe," European Journal of Operational Research, Elsevier, vol. 132(2), pages 260-273, July.
    6. Fried, Harold O. & Lovell, C. A. Knox & Schmidt, Shelton S. (ed.), 2008. "The Measurement of Productive Efficiency and Productivity Growth," OUP Catalogue, Oxford University Press, number 9780195183528, Decembrie.
    7. Adler, Nicole & Yazhemsky, Ekaterina, 2010. "Improving discrimination in data envelopment analysis: PCA-DEA or variable reduction," European Journal of Operational Research, Elsevier, vol. 202(1), pages 273-284, April.
    8. Hung-jen Wang & Peter Schmidt, 2002. "One-Step and Two-Step Estimation of the Effects of Exogenous Variables on Technical Efficiency Levels," Journal of Productivity Analysis, Springer, vol. 18(2), pages 129-144, September.
    9. Charnes, A. & Cooper, W. W. & Rhodes, E., 1978. "Measuring the efficiency of decision making units," European Journal of Operational Research, Elsevier, vol. 2(6), pages 429-444, November.
    10. John Ruggiero, 2005. "Impact Assessment Of Input Omission On Dea," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 4(03), pages 359-368.
    11. Jesús T. Pastor & JosÉ L. Ruiz & Inmaculada Sirvent, 2002. "A Statistical Test for Nested Radial Dea Models," Operations Research, INFORMS, vol. 50(4), pages 728-735, August.
    12. Olson, Jerome A. & Schmidt, Peter & Waldman, Donald M., 1980. "A Monte Carlo study of estimators of stochastic frontier production functions," Journal of Econometrics, Elsevier, vol. 13(1), pages 67-82, May.
    13. Valdmanis, Vivian, 1992. "Sensitivity analysis for DEA models : An empirical example using public vs. NFP hospitals," Journal of Public Economics, Elsevier, vol. 48(2), pages 185-205, July.
    14. N Adler & B Golany, 2002. "Including principal component weights to improve discrimination in data envelopment analysis," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 53(9), pages 985-991, September.
    15. Lewin, Arie Y & Morey, Richard C & Cook, Thomas J, 1982. "Evaluating the administrative efficiency of courts," Omega, Elsevier, vol. 10(4), pages 401-411.
    16. Wagner, Janet M. & Shimshak, Daniel G., 2007. "Stepwise selection of variables in data envelopment analysis: Procedures and managerial perspectives," European Journal of Operational Research, Elsevier, vol. 180(1), pages 57-67, July.
    17. Wen-Chih Chen & Andrew Johnson, 2010. "The dynamics of performance space of Major League Baseball pitchers 1871–2006," Annals of Operations Research, Springer, vol. 181(1), pages 287-302, December.
    18. R. D. Banker & A. Charnes & W. W. Cooper, 1984. "Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis," Management Science, INFORMS, vol. 30(9), pages 1078-1092, September.
    19. Jenkins, Larry & Anderson, Murray, 2003. "A multivariate statistical approach to reducing the number of variables in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 147(1), pages 51-61, May.
    20. Dyson, R. G. & Allen, R. & Camanho, A. S. & Podinovski, V. V. & Sarrico, C. S. & Shale, E. A., 2001. "Pitfalls and protocols in DEA," European Journal of Operational Research, Elsevier, vol. 132(2), pages 245-259, July.
    21. Peter Smith, 1997. "Model misspecification in Data Envelopment Analysis," Annals of Operations Research, Springer, vol. 73(0), pages 233-252, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jamal Ouenniche & Skarleth Carrales, 2018. "Assessing efficiency profiles of UK commercial banks: a DEA analysis with regression-based feedback," Annals of Operations Research, Springer, vol. 266(1), pages 551-587, July.
    2. Toloo, Mehdi & Tone, Kaoru & Izadikhah, Mohammad, 2023. "Selecting slacks-based data envelopment analysis models," European Journal of Operational Research, Elsevier, vol. 308(3), pages 1302-1318.
    3. Villanueva-Cantillo, Jeyms & Munoz-Marquez, Manuel, 2021. "Methodology for calculating critical values of relevance measures in variable selection methods in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 290(2), pages 657-670.
    4. Peyrache, Antonio & Rose, Christiern & Sicilia, Gabriela, 2020. "Variable selection in Data Envelopment Analysis," European Journal of Operational Research, Elsevier, vol. 282(2), pages 644-659.
    5. Eskelinen, Juha, 2017. "Comparison of variable selection techniques for data envelopment analysis in a retail bank," European Journal of Operational Research, Elsevier, vol. 259(2), pages 778-788.
    6. Charles, Vincent & Aparicio, Juan & Zhu, Joe, 2019. "The curse of dimensionality of decision-making units: A simple approach to increase the discriminatory power of data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 279(3), pages 929-940.
    7. Anna Łozowicka & Bartłomiej Lach, 2022. "CI-DEA: A Way to Improve the Discriminatory Power of DEA—Using the Example of the Efficiency Assessment of the Digitalization in the Life of the Generation 50+," Sustainability, MDPI, vol. 14(6), pages 1-22, March.
    8. Raul Moragues & Juan Aparicio & Miriam Esteve, 2023. "Ranking the Importance of Variables in a Nonparametric Frontier Analysis Using Unsupervised Machine Learning Techniques," Mathematics, MDPI, vol. 11(11), pages 1-24, June.
    9. Imad Bou-Hamad & Abdel Latef Anouze & Ibrahim H. Osman, 2022. "A cognitive analytics management framework to select input and output variables for data envelopment analysis modeling of performance efficiency of banks using random forest and entropy of information," Annals of Operations Research, Springer, vol. 308(1), pages 63-92, January.
    10. Qiwei Xie & Yuanyuan Li & Lizheng Wang & Chao Liu, 2018. "Improving discrimination in data envelopment analysis without losing information based on Renyi’s entropy," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 26(4), pages 1053-1068, December.
    11. Esteve, Miriam & Aparicio, Juan & Rodriguez-Sala, Jesus J. & Zhu, Joe, 2023. "Random Forests and the measurement of super-efficiency in the context of Free Disposal Hull," European Journal of Operational Research, Elsevier, vol. 304(2), pages 729-744.
    12. Yongjun Li & Xiao Shi & Min Yang & Liang Liang, 2017. "Variable selection in data envelopment analysis via Akaike’s information criteria," Annals of Operations Research, Springer, vol. 253(1), pages 453-476, June.
    13. Adler, Nicole & Yazhemsky, Ekaterina, 2010. "Improving discrimination in data envelopment analysis: PCA-DEA or variable reduction," European Journal of Operational Research, Elsevier, vol. 202(1), pages 273-284, April.
    14. Kyuseok Lee & Kyuwan Choi, 2010. "Cross redundancy and sensitivity in DEA models," Journal of Productivity Analysis, Springer, vol. 34(2), pages 151-165, October.
    15. Toloo, Mehdi & Keshavarz, Esmaeil & Hatami-Marbini, Adel, 2021. "Selecting data envelopment analysis models: A data-driven application to EU countries," Omega, Elsevier, vol. 101(C).
    16. Valentin Zelenyuk, 2019. "Data Envelopment Analysis and Business Analytics: The Big Data Challenges and Some Solutions," CEPA Working Papers Series WP072019, School of Economics, University of Queensland, Australia.
    17. Massimo Finocchiaro Castro & Calogero Guccio, 2014. "Searching for the source of technical inefficiency in Italian judicial districts: an empirical investigation," European Journal of Law and Economics, Springer, vol. 38(3), pages 369-391, December.
    18. Lee, Chia-Yen & Cai, Jia-Ying, 2020. "LASSO variable selection in data envelopment analysis with small datasets," Omega, Elsevier, vol. 91(C).
    19. Benítez-Peña, Sandra & Bogetoft, Peter & Romero Morales, Dolores, 2020. "Feature Selection in Data Envelopment Analysis: A Mathematical Optimization approach," Omega, Elsevier, vol. 96(C).
    20. Liu, John S. & Lu, Louis Y.Y. & Lu, Wen-Min, 2016. "Research fronts in data envelopment analysis," Omega, Elsevier, vol. 58(C), pages 33-45.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:215:y:2011:i:3:p:662-669. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.