IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v274y2019i3p1047-1054.html
   My bibliography  Save this article

Data envelopment analysis and big data

Author

Listed:
  • Khezrimotlagh, Dariush
  • Zhu, Joe
  • Cook, Wade D.
  • Toloo, Mehdi

Abstract

In the traditional data envelopment analysis (DEA) approach for a set of n Decision Making Units (DMUs), a standard DEA model is solved n times, one for each DMU. As the number of DMUs increases, the running-time to solve the standard model sharply rises. In this study, a new framework is proposed to significantly decrease the required DEA calculation time in comparison with the existing methodologies when a large set of DMUs (e.g., 20,000 DMUs or more) is present. The framework includes five steps: (i) selecting a subsample of DMUs using a proposed algorithm, (ii) finding the best-practice DMUs in the selected subsample, (iii) finding the exterior DMUs to the hull of the selected subsample, (iv) identifying the set of all efficient DMUs, and (v) measuring the performance scores of DMUs as those arising from the traditional DEA approach. The variable returns to scale technology is assumed and several simulation experiments are designed to estimate the running-time for applying the proposed method for big data. The obtained results in this study point out that the running-time is decreased up to 99.9% in comparison with the existing techniques. In addition, we illustrate the essential computation time for applying the proposed method as a function of the number of DMUs (cardinality), number of inputs and outputs (dimension), and the proportion of efficient DMUs (density). The methods are also compared on a real data set consisting of 30,099 electric power plants in the United States from 1996 to 2016.

Suggested Citation

  • Khezrimotlagh, Dariush & Zhu, Joe & Cook, Wade D. & Toloo, Mehdi, 2019. "Data envelopment analysis and big data," European Journal of Operational Research, Elsevier, vol. 274(3), pages 1047-1054.
  • Handle: RePEc:eee:ejores:v:274:y:2019:i:3:p:1047-1054
    DOI: 10.1016/j.ejor.2018.10.044
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221718309123
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2018.10.044?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Richard Barr & Matthew Durchholz, 1997. "Parallel and hierarchical decomposition approaches for solving large-scale Data Envelopment Analysis models," Annals of Operations Research, Springer, vol. 73(0), pages 339-372, October.
    2. Wen-Chih Chen & Sheng-Yung Lai, 2017. "Determining radial efficiency with a large data set by solving small-size linear programs," Annals of Operations Research, Springer, vol. 250(1), pages 147-166, March.
    3. Dula, J. H. & Helgason, R. V., 1996. "A new procedure for identifying the frame of the convex hull of a finite collection of points in multidimensional space," European Journal of Operational Research, Elsevier, vol. 92(2), pages 352-367, July.
    4. J.H. Dulá & R.M. Thrall, 2001. "A Computational Framework for Accelerating DEA," Journal of Productivity Analysis, Springer, vol. 16(1), pages 63-78, July.
    5. J. H. Dulá & R. V. Helgason & N. Venugopal, 1998. "An Algorithm for Identifying the Frame of a Pointed Finite Conical Hull," INFORMS Journal on Computing, INFORMS, vol. 10(3), pages 323-330, August.
    6. R. D. Banker & A. Charnes & W. W. Cooper, 1984. "Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis," Management Science, INFORMS, vol. 30(9), pages 1078-1092, September.
    7. Ali, Agha Iqbal, 1993. "Streamlined computation for data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 64(1), pages 61-67, January.
    8. Seiford, Lawrence M. & Zhu, Joe, 2005. "A response to comments on modeling undesirable factors in efficiency evaluation," European Journal of Operational Research, Elsevier, vol. 161(2), pages 579-581, March.
    9. William W. Cooper & Lawrence M. Seiford & Kaoru Tone, 2007. "Data Envelopment Analysis," Springer Books, Springer, edition 0, number 978-0-387-45283-8, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Tao Jie, 2020. "Parallel processing of the Build Hull algorithm to address the large-scale DEA problem," Annals of Operations Research, Springer, vol. 295(1), pages 453-481, December.
    2. Khezrimotlagh, Dariush, 2022. "Simulation designs for production frontiers," European Journal of Operational Research, Elsevier, vol. 303(3), pages 1321-1334.
    3. Borrás, Fernando & Ruiz, José L. & Sirvent, Inmaculada, 2023. "Peer evaluation through cross-efficiency based on reference sets," Omega, Elsevier, vol. 114(C).
    4. Kaffash, Sepideh & Nguyen, An Truong & Zhu, Joe, 2021. "Big data algorithms and applications in intelligent transportation system: A review and bibliometric analysis," International Journal of Production Economics, Elsevier, vol. 231(C).
    5. Ricardo F. Díaz & Blanca Sanchez-Robles, 2020. "Non-Parametric Analysis of Efficiency: An Application to the Pharmaceutical Industry," Mathematics, MDPI, vol. 8(9), pages 1-27, September.
    6. Tavana, Madjid & Izadikhah, Mohammad & Toloo, Mehdi & Roostaee, Razieh, 2021. "A new non-radial directional distance model for data envelopment analysis problems with negative and flexible measures," Omega, Elsevier, vol. 102(C).
    7. Osman, Ibrahim H. & Zablith, Fouad, 2021. "Re-evaluating electronic government development index to monitor the transformation toward achieving sustainable development goals," Journal of Business Research, Elsevier, vol. 131(C), pages 426-440.
    8. Carayannis, Elias G. & Grigoroudis, Evangelos & Wurth, Bernd, 2022. "OR for entrepreneurial ecosystems: A problem-oriented review and agenda," European Journal of Operational Research, Elsevier, vol. 300(3), pages 791-808.
    9. Ya Chen & Mike Tsionas & Valentin Zelenyuk, 2020. "LASSO DEA for small and big data," CEPA Working Papers Series WP022020, School of Economics, University of Queensland, Australia.
    10. Tone, Kaoru & Toloo, Mehdi & Izadikhah, Mohammad, 2020. "A modified slacks-based measure of efficiency in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 287(2), pages 560-571.
    11. Dai, Qianzhi & Li, Yongjun & Lei, Xiyang & Wu, Dengsheng, 2021. "A DEA-based incentive approach for allocating common revenues or fixed costs," European Journal of Operational Research, Elsevier, vol. 292(2), pages 675-686.
    12. Bingqing Li & Zhanqi Wang & Feng Xu, 2022. "Does Optimization of Industrial Structure Improve Green Efficiency of Industrial Land Use in China?," IJERPH, MDPI, vol. 19(15), pages 1-18, July.
    13. Valentin Zelenyuk, 2019. "Data Envelopment Analysis and Business Analytics: The Big Data Challenges and Some Solutions," CEPA Working Papers Series WP072019, School of Economics, University of Queensland, Australia.
    14. Chen, Ya & Tsionas, Mike G. & Zelenyuk, Valentin, 2021. "LASSO+DEA for small and big wide data," Omega, Elsevier, vol. 102(C).
    15. Orji, Ifeyinwa Juliet & Kusi-Sarpong, Simonov & Huang, Shuangfa & Vazquez-Brust, Diego, 2020. "Evaluating the factors that influence blockchain adoption in the freight logistics industry," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 141(C).
    16. Zelenyuk, Valentin, 2020. "Aggregation of inputs and outputs prior to Data Envelopment Analysis under big data," European Journal of Operational Research, Elsevier, vol. 282(1), pages 172-187.
    17. Khezrimotlagh, Dariush & Cook, Wade D. & Zhu, Joe, 2020. "A nonparametric framework to detect outliers in estimating production frontiers," European Journal of Operational Research, Elsevier, vol. 286(1), pages 375-388.
    18. Joe Zhu, 2022. "DEA under big data: data enabled analytics and network data envelopment analysis," Annals of Operations Research, Springer, vol. 309(2), pages 761-783, February.
    19. Valero-Carreras, Daniel & Aparicio, Juan & Guerrero, Nadia M., 2021. "Support vector frontiers: A new approach for estimating production functions through support vector machines," Omega, Elsevier, vol. 104(C).
    20. An, Qingxian & Tao, Xiangyang & Xiong, Beibei & Chen, Xiaohong, 2022. "Frontier-based incentive mechanisms for allocating common revenues or fixed costs," European Journal of Operational Research, Elsevier, vol. 302(1), pages 294-308.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tao Jie, 2020. "Parallel processing of the Build Hull algorithm to address the large-scale DEA problem," Annals of Operations Research, Springer, vol. 295(1), pages 453-481, December.
    2. Wen-Chih Chen & Sheng-Yung Lai, 2017. "Determining radial efficiency with a large data set by solving small-size linear programs," Annals of Operations Research, Springer, vol. 250(1), pages 147-166, March.
    3. J. H. Dulá, 2011. "An Algorithm for Data Envelopment Analysis," INFORMS Journal on Computing, INFORMS, vol. 23(2), pages 284-296, May.
    4. Zelenyuk, Valentin, 2020. "Aggregation of inputs and outputs prior to Data Envelopment Analysis under big data," European Journal of Operational Research, Elsevier, vol. 282(1), pages 172-187.
    5. Valentin Zelenyuk, 2019. "Data Envelopment Analysis and Business Analytics: The Big Data Challenges and Some Solutions," CEPA Working Papers Series WP072019, School of Economics, University of Queensland, Australia.
    6. Alexander P. Afanasiev & Vladimir E. Krivonozhko & Andrey V. Lychev & Oleg V. Sukhoroslov, 2020. "Multidimensional frontier visualization based on optimization methods using parallel computations," Journal of Global Optimization, Springer, vol. 76(3), pages 563-574, March.
    7. López, Francisco J., 2011. "Generalizing cross redundancy in data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 214(3), pages 716-721, November.
    8. Dulá, J.H. & López, F.J., 2013. "DEA with streaming data," Omega, Elsevier, vol. 41(1), pages 41-47.
    9. Goodness C. Aye & Giray Gozgor & Rangan Gupta, 2020. "Dynamic and Asymmetric Response of Inequality to Income Volatility: The Case of the United Kingdom," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 147(3), pages 747-762, February.
    10. J.H. Dulá & R.M. Thrall, 2001. "A Computational Framework for Accelerating DEA," Journal of Productivity Analysis, Springer, vol. 16(1), pages 63-78, July.
    11. Peter Fernandes Wanke & Rebecca de Mattos, 2014. "Capacity Issues and Efficiency Drivers in Brazilian Bulk Terminals," Brazilian Business Review, Fucape Business School, vol. 11(5), pages 72-98, October.
    12. Chiu, Yung-Ho & Lee, Jen-Hui & Lu, Ching-Cheng & Shyu, Ming-Kuang & Luo, Zhengying, 2012. "The technology gap and efficiency measure in WEC countries: Application of the hybrid meta frontier model," Energy Policy, Elsevier, vol. 51(C), pages 349-357.
    13. Khezrimotlagh, Dariush & Cook, Wade D. & Zhu, Joe, 2020. "A nonparametric framework to detect outliers in estimating production frontiers," European Journal of Operational Research, Elsevier, vol. 286(1), pages 375-388.
    14. Chien-Ming Chen, 2014. "Evaluating eco-efficiency with data envelopment analysis: an analytical reexamination," Annals of Operations Research, Springer, vol. 214(1), pages 49-71, March.
    15. Franz R. Hahn, 2007. "Determinants of Bank Efficiency in Europe. Assessing Bank Performance Across Markets," WIFO Studies, WIFO, number 31499, February.
    16. Matthias Klumpp & Dominic Loske, 2021. "Sustainability and Resilience Revisited: Impact of Information Technology Disruptions on Empirical Retail Logistics Efficiency," Sustainability, MDPI, vol. 13(10), pages 1-20, May.
    17. Atkinson, Scott E. & Tsionas, Mike G., 2021. "Generalized estimation of productivity with multiple bad outputs: The importance of materials balance constraints," European Journal of Operational Research, Elsevier, vol. 292(3), pages 1165-1186.
    18. Mehdiloozad, Mahmood & Zhu, Joe & Sahoo, Biresh K., 2018. "Identification of congestion in data envelopment analysis under the occurrence of multiple projections: A reliable method capable of dealing with negative data," European Journal of Operational Research, Elsevier, vol. 265(2), pages 644-654.
    19. Mohsen Afsharian & Anna Kryvko & Peter Reichling, 2011. "Efficiency and Its Impact on the Performance of European Commercial Banks," FEMM Working Papers 110018, Otto-von-Guericke University Magdeburg, Faculty of Economics and Management.
    20. Trinks, Arjan & Mulder, Machiel & Scholtens, Bert, 2020. "An Efficiency Perspective on Carbon Emissions and Financial Performance," Ecological Economics, Elsevier, vol. 175(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:274:y:2019:i:3:p:1047-1054. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.