IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v144y2020ics0167947319302476.html
   My bibliography  Save this article

Quantile regression in big data: A divide and conquer based strategy

Author

Listed:
  • Chen, Lanjue
  • Zhou, Yong

Abstract

Quantile regression, which analyzes the conditional distribution of outcomes given a set of covariates, has been widely used in many fields. However, the volume and velocity of big data make the estimation of quantile regression model extremely difficult due to the intensive computation and the limited storage. Based on divide and conquer strategy, a simple and efficient method is proposed to address this problem. The proposed approach only keeps summary statistics of each data block and then can use them to reconstruct the estimator of the entire data with asymptotically negligible approximation error. This property makes the proposed method particularly appealing when data blocks are retained in multiple servers or come in the form of data stream. Furthermore, the proposed estimator is shown to be consistent and asymptotically as efficient as the estimating equation estimator calculated using the entire data together when certain conditions hold. The merits of the proposed method are illustrated using both simulation studies and real data analysis.

Suggested Citation

  • Chen, Lanjue & Zhou, Yong, 2020. "Quantile regression in big data: A divide and conquer based strategy," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
  • Handle: RePEc:eee:csdana:v:144:y:2020:i:c:s0167947319302476
    DOI: 10.1016/j.csda.2019.106892
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947319302476
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2019.106892?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Moshe Buchinsky, 1998. "Recent Advances in Quantile Regression Models: A Practical Guideline for Empirical Research," Journal of Human Resources, University of Wisconsin Press, vol. 33(1), pages 88-126.
    2. Pakes, Ariel & Pollard, David, 1989. "Simulation and the Asymptotics of Optimization Estimators," Econometrica, Econometric Society, vol. 57(5), pages 1027-1057, September.
    3. Runze Li & Dennis K.J. Lin & Bing Li, 2013. "Statistical inference in massive data sets," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 29(5), pages 399-409, September.
    4. Xi Chen & Weidong Liu & Yichen Zhang, 2018. "Quantile Regression Under Memory Constraint," Papers 1810.08264, arXiv.org.
    5. Peter Hall & Jeff Racine & Qi Li, 2004. "Cross-Validation and the Estimation of Conditional Probability Densities," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 1015-1026, December.
    6. Koenker, Roger W & Bassett, Gilbert, Jr, 1978. "Regression Quantiles," Econometrica, Econometric Society, vol. 46(1), pages 33-50, January.
    7. Buchinsky, Moshe, 1994. "Changes in the U.S. Wage Structure 1963-1987: Application of Quantile Regression," Econometrica, Econometric Society, vol. 62(2), pages 405-458, March.
    8. Roger Koenker & Kevin F. Hallock, 2001. "Quantile Regression," Journal of Economic Perspectives, American Economic Association, vol. 15(4), pages 143-156, Fall.
    9. Buchinsky, Moshe, 1995. "Quantile regression, Box-Cox transformation model, and the U.S. wage structure, 1963-1987," Journal of Econometrics, Elsevier, vol. 65(1), pages 109-154, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wang, Kangning & Li, Shaomin & Zhang, Benle, 2021. "Robust communication-efficient distributed composite quantile regression and variable selection for massive data," Computational Statistics & Data Analysis, Elsevier, vol. 161(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sungwon Lee & Joon H. Ro, 2020. "Nonparametric Tests for Conditional Quantile Independence with Duration Outcomes," Working Papers 2013, Nam Duck-Woo Economic Research Institute, Sogang University (Former Research Institute for Market Economy).
    2. Alex Coad & Rekha Rao, 2007. "The employment effects of innovation," Documents de travail du Centre d'Economie de la Sorbonne r07036, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    3. Joachim Wagner, 2014. "Exports, foreign direct investments and productivity: are services firms different?," The Service Industries Journal, Taylor & Francis Journals, vol. 34(1), pages 24-37, January.
    4. Joachim Wagner, 2006. "Export Intensity and Plant Characteristics: What Can We Learn from Quantile Regression?," Review of World Economics (Weltwirtschaftliches Archiv), Springer;Institut für Weltwirtschaft (Kiel Institute for the World Economy), vol. 142(1), pages 195-203, April.
    5. Daniel Pollmann & Thomas Dohmen & Franz Palm, 2020. "Robust Estimation of Wage Dispersion with Censored Data: An Application to Occupational Earnings Risk and Risk Attitudes," De Economist, Springer, vol. 168(4), pages 519-540, December.
    6. Alexander Coad, 2008. "Distance to Frontier and Appropriate Business Strategy," Papers on Economics and Evolution 2008-07, Philipps University Marburg, Department of Geography.
    7. Jayeeta Bhattacharya, 2020. "Quantile regression with generated dependent variable and covariates," Papers 2012.13614, arXiv.org.
    8. Joachim Wagner, 2016. "From Estimation Results to Stylized Facts: Twelve Recommendations for Empirical Research in International Activities of Heterogeneous Firms," World Scientific Book Chapters, in: Microeconometrics of International Trade, chapter 15, pages 479-514, World Scientific Publishing Co. Pte. Ltd..
    9. Stijn Kelchtermans & Reinhilde Veugelers, 2011. "The great divide in scientific productivity: why the average scientist does not exist," Industrial and Corporate Change, Oxford University Press and the Associazione ICC, vol. 20(1), pages 295-336, February.
    10. Zheng Fang & Chris Sakellariou, 2011. "A Case of Sticky Floors: Gender Wage Differentials in Thailand," Asian Economic Journal, East Asian Economic Association, vol. 25(1), pages 35-54, March.
    11. Kelly LABAR, 2007. "Intergenerational Mobility in China," Working Papers 200729, CERDI.
    12. Alex Coad & Rekha Rao, 2006. "Innovation and firm growth in "complex technology" sectors: a quantile regression approach," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-00118797, HAL.
    13. Daniel Pollmann & Thomas Dohmen & Franz Palm, 2020. "Dispersion estimation; Earnings risk; Censoring; Quantile regression; Occupational choice; Sorting; Risk preferences; SOEP; IABS," ECONtribute Discussion Papers Series 028, University of Bonn and University of Cologne, Germany.
    14. Brown, Christian & Routon, P. Wesley, 2018. "On the distributional and evolutionary nature of the obesity wage penalty," Economics & Human Biology, Elsevier, vol. 28(C), pages 160-172.
    15. Powell, David & Wagner, Joachim, 2010. "The Exporter Productivity Premium along the Productivity Distribution: First Evidence from a Quantile Regression Approach for Fixed Effects Panel Data Models," IZA Discussion Papers 5112, Institute of Labor Economics (IZA).
    16. Yu-Yen Ku & Tze-Yu Yen, 2016. "Heterogeneous Effect of Financial Leverage on Corporate Performance: A Quantile Regression Analysis of Taiwanese Companies," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 19(03), pages 1-33, September.
    17. Stacy, Brian, 2014. "Left with Bias? Quantile Regression with Measurement Error in Left Hand Side Variables," EconStor Preprints 104744, ZBW - Leibniz Information Centre for Economics.
    18. Szilvia Hamori & Anna Lovasz, 2011. "Can a fifty percent increase in public sector wages improve the position of public sector employees in the long run? An assessment of the public-private income gap in Hungary," Budapest Working Papers on the Labour Market 1106, Institute of Economics, Centre for Economic and Regional Studies.
    19. Burchi, Francesco, 2010. "Child nutrition in Mozambique in 2003: The role of mother's schooling and nutrition knowledge," Economics & Human Biology, Elsevier, vol. 8(3), pages 331-345, December.
    20. Ulrich Reuter, 2006. "What Kind of Education Does China Need?: The Impact of Educational Attainment on Local Growth and Disparities," WIDER Working Paper Series RP2006-127, World Institute for Development Economic Research (UNU-WIDER).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:144:y:2020:i:c:s0167947319302476. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.