IDEAS home Printed from https://ideas.repec.org/a/eee/eejocm/v31y2019icp181-197.html
   My bibliography  Save this article

Information theoretic-based sampling of observations

Author

Listed:
  • van Cranenburgh, Sander
  • Bliemer, Michiel C.J.

Abstract

Due to the surge in the amount of data that are being collected, analysts are increasingly faced with very large data sets. Estimation of sophisticated discrete choice models (such as Mixed Logit models) based on these typically large data sets can be computationally burdensome, or even infeasible. Hitherto, analysts tried to overcome these computational burdens by reverting to less computationally demanding choice models or by taking advantage of the increase in computational resources. In this paper we take a different approach: we develop a new method called Sampling of Observations (SoO) which scales down the size of the choice data set, prior to the estimation. More specifically, based on information-theoretic principles this method extracts a subset of observations from the data which is much smaller in volume than the original data set, yet produces statistically nearly identical results. We show that this method can be used to estimate sophisticated discrete choice models based on data sets that were originally too large to conduct sophisticated choice analysis.

Suggested Citation

  • van Cranenburgh, Sander & Bliemer, Michiel C.J., 2019. "Information theoretic-based sampling of observations," Journal of choice modelling, Elsevier, vol. 31(C), pages 181-197.
  • Handle: RePEc:eee:eejocm:v:31:y:2019:i:c:p:181-197
    DOI: 10.1016/j.jocm.2018.02.003
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1755534517301124
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jocm.2018.02.003?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Frank Witlox, 2015. "Beyond the Data Smog?," Transport Reviews, Taylor & Francis Journals, vol. 35(3), pages 245-249, May.
    2. Bliemer, Michiel C.J. & Collins, Andrew T., 2016. "On determining priors for the generation of efficient stated choice experimental designs," Journal of choice modelling, Elsevier, vol. 21(C), pages 10-14.
    3. Esther Bekker-Grob & Bas Donkers & Marcel Jonker & Elly Stolk, 2015. "Sample Size Requirements for Discrete-Choice Experiments in Healthcare: a Practical Guide," The Patient: Patient-Centered Outcomes Research, Springer;International Academy of Health Preference Research, vol. 8(5), pages 373-384, October.
    4. Bliemer, Michiel C.J. & Rose, John M. & Hensher, David A., 2009. "Efficient stated choice experiments for estimating nested logit models," Transportation Research Part B: Methodological, Elsevier, vol. 43(1), pages 19-35, January.
    5. van Cranenburgh, Sander & Rose, John M. & Chorus, Caspar G., 2018. "On the robustness of efficient experimental designs towards the underlying decision rule," Transportation Research Part A: Policy and Practice, Elsevier, vol. 109(C), pages 50-64.
    6. David Revelt & Kenneth Train, 1998. "Mixed Logit With Repeated Choices: Households' Choices Of Appliance Efficiency Level," The Review of Economics and Statistics, MIT Press, vol. 80(4), pages 647-657, November.
    7. Hess, Stephane & Train, Kenneth E. & Polak, John W., 2006. "On the use of a Modified Latin Hypercube Sampling (MLHS) method in the estimation of a Mixed Logit Model for vehicle choice," Transportation Research Part B: Methodological, Elsevier, vol. 40(2), pages 147-163, February.
    8. Ferrini, Silvia & Scarpa, Riccardo, 2007. "Designs with a priori information for nonmarket valuation with choice experiments: A Monte Carlo study," Journal of Environmental Economics and Management, Elsevier, vol. 53(3), pages 342-363, May.
    9. Bliemer, Michiel C.J. & Rose, John M. & Chorus, Caspar G., 2017. "Detecting dominance in stated choice data and accounting for dominance-based scale differences in logit models," Transportation Research Part B: Methodological, Elsevier, vol. 102(C), pages 83-104.
    10. Nielsen, Otto Anker, 2000. "A stochastic transit assignment model considering differences in passengers utility functions," Transportation Research Part B: Methodological, Elsevier, vol. 34(5), pages 377-402, June.
    11. John Rose & Michiel Bliemer, 2013. "Sample size requirements for stated choice experiments," Transportation, Springer, vol. 40(5), pages 1021-1041, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Stöckel, Jannis & Bom, Judith, 2022. "Revisiting longer-term health effects of informal caregiving: Evidence from the UK," The Journal of the Economics of Ageing, Elsevier, vol. 21(C).
    2. Andrea Tolentino Herrera & José Gerardo De la Vega Meneses, 2020. "Responsabilidad Social Corporativa como la clave para las empresas exitosas," Revista de Investigación en Ciencias Contables y Administrativas, Universidad Michoacana de San Nicolás de Hidalgo, Facultad de Contaduría y Ciencias Administrativas, vol. 6(1), pages 116-129, December.
    3. Molloy, Joseph & Becker, Felix & Schmid, Basil & Axhausen, Kay W., 2021. "mixl: An open-source R package for estimating complex choice models on large datasets," Journal of choice modelling, Elsevier, vol. 39(C).
    4. S. Van Cranenburgh & S. Wang & A. Vij & F. Pereira & J. Walker, 2021. "Choice modelling in the age of machine learning -- discussion paper," Papers 2101.11948, arXiv.org, revised Nov 2021.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. van Cranenburgh, Sander & Collins, Andrew T., 2019. "New software tools for creating stated choice experimental designs efficient for regret minimisation and utility maximisation decision rules," Journal of choice modelling, Elsevier, vol. 31(C), pages 104-123.
    2. Joan L. Walker & Yanqiao Wang & Mikkel Thorhauge & Moshe Ben-Akiva, 2018. "D-efficient or deficient? A robustness analysis of stated choice experimental designs," Theory and Decision, Springer, vol. 84(2), pages 215-238, March.
    3. van Cranenburgh, Sander & Rose, John M. & Chorus, Caspar G., 2018. "On the robustness of efficient experimental designs towards the underlying decision rule," Transportation Research Part A: Policy and Practice, Elsevier, vol. 109(C), pages 50-64.
    4. Bliemer, Michiel C.J. & Rose, John M., 2010. "Construction of experimental designs for mixed logit models allowing for correlation across choice observations," Transportation Research Part B: Methodological, Elsevier, vol. 44(6), pages 720-734, July.
    5. Danaf, Mazen & Atasoy, Bilge & de Azevedo, Carlos Lima & Ding-Mastera, Jing & Abou-Zeid, Maya & Cox, Nathaniel & Zhao, Fang & Ben-Akiva, Moshe, 2019. "Context-aware stated preferences with smartphone-based travel surveys," Journal of choice modelling, Elsevier, vol. 31(C), pages 35-50.
    6. Tagliafierro, C. & Boeri, M. & Longo, A. & Hutchinson, W.G., 2016. "Stated preference methods and landscape ecology indicators: An example of transdisciplinarity in landscape economic valuation," Ecological Economics, Elsevier, vol. 127(C), pages 11-22.
    7. Esther W. de Bekker‐Grob & Mandy Ryan & Karen Gerard, 2012. "Discrete choice experiments in health economics: a review of the literature," Health Economics, John Wiley & Sons, Ltd., vol. 21(2), pages 145-172, February.
    8. Haghani, Milad & Bliemer, Michiel C.J. & Hensher, David A., 2021. "The landscape of econometric discrete choice modelling research," Journal of choice modelling, Elsevier, vol. 40(C).
    9. Haghani, Milad & Sarvi, Majid, 2018. "Hypothetical bias and decision-rule effect in modelling discrete directional choices," Transportation Research Part A: Policy and Practice, Elsevier, vol. 116(C), pages 361-388.
    10. Greiner, Romy & Bliemer, Michiel & Ballweg, Julie, 2014. "Design considerations of a choice experiment to estimate likely participation by north Australian pastoralists in contractual biodiversity conservation," Journal of choice modelling, Elsevier, vol. 10(C), pages 34-45.
    11. John Gibson & Riccardo Scarpa & Halahingano Rohorua, 2013. "Respiratory Health of Pacific Island Immigrants and Preferences for Indoor Air Quality Determinants in New Zealand," Working Papers in Economics 13/09, University of Waikato.
    12. Zhang, Rong & Zhu, Lichao, 2019. "Threshold incorporating freight choice modeling for hinterland leg transportation chain of export containers," Transportation Research Part A: Policy and Practice, Elsevier, vol. 130(C), pages 858-872.
    13. Zhou, Heng & Norman, Richard & Xia, Jianhong(Cecilia) & Hughes, Brett & Kelobonye, Keone & Nikolova, Gabi & Falkmer, Torbjorn, 2020. "Analysing travel mode and airline choice using latent class modelling: A case study in Western Australia," Transportation Research Part A: Policy and Practice, Elsevier, vol. 137(C), pages 187-205.
    14. Amador, Francisco Javier & González, Rosa Marina & Ramos-Real, Francisco Javier, 2013. "Supplier choice and WTP for electricity attributes in an emerging market: The role of perceived past experience, environmental concern and energy saving behavior," Energy Economics, Elsevier, vol. 40(C), pages 953-966.
    15. Großmann, Heiko, 2019. "A practical approach to designing partial-profile choice experiments with two alternatives for estimating main effects and interactions of many two-level attributes," Journal of choice modelling, Elsevier, vol. 32(C), pages 1-1.
    16. Bliemer, Michiel C.J. & Rose, John M., 2011. "Experimental design influences on stated choice outputs: An empirical study in air travel choice," Transportation Research Part A: Policy and Practice, Elsevier, vol. 45(1), pages 63-79, January.
    17. Canessa, Carolin & Venus, Terese E. & Wiesmeier, Miriam & Mennig, Philipp & Sauer, Johannes, 2023. "Incentives, Rewards or Both in Payments for Ecosystem Services: Drawing a Link Between Farmers' Preferences and Biodiversity Levels," Ecological Economics, Elsevier, vol. 213(C).
    18. Frings, Oliver & Abildtrup, Jens & Montagné-Huck, Claire & Gorel, Salomé & Stenger, Anne, 2023. "Do individual PES buyers care about additionality and free-riding? A choice experiment," Ecological Economics, Elsevier, vol. 213(C).
    19. Fosgerau, Mogens & Bierlaire, Michel, 2007. "A practical test for the choice of mixing distribution in discrete choice models," Transportation Research Part B: Methodological, Elsevier, vol. 41(7), pages 784-794, August.
    20. Kolarova, Viktoriya & Steck, Felix & Bahamonde-Birke, Francisco J., 2019. "Assessing the effect of autonomous driving on value of travel time savings: A comparison between current and future preferences," Transportation Research Part A: Policy and Practice, Elsevier, vol. 129(C), pages 155-169.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:eejocm:v:31:y:2019:i:c:p:181-197. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/journal-of-choice-modelling .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.