IDEAS home Printed from https://ideas.repec.org/a/eee/transb/v90y2016icp1-21.html
   My bibliography  Save this article

Hidden Markov Model-based population synthesis

Author

Listed:
  • Saadi, Ismaïl
  • Mustafa, Ahmed
  • Teller, Jacques
  • Farooq, Bilal
  • Cools, Mario

Abstract

Micro-simulation travel demand and land use models require a synthetic population, which consists of a set of agents characterized by demographic and socio-economic attributes. Two main families of population synthesis techniques can be distinguished: (a) fitting methods (iterative proportional fitting, updating) and (b) combinatorial optimization methods. During the last few years, a third outperforming family of population synthesis procedures has emerged, i.e., Markov process-based methods such as Monte Carlo Markov Chain (MCMC) simulations. In this paper, an extended Hidden Markov Model (HMM)-based approach is presented, which can serve as a better alternative than the existing methods. The approach is characterized by a great flexibility and efficiency in terms of data preparation and model training. The HMM is able to reproduce the structural configuration of a given population from an unlimited number of micro-samples and a marginal distribution. Only one marginal distribution of the considered population can be used as a boundary condition to “guide” the synthesis of the whole population. Model training and testing are performed using the Survey on the Workforce of 2013 and the Belgian National Household Travel Survey of 2010. Results indicate that the HMM method captures the complete heterogeneity of the micro-data contrary to standard fitting approaches. The method provides accurate results as it is able to reproduce the marginal distributions and their corresponding multivariate joint distributions with an acceptable error rate (i.e., SRSME=0.54 for 6 synthesized attributes). Furthermore, the HMM outperforms IPF for small sample sizes, even though the amount of input data is less than that for IPF. Finally, simulations show that the HMM can merge information provided by multiple data sources to allow good population estimates.

Suggested Citation

  • Saadi, Ismaïl & Mustafa, Ahmed & Teller, Jacques & Farooq, Bilal & Cools, Mario, 2016. "Hidden Markov Model-based population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 90(C), pages 1-21.
  • Handle: RePEc:eee:transb:v:90:y:2016:i:c:p:1-21
    DOI: 10.1016/j.trb.2016.04.007
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0191261515300904
    Download Restriction: Full text for ScienceDirect subscribers only

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Denteneer, Dee & Verbeek, Albert, 1985. "A fast algorithm for iterative proportional fitting in log-linear models," Computational Statistics & Data Analysis, Elsevier, vol. 3(1), pages 251-264, May.
    2. David Pritchard & Eric Miller, 2012. "Advances in population synthesis: fitting many attributes per agent and fitting to household and person margins simultaneously," Transportation, Springer, vol. 39(3), pages 685-704, May.
    3. P Williamson & M Birkin & P H Rees, 1998. "The estimation of population microdata by using data from small area statistics and samples of anonymised records," Environment and Planning A, Pion Ltd, London, vol. 30(5), pages 785-816, May.
    4. Beckman, Richard J. & Baggerly, Keith A. & McKay, Michael D., 1996. "Creating synthetic baseline populations," Transportation Research Part A: Policy and Practice, Elsevier, vol. 30(6), pages 415-429, November.
    5. P Williamson & M Birkin & P H Rees, 1998. "The Estimation of Population Microdata by Using Data from Small Area Statistics and Samples of Anonymised Records," Environment and Planning A, , vol. 30(5), pages 785-816, May.
    6. Endo, Yushi & Takemura, Akimichi, 2009. "Iterative proportional scaling via decomposable submodels for contingency tables," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 966-978, February.
    7. Badsberg, J. H. & Malvestuto, F. M., 2001. "An implementation of the iterative proportional fitting procedure by propagation trees," Computational Statistics & Data Analysis, Elsevier, vol. 37(3), pages 297-322, September.
    8. Yasmin, Farhana & Morency, Catherine & Roorda, Matthew J., 2015. "Assessment of spatial transferability of an activity-based model, TASHA," Transportation Research Part A: Policy and Practice, Elsevier, vol. 78(C), pages 200-213.
    9. Daniel C. Knudsen & A. Stewart Fotheringham, 1986. "Matrix Comparison, Goodness-of-Fit, and Spatial Interaction Modeling," International Regional Science Review, , vol. 10(2), pages 127-147, August.
    10. Rich, Jeppe & Mulalic, Ismir, 2012. "Generating synthetic baseline populations from register data," Transportation Research Part A: Policy and Practice, Elsevier, vol. 46(3), pages 467-479.
    11. Nicholas Geard & James M McCaw & Alan Dorin & Kevin B Korb & Jodie McVernon, 2013. "Synthetic Population Dynamics: A Model of Household Demography," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 16(1), pages 1-8.
    12. Johan Barthelemy & Philippe Toint, 2015. "A Stochastic and Flexible Activity Based Model for Large Population. Application to Belgium," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 18(3), pages 1-15.
    13. Visser, Ingmar & Speekenbrink, Maarten, 2010. "depmixS4: An R Package for Hidden Markov Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 36(i07).
    14. Farooq, Bilal & Bierlaire, Michel & Hurtubia, Ricardo & Flötteröd, Gunnar, 2013. "Simulation based population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 58(C), pages 243-263.
    Full references (including those not matched with items on IDEAS)

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:transb:v:90:y:2016:i:c:p:1-21. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/548/description#description .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.