IDEAS home Printed from https://ideas.repec.org/
MyIDEAS: Log in (now much improved!) to save this article

Hidden Markov Model-based population synthesis

Listed author(s):
  • Saadi, Ismaïl
  • Mustafa, Ahmed
  • Teller, Jacques
  • Farooq, Bilal
  • Cools, Mario
Registered author(s):

    Micro-simulation travel demand and land use models require a synthetic population, which consists of a set of agents characterized by demographic and socio-economic attributes. Two main families of population synthesis techniques can be distinguished: (a) fitting methods (iterative proportional fitting, updating) and (b) combinatorial optimization methods. During the last few years, a third outperforming family of population synthesis procedures has emerged, i.e., Markov process-based methods such as Monte Carlo Markov Chain (MCMC) simulations. In this paper, an extended Hidden Markov Model (HMM)-based approach is presented, which can serve as a better alternative than the existing methods. The approach is characterized by a great flexibility and efficiency in terms of data preparation and model training. The HMM is able to reproduce the structural configuration of a given population from an unlimited number of micro-samples and a marginal distribution. Only one marginal distribution of the considered population can be used as a boundary condition to “guide” the synthesis of the whole population. Model training and testing are performed using the Survey on the Workforce of 2013 and the Belgian National Household Travel Survey of 2010. Results indicate that the HMM method captures the complete heterogeneity of the micro-data contrary to standard fitting approaches. The method provides accurate results as it is able to reproduce the marginal distributions and their corresponding multivariate joint distributions with an acceptable error rate (i.e., SRSME=0.54 for 6 synthesized attributes). Furthermore, the HMM outperforms IPF for small sample sizes, even though the amount of input data is less than that for IPF. Finally, simulations show that the HMM can merge information provided by multiple data sources to allow good population estimates.

    If you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.

    File URL: http://www.sciencedirect.com/science/article/pii/S0191261515300904
    Download Restriction: Full text for ScienceDirect subscribers only

    As the access to this document is restricted, you may want to look for a different version under "Related research" (further below) or search for a different version of it.

    Article provided by Elsevier in its journal Transportation Research Part B: Methodological.

    Volume (Year): 90 (2016)
    Issue (Month): C ()
    Pages: 1-21

    as
    in new window

    Handle: RePEc:eee:transb:v:90:y:2016:i:c:p:1-21
    DOI: 10.1016/j.trb.2016.04.007
    Contact details of provider: Web page: http://www.elsevier.com/wps/find/journaldescription.cws_home/548/description#description

    Order Information: Postal: http://www.elsevier.com/wps/find/supportfaq.cws_home/regional
    Web: https://shop.elsevier.com/order?id=548&ref=548_01_ooc_1&version=01

    References listed on IDEAS
    Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:

    as
    in new window


    1. Daniel C. Knudsen & A. Stewart Fotheringham, 1986. "Matrix Comparison, Goodness-of-Fit, and Spatial Interaction Modeling," International Regional Science Review, , vol. 10(2), pages 127-147, August.
    2. Denteneer, Dee & Verbeek, Albert, 1985. "A fast algorithm for iterative proportional fitting in log-linear models," Computational Statistics & Data Analysis, Elsevier, vol. 3(1), pages 251-264, May.
    3. Visser, Ingmar & Speekenbrink, Maarten, 2010. "depmixS4: An R Package for Hidden Markov Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 36(i07).
    4. David Pritchard & Eric Miller, 2012. "Advances in population synthesis: fitting many attributes per agent and fitting to household and person margins simultaneously," Transportation, Springer, vol. 39(3), pages 685-704, May.
    5. Farooq, Bilal & Bierlaire, Michel & Hurtubia, Ricardo & Flötteröd, Gunnar, 2013. "Simulation based population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 58(C), pages 243-263.
    6. Rich, Jeppe & Mulalic, Ismir, 2012. "Generating synthetic baseline populations from register data," Transportation Research Part A: Policy and Practice, Elsevier, vol. 46(3), pages 467-479.
    7. Nicholas Geard & James M McCaw & Alan Dorin & Kevin B Korb & Jodie McVernon, 2013. "Synthetic Population Dynamics: A Model of Household Demography," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 16(1), pages 1-8.
    8. P Williamson & M Birkin & P H Rees, 1998. "The Estimation of Population Microdata by Using Data from Small Area Statistics and Samples of Anonymised Records," Environment and Planning A, , vol. 30(5), pages 785-816, May.
    9. P Williamson & M Birkin & P H Rees, 1998. "The estimation of population microdata by using data from small area statistics and samples of anonymised records," Environment and Planning A, Pion Ltd, London, vol. 30(5), pages 785-816, May.
    10. Endo, Yushi & Takemura, Akimichi, 2009. "Iterative proportional scaling via decomposable submodels for contingency tables," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 966-978, February.
    11. Johan Barthelemy & Philippe Toint, 2015. "A Stochastic and Flexible Activity Based Model for Large Population. Application to Belgium," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 18(3), pages 1-15.
    12. Badsberg, J. H. & Malvestuto, F. M., 2001. "An implementation of the iterative proportional fitting procedure by propagation trees," Computational Statistics & Data Analysis, Elsevier, vol. 37(3), pages 297-322, September.
    13. Yasmin, Farhana & Morency, Catherine & Roorda, Matthew J., 2015. "Assessment of spatial transferability of an activity-based model, TASHA," Transportation Research Part A: Policy and Practice, Elsevier, vol. 78(C), pages 200-213.
    14. Beckman, Richard J. & Baggerly, Keith A. & McKay, Michael D., 1996. "Creating synthetic baseline populations," Transportation Research Part A: Policy and Practice, Elsevier, vol. 30(6), pages 415-429, November.
    Full references (including those not matched with items on IDEAS)

    This item is not listed on Wikipedia, on a reading list or among the top items on IDEAS.

    When requesting a correction, please mention this item's handle: RePEc:eee:transb:v:90:y:2016:i:c:p:1-21. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu)

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If references are entirely missing, you can add them using this form.

    If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    This information is provided to you by IDEAS at the Research Division of the Federal Reserve Bank of St. Louis using RePEc data.