IDEAS home Printed from https://ideas.repec.org/a/inm/ortrsc/v47y2013i2p266-279.html
   My bibliography  Save this article

Synthetic Population Generation Without a Sample

Author

Listed:
  • Johan Barthelemy

    (Namur Research Center for Complex Systems (NAXYS), FUNDP-University of Namur, B-5000 Namur, Belgium, johan.barthelemy@fundp.ac.be)

  • Philippe L. Toint

    (Namur Research Center for Complex Systems (NAXYS), FUNDP-University of Namur, B-5000 Namur, Belgium)

Abstract

The advent of microsimulation in the transportation sector has created the need for extensive disaggregated data concerning the population whose behavior is modeled. Because of the cost of collecting this data and the existing privacy regulations, this need is often met by the creation of a synthetic population on the basis of aggregate data. Although several techniques for generating such a population are known, they suffer from a number of limitations. The first is the need for a sample of the population for which fully disaggregated data must be collected, although such samples may not exist or may not be financially feasible. The second limiting assumption is that the aggregate data used must be consistent, a situation that is most unusual because these data often come from different sources and are collected, possibly at different moments, using different protocols. The paper presents a new synthetic population generator in the class of the Synthetic Reconstruction methods, whose objective is to obviate these limitations. It proceeds in three main successive steps: generation of individuals, generation of household type's joint distributions, and generation of households by gathering individuals. The main idea in these generation steps is to use data at the most disaggregated level possible to define joint distributions, from which individuals and households are randomly drawn. The method also makes explicit use of both continuous and discrete optimization and uses the (chi) 2 metric to estimate distances between estimated and generated distributions. The new generator is applied for constructing a synthetic population of approximately 10,000,000 individuals and 4,350,000 households localized in the 589 municipalities of Belgium. The statistical quality of the generated population is discussed using criteria extracted from the literature, and it is shown that the new population generator produces excellent results.

Suggested Citation

  • Johan Barthelemy & Philippe L. Toint, 2013. "Synthetic Population Generation Without a Sample," Transportation Science, INFORMS, vol. 47(2), pages 266-279, May.
  • Handle: RePEc:inm:ortrsc:v:47:y:2013:i:2:p:266-279
    DOI: 10.1287/trsc.1120.0408
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/trsc.1120.0408
    Download Restriction: no

    File URL: https://libkey.io/10.1287/trsc.1120.0408?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Fred Glover, 1989. "Tabu Search---Part I," INFORMS Journal on Computing, INFORMS, vol. 1(3), pages 190-206, August.
    2. Fred Glover, 1990. "Tabu Search—Part II," INFORMS Journal on Computing, INFORMS, vol. 2(1), pages 4-32, February.
    3. Beckman, Richard J. & Baggerly, Keith A. & McKay, Michael D., 1996. "Creating synthetic baseline populations," Transportation Research Part A: Policy and Practice, Elsevier, vol. 30(6), pages 415-429, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Stefano Guarino & Enrico Mastrostefano & Massimo Bernaschi & Alessandro Celestini & Marco Cianfriglia & Davide Torre & Lena Rebecca Zastrow, 2021. "Inferring Urban Social Networks from Publicly Available Data," Future Internet, MDPI, vol. 13(5), pages 1-45, April.
    2. Jason Hawkins & Khandker Nurul Habib, 2023. "A multi-source data fusion framework for joint population, expenditure, and time use synthesis," Transportation, Springer, vol. 50(4), pages 1323-1346, August.
    3. Andrew Bwambale & Charisma F. Choudhury & Stephane Hess & Md. Shahadat Iqbal, 2021. "Getting the best of both worlds: a framework for combining disaggregate travel survey data and aggregate mobile phone data for trip generation modelling," Transportation, Springer, vol. 48(5), pages 2287-2314, October.
    4. Saadi, Ismaïl & Mustafa, Ahmed & Teller, Jacques & Farooq, Bilal & Cools, Mario, 2016. "Hidden Markov Model-based population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 90(C), pages 1-21.
    5. Jian Liu & Xiaosu Ma & Yi Zhu & Jing Li & Zong He & Sheng Ye, 2021. "Generating and Visualizing Spatially Disaggregated Synthetic Population Using a Web-Based Geospatial Service," Sustainability, MDPI, vol. 13(3), pages 1-16, February.
    6. Trond Husby & Olga Ivanova & Mark Thissen, 2018. "Simulating the Joint Distribution of Individuals, Households and Dwellings in Small Areas," International Journal of Microsimulation, International Microsimulation Association, vol. 11(2), pages 169-190.
    7. Ballis, Haris & Dimitriou, Loukas, 2020. "Revealing personal activities schedules from synthesizing multi-period origin-destination matrices," Transportation Research Part B: Methodological, Elsevier, vol. 139(C), pages 224-258.
    8. Farooq, Bilal & Bierlaire, Michel & Hurtubia, Ricardo & Flötteröd, Gunnar, 2013. "Simulation based population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 58(C), pages 243-263.
    9. Sun, Lijun & Erath, Alexander & Cai, Ming, 2018. "A hierarchical mixture modeling framework for population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 114(C), pages 199-212.
    10. Yu Han & Changjie Chen & Zhong-Ren Peng & Pallab Mozumder, 2022. "Evaluating impacts of coastal flooding on the transportation system using an activity-based travel demand model: a case study in Miami-Dade County, FL," Transportation, Springer, vol. 49(1), pages 163-184, February.
    11. Ma, Lu & Srinivasan, Sivaramakrishnan, 2016. "An empirical assessment of factors affecting the accuracy of target-year synthetic populations," Transportation Research Part A: Policy and Practice, Elsevier, vol. 85(C), pages 247-264.
    12. Boakye, Jessica & Guidotti, Roberto & Gardoni, Paolo & Murphy, Colleen, 2022. "The role of transportation infrastructure on the impact of natural hazards on communities," Reliability Engineering and System Safety, Elsevier, vol. 219(C).
    13. Suesse Thomas & Namazi-Rad Mohammad-Reza & Mokhtarian Payam & Barthélemy Johan, 2017. "Estimating Cross-Classified Population Counts of Multidimensional Tables: An Application to Regional Australia to Obtain Pseudo-Census Counts," Journal of Official Statistics, Sciendo, vol. 33(4), pages 1021-1050, December.
    14. Nicholas Fournier & Eleni Christofa & Arun Prakash Akkinepally & Carlos Lima Azevedo, 2021. "Integrated population synthesis and workplace assignment using an efficient optimization-based person-household matching method," Transportation, Springer, vol. 48(2), pages 1061-1087, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nicholas Fournier & Eleni Christofa & Arun Prakash Akkinepally & Carlos Lima Azevedo, 2021. "Integrated population synthesis and workplace assignment using an efficient optimization-based person-household matching method," Transportation, Springer, vol. 48(2), pages 1061-1087, April.
    2. Haochen Zhang & Shaowei Cai & Chuan Luo & Minghao Yin, 2017. "An efficient local search algorithm for the winner determination problem," Journal of Heuristics, Springer, vol. 23(5), pages 367-396, October.
    3. Mohammad Javad Feizollahi & Igor Averbakh, 2014. "The Robust (Minmax Regret) Quadratic Assignment Problem with Interval Flows," INFORMS Journal on Computing, INFORMS, vol. 26(2), pages 321-335, May.
    4. C N Potts & V A Strusevich, 2009. "Fifty years of scheduling: a survey of milestones," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(1), pages 41-68, May.
    5. Cazzaro, Davide & Fischetti, Martina & Fischetti, Matteo, 2020. "Heuristic algorithms for the Wind Farm Cable Routing problem," Applied Energy, Elsevier, vol. 278(C).
    6. Fiondella, Lance & Lin, Yi-Kuei & Pham, Hoang & Chang, Ping-Chen & Li, Chendong, 2017. "A confidence-based approach to reliability design considering correlated failures," Reliability Engineering and System Safety, Elsevier, vol. 165(C), pages 102-114.
    7. Huang, Yeran & Yang, Lixing & Tang, Tao & Gao, Ziyou & Cao, Fang, 2017. "Joint train scheduling optimization with service quality and energy efficiency in urban rail transit networks," Energy, Elsevier, vol. 138(C), pages 1124-1147.
    8. B Dengiz & C Alabas-Uslu & O Dengiz, 2009. "Optimization of manufacturing systems using a neural network metamodel with a new training approach," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(9), pages 1191-1197, September.
    9. S-W Lin & K-C Ying, 2008. "A hybrid approach for single-machine tardiness problems with sequence-dependent setup times," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 59(8), pages 1109-1119, August.
    10. Shao, Saijun & Xu, Su Xiu & Huang, George Q., 2020. "Variable neighborhood search and tabu search for auction-based waste collection synchronization," Transportation Research Part B: Methodological, Elsevier, vol. 133(C), pages 1-20.
    11. Joseph B. Mazzola & Robert H. Schantz, 1997. "Multiple‐facility loading under capacity‐based economies of scope," Naval Research Logistics (NRL), John Wiley & Sons, vol. 44(3), pages 229-256, April.
    12. Abdmouleh, Zeineb & Gastli, Adel & Ben-Brahim, Lazhar & Haouari, Mohamed & Al-Emadi, Nasser Ahmed, 2017. "Review of optimization techniques applied for the integration of distributed generation from renewable energy sources," Renewable Energy, Elsevier, vol. 113(C), pages 266-280.
    13. Masoud Yaghini & Mohammad Karimi & Mohadeseh Rahbar, 2015. "A set covering approach for multi-depot train driver scheduling," Journal of Combinatorial Optimization, Springer, vol. 29(3), pages 636-654, April.
    14. Fred W. Glover, 2022. "Unforeseen Consequences of “Tabu” Choices—A Retrospective," INFORMS Journal on Computing, INFORMS, vol. 34(3), pages 1306-1308, May.
    15. Chris S. K. Leung & Henry Y. K. Lau, 2018. "Multiobjective Simulation-Based Optimization Based on Artificial Immune Systems for a Distribution Center," Journal of Optimization, Hindawi, vol. 2018, pages 1-15, May.
    16. Ilfat Ghamlouche & Teodor Gabriel Crainic & Michel Gendreau, 2003. "Cycle-Based Neighbourhoods for Fixed-Charge Capacitated Multicommodity Network Design," Operations Research, INFORMS, vol. 51(4), pages 655-667, August.
    17. Olli Bräysy & Michel Gendreau, 2005. "Vehicle Routing Problem with Time Windows, Part II: Metaheuristics," Transportation Science, INFORMS, vol. 39(1), pages 119-139, February.
    18. Servranckx, Tom & Vanhoucke, Mario, 2019. "A tabu search procedure for the resource-constrained project scheduling problem with alternative subgraphs," European Journal of Operational Research, Elsevier, vol. 273(3), pages 841-860.
    19. Drexl, Andreas & Juretzka, Jan & Salewski, Frank, 1993. "Academic course scheduling under workload and changeover constraints," Manuskripte aus den Instituten für Betriebswirtschaftslehre der Universität Kiel 337, Christian-Albrechts-Universität zu Kiel, Institut für Betriebswirtschaftslehre.
    20. Christina Iliopoulou & Konstantinos Kepaptsoglou & Eleni Vlahogianni, 2019. "Metaheuristics for the transit route network design problem: a review and comparative analysis," Public Transport, Springer, vol. 11(3), pages 487-521, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ortrsc:v:47:y:2013:i:2:p:266-279. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.