IDEAS home Printed from https://ideas.repec.org/a/eee/transb/v191y2025ics0191261524002522.html
   My bibliography  Save this article

A novel data fusion method to leverage passively-collected mobility data in generating spatially-heterogeneous synthetic population

Author

Listed:
  • Vo, Khoa D.
  • Kim, Eui-Jin
  • Bansal, Prateek

Abstract

Conventional methods to synthesize population use household travel survey (HTS) data. They generate many infeasible attribute values due to sequentially generating sociodemographics and spatial attributes and encounter a low spatial heterogeneity issue due to a low sampling rate of the HTS data. Passively collected mobility (PCM) data (e.g., cellular traces) provides extensive spatial coverage but poses integration challenges with HTS data due to differences in spatial resolution and attributes. This study introduces a novel cluster-based data fusion method to address these limitations and simultaneously generate synthetic populations with accurate sociodemographics and home–work locations at high spatial heterogeneity. Spatial clustering is adopted to align the spatial resolution of HTS and PCM data, facilitating effective data integration. The data fusion process is reformulated into cluster-specific low-dimensional optimization subproblems to ensure computational tractability. Analytical properties are derived to retain essential distributional characteristics from both datasets in the fused distribution. The spatial clustering process is optimized to ensure such distributional consistencies while maintaining a balance between feasibility and heterogeneity of the synthetic population. The data fusion properties are validated using HTS and LTE/5G cellular signaling data from Seoul, South Korea. Validation against census data confirms the method’s efficacy in maintaining distributional consistency while increasing spatial heterogeneity, with 97% of the generated population being unobserved in the HTS data. This research advances methods to synthesize a population by leveraging the complementary strengths of HTS and PCM data, providing a robust framework for generating spatially diverse synthetic populations essential for urban planning.

Suggested Citation

  • Vo, Khoa D. & Kim, Eui-Jin & Bansal, Prateek, 2025. "A novel data fusion method to leverage passively-collected mobility data in generating spatially-heterogeneous synthetic population," Transportation Research Part B: Methodological, Elsevier, vol. 191(C).
  • Handle: RePEc:eee:transb:v:191:y:2025:i:c:s0191261524002522
    DOI: 10.1016/j.trb.2024.103128
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0191261524002522
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.trb.2024.103128?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Beckman, Richard J. & Baggerly, Keith A. & McKay, Michael D., 1996. "Creating synthetic baseline populations," Transportation Research Part A: Policy and Practice, Elsevier, vol. 30(6), pages 415-429, November.
    2. Sun, Lijun & Erath, Alexander & Cai, Ming, 2018. "A hierarchical mixture modeling framework for population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 114(C), pages 199-212.
    3. Vo, Khoa D. & Lam, William H.K. & Chen, Anthony & Shao, Hu, 2020. "A household optimum utility approach for modeling joint activity-travel choices in congested road networks," Transportation Research Part B: Methodological, Elsevier, vol. 134(C), pages 93-125.
    4. Hensher, David A. & Ho, Chinh Q. & Ellison, Richard B., 2019. "Simultaneous location of firms and jobs in a transport and land use model," Journal of Transport Geography, Elsevier, vol. 75(C), pages 110-121.
    5. Auld, Joshua & Mohammadian, Abolfazl(Kouros), 2012. "Activity planning processes in the Agent-based Dynamic Activity Planning and Travel Scheduling (ADAPTS) model," Transportation Research Part A: Policy and Practice, Elsevier, vol. 46(8), pages 1386-1403.
    6. Eric Miller, 2023. "The current state of activity-based travel demand modelling and some possible next steps," Transport Reviews, Taylor & Francis Journals, vol. 43(4), pages 565-570, July.
    7. Bowman, J. L. & Ben-Akiva, M. E., 2001. "Activity-based disaggregate travel demand model system with activity schedules," Transportation Research Part A: Policy and Practice, Elsevier, vol. 35(1), pages 1-28, January.
    8. Daniel McFadden, 1977. "Modelling the Choice of Residential Location," Cowles Foundation Discussion Papers 477, Cowles Foundation for Research in Economics, Yale University.
    9. Farooq, Bilal & Bierlaire, Michel & Hurtubia, Ricardo & Flötteröd, Gunnar, 2013. "Simulation based population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 58(C), pages 243-263.
    10. Robert Tanton & Yogi Vidyattama & Binod Nepal & Justine McNamara, 2011. "Small area estimation using a reweighting algorithm," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(4), pages 931-951, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nicholas Fournier & Eleni Christofa & Arun Prakash Akkinepally & Carlos Lima Azevedo, 2021. "Integrated population synthesis and workplace assignment using an efficient optimization-based person-household matching method," Transportation, Springer, vol. 48(2), pages 1061-1087, April.
    2. Mohamed Khachman & Catherine Morency & Francesco Ciari, 2024. "Integrated multiresolution framework for spatialized population synthesis," Transportation, Springer, vol. 51(3), pages 823-852, June.
    3. Nejad, Mohammad Motalleb & Erdogan, Sevgi & Cirillo, Cinzia, 2021. "A statistical approach to small area synthetic population generation as a basis for carless evacuation planning," Journal of Transport Geography, Elsevier, vol. 90(C).
    4. ANTONI Jean-Philippe & VUIDEL Gilles & KLEIN Olivier, 2017. "Generating a located synthetic population of individuals, households, and dwellings," LISER Working Paper Series 2017-07, Luxembourg Institute of Socio-Economic Research (LISER).
    5. Jian Liu & Xiaosu Ma & Yi Zhu & Jing Li & Zong He & Sheng Ye, 2021. "Generating and Visualizing Spatially Disaggregated Synthetic Population Using a Web-Based Geospatial Service," Sustainability, MDPI, vol. 13(3), pages 1-16, February.
    6. He, Brian Y. & Zhou, Jinkai & Ma, Ziyi & Chow, Joseph Y.J. & Ozbay, Kaan, 2020. "Evaluation of city-scale built environment policies in New York City with an emerging-mobility-accessible synthetic population," Transportation Research Part A: Policy and Practice, Elsevier, vol. 141(C), pages 444-467.
    7. Ma, Lu & Srinivasan, Sivaramakrishnan, 2016. "An empirical assessment of factors affecting the accuracy of target-year synthetic populations," Transportation Research Part A: Policy and Practice, Elsevier, vol. 85(C), pages 247-264.
    8. Matt Ruther & Nicholas Nagle & Galen Maclaurin & Stefan Leyk & Barbara Buttenfield, 2013. "Validation of spatially allocated small area estimates for 1880 Census demography," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 29(22), pages 579-616.
    9. Andrew Bwambale & Charisma F. Choudhury & Stephane Hess & Md. Shahadat Iqbal, 2021. "Getting the best of both worlds: a framework for combining disaggregate travel survey data and aggregate mobile phone data for trip generation modelling," Transportation, Springer, vol. 48(5), pages 2287-2314, October.
    10. Sun, Lijun & Erath, Alexander & Cai, Ming, 2018. "A hierarchical mixture modeling framework for population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 114(C), pages 199-212.
    11. Michael Iacono & David Levinson & Ahmed El-Geneidy, 2007. "Models of Transportation and Land Use Change: A Guide to the Territory," Working Papers 200805, University of Minnesota: Nexus Research Group.
    12. Saadi, Ismaïl & Mustafa, Ahmed & Teller, Jacques & Farooq, Bilal & Cools, Mario, 2016. "Hidden Markov Model-based population synthesis," Transportation Research Part B: Methodological, Elsevier, vol. 90(C), pages 1-21.
    13. Abdoul Razac Sané & Pierre-Olivier Vandanjon & Rachid Belaroussi & Pierre Hankach, 2025. "A comprehensive investigation of variational auto-encoders for population synthesis," Journal of Computational Social Science, Springer, vol. 8(1), pages 1-34, February.
    14. Stanislav S. Borysov & Jeppe Rich, 2021. "Introducing synthetic pseudo panels: application to transport behaviour dynamics," Transportation, Springer, vol. 48(5), pages 2493-2520, October.
    15. Samani, Ali Riahi & Mishra, Sabyasachee & Golias, Mihalis & Lee, David J.-H., 2023. "What influences the location choice of establishments? An analysis considering establishment types and activities interactions," Journal of Transport Geography, Elsevier, vol. 111(C).
    16. Emiliano Brancaccio & Mauro Gallegati & Raffaele Giammetti, 2022. "Neoclassical influences in agent‐based literature: A systematic review," Journal of Economic Surveys, Wiley Blackwell, vol. 36(2), pages 350-385, April.
    17. Patrick Bayer & Fernando Ferreira & Robert McMillan, 2007. "A Unified Framework for Measuring Preferences for Schools and Neighborhoods," Journal of Political Economy, University of Chicago Press, vol. 115(4), pages 588-638, August.
    18. Frey, Rainer & Hussinger, Katrin, 2006. "The role of technology in M&As: a firm-level comparison of cross-border and domestic deals," Discussion Paper Series 1: Economic Studies 2006,45, Deutsche Bundesbank.
    19. Jiahua Tang & Du Zhang & Xibin Sun & Haiou Qin, 2022. "Improving Temporal Event Scheduling through STEP Perpetual Learning," Sustainability, MDPI, vol. 14(23), pages 1-23, December.
    20. Turansick, Christopher, 2022. "Identification in the random utility model," Journal of Economic Theory, Elsevier, vol. 203(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:transb:v:191:y:2025:i:c:s0191261524002522. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/548/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.