IDEAS home Printed from
   My bibliography  Save this article

Imputing trip purposes for long-distance travel


  • Yijing Lu


  • Lei Zhang



Planning and policy analysis at the national, state and inter-regional corridor levels depends on reliable information and forecasts about long-distance travel. Emerging passive data collection technologies such as GPS, smartphones, and social media provide the opportunity for researchers and practitioners to potentially supplement or replace traditional long-distance travel surveys. However, certain important trip information, such as trip purpose, travel mode, and travelers’ socio-demographic characteristics, is missing from passively collected travel data. One promising solution to this data issue is to impute the missing information based on supplementary data (e.g., land use) and advanced statistical or data mining algorithms. This paper develops machine learning methods, including decision tree and meta-learning, to estimate trip purposes for long-distance passenger travel. A passively collected long-distance trip dataset is simulated from the 1995 American Travel Survey for the development and validation of the machine learning methods. The predictive accuracy of the proposed methods is evaluated for several scenarios varying with trip purposes and the extent of data availability as inputs. This research design will provide not only a practically useful approach for long-distance trip purpose imputation, but also generate valuable insights for future long-distance travel surveys. Results show that the accuracy of the trip purpose imputation methods based on all available data decreases from 95 % with two purposes (business and non-business) to 77 % with four purposes (business, personal business, social visit, and leisure). Based on a two-purpose scheme, the predictive accuracy of the imputation algorithms decreases from 95 % when all input data is used (a full-information model), to 72 % with a minimum information model that only utilizes the passively collected data. If traveler’s socio-demographic characteristics are available (possibly through other imputation models), the predictive accuracy only decreases from 95 to 91 %. Copyright Springer Science+Business Media New York 2015

Suggested Citation

  • Yijing Lu & Lei Zhang, 2015. "Imputing trip purposes for long-distance travel," Transportation, Springer, vol. 42(4), pages 581-595, July.
  • Handle: RePEc:kap:transp:v:42:y:2015:i:4:p:581-595
    DOI: 10.1007/s11116-015-9595-0

    Download full text from publisher

    File URL:
    Download Restriction: Access to full text is restricted to subscribers.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    1. Lei Zhang & Frank Southworth & Chenfeng Xiong & Anthon Sonnenberg, 2012. "Methodological Options and Data Sources for the Development of Long-Distance Passenger Travel Demand Models: A Comprehensive Review," Transport Reviews, Taylor & Francis Journals, vol. 32(4), pages 399-433, April.
    2. Chen, Cynthia & Gong, Hongmian & Lawson, Catherine & Bialostozky, Evan, 2010. "Evaluating the feasibility of a passive travel survey collection in a complex urban environment: Lessons learned from the New York City case study," Transportation Research Part A: Policy and Practice, Elsevier, vol. 44(10), pages 830-840, December.
    3. Du, Jianhe & Aultman-Hall, Lisa, 2007. "Increasing the accuracy of trip rate information from passive multi-day GPS travel datasets: Automatic trip end identification issues," Transportation Research Part A: Policy and Practice, Elsevier, vol. 41(3), pages 220-232, March.
    Full references (including those not matched with items on IDEAS)


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. repec:gam:jsusta:v:9:y:2017:i:11:p:1943-:d:116504 is not listed on IDEAS


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:kap:transp:v:42:y:2015:i:4:p:581-595. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Sonal Shukla) or (Rebekah McClure). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.