IDEAS home Printed from https://ideas.repec.org/a/eee/transb/v118y2018icp407-428.html
   My bibliography  Save this article

An innovative approach for traffic crash estimation and prediction on accommodating unobserved heterogeneities

Author

Listed:
  • Dong, Chunjiao
  • Shao, Chunfu
  • Clarke, David B.
  • Nambisan, Shashi S.

Abstract

Since traffic crashes involve complex interactions among drivers, vehicles, roadway, traffic, and environmental elements and not all of the factors that could potentially determine the occurrences of traffic crashes can be observed and measured, new methods are needed to better perform traffic crash estimations and predictions and address the unobserved heterogeneity issues in crash data. Unlike the conventional methods, which generally are the statistical models with the observed crash counts as the dependent variables and the factors affecting the likelihood of a traffic crash as the independent variables, a dynamic state-space model with deep learning is proposed to analyze the traffic crashes. The proposed model includes three modules, an unsupervised feature learning module to identify functional network between the explanatory variables and the feature representations, a supervised fine tuning module to perform crash occurrence likelihood estimations, and a dynamic state-space module to perform crash count predictions. A multivariate Tobit model is incorporated in the supervised fine tuning module as the regression layer to account for the heterogeneity issues in correlated crash data. The results of deep learning are fed to the dynamic state-space model that contains a dynamic equation governing the state dynamics to improve the performances of estimation and prediction. The proposed model was applied to the dataset that was obtained from Knox County in Tennessee to validate the model effectiveness and efficiency. The results show that the proposed model has superior performances in terms of estimation and prediction power compared to the SVM and Random Forest (RF) models. The overall performances of the proposed model for all crashes show an 50.559% RMSD improvement over the SVM models and an 57.867% RMSD improvement over the RF models. The findings indicate that the feature learning module identifies relational information between the explanatory variables and feature representations, which reduces the dimensionality of the input and preserves the original information. The proposed model that includes a multivariate Tobit regression layer in the supervised fine tuning module can better account for differential distribution patterns in traffic crashes across injury severities and provides superior crash occurrence likelihood estimation results. The findings suggest that the proposed model can better address the heterogeneity issues in correlated crash data and is a superior alternative for traffic crash estimations and predictions.

Suggested Citation

  • Dong, Chunjiao & Shao, Chunfu & Clarke, David B. & Nambisan, Shashi S., 2018. "An innovative approach for traffic crash estimation and prediction on accommodating unobserved heterogeneities," Transportation Research Part B: Methodological, Elsevier, vol. 118(C), pages 407-428.
  • Handle: RePEc:eee:transb:v:118:y:2018:i:c:p:407-428
    DOI: 10.1016/j.trb.2018.10.020
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0191261517308640
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.trb.2018.10.020?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jeffrey M. Wooldridge, 2005. "Simple solutions to the initial conditions problem in dynamic, nonlinear panel data models with unobserved heterogeneity," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 20(1), pages 39-54, January.
    2. Sheng‐Kai Chang, 2011. "Simulation estimation of two‐tiered dynamic panel Tobit models with an application to the labor supply of married women," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 26(5), pages 854-871, August.
    3. Bhat, Chandra R., 2015. "A new generalized heterogeneous data model (GHDM) to jointly model mixed types of dependent variables," Transportation Research Part B: Methodological, Elsevier, vol. 79(C), pages 50-77.
    4. Hajivassiliou, Vassilis & McFadden, Daniel & Ruud, Paul, 1996. "Simulation of multivariate normal rectangle probabilities and their derivatives theoretical and computational results," Journal of Econometrics, Elsevier, vol. 72(1-2), pages 85-134.
    5. Bhat, Chandra R. & Astroza, Sebastian & Hamdi, Amin S., 2017. "A spatial generalized ordered-response model with skew normal kernel error terms with an application to bicycling frequency," Transportation Research Part B: Methodological, Elsevier, vol. 95(C), pages 126-148.
    6. Bhat, Chandra R., 2011. "The maximum approximate composite marginal likelihood (MACML) estimation of multinomial probit-based unordered response choice models," Transportation Research Part B: Methodological, Elsevier, vol. 45(7), pages 923-939, August.
    7. Bhat, Chandra R. & Astroza, Sebastian & Bhat, Aarti C. & Nagel, Kai, 2016. "Incorporating a multiple discrete-continuous outcome in the generalized heterogeneous data model: Application to residential self-selection effects analysis in an activity time-use behavior model," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 52-76.
    8. Xiong, Yingge & Tobias, Justin L. & Mannering, Fred L., 2014. "The analysis of vehicle crash injury-severity data: A Markov switching approach with road-segment heterogeneity," Transportation Research Part B: Methodological, Elsevier, vol. 67(C), pages 109-128.
    9. Bhat, Chandra R. & Sener, Ipek N. & Eluru, Naveen, 2010. "A flexible spatially dependent discrete choice model: Formulation and application to teenagers' weekday recreational activity participation," Transportation Research Part B: Methodological, Elsevier, vol. 44(8-9), pages 903-921, September.
    10. Steven M. Shugan, 2006. "Editorial: Errors in the Variables, Unobserved Heterogeneity, and Other Ways of Hiding Statistical Error," Marketing Science, INFORMS, vol. 25(3), pages 203-216, 05-06.
    11. Lord, Dominique & Mannering, Fred, 2010. "The statistical analysis of crash-frequency data: A review and assessment of methodological alternatives," Transportation Research Part A: Policy and Practice, Elsevier, vol. 44(5), pages 291-305, June.
    12. Bhat, Chandra R. & Pinjari, Abdul R. & Dubey, Subodh K. & Hamdi, Amin S., 2016. "On accommodating spatial interactions in a Generalized Heterogeneous Data Model (GHDM) of mixed types of dependent variables," Transportation Research Part B: Methodological, Elsevier, vol. 94(C), pages 240-263.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wang, Shenhao & Wang, Qingyi & Bailey, Nate & Zhao, Jinhua, 2021. "Deep neural networks for choice analysis: A statistical learning theory perspective," Transportation Research Part B: Methodological, Elsevier, vol. 148(C), pages 60-81.
    2. Unsok Ryu & Jian Wang & Unjin Pak & Sonil Kwak & Kwangchol Ri & Junhyok Jang & Kyongjin Sok, 2022. "A clustering based traffic flow prediction method with dynamic spatiotemporal correlation analysis," Transportation, Springer, vol. 49(3), pages 951-988, June.
    3. Zhang, Canrong & Guan, Hao & Yuan, Yifei & Chen, Weiwei & Wu, Tao, 2020. "Machine learning-driven algorithms for the container relocation problem," Transportation Research Part B: Methodological, Elsevier, vol. 139(C), pages 102-131.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bhat, Chandra R. & Mondal, Aupal, 2022. "A New Flexible Generalized Heterogeneous Data Model (GHDM) with an Application to Examine the Effect of High Density Neighborhood Living on Bicycling Frequency," Transportation Research Part B: Methodological, Elsevier, vol. 164(C), pages 244-266.
    2. Mothafer, Ghasak I.M.A. & Yamamoto, Toshiyuki & Shankar, Venkataraman N., 2018. "A multivariate heterogeneous-dispersion count model for asymmetric interdependent freeway crash types," Transportation Research Part B: Methodological, Elsevier, vol. 108(C), pages 84-105.
    3. Bhat, Chandra R. & Pinjari, Abdul R. & Dubey, Subodh K. & Hamdi, Amin S., 2016. "On accommodating spatial interactions in a Generalized Heterogeneous Data Model (GHDM) of mixed types of dependent variables," Transportation Research Part B: Methodological, Elsevier, vol. 94(C), pages 240-263.
    4. Dubey, Subodh & Bansal, Prateek & Daziano, Ricardo A. & Guerra, Erick, 2020. "A Generalized Continuous-Multinomial Response Model with a t-distributed Error Kernel," Transportation Research Part B: Methodological, Elsevier, vol. 133(C), pages 114-141.
    5. Astroza, Sebastian & Bhat, Prerna C. & Bhat, Chandra R. & Pendyala, Ram M. & Garikapati, Venu M., 2018. "Understanding activity engagement across weekdays and weekend days: A multivariate multiple discrete-continuous modeling approach," Journal of choice modelling, Elsevier, vol. 28(C), pages 56-70.
    6. Leung, Kevin Y.K. & Astroza, Sebastian & Loo, Becky P.Y. & Bhat, Chandra R., 2019. "An environment-people interactions framework for analysing children's extra-curricular activities and active transport," Journal of Transport Geography, Elsevier, vol. 74(C), pages 341-358.
    7. Subodh Dubey & Prateek Bansal & Ricardo A. Daziano & Erick Guerra, 2019. "A Generalized Continuous-Multinomial Response Model with a t-distributed Error Kernel," Papers 1904.08332, arXiv.org, revised Jan 2020.
    8. Blake, Miranda R. & Dubey, Subodh & Swait, Joffre & Lancsar, Emily & Ghijben, Peter, 2020. "An integrated modelling approach examining the influence of goals, habit and learning on choice using visual attention data," Journal of Business Research, Elsevier, vol. 117(C), pages 44-57.
    9. Mondal, Aupal & Bhat, Chandra R., 2022. "A spatial rank-ordered probit model with an application to travel mode choice," Transportation Research Part B: Methodological, Elsevier, vol. 155(C), pages 374-393.
    10. Vinayak, Pragun & Dias, Felipe F. & Astroza, Sebastian & Bhat, Chandra R. & Pendyala, Ram M. & Garikapati, Venu M., 2018. "Accounting for multi-dimensional dependencies among decision-makers within a generalized model framework: An application to understanding shared mobility service usage levels," Transport Policy, Elsevier, vol. 72(C), pages 129-137.
    11. Bhat, Chandra R. & Astroza, Sebastian & Hamdi, Amin S., 2017. "A spatial generalized ordered-response model with skew normal kernel error terms with an application to bicycling frequency," Transportation Research Part B: Methodological, Elsevier, vol. 95(C), pages 126-148.
    12. Enam, Annesha & Konduri, Karthik C. & Pinjari, Abdul R. & Eluru, Naveen, 2018. "An integrated choice and latent variable model for multiple discrete continuous choice kernels: Application exploring the association between day level moods and discretionary activity engagement choi," Journal of choice modelling, Elsevier, vol. 26(C), pages 80-100.
    13. Buddhavarapu, Prasad & Bansal, Prateek & Prozzi, Jorge A., 2021. "A new spatial count data model with time-varying parameters," Transportation Research Part B: Methodological, Elsevier, vol. 150(C), pages 566-586.
    14. Paleti, Rajesh, 2018. "Generalized multinomial probit Model: Accommodating constrained random parameters," Transportation Research Part B: Methodological, Elsevier, vol. 118(C), pages 248-262.
    15. H. T. Tran & E. Santarelli, 2013. "Determinants and Effects of Innovative Activities in Vietnam. A Firm-level Analysis," Working Papers wp909, Dipartimento Scienze Economiche, Universita' di Bologna.
    16. Dubey, Subodh & Sharma, Ishant & Mishra, Sabyasachee & Cats, Oded & Bansal, Prateek, 2022. "A General Framework to Forecast the Adoption of Novel Products: A Case of Autonomous Vehicles," Transportation Research Part B: Methodological, Elsevier, vol. 165(C), pages 63-95.
    17. Asmussen, Katherine E. & Mondal, Aupal & Bhat, Chandra R., 2022. "Adoption of partially automated vehicle technology features and impacts on vehicle miles of travel (VMT)," Transportation Research Part A: Policy and Practice, Elsevier, vol. 158(C), pages 156-179.
    18. Hajivassiliou, Vassilis, 2019. "Estimation and specification testing of panel data models with non-ignorable persistent heterogeneity, contemporaneous and intertemporal simultaneity and observable and unobservable dynamics," LSE Research Online Documents on Economics 102843, London School of Economics and Political Science, LSE Library.
    19. González, Mariano & Larrú, José María, 2012. "Egalitarian aid. The impact of aid on Latin American inequality," MPRA Paper 41660, University Library of Munich, Germany.
    20. Yan, Ying & Zhang, Ying & Yang, Xiangli & Hu, Jin & Tang, Jinjun & Guo, Zhongyin, 2020. "Crash prediction based on random effect negative binomial model considering data heterogeneity," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 547(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:transb:v:118:y:2018:i:c:p:407-428. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/548/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.