Random Covariance Heterogeneity in Discrete Choice Models
The area of discrete choice modelling has developed rapidly in recent years. In particular, continuing refinements of the Generalised Extreme Value (GEV) model family have permitted the representation of increasingly complex patterns of substitution and parallel advances in estimation capability have led to the increased use of model forms requiring simulation in estimation and application. One model form especially, namely the Mixed Multinomial Logit (MMNL) model, is being used ever more widely. Aside from allowing for random variations in tastes across decision-makers in a Random Coefficients Logit (RCL) framework, this model additionally allows for the representation of inter-alternative correlation as well as heteroscedasticity in an Error Components Logit (ECL) framework, enabling the model to approximate any Random Utility model arbitrarily closely. While the various developments discussed above have led to gradual gains in modelling flexibility, little effort has gone into the development of model forms allowing for a representation of heterogeneity across respondents in the correlation structure in place between alternatives. Such correlation heterogeneity is however possibly a crucial factor in the variation of choice-making behaviour across decision-makers, given the potential presence of individual-specific terms in the unobserved part of utility of multiple alternatives. To the authors' knowledge, there has so far only been one application of a model allowing for such heterogeneity, by Bhat (1997). In this Covariance NL model, the logsum parameters themselves are a function of socio-demographic attributes of the decision-makers, such that the correlation heterogeneity is explained with the help of these attributes. While the results by Bhat show the presence of statistically significant levels of covariance heterogeneity, the improvements in terms of model performance are almost negligible. While it is possible to interpret this as a lack of covariance heterogeneity in the data, another explanation is possible. It is clearly imaginable that a major part of the covariance heterogeneity cannot be explained in a deterministic fashion, either due to data limitations, or because of the presence of actual random variation, in a situation analogous to the case of random taste heterogeneity that cannot be explained in a deterministic fashion. In this paper, we propose two different ways of modelling such random variations in the correlation structure across individuals. The first approach is based on the use of an underlying GEV structure, while the second approach consists of an extension of the ECL model. In the former approach, the choice probabilities are given by integration of underlying GEV choice probabilities, such as Nested Logit, over the assumed distribution of the structural parameters. In the most basic specification, the structural parameters are specified as simple random variables, where appropriate choices of statistical distributions and/or mathematical transforms guarantee that the resulting structural parameters fall into the permissible range of values. Several extensions are then discussed in the paper that allow for a mixture of random and deterministic variations in the correlation structure. In an ECL model, correlation across alternatives is introduced with the help of normally distributed error-terms with a mean of zero that are shared by alternatives that are closer substitutes for each other, with the extent of correlation being determined by the estimates of the standard deviations of the error-components. The extension of this model to a structure allowing for random covariance heterogeneity is again divided into two parts. In the first approach, correlation is assumed to vary purely randomly; this is obtained through simple integration over the distribution of the standard deviations of the error-terms, superseding the integration over the distribution of the error-components with a specific draw for the standard deviations. The second extension is similar to the one used in the GEV case, with the standard deviations being composed of a deterministic term and a random term, either as a pure deviation, or in the form of random coefficients in the parameterisation of the distribution of the standard deviations. We next show that our Covariance GEV (CGEV) model generalises all existing GEV model structures, while the Covariance ECL (CECL) model can theoretically approximate all RUM models arbitrarily closely. Although this also means that the CECL model can closely replicate the behaviour of the CGEV model, there are some differences between the two models, which can be related to the differences in the underlying error-structure of the base models (GEV vs ECL). The CECL model has the advantage of implicitly allowing for heteroscedasticity, although this is also possible with the CGEV model, by adding appropriate error-components, leading to an EC-CGEV model. In terms of estimation, the CECL model has a run-time advantage for basic nesting structures, when the number of error-components, and hence dimensions of integration, is low enough not to counter-act the gains made by being based on a more straightforward integrand (MNL vs advanced GEV). However, in more complicated structures, this advantage disappears, in a situation that is analogous to the case of Mixed GEV models compared to ECL models. A final disadvantage of the CECL model structure comes in the form of an additional set of identification conditions. The paper presents applications of these model structures to both cross-sectional and panel datasets from the field of travel behaviour analysis. The applications illustrate the gains in model performance that can be obtained with our proposed structures when compared to models governed by a homogeneous covariance structure assumption. As expected, the gains in performance are more important in the case of data with repeated observations for the same individual, where the notion of individual-specific substitution patterns applies more directly. The applications also confirm the slight differences between the CGEV and CECL models discussed above. The paper concludes with a discussion of how the two structures can be extended to allow for random taste heterogeneity. The resulting models thus allow for random variations in choice behaviour both in the evaluation of measured attributes C as well as the correlation across alternatives in the unobserved utility terms. This further increases the flexibility of the two model structures, and their potential for analysing complex behaviour in transport and other areas of research.
References listed on IDEAS
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Hess, S. & Bierlaire, Michel & Polak, J.W., 2007. "A systematic comparison of continuous and discrete mixture models," European Transport \ Trasporti Europei, ISTIEE, Institute for the Study of Transport within the European Economic Integration, issue 37, pages 35-61.
- Stephane Hess & John Polak & Andrew Daly & Geoffrey Hyman, 2007. "Flexible substitution patterns in models of mode and time of day choice: new evidence from the UK and the Netherlands," Transportation, Springer, vol. 34(2), pages 213-238, March.
- Small, Kenneth A, 1987. "A Discrete Choice Model for Ordered Alternatives," Econometrica, Econometric Society, vol. 55(2), pages 409-424, March.
- Train,Kenneth E., 2009.
"Discrete Choice Methods with Simulation,"
Cambridge University Press, number 9780521766555, August.
- Kenneth Train, 2003. "Discrete Choice Methods with Simulation," Online economics textbooks, SUNY-Oswego, Department of Economics, number emetr2.
- Train,Kenneth E., 2009. "Discrete Choice Methods with Simulation," Cambridge Books, Cambridge University Press, number 9780521747387, October.
- Hess, Stephane & Daly, Andrew & Rohr, Charlene & Hyman, Geoff, 2007. "On the development of time period and mode choice models for use in large scale modelling forecasting systems," Transportation Research Part A: Policy and Practice, Elsevier, vol. 41(9), pages 802-826, November.
- Hess, Stephane & Bierlaire, Michel & Polak, John W., 2005. "Estimation of value of travel-time savings using mixed logit models," Transportation Research Part A: Policy and Practice, Elsevier, vol. 39(2-3), pages 221-236.
- Papola, Andrea, 2004. "Some developments on the cross-nested logit model," Transportation Research Part B: Methodological, Elsevier, vol. 38(9), pages 833-851, November.
- Wen, Chieh-Hua & Koppelman, Frank S., 2001. "The generalized nested logit model," Transportation Research Part B: Methodological, Elsevier, vol. 35(7), pages 627-641, August.
- Hess, Stephane & Train, Kenneth E. & Polak, John W., 2006. "On the use of a Modified Latin Hypercube Sampling (MLHS) method in the estimation of a Mixed Logit Model for vehicle choice," Transportation Research Part B: Methodological, Elsevier, vol. 40(2), pages 147-163, February.
- Greene, William H. & Hensher, David A. & Rose, John, 2006. "Accounting for heterogeneity in the variance of unobserved effects in mixed logit models," Transportation Research Part B: Methodological, Elsevier, vol. 40(1), pages 75-92, January.
- Bhat, Chandra R. & Castelar, Saul, 2002. "A unified mixed logit framework for modeling revealed and stated preferences: formulation and application to congestion pricing analysis in the San Francisco Bay area," Transportation Research Part B: Methodological, Elsevier, vol. 36(7), pages 593-616, August.
- Swait, Joffre & Adamowicz, Wiktor, 2001. "Choice Environment, Market Complexity, and Consumer Behavior: A Theoretical and Empirical Approach for Incorporating Decision Complexity into Models of Consumer Choice," Organizational Behavior and Human Decision Processes, Elsevier, vol. 86(2), pages 141-167, November.
- Swait, Joffre & Adamowicz, Wiktor L., 1999. "Choice Environment, Market Complexity and Consumer Behavior: A Theoretical and Empirical Approach for Incorporating Decision Complexity into Models of Consumer Choice," Staff Paper Series 24093, University of Alberta, Department of Resource Economics and Environmental Sociology.
- Brownstone, David & Train, Kenneth, 1998. "Forecasting new product penetration with flexible substitution patterns," Journal of Econometrics, Elsevier, vol. 89(1-2), pages 109-129, November.
- Brownstone, David & Train, Kenneth, 1999. "Forecasting new product penetration with flexible substitution patterns," University of California Transportation Center, Working Papers qt3tb6j874, University of California Transportation Center.
- Brownstone, David & Train, Kenneth, 1999. "Forecasting new product penetration with flexible substitution patterns," University of California Transportation Center, Working Papers qt1j6814b3, University of California Transportation Center.
- Hess, Stephane & Polak, John W., 2005. "Mixed logit modelling of airport choice in multi-airport regions," Journal of Air Transport Management, Elsevier, vol. 11(2), pages 59-68.
- H C W L Williams, 1977. "On the Formation of Travel Demand Models and Economic Evaluation Measures of User Benefit," Environment and Planning A, , vol. 9(3), pages 285-344, March.
- Daniel McFadden & Kenneth Train, 2000. "Mixed MNL models for discrete response," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 15(5), pages 447-470.
- Hess, Stephane & Rose, John M., 2009. "Allowing for intra-respondent variations in coefficients estimated on repeated choice data," Transportation Research Part B: Methodological, Elsevier, vol. 43(6), pages 708-719, July.
- Koppelman, Frank S. & Sethi, Vaneet, 2005. "Incorporating variance and covariance heterogeneity in the Generalized Nested Logit model: an application to modeling long distance travel choice behavior," Transportation Research Part B: Methodological, Elsevier, vol. 39(9), pages 825-853, November.
- Bhat, Chandra R. & Guo, Jessica, 2004. "A mixed spatially correlated logit model: formulation and application to residential choice modeling," Transportation Research Part B: Methodological, Elsevier, vol. 38(2), pages 147-168, February.
- Swait, Joffre & Adamowicz, Wiktor L., 1996. "The Effect of Choice Environment and Task Demands on Consumer Behavior: Discriminating Between Contribution and Confusion," Staff Paper Series 24091, University of Alberta, Department of Resource Economics and Environmental Sociology.
- Daniel McFadden, 1977. "Modelling the Choice of Residential Location," Cowles Foundation Discussion Papers 477, Cowles Foundation for Research in Economics, Yale University. Full references (including those not matched with items on IDEAS)