Accounting for stochastic variables in discrete choice models

Author: Díaz F., Cantillo V., Arellana J. y Ortúzar J. de D. (2015)

Journal: Transportation Research, 78B(2), 222-237

Keywords: Stochastic variables, Errors in variables, Discrete choice models, Mixed logit


Search ScienceDirect
Transportation Research Part B: Methodological
Volume 78, August 2015, Pages 222-237
Transportation Research Part B: Methodological
Accounting for stochastic variables in discrete choice models
Author links open overlay panelFedericoDíazaVíctorCantillobJulianArellanabJuan de DiosOrtúzarc
Show more rights and content

Explanatory variables in DCM cannot be assumed as deterministic.

We explore the inclusion of stochastic variables in DCM through econometric analyses.

We also tested alternative model structures using simulation and a real case databank.

Usually an error components model can deal with stochasticity with large sample sizes.

Stochasticity related with the variable magnitudes can be captured using a RPM.

Interpretation of ML models should be done carefully because of confounding effects.

The estimation of discrete choice models requires measuring the attributes describing the alternatives within each individual’s choice set. Even though some attributes are intrinsically stochastic (e.g. travel times) or are subject to non-negligible measurement errors (e.g. waiting times), they are usually assumed fixed and deterministic. Indeed, even an accurate measurement can be biased as it might differ from the original (experienced) value perceived by the individual.

Experimental evidence suggests that discrepancies between the values measured by the modeller and experienced by the individuals can lead to incorrect parameter estimates. On the other hand, there is an important trade-off between data quality and collection costs. This paper explores the inclusion of stochastic variables in discrete choice models through an econometric analysis that allows identifying the most suitable specifications. Various model specifications were experimentally tested using synthetic data; comparisons included tests for unbiased parameter estimation and computation of marginal rates of substitution. Model specifications were also tested using a real case databank featuring two travel time measurements, associated with different levels of accuracy.

Results show that in most cases an error components model can effectively deal with stochastic variables. A random coefficients model can only effectively deal with stochastic variables when their randomness is directly proportional to the value of the attribute. Another interesting result is the presence of confounding effects that are very difficult, if not impossible, to isolate when more flexible models are used to capture stochastic variations. Due the presence of confounding effects when estimating flexible models, the estimated parameters should be carefully analysed to avoid misinterpretations. Also, as in previous misspecification tests reported in the literature, the Multinomial Logit model proves to be quite robust for estimating marginal rates of substitution, especially when models are estimated with large samples.