Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models. (July 2020)
- Record Type:
- Journal Article
- Title:
- Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models. (July 2020)
- Main Title:
- Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models
- Authors:
- Zhao, Xilei
Yan, Xiang
Yu, Alan
Van Hentenryck, Pascal - Abstract:
- Highlights: A comprehensive comparison between machine learning and logit models is provided. Random forest model has much higher predictive accuracy compared to multinomial logit model and mixed logit model. Machine learning and logit models agree on many aspects of the behavioral outputs. Applying a standard approach to generate marginal effects and arc elasticities for random forest lead to unreasonable estimates. Random forest captures nonlinear relationships between the input features and choice outcomes. Abstract: Some recent studies have shown that machine learning can achieve higher predictive accuracy than logit models. However, existing studies rarely examine behavioral outputs (e.g., marginal effects and elasticities) that can be derived from machine-learning models and compare the results with those obtained from logit models. In other words, there has not been a comprehensive comparison between logit models and machine learning that covers both prediction and behavioral analysis, two equally important subjects in travel-behavior study. This paper addresses this gap by examining the key differences in model development, evaluation, and behavioral interpretation between logit and machine-learning models for mode-choice modeling. We empirically evaluate the two approaches using stated-preference survey data. Consistent with the literature, this paper finds that the best-performing machine-learning model, random forest, has significantly higher predictive accuracyHighlights: A comprehensive comparison between machine learning and logit models is provided. Random forest model has much higher predictive accuracy compared to multinomial logit model and mixed logit model. Machine learning and logit models agree on many aspects of the behavioral outputs. Applying a standard approach to generate marginal effects and arc elasticities for random forest lead to unreasonable estimates. Random forest captures nonlinear relationships between the input features and choice outcomes. Abstract: Some recent studies have shown that machine learning can achieve higher predictive accuracy than logit models. However, existing studies rarely examine behavioral outputs (e.g., marginal effects and elasticities) that can be derived from machine-learning models and compare the results with those obtained from logit models. In other words, there has not been a comprehensive comparison between logit models and machine learning that covers both prediction and behavioral analysis, two equally important subjects in travel-behavior study. This paper addresses this gap by examining the key differences in model development, evaluation, and behavioral interpretation between logit and machine-learning models for mode-choice modeling. We empirically evaluate the two approaches using stated-preference survey data. Consistent with the literature, this paper finds that the best-performing machine-learning model, random forest, has significantly higher predictive accuracy than multinomial logit and mixed logit models. The random forest model and the two logit models largely agree on several aspects of the behavioral outputs, including variable importance and the direction of association between independent variables and mode choice. However, we find that the random forest model produces behaviorally unreasonable arc elasticities and marginal effects when these behavioral outputs are computed from a standard approach. After the introduction of some modifications that overcome the limitations of tree-based models, the results are improved to some extent. There appears to be a tradeoff between predictive accuracy and behavioral soundness when choosing between machine learning and logit models in mode-choice modeling. … (more)
- Is Part Of:
- Travel behaviour and society. Volume 20(2020)
- Journal:
- Travel behaviour and society
- Issue:
- Volume 20(2020)
- Issue Display:
- Volume 20, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 20
- Issue:
- 2020
- Issue Sort Value:
- 2020-0020-2020-0000
- Page Start:
- 22
- Page End:
- 35
- Publication Date:
- 2020-07
- Subjects:
- Machine learning -- Logit models -- Mobility-on-demand -- Stated-preference survey -- Travel mode choice -- Travel behavior
Transportation -- Periodicals
Population geography -- Periodicals
303.48305 - Journal URLs:
- http://www.sciencedirect.com/science/journal/2214367X ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.tbs.2020.02.003 ↗
- Languages:
- English
- ISSNs:
- 2214-367X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13359.xml