Model Validation and Reasonableness Checking Manual
5.0 Mode Choice / Auto Occupancy
5.1 Model Description
The treatment of modal choice can vary a great deal by region. For regions with limited transit facilities, it may be sufficient to apply a mode split factor to person trips to account for the percentage using transit. It may even be possible to ignore public transportation trips completely if they constitute a very small portion of regional travel. In the case of a mode split factor being used, these should be reviewed against available local transit ridership figures for reasonableness.
Appendix A contains travel to work characteristics for the 50 largest metropolitan areas in the U.S. The portion of work trips using transit varies from less than 1% in Ft.Worth Texas to nearly 50% in New York City. Thus, local characteristics are very important in determining mode split.
The remainder of this chapter will focus on mode choice models which constitute best practice for metropolitan areas with significant transit service. Mode choice models represent traveler decisions about which vehicular mode to use as a function of level-of-service (LOS) characteristics of the mode and traveler and household characteristics. The mode choice component should be adequately designed and constructed to address the data and informational requirements of regional system planning. The level of detail and precision required in the mode choice model needs to be sufficient to answer policy issues such as the impacts of rail, HOV, pricing strategies, and non-motorized travel.
Two types of discrete choice models are prevalent today: multinomial logit models and nested logit models. A multinomial logit model assumes equally competing alternatives, which allows the "shifting" of trips to and from other modes in proportion to the initial estimate of these modes. A nested logit model recognizes the potential for something other than equal competition among modes. This structure assumes that modes and submodes are distinctly different types of alternatives that present distinct choices to travelers. Its most important departure from the multinomial structure is that the lower level choices are more elastic than they would be in the multinomial structure. For example, this model structure would assume that a person is more sensitive to the mode of access to the transit system than to the decision between auto and transit. Discrete choice models may be estimated on aggregate (zone-level) data or disaggregate (household-level) data, and the most recent modeling efforts have focused on disaggregate nested logit models.
Mode choice models require a number of inputs, many of which are produced in earlier steps in the modeling process. Variables which are typically included are transit travel time (out-of-vehicle, in-vehicle, walk time, wait time), number of transfers, highway travel time, transit fare, auto costs, household income and/or auto ownership, household size, number of workers, and land use characteristics. All of these inputs should be reviewed for reasonableness and compared with observed values. The New Orleans model validation included a comparison of the system variables, such as time and cost, by trip purpose.
As part of the model estimation process, it is useful to check the reasonableness of mode choice parameters by comparing with other regions. Tables 5-1 and 5-2 list some parameters from a number of cities for work and non-work models.
5.2 Disaggregate Validation
Disaggregate validation provides a means of exploring in detail how well a candidate mode choice model fits the observed data. It involves defining subgroups of observations, based, for example, on ranges of trip distance and household auto ownership levels. The model-predicted choices for these subgroups are then compared with the observed choices. Systematic biases revealed by these comparisons suggest the need for new variables or other changes in the utility functions for each mode. Thus, the model estimation and disaggregate validation subtasks are best carried out iteratively before final model specifications are selected.
Ideally, disaggregate validation is performed using a sample of travel observations which is independent of that used for model estimation. For the validation of the Southern California models, the data set from a large household survey and on-board survey was split into two parts, one for model estimation and one for validation. In some cases, a validation data set might be available from other sources (e.g. PUMS).
Even if a separate data set is not available, disaggregate validation can be performed using the same data set used for model estimation. Models can be applied to segments of the data set using the model estimation program to identify biases. For example, say a mode choice model is validated by auto ownership level. The validation might show that transit share is overestimated for zero car households in the suburbs. A possible solution would be to add variables where auto ownership interacts with area type (possibly replacing existing separate variables for area type and auto ownership.
Table 5-1
Review of Mode Choice
Coefficients For Home Based Work Trips
| Coefficients on Service Level Variables From a Sample of Home Based Work Mode Choice Model | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| City | Survey Year | In-Vehicle Time | Tran Drv Acc Time | Out-of Vehicle Time | Hwy Term Time | Trn Walk Time | Tran Xfer Time | Cost | Auto Oper Cost | Tran Fare | Park Cost |
| New Orleans | 1960 | -0.015 | -0.100 | (M) | -0.033 | -0.077 | -0.032 | -0.0080 | |||
| Minn/St. Paul | 1970 | -0.031 | (M) | -0.044 | -0.030 | -0.044 | -0.0140 | ||||
| Chicago | 1970 | -0.028 | -0.030 | -0.114 | -0.023 | -0.114 | -0.0121 | ||||
| Los Angeles | 1975 | -0.020 | -0.112 | -0.0144 | |||||||
| Seattle | 1977 | -0.040 | -0.286 | (M) | -0.044 | -0.03 | -0.044 | -0.0140 | |||
| Cincinnati | 1978 | -0.019 | -0.028 | -0.0045 | |||||||
| Washington | 1980 | -0.017 | -0.058 | -0.004 | -0.004 | -0.009 | |||||
| San Francisco | 1980 | -0.025 | -0.058 | -0.0039 | |||||||
| Dallas | 1984 | -0.030 | -0.055 | -0.055 | -0.055 | -0.055 | -0.059 | -0.005 | -0.005 | -0.012 | |
| Shirley (low) | 1984 | -0.022 | -0.035 | -0.0037 | |||||||
| Shirley (high) | 1984 | -0.034 | -0.044 | -0.0046 | |||||||
| Value of Time with the CPI Adjusted to 1979 | ||||||
|---|---|---|---|---|---|---|
| City | Survey Year | CPI Index | C(ivt) --------- C(cost) |
C(ivt) --------- C(oper) |
C(ivt) -------- C(fare) |
C(ivt) -------- C(park) |
| New Orleans | 1960 | 29.6 | 2.76 | |||
| Minn/St. Paul | 1970 | 38.8 | 2.48 | |||
| Chicago | 1970 | 38.8 | 2.56 | |||
| Los Angeles | 1975 | 53.8 | 1.12 | |||
| Seattle | 1977 | 59.5* | 2.09 | |||
| Cincinnati | 1978 | 65.2 | 2.84 | |||
| Washington | 1980 | 82.4 | 2.61 | 2.08 | 0.97 | |
| San Francisco | 1980 | 82.4 | 3.47 | |||
| Dallas | 1984 | 103.9 | 2.68 | 2.68 | 1.07 | |
| Shirley (low) | 1984 | 103.9 | 2.29 | |||
| Shirley (high) | 1984 | 103.9 | 3.74 | |||
| Value of Time as Percent of Median Income | ||||||
|---|---|---|---|---|---|---|
| City | Survey Year | 1979 Median Income | C(ivt) --------- C(cost) |
C(ivt) --------- C(oper) |
C(ivt) -------- C(fare) |
C(ivt) -------- C(park) |
| New Orleans | 1960 | 18,933 | 30.31 | |||
| Minn/St. Paul | 1970 | 24,879 | 20.77 | |||
| Chicago | 1970 | 24,301 | 21.92 | |||
| Los Angeles | 1975 | 22,041 | 10.60 | |||
| Seattle | 1977 | 21,000* | 20.31 | |||
| Cincinnati | 1978 | 21,552 | 27.43 | |||
| Washington | 1980 | 27,885 | 19.49 | 15.50 | 7.26 | |
| San Francisco | 1980 | 24,599 | 29.36 | |||
| Dallas | 1984 | 22,033 | 25.25 | 25.25 | 10.11 | |
| Shirley (low) | 1984 | 27,885 | 16.75 | |||
| Shirley (high) | 1984 | 27,885 | 27.36 | |||
(m) Multiple Coefficients Depending on Car Occupancy
* Estimated CPI
for 1979 was 72.6
Sources:
- Parsons, Brinckerhoff Quade & Douglas, Inc., "Review of Best Practices," Washington, DC (1992)
- KPMG Peat Marwick, "Compendium of Travel Demand Forecasting Methodologies," Prepared for Federal Transit Administration, Washington, DC (February 1992)
Table 5-2
Review of Mode Choice
Coefficients For Home Based Non-Work and Non-Home Based Trips
| Coefficients on Service Level Variables From a Sample of Home Based Other Mode Choice Model | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| City | Survey Year | In-Vehicle Time | Tran Drv Acc Time | Out-of Vehicle Time | Hwy Term Time | Trn Walk Time | Tran Xfer Time | Cost | Auto Oper Cost | Tran Fare | Park Cost |
| New Orleans | 1960 | -0.0066 | -0.0165 | -0.340 | -0.012 | -0.012 | -0.0319 | ||||
| Minn/St. Paul | 1970 | -0.0080 | -0.0200 | (M) | -0.818* | -0.012 | |||||
| Seattle | 1977 | -0.0080 | -0.200 | -0.0200 | (M) | -0.135* | -0.035 | ||||
| St. Louis | N/A | -0.238 | -0.0595 | -0.018 | |||||||
| Honolulu | N/A | -0.101 | -0.041 | -0.041 | |||||||
| San Juan | N/A | -0.0050 | -0.060 | -0.061 | -0.061 | -0.005 | |||||
| * Coefficient on the number of transfers | |||||||||||
| Value of Time (Using only the original coefficients) | ||||||
|---|---|---|---|---|---|---|
| City | Survey Year | C(ivt) --------- C(cost) |
C(ivt) --------- C(oper) |
C(ivt) -------- C(fare) |
C(ivt) -------- C(park) |
C(cost) Work -------- C(cost Non-Work |
| New Orleans | 1960 | 0.33 | 0.33 | 0.12 | 0.67/0.25 | |
| Minn/St. Paul | 1970 | 0.40 | 1.17 | |||
| Seattle | 1977 | 0.14 | 0.40 | |||
| St. Louis | N/A | 0.79 | 0.46 | |||
| Honolulu | N/A | N/A | N/A | |||
| San Juan | N/A | 0.60 | 0.48 | |||
| Coefficients on Service Level Variables From a Sample of Non-Home Based Mode Choice Model | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| City | Survey Year | In-Vehicle Time | Tran Drv Acc Time | Out-of Vehicle Time< | Hwy Term Time | Trn Walk Time | Tran Xfer Time | Cost | Auto Oper Cost | Tran Fare< | Park Cost |
| New Orleans | 1960 | -0.0131 | -0.0328 | -0.242 | -0.075* | -0.005 | -0.005 | -0.0291 | |||
| Minn/St. Paul | 1970 | -0.0100 | -0.0250 | (M) | -0.004 | ||||||
| Seattle | 1977 | -0.0200 | -0.198 | -0.0250 | (M) | -0.031 | |||||
| St. Louis | N/A | -0.0230 | -0.0575 | -0.011 | |||||||
| Honolulu | N/A | N/A | -0.126 | -0.040 | -0.040 | ||||||
| San Juan | N/A | -0.0100 | -0.119 | -0.026 | -0.026 | -0.002 | |||||
| * Coefficient on the number of transfers | |||||||||||
| Value of Time (Using only the original coefficients) | ||||||
|---|---|---|---|---|---|---|
| City | Survey Year | C(ivt) --------- C(cost) |
C(ivt) --------- C(oper) |
C(ivt) -------- C(fare) |
C(ivt) -------- C(park) |
C(cost) Work -------- C(cost Non-Work |
| New Orleans | 1960 | 15.72 | 15.72 | 0.27 | 1.60/0.275 | |
| Minn/St. Paul | 1970 | 1.50 | 3.50 | |||
| Seattle | 1977 | 0.39 | 0.45 | |||
| St. Louis | N/A | 1.25 | 0.76 | |||
| Honolulu | N/A | N/A | N/A | |||
| San Juan | N/A | 2.00 | ||||
Sources:
- Parsons, Brinckerhoff Quade & Douglas, Inc., "Review of Best Practices," Washington, DC (1992)
- KPMG Peat Marwick, "Compendium of Travel Demand Forecasting Methodologies," Prepared for Federal Transit Administration, Washington, DC (February 1992)
Disaggregate validation can be performed using subsets of the observations based on ranges of the following variables:
- Household characteristics such as household size, income level, number of workers, and auto ownership;
- Traveler characteristics such as age, gender, driver license status, and employment status;
- Zonal characteristics such as geographical location, area type, population density, and parking costs; and
- Trip characteristics such as trip distance, time, and cost.
Tables 5-3 and 5-4 present an example of disaggregate validation performed for a mode choice model in the Los Angeles area. A multinomial logit mode choice model with nine alternatives was estimated for home based work trips from a combined data set from household and on-board surveys. This model was validated by applying the model to the estimation data set, and the results--the number selecting each mode chosen by survey respondents versus the number predicted by the model--were tabulated for market segments representing auto ownership and income levels. This type of validation procedure was available in the model estimation software.
The row total of each table shows that the overall performance of the model in estimating mode shares across the population is good. Although there are cells in both tables where the predicted number of users of a mode differs significantly from the number who chose each mode in the surveys, there are no systematic biases. For example, although the predicted number of users of each auto mode differs from the observed for 1-car households as shown in Table 5-1, the model slightly overpredicts auto use for the drive alone and shared ride 2 modes while it slightly underpredicts auto passengers and shred ride 3+. The predicted shares for auto for both 0-car and 2-car households, however, are very close to observed values. This indicates a lack of systematic bias. If, for example, the model showed that auto use was consistently overpredicted for multiple car households, additional auto ownership-related variables could be tested in the model structure.
It should be noted that the non-integer values for the number chosen in each cell reflect the weighting done in the expansion of the survey data set.
Table 5-3
HBW Classification by
Automobiles per Household

Table 5-4
HBW Classification by
Household Income

Sensitivity Tests
Typically, when mode choice models are estimated, the model
coefficients, derived ratios, and model elasticities are compared to those
from other regions. The comparison of model coefficients and derived
variables can be considered both a validation check and a sensitivity
check. If model coefficients (and constants) and derived ratios are in the
range of what has been reported elsewhere, the model sensitivity should be
similar to models used in other regions.
A common sensitivity test for mode choice models is the direct or cross elasticities of the model. Elasticities can be used to estimate the percent change in demand given a percent change in supply. As with the values of the model coefficients and derived ratios, elasticities can be considered as both validation and sensitivity tests. For example, a well-known rule-of-thumb for transit fare elasticity is the Simpson-Curtin Rule. This states that transit fare elasticity is about -0.3. In other words, a 10 percent increase in transit fare will result in about a 3 percent decrease in transit ridership. While the report is somewhat dated, elasticities derived from models and from empirical studies can be found in Patronage Impacts of Changes in Transit Fares and Services, Ecosometrics (1980).
Sensitivity tests can be made on model elasticities for fares, in-vehicle travel time, out-of-vehicle travel time, and transfers. Additional mode choice model sensitivity tests examine changes in transit mode shares relative to changes in transit fares and travel time. Sensitivity tests are performed by applying the model with unit changes in variables, e.g. a $0.25 increase in transit fare or a 10% increase in auto travel time.
Although disaggregate validation has been discussed in the Mode Choice section of this manual, it should be done for all disaggregately estimated models. Examples of other discrete choice models where this applies include visitor or destination choice models, and auto ownership models.
5.3 Aggregate Validation
To validate the models at the aggregate level, the models should be applied to calibration year person trip tables and LOS input data. Mode shares by trip purpose should be subdivided into submode shares by purpose if, for example, the mode choice model estimates transit trips for walk access and drive access trips separately. The resulting trips by mode should be compared with secondary data sources such as:
- Available transit ridership, highway vehicle, and auto occupancy counts at screenlines by time of day;
- 1990 Census Journey-to-Work data on trips by mode and origin and destination district;
- Total patronage by transit mode; and
- Counts of transit patrons by access mode at major stations serving transfers between auto and feeder bus and express transit services.
These comparisons may lead to the specification of adjustments to the models modal constants and market segmentation procedures to ensure that aggregate versions of the models accurately replicate the observed data.
Additional aggregate validation checks which should be made of mode choice models are:
- Average auto occupancies by trip purpose (see Table 5-5)
- Percent single occupant vehicles (SOVs) by trip purpose
- Home-based work transit trips as a percent of total transit trips
- Mode shares to/from area types or major districts
- Average auto occupancies to/from area types or major districts
An example of mode share by market segment is shares of transit trips using walk access versus auto access. An example of mode shares to/from a particular area type is mode shares of work trips destined for the CBD. Conversely, the share of total transit trips destined for the CBD can be checked.
In analysis of future year alternatives, travel models are often used to evaluate the introduction of a new mode, such as a light rail system. The introduction of a new mode will clearly have an impact on the mode choice validation. One would expect the new rail mode to shift trips from existing transit modes, and (possibly to a lesser extent) shift trips from auto to transit. In evaluating the reasonableness of mode choice results, it is important to consider the underlying model structure.
5.4 Auto Occupancy
Changes in auto occupancy can result in significant changes in the number of vehicle trips assigned. If auto occupancy rates are used to convert from person trips to vehicle trips, these can easily be adjusted in validation. Increasing auto occupancy, decreases the number of vehicle trips. As shown in Table 5-5, auto occupancies have generally been decreasing. Auto occupancy factors are typically developed from household travel surveys based on the reported number of person trips divided by auto driver trips.
Table 5-5
Average Vehicle
Occupancy for Selected Trip Purposes
(person miles per vehicle mile)
| Trip Purpose | 1977 | 1983 | 1990 | Percent Change (77-90) |
|---|---|---|---|---|
| Home to Work | 1.3 | 1.3 | 1.1 | -15 |
| Shopping | 2.1 | 1.8 | 1.7 | -19 |
| Other family or personal business | 2.0 | 1.8 | 1.8 | -10 |
| Social and recreation | 2.4 | 2.1 | 2.1 | -13 |
| All Purposes | 1.9 | 1.7 | 1.6 | -16 |
| Source: 1977, 1983, and 1990 NPTS | ||||

