# Fundamentals of Transportation/Destination Choice/Background

Some additional background on Fundamentals of Transportation/Destination Choice

## History

Over the years, modelers have used several different formulations of trip distribution. The first was the Fratar or Growth model (which did not differentiate trips by purpose). This structure extrapolated a base year trip table to the future based on growth, but took no account of changing spatial accessibility due to increased supply or changes in travel patterns and congestion.

The next models developed were the gravity model and the intervening opportunities model. The most widely used formulation is still the gravity model.

While studying traffic in Baltimore, Maryland, Alan Voorhees developed a mathematical formula to predict traffic patterns based on land use. This formula has been instrumental in the design of numerous transportation and public works projects around the world. He wrote "A General Theory of Traffic Movement," (Voorhees, 1956) which applied the gravity model to trip distribution, which translates trips generated in an area to a matrix that identifies the number of trips from each origin to each destination, which can then be loaded onto the network.

Evaluation of several model forms in the 1960s concluded that "the gravity model and intervening opportunity model proved of about equal reliability and utility in simulating the 1948 and 1955 trip distribution for Washington, D.C." (Heanue and Pyers 1966). The Fratar model was shown to have weakness in areas experiencing land use changes. As comparisons between the models showed that either could be calibrated equally well to match observed conditions, because of computational ease, gravity models became more widely spread than intervening opportunities models. Some theoretical problems with the intervening opportunities model were discussed by Whitaker and West (1968) concerning its inability to account for all trips generated in a zone which makes it more difficult to calibrate, although techniques for dealing with the limitations have been developed by Ruiter (1967).

With the development of logit and other discrete choice techniques, new, demographically disaggregate approaches to travel demand were attempted. By including variables other than travel time in determining the probability of making a trip, it is expected to have a better prediction of travel behavior. The logit model and gravity model have been shown by Wilson (1967) to be of essentially the same form as used in statistical mechanics, the entropy maximization model. The application of these models differ in concept in that the gravity model uses impedance by travel time, perhaps stratified by socioeconomic variables, in determining the probability of trip making, while a discrete choice approach brings those variables inside the utility or impedance function. Discrete choice models require more information to estimate and more computational time.

Ben-Akiva and Lerman (1985) have developed combination destination choice and mode choice models using a logit formulation for work and non-work trips. Because of computational intensity, these formulations tended to aggregate traffic zones into larger districts or rings in estimation. In current application, some models, including for instance the transportation planning model used in Portland, Oregon use a logit formulation for destination choice. Allen (1984) used utilities from a logit based mode choice model in determining composite impedance for trip distribution. However, that approach, using mode choice log-sums implies that destination choice depends on the same variables as mode choice. Levinson and Kumar (1995) employ mode choice probabilities as a weighting factor and develops a specific impedance function or “f-curve” for each mode for work and non-work trip purposes.

## Mathematics

At this point in the transportation planning process, the information for zonal interchange analysis is organized in an origin-destination table. On the left is listed trips produced in each zone. Along the top are listed the zones, and for each zone we list its attraction. The table is n x n, where n = the number of zones.

Each cell in our table is to contain the number of trips from zone i to zone j. We do not have these within cell numbers yet, although we have the row and column totals. With data organized this way, our task is to fill in the cells for tables headed t=1 through say t=n.

Actually, from home interview travel survey data and attraction analysis we have the cell information for t = 1. The data are a sample, so we generalize the sample to the universe. The techniques used for zonal interchange analysis explore the empirical rule that fits the t = 1 data. That rule is then used to generate cell data for t = 2, t = 3, t = 4, etc., to t = n.

The first technique developed to model zonal interchange involves a model such as this:

$T_{ij}=T_{i}{\frac {T_{j}f\left({C_{ij}}\right)K_{ij}}{\sum _{j=1}^{n}{T_{j}f\left({C_{ij}}\right)K_{ij}}}}$ where:

• $T_{ij}$ : trips from i to j.
• $T_{i}$ : trips from i, as per our generation analysis
• $T_{j}$ : trips attracted to j, as per our generation analysis
• $f(C_{ij})$ : travel cost friction factor, say = $C_{ij}^{b}$ • $K_{ij}$ : Calibration parameter

Zone i generates $T_{i}$ trips; how many will go to zone j? That depends on the attractiveness of j compared to the attractiveness of all places; attractiveness is tempered by the distance a zone is from zone i. We compute the fraction comparing j to all places and multiply $T_{i}$ by it.

The rule is often of a gravity form:

$T_{ij}=a{\frac {P_{i}P_{j}}{C_{ij}^{b}}}$ where:

• $P_{i};P_{j}$ : populations of i and j
• $a;b$ : parameters

But in the zonal interchange mode, we use numbers related to trip origins ($T_{i}$ ) and trip destinations ($T_{j}$ ) rather than populations.

There are lots of model forms because we may use weights and special calibration parameters, e.g., one could write say:

$T_{ij}=a{\frac {T_{i}^{c}T_{j}^{d}}{C_{ij}^{b}}}$ or $T_{ij}={\frac {cT_{i}dT_{j}}{C_{ij}^{b}}}$ where:

• a, b, c, d are parameters
• $C_{ij}$ : travel cost (e.g. distance, money, time)
• $T_{j}$ : inbound trips, destinations
• $T_{i}$ : outbound trips, origin

## Entropy Analysis

Wilson (1970) gives us another way to think about zonal interchange problem. This section treats Wilson’s methodology to give a grasp of central ideas. To start, consider some trips where we have seven people in origin zones commuting to seven jobs in destination zones. One configuration of such trips will be:

Table: Configuration of Trips
zone 1 2 3
1 2 1 1
2 0 2 1

$w\left({T_{ij}}\right)={\frac {7!}{2!1!1!0!2!1!}}=1260$ where $0!=1$ That configuration can appear in 1,260 ways. We have calculated the number of ways that configuration of trips might have occurred, and to explain the calculation, let’s recall those coin tossing experiments talked about so much in elementary statistics. The number of ways a two-sided coin can come up is 2n, where n is the number of times we toss the coin. If we toss the coin once, it can come up heads or tails, 2*1 = 2. If we toss it twice, it can come up HH, HT, TH, or TT, 4 ways, and 2*2 = 4. To ask the specific question about, say, four coins coming up all heads, we calculate 4!/4!0! =1 . Two heads and two tails would be 4!/2!2! = 6. We are solving the equation:

$w={\frac {n!}{\prod _{i=1}^{n}{n_{i}!}}}$ An important point is that as n gets larger, our distribution gets more and more peaked, and it is more and more reasonable to think of a most likely state.

However, the notion of most likely state comes not from this thinking; it comes from statistical mechanics, a field well known to Wilson and not so well known to transportation planners. The result from statistical mechanics is that a descending series is most likely. Think about the way the energy from lights in the classroom is affecting the air in the classroom. If the effect resulted in an ascending series, many of the atoms and molecules would be affected a lot and a few would be affected a little. The descending series would have a lot affected not at all or not much and only a few affected very much. We could take a given level of energy and compute excitation levels in ascending and descending series. Using the formula above, we would compute the ways particular series could occur, and we would concluded that descending series dominate.

That’s more or less Boltzmann’s Law,

$p_{j}=p_{0}e^{\beta e_{j}}$ That is, the particles at any particular excitation level, j, will be a negative exponential function of the particles in the ground state, $p_{0}$ , the excitation level, $e_{j}$ , and a parameter $beta$ , which is a function of the (average) energy available to the particles in the system.

The two paragraphs above have to do with ensemble methods of calculation developed by Gibbs, a topic well beyond the reach of these notes.

Returning to our O-D matrix, note that we have not used as much information as we would have from an O and D survey and from our earlier work on trip generation. For the same travel pattern in the O-D matrix used before, we would have row and column totals, i.e.:

Table: Illustrative O-D Matrix with row and column totals
zone 1 2 3
zone Ti \Tj 2 3 2
1 4 2 1 1
2 3 0 2 1

Consider the way the four folks might travel, 4!/2!1!1! = 12; consider three folks, 3!/0!2!1! = 3. All travel can be combined in 12*3 = 36 ways. The possible configuration of trips is, thus, seen to be much constrained by the column and row totals.

We put this point together with the earlier work with our matrix and the notion of most likely state to say that we want to

$\max w\left({T_{ij}}\right)={\frac {T!}{\prod _{ij}{Tij!}}}$ subject to

$\sum _{j}{T_{ij}=T_{i}};\sum _{i}{T_{ij}=T_{j}}$ where: $T=\sum _{j}{\sum _{i}{T_{ij}}}=\sum _{i}{T_{i}}=\sum _{j}{T_{j}}$ and this is the problem that we have solved above.

Wilson adds another consideration; he constrains the system to the amount of energy available (i.e., money), and we have the additional constraint,

$\sum _{i}{\sum _{j}{T_{ij}C_{ij}=C}}$ where C is the quantity of resources available and $C_{ij}$ is the travel cost from i to j.

The discussion thus far contains the central ideas in Wilson’s work, but we are not yet to the place where the reader will recognize the model as it is formulated by Wilson.

First, writing the function to be maximized using Lagrangian multipliers, we have:

${\frac {T!}{\prod _{ij}{Tij!}}}+\sum _{i}{\lambda _{i}\left({T_{i}-\sum _{j}{T_{ij}}}\right)}+\sum _{j}{\lambda _{j}\left({T_{j}-\sum _{i}{T_{ij}}}\right)+\beta \left({C-\sum _{i}{\sum _{j}{T_{ij}C_{ij}}}}\right)}$ where $\lambda _{i},\lambda _{j},and\beta$ are the Lagrange multipliers, $beta$ having an energy sense.

Second, it is convenient to maximize the natural log (ln) rather than w(Tij), for then we may use Stirling's approximation.

$\ln N!\approx N\ln N-N$ so ${\frac {\partial \ln N!}{\partial N}}\approx \ln N$ Third, evaluating the maximum, we have

${\frac {\partial T}{\partial T_{ij}}}=-\ln T_{ij}-\lambda _{i}-\lambda _{j}-\beta C_{ij}=0$ with solution

$\ln T_{ij}=-\lambda _{i}-\lambda _{j}-\beta C_{ij}$ $T_{ij}=e^{-\lambda _{i}-\lambda _{j}-\beta C_{ij}}$ Finally, substituting this value of $T_{i}j$ back into our constraint equations, we have: $\sum _{j}{e^{-\lambda _{i}-\lambda _{j}-\beta C_{ij}}}=0\sum _{i}{e^{-\lambda _{i}-\lambda _{j}-\beta C_{ij}}}=0$ and, taking the constant multiples outside of the summation sign

$e^{-\lambda _{j}}={\frac {T_{i}}{\sum _{j}{e^{-\lambda _{j}-\beta C_{ij}}}}};e^{-\lambda _{i}}={\frac {T_{j}}{\sum _{i}{e^{-\lambda _{i}-\beta C_{ij}}}}}$ let ${\frac {e^{-\lambda _{j}}}{T_{i}}}=A_{i};{\frac {e^{-\lambda _{i}}}{T_{j}}}=B_{j}$ we have

$T_{ij}=A_{i}B_{j}T_{i}T_{j}e^{-\beta C_{ij}}$ which says that the most probable distribution of trips has a gravity model form, $T_{ij}$ is proportional to trip origins and destinations. $A_{i}$ , $B_{j}$ , and $\beta$ ensure constraints are met.

Turning now to computation, we have a large problem. First, we do not know the value of C, which earlier on we said had to do with the money available, it was a cost constraint. Consequently, we have to set $\beta$ to different values and then find the best set of values for $A_{i}$ and $B_{j}$ . We know what $beta$ means – the greater the value of $\beta$ , the less the cost of average distance traveled. (Compare $\beta$ in Boltzmann’s Law noted earlier.) Second, the values of $\beta _{i}$ and $\beta _{j}$ depend on each other. So for each value of $\beta$ , we must use an iterative solution. There are computer programs to do this.

Wilson’s method has been applied to the Lowry model.