Model Driver Screening and Evaluation Program
Volume II: Maryland Pilot Older Driver Study


Chapter 5.
Summary and Conclusions

The Maryland Pilot Older Driver Study collected and analyzed data describing the functional status of a total of 2,508 drivers age 55 and older between November 1998 and October 2001, sampled in three different venues: 1,876 License Renewal applicants, tested in Motor Vehicle Administration (MVA) field offices; 366 Medically Referred drivers, also tested in MVA field offices statewide; and 266 older drivers in a Residential Community, tested at Leisure World in Montgomery County, MD. The larger, License Renewal sample was deemed sufficiently representative of its age cohort to permit generalization to the broad population of older drivers, with respect to crash and violation experience; it served as the test bed for project data analyses examining the relationship between functional ability and a number of traffic safety outcome measures. Self-reported mobility restrictions and estimates of exposure also were collected and analyzed among all three study samples.

Ten measures of functional capacity were included in the research design. These were selected based upon prior, independent studies relating specific procedures and/or more general measurement constructs to safe driving ability and driving impairment, and upon a pre-pilot study suggesting that they could meet additional project criteria concerned with feasibility of test administration. All ten measures could be completed in approximately 20 minutes, on average.

Six screening procedures addressed perceptual-cognitive abilities--the Motor-free Visual Perception Test/Visual Closure Subtest assessed visuospatial skills, including the ability to visualize missing information as needed when only part of a threat object or other critical target is visible to a driver; Trail-making (Part B) used a paper-and-pencil exercise to measure directed visual search and divided attention capabilities, both essential to way-finding as well as rapid recognition of safety threats; Dynamic Trails also measured directed visual search and divided attention abilities, as above, but used a PC-based methodology with an added element of distraction provided by a moving traffic scene in the background; Useful Field of View Subtest 2 used a PC to measure divided attention and information processing speed, specifically the peripheral target duration at which a person can correctly localize the target while maintaining attention with central vision, a key to safe intersection negotiation; Delayed Recall assessed "working memory" ability, needed for proper response to all manner of driving situations and traffic control devices, and for basic navigation; and the Scan Test sought evidence of visual field neglect and erratic scanning patterns.

Four screening procedures addressed physical abilities--the Rapid Pace Walk and Foot Tap tests measured lower limb strength and mobility as needed to sustain steady control over brake and accelerator operation, and to quickly shift from one pedal to the other as circumstances may require; Head/Neck Rotation measured whether or not an individual could look directly over his/her shoulder as needed to safely change lanes or merge, with the lower torso fixed in place with a seatbelt as when driving; and the Arm Reach test measured upper limb strength and flexibility as needed for effective steering control.

Safety outcome measures analyzed in this research included three levels of crash data and three levels of convictions for moving violations, applying progressively more stringent criteria to evaluate the relationship between functional loss and the risk of injury due to a motor vehicle crash. In the crash analyses, at-fault crashes were segregated from the larger set including crashes where fault was unknown, and from all police-reported crashes (i.e., without regard to fault). Assignment of fault was based on the report of the investigating police officer; to be reported, a crash must have been serious enough to require a vehicle to be towed from the scene.

In the conviction analyses, "all moving violations" were further sorted to exclude those for speeding--a behavior not typically associated with older drivers--and also to exclude violations of passenger restraint system laws that pertain to behaviors which, while critically important in determining the severity of injuries experienced in a crash, are arguably of less concern as precursors of a crash than infractions such as running a stop sign or traffic signal, failure to yield, one-way and wrong-way violations, etc.

Among the crash analyses, the strongest relationships with functional status were uniformly found when examining at-fault crashes only. Among the analyses of moving violations, the strongest relationships were most often found for that category of events described by "all moving violations without speeding and occupant restraint citations." This is important from the standpoint of "construct validation"--the behaviors signified by these particular subcategories of events are those bearing the strongest a priori relationships to crash risk. And while the relationships based on conviction data were weighted less heavily than those based on crash analysis outcomes, they nevertheless provided key convergent evidence in identifying the best predictors among the screening measures included in the Pilot Study.

It was recognized that the analyses of safety outcomes, as related to drivers' functional status, were subject to several potential sources of bias. First, because test dates varied while a common cutoff date for driving history observations was applied to the analysis sample, there was a varying period during which drivers could have accumulated adverse safety outcomes. However, a comparison of the amount of time (in months) comprising the analysis intervals for crash-involved versus crash-free drivers showed no significant differences. Next, the question of whether exposure differences (i.e., apart from differences in functional status) might account for differences in crash experience among the study sample was raised; but, the only source of such information was self-reports. Internal checks between weekly versus annual estimates of miles driven, by the same individuals, underscored concerns about the reliability of these subjective data, as almost 50 percent of the responses demonstrated a 50 percent error rate in estimated miles driven. Though appealing in concept, without an objective index of how much driving occurs, and under what conditions, no "corrections" for individual differences in exposure in these analyses could be justified.

Descriptive statistics revealed broad differences between the three study samples. The License Renewal sample was approximately 10 years younger (mean age = 68.3) than the Medical Referral (mean age = 76.8) and Residential Community (mean age = 77.1) samples. However, the Residential Community sample mirrored the population-based License Renewal sample much more closely in terms of functional ability, especially with respect to perceptual-cognitive tests. This result reinforces the notion that functional status, not age per se, is of primary importance. In terms of self-reported mobility restrictions, the Residential Community and Medical Referral samples were more alike, particularly with respect to how often they "never" and "always" avoided problem situations (nighttime, bad weather, heavy traffic, etc.). Thus, drivers of similar age but differing in functional ability may nevertheless make similar behavioral adaptations in their driving habits, to compensate for a perceived increase in driving risk. This finding is useful in designing educational and counseling components of a screening and evaluation program.

The analysis results obtained in the Maryland Pilot Older Driver Study have provided perhaps the best evidence to date that functional capacity screening, conducted quickly and efficiently, in diverse settings, can yield scientifically valid predictions about the risk of driving impairment experienced by older individuals. The evidence that a person's ability to drive safely has been impaired, at a given level of functional decline, is based on "odds ratio" calculations. These calculated values express how much greater the odds are of being involved in a crash (and of committing moving violations) if a driver fails a test than if he or she passes it.

The results of the analyses relating functional status to crash involvement in this research are summarized in table 7 below, in terms of the peak (valid) odds ratio value calculated for each included screening measure. These odds ratio (OR) values highlight the most predictive levels attained by the various functional screens examined in the Pilot Study. At a value of 1.0, a driver has the same odds of being crash-involved if he/she passes a test as if he/she fails it; higher OR values connote greater predictive value. For comparison purposes, peak valid OR values for the same measures are also shown based on calculations using prospective data only. The inclusion of one year of retrospective driving experience data (keyed to each individual's test date) in the primary analyses was justified earlier, on medical grounds; however, it is reasonable to question how the results might have varied if restricted to the smaller data set described by a purely prospective analysis. Across both data sets (i.e., with and without the added year of retrospective driving experience), the strongest relationships were consistently demonstrated between functional status and at-fault crashes.

Table 7.
Peak Valid Odds Ratios for Prediction of Crashes
  Peak Valid Odds Ratio
Functional Capacity Screening Measure Prospective 
+ 1 Year Retro

Perceptual-cognitive measures

Motor-free Visual Perception Test, Visual Closure Subtest 4.96 6.22
Trail-making, Part B 3.50 2.21
Delayed Recall 2.92 1.05
Useful Field of View, Subtest 2 2.48 3.11
Dynamic Trails 1.45 ‡
Scan Test ‡ ‡

Physical measures

Rapid Pace Walk 2.64 1.70
Head/Neck Rotation 2.56 4.46
Foot Tap 1.50 1.06
Arm Reach ‡ ‡

‡ One or more cell counts were too small to permit a valid odds ratio calculation.

As indicated, the Motor-Free Visual Perception Test/Visual Closure subtest was most predictive of (at-fault) crash involvement by drivers in the License Renewal sample, by a wide margin. Three additional perceptual-cognitive measures--Trail-making, Part B; Delayed Recall; and Useful Field of View, subtest 2--also were shown to be potentially useful predictors for identifying at-risk drivers. Among the physical measures, the Rapid Pace Walk and Head/Neck Rotation appear to have the greatest potential value as predictors of driving impairment.

The results of analyses relating functional status to convictions for three categories of moving violations are summarized in table 8 below, in terms of the peak valid odds ratio (OR) value calculated for each included screening measure. As before, the OR values express how much more likely drivers who fail a test are to experience a particular (negative) safety outcome--in this case a conviction for a moving violation--versus drivers who pass the test. While the behaviors associated with moving violations do not necessarily lead to crashes, they are clearly of concern to traffic safety professionals. Accordingly, these indications of driving negligence serve as secondary outcome measures for gauging the relative utility of different screening procedures.

Table 8.
Peak Valid Odds Ratios for Prediction of Moving Violations
Functional Capacity Screening Measure
Peak Valid Odds Ratio

Perceptual-cognitive measures

Motor-free Visual Perception Test, Visual Closure Subtest 4.53a
Trail-making, Part B 1.72b
Delayed Recall 1.72a
Useful Field of View, Subtest 2 1.67a
Dynamic Trails 1.27ba
Scan Test ‡

Physical measures

Rapid Pace Walk 1.48b
Head/Neck Rotation ‡
Foot Tap 2.14c
Arm Reach ‡

‡One or more cell counts were too small to permit a valid odds ratio calculation.
aPeak valid OR was calculated for analysis of moving violations without speeding and occupant restraint citations.
bPeak valid OR was calculated for analysis of moving violations without speeding.
cPeak valid OR was calculated for analysis of all moving violations.

As shown, the peak valid OR value was demonstrated for analyses of moving violations without speeding and occupant restraint citations, in a majority of cases. This was expected because this, the most restrictive analysis category, focused upon behaviors believed to be--but for random good fortune--the logical precursors of crashes, e.g., reckless, careless, and negligent operation; stop and yield violations; improper turning, passing, following, lane changing, and backing maneuvers; lane exceedance; and wrong-way and one-way movements.

In fewer cases the peak OR was found when occupant restraint violations remained in the analysis, and speeding violations only were removed from the analysis; and, in one case peak valid OR was found for the analysis of "all moving violations." No special importance is attached to these findings. With weaker relationships overall compared to those revealed in the crash analyses, there is more random fluctuation or "noise" in the violation data that can result in an anomalous high OR value at a particular performance level, for any given measure. The calculated odds ratio values presented in appendix H indicate that, across all performance levels, the strongest relationships (even if not statistically significant) obtain for the analyses focusing upon moving violations excluding speeding and occupant restraint citations.

Again, the Motor-Free Visual Perception Test/Visual Closure subtest was most predictive of negative safety outcomes--convictions for moving violations, in this case. The ordering of the remaining perceptual-cognitive measures in terms of peak valid OR values was the same as for the full crash analysis (including 1 year of retrospective experience), but weaker relationships were demonstrated across the board. The only other statistically reliable result was found for Trail-making, Part B, for the relationship between this measure and moving violations except speeding. For the physical measures, no statistically reliable relationships with the conviction measures were demonstrated, even at the performance levels where the peak OR was calculated.

The crash and conviction analysis results lead to the consideration of candidate "cutpoints," or pass/fail criteria, for measures that appear to be of potential value in identifying at-risk drivers.

It may be argued that judgments about the best cutpoints for pass/fail decisions should be pegged to the functional ability (test performance) level where a clear spike in OR is observed. If performance levels for the predictor variable are examined at very fine gradations, however, what may appear as a "spike" in calculated OR could actually be a spurious result that gives a misleading interpretation of the larger predictor-criterion relationship. Other problems include reversals in the OR curves, and/or the curve describing calculated OR for a specific measure may change so gradually that it is difficult to single out a candidate cutpoint on this basis. This is not surprising, given the sensitivity of OR calculations to the shift of a very small number of observations from one cell to another in the 4-way classification table defined by "pass" or "fail," versus "crash" or "no crash."

Thus, one conclusion of this work is that broader trends in the distributions of crash-involved versus non-crash-involved drivers also deserve consideration when identifying candidate cutpoints. Specifically, analysis outcomes must be scrutinized to determine where there is a clear performance-versus-safety transition in the distributions of crash-involved versus non-crash-involved drivers, i.e., to pinpoint a level of functional loss where the percentage of drivers in the former group begins to consistently exceed the latter.

A good example is provided by the at-fault crash analysis for the best-performing screening measure in the Pilot Study, MVPT/VC. Elements from an earlier plot of these analysis results that are most germane to this discussion are reproduced in figure 45. The ratio of crash-involved to non-crash-involved drivers, illustrated by the relative height of the black bars to the white bars, peaks at two different performance levels: 5 incorrect and 7 incorrect responses. Based on cell counts in the OR calculation matrix, however, only the analysis result for 5 incorrect is valid. Meanwhile, the transition point where the proportion of crash-involved drivers begins to "consistently exceed" the proportion of non-crash-involved drivers occurs between 3 and 4 incorrect responses.

Figure 45.
Motor-Free Visual Perception/Visual Closure Subtest Results 
Illustrating the Disparity between Using Isolated, Peak OR Values and Shifts 
in the Distribution of Crash-Involved, versus Non-Crash-Involved, 
Drivers as a Basis for Selecting Pass/Fail Cutpoints

Figure 45. Motor-Free Visual Perception/Visual Closure Subtest Results
Illustrating the Disparity between Using Isolated, Peak OR Values and Shifts in
the Distribution of Crash-Involved, versus Non-Crash-Involved, Drivers as a
Basis for Selecting Pass/Fail Cutpoints

But it is the overall shape of the two distributions that may be most revealing. While the distribution of non-crash-involved drivers shows a monotonic decline from zero incorrect (perfect performance) to 7 incorrect responses, the distribution of crash-involved drivers is distinctly bimodal--as if two separate distributions of crash-involved drivers are represented in the same plot.

An explanation for this analysis outcome with clear implications for cutpoint identification may be suggested. Among the non-crash-involved drivers, which constitute the vast majority--97.7 percent--of the total number screened, a normal distribution of functional ability should be detected by a valid test. For the MVPT, on a population basis, higher frequencies are observed with fewer errors, and a steady decline in frequency is observed as number of errors increases. This is precisely the monotonic curve demonstrated for the non-crash-involved drivers in the Pilot Study, bolstering the assertion that a representative sample was obtained for this study.

With regard to crash-involved drivers, only some would be expected to experience a crash because of this particular functional loss. The frequency distribution of their scores might be expected to differ from the rest who, logically, would have experienced their crashes because of a different kind of impairment, or simply by chance.

If this premise is valid, two separate distributions could indeed be represented among the crash-involved drivers. For the group whose driving has been impaired because of this specific functional deficit, a frequency distribution centered around a mean performance level representing significant functional decline can be postulated. For the other group--i.e., the drivers involved in crashes because of another type of deficit, or for reasons that have nothing to do with functional ability--there is no reason why the frequency distribution of scores on this measure should not follow the same pattern as for the general population.

The idealized set of curves presented in figure 46 may help to illustrate this suggested explanation for the observed analysis outcome.

Figure 46.
Idealized Frequency Distribution Plot Segregating Crash-Involved Drivers 
into One Group that is at Risk Because of the Specific Functional Ability 
Under Consideration, versus Another Group that Crashes Because of 
Other Sources of Impairment or Random Events

Figure 46.  Idealized Frequency Distribution Plot Segregating Crash-Involved
Drivers into One Group that is at Risk Because of the Specific Functional Ability
Under Consideration, versus Another Group that Crashes because of Other Sources of
Impairment or Random Events

This interpretation lends support to the identification of not one cutpoint per screening measure, but two. A cutpoint connoting an "early warning" that an individual's level of functional decline is just beginning to place him/her at higher risk of driving impairment may be distinguished from an "immediate danger" cutpoint, where functional decline has reached a level associated with the highest relative risk1 of crash involvement compared to functionally intact drivers. The former may trigger prevention efforts; the latter signals a need for intervention.

For a majority of screening measures included in the Pilot Study, the analysis outcomes may be applied within this framework to identify candidate cutpoints as shown in table 9 below.

Table 9. 
Candidate Cutpoints for Screening Measures in the
Pilot Study That Are Supported by Present Crash Analysis Results
  Candidate Cutpoint
Functional Capacity Screening Measure Prevention Intervention

Perceptual-cognitive measures

Motor-free Visual Perception Test, Visual Closure Subtest 3 incorrect 5 incorrect
Trail-making, Part B 80 seconds 180 seconds
Delayed Recall 1 incorrect 2 incorrect
Useful Field of View, Subtest 2 200 msec 300msec
Dynamic Trails ‡ ‡
Scan Test † †

Physical measures

Rapid Pace Walk 7.5 seconds 9.0 seconds
Head/Neck Rotation † †
Foot Tap ‡ ‡
Arm Reach † †

‡ Analysis outcomes were not statistically reliable and/or too few observations to support cutpoint identification.
† N/A (binary measure)

In conclusion, the results of the Maryland Pilot Older Driver Study reinforce the proposition that loss of key functional abilities predicts an increase in driving impairment and higher risk of crash involvement. There is also evidence that it would be feasible to conduct functional capacity screening in a "production" (driver licensing) setting, at a cost in the range of $5 to $10 per driver screened. If only a subset of the battery of measures included in the Pilot Study were to be implemented, it would drive the cost-per-driver-screened even lower. Caution still must be exercised in using the study's findings to select "best" measures, however. It is the domains of functional ability, not particular measurement techniques, that should be the focus of attention given our present understanding of how well functional screening can detect high-risk drivers. While certain procedures yielded stronger relationships with crashes and moving violations than others in the Pilot Study, a need for methodological refinements and increased sample sizes to bolster confidence in the reliability of these findings, and to solidify cutpoint determination, is paramount. And it may confidently be assumed that better technology as well as better understanding of the sought-after relationships between functional status and safety will undoubtedly lead to superior screening and assessment tools in the future.

Finally, there are broader implications for developing and implementing a driver screening program that can be drawn from this experience in Maryland. Most importantly, to "fail" a screen does not necessarily mean that an individual should stop driving. It means that the individual's functional status places him or her at greater risk of a motor vehicle crash, and may establish a need for follow-up to more accurately diagnose underlying medical problems; to undergo, in some cases, a formal (on-road) driving evaluation; to consider changes in driving habits that reduce exposure to the most risky conditions; and to explore the potential for remediation to counter the indicated functional loss. Thus, the application of findings in the Maryland Pilot Older Driver Study, described herein, must be gauged in relation to a larger, integrated set of activities devoted to enhancing public safety while allowing older persons to continue driving as long as they can safely do so. This expanded discussion is a part of Volume 1 of the Final Technical Report submitted in this project.

1 For reference, calculations using the Relative Risk analytical technique yield results identical to the Odds Ratio calculation when critical outcome event (crash) counts are small.