Driver Screening and Evaluation Program
Volume II: Maryland Pilot Older Driver Study
The findings of the Maryland Pilot Older Driver Study are presented in this section, including data analysis summary tables, graphs and figures, and statistical test results. Additional detail, typically in the form of raw data tables, is deferred to report appendixes.
Separate subsections are devoted to each of the following analysis topics:
At the end of this section, information bearing on the feasibility of test administration is summarized, including an account of time and resources committed by Maryland MVA staff during the pilot study, and research team observations concerning difficulties encountered during data collection.
This section of the report identifies the sources of crash and (moving) violation data that served as the primary safety outcome measures, and the steps involved in creating analysis files to describe frequency distributions of these events and to test the strength of their relationships with functional status as measured by the screening procedures applied during the pilot study.
Primary data collection and database design was performed by staff at the MVA. The process of obtaining driver records began with the development of a master list of "Soundex" numbers; this is the Maryland driver's license number. The Soundex for each subject was acquired at the time of screening and keypunched into a local database along with screening results. The Soundexes from that compilation were first submitted to an MVA cross-reference table to identify any that may have changed since screening. Since the first character in the Soundex is derived from the driver's last name, an individual's Soundex changes whenever he/she changes his or her name. Further, after the initial request was made for unique Soundexes in the Maryland state database, the local database was periodically updated with the current Soundex to maintain up-to-date crash and conviction records.
Conviction records were extracted from the Maryland Motor Vehicle Production Database which is the principal data repository for the licensing agency. The updated Soundex list was submitted to the MVA Production Database to determine which Soundexes were valid. When Soundexes as recorded in the local database were determined to be invalid, an individual review of the driver's record was made to correct the numbers used for tracking the individual. A unique list of updated and valid Soundexes was then resubmitted to the MVA Production Database for data extraction.
Crash records were extracted from the Maryland Automated Accident Reporting
System (MAARS) Database. MAARS originates as a paper crash report submitted to the
Maryland State Police by any one of the more than 125 police jurisdictions within the State
of Maryland. The contents of the paper report that can be stored in a database are keypunched
by a unit of the State Police. A copy of the MAARS data is supplied to the Maryland State
Highway Administration (SHA) where the location of the crash is edited to provide a
reference to the highway system. MVA submits requests to SHA to make extractions.
The validity of the Soundex number for study participants was paramount, as this was the linking variable between databases containing the analysis outcomes of primary interest in this research. Where questions existed about the validity of an individual Soundex, all available evidence pertaining to that individual subject was reviewed to confirm/correct the Soundex. Validity for the License Renewal sample--most critical because it was the source of data for analyses relating driver functional status to safety outcomes--was assessed by comparing Test Date to Issue Date, along with date of birth and gender.
The analyses addressing functional status and safety, as noted earlier, were based upon data acquired from the License Renewal sample. These data were manually recorded for the most part, but also included measures derived from automated test procedures administered on a PC. Manually recorded screening data were typically entered locally at each test location from paper records, and stored centrally through the same network connection. Data products from automated screens were stored both locally and centrally through a network connection for each test device.
Some aspects of compiling the analysis database differed among the samples. For the Medical Referral sample, data collected manually at MVA offices around the state were sent to MVA headquarters, where they were keypunched centrally into MS Access tables by agency staff. In addition, the Medical Referral database included separate entries for one of its outcome measures--the determination of driving fitness by the examining physician (Daily Duty Doctor). Normally, during the MAB review process this determination is based solely upon driving history and medical history information, But in this study, the added effect of access to functional screening data--if any--was evaluated. The first entry by the examining physician was therefore made before functional data from the screening measures were revealed; the second was made after> this information became available to the physician. This permitted later analysis of how functional screening data may influence a Medical Advisory Board's decision-making process.
Once data were entered into Access tables for all samples--License Renewal, Medical Referral, and Residential Community--the MVA sent out preliminary versions for review by research team members. This review identified errors and other needs for changes by MVA to facilitate subsequent analyses. A final version of the database was then assigned a version number and sent to members of the research team for analysis. Version numbers were critical in this process because crash and conviction information must be regularly updated. The interval between crash and conviction updates for the study samples coincides with other modifications to the subject databases or a specific request by users of the data system.
Additional steps were involved after the project database was received from MVA, before planned analyses could proceed. Generally, data were imported into Access as text and converted to numerical, date, or logical (yes/no) format as appropriate. Since Access permits conversion to a more suitable format at almost any time, data were maintained in their original format until the final table of to-be-analyzed data was created. Original data tables as received from MVA were not altered during this process.
Initial data processing was performed using Microsoft Access 97. The Access tables were first linked together using the Soundex numbers as the key variable for each driver in the database. See appendix C for Access data structure and variables. Variables were selected from the Access tables and some variables were recoded prior to creating a rectangular file for analysis, i.e., where each row contains all data for one driver.
At this stage of processing, filters were applied to the exhaustive records for each driver in the MVA database to: (1) exclude crash events associated with the use of alcohol; and (2) restrict the observation period during which crash and conviction data would be compiled for each driver to test relationships with functional status indicators obtained during screening. In the first instance, filtering was justified because of a desire to--within the limits of the data quality afforded by State police reports--identify incidents where fault could be attributed specifically to a measured decline in functional ability. While alcohol use by a crashing driver does not rule out negligence or performance failure due to functional decline, of course, the confounding of these factors makes it impossible to reliably assess their relative contributions.
Next, within the domain of crash incidents without any indication of alcohol involvement, additional sorting was performed according to MVA system codes that distinguish crashes where fault has been assigned from crashes where fault is unknown in the judgment of the investigating officer. It is likely that in at least some instances where fault status was coded as "unknown," the driver was at fault1; single-vehicle, run-off-road crashes are sometimes coded in this category. Finally, a third crash category connoting an even lower probability of fault may also be inferred, from the absence of either of these codes. The result of this sorting process was to define three levels of (non-alcohol-related) crash involvement as primary safety outcomes in this research: (1) at-fault crashes; (2) crashes where fault was assigned or where an "unknown" code was assigned; and (3) all crashes. The subsequent interpretation of study results was keyed, in part, to this analytic approach.
In a related processing activity, data identifying convictions for traffic violations on each driver's record were sorted according to their judged importance in understanding the relationship between functional status and safety. First, events coded in Maryland's system as "moving" violations were separated from all other violation codes including, for example, convictions for licensing matters (hearings, suspensions), parking infractions, and so forth. Within the category of moving violations, additional sorting was performed with the intention of excluding behaviors that are not prevalent among older drivers, based on the technical literature, or that may have been included in this category due to some peculiarity of the State's coding system but hold less credibility as the potential cause of a crash. This sorting activity defined three conviction categories for the present analyses: (1) all moving violations; (2) moving violations excluding speeding convictions; and (3) moving violations excluding speeding convictions and convictions for occupant restraint violations.
The relative proportions of convictions comprising each category analyzed are displayed in figure 7. The (grouped) violation types remaining in the analyses after setting aside speeding and occupant restraint violations are also indicated.
One remaining filter that established boundaries on the to-be-analyzed dataset was the amount of time each driver's crash and violation experience was observed. There were two competing priorities in setting such boundaries--capturing as large an interval of experience as could reasonably be associated with differences in functional ability, as measured in the pilot study, and equalizing observation periods across all drivers in the (License Renewal) sample.
On the first point, choices included looking only at prospective data (crash and violation experience after each driver's date of testing) or, also including some extent of retrospective experience in the observation period; and if retrospective data were to be included, how far in the past could (differences in) drivers' functional ability reasonably be gauged by their performance on the included screening measures? In consultation with the Chief of the Maryland Medical Advisory Board2 it was determined that one year of retrospective data would be evaluated, while duplicating selected analyses using only prospective data to look for differences in the pattern of results that might alter any of the study's conclusions. A query was subsequently written in Access to bracket each driver's screening date with one year of experience before his/her test date and as much time after the date as allowed by the final extraction of crash and violation information from the Maryland system.
On the second point, it was inevitable given the period required to complete data collection procedures in this study that a longer period would be available in which to observe the experience of some drivers than others. This interval could be equalized if analyses were restricted to just the period of time following the last driver screened; but more than a year of prospective data would be disregarded for drivers tested earlier in the study with this approach, and the power of the analyses would decrease because of the reduced number of observations. The critical issue here is whether the variability in driving experience observation intervals inherent in the pilot study design is random with respect to crash-involved versus non-crash-involved populations.
In consideration of this possibility, the relative distributions of observation times for drivers in the License Renewal sample with and without crashes during the planned analysis interval were examined, at the level of "months after test date." These distributions are displayed in figure 8. The mean number of (full) months after test date for which driving experience data were available for drivers involved in crashes was 20.2, with a standard deviation of 2.6; for drivers who were not involved in crashes, the mean number of months after test date for which driving experience data were available was 19.9, with a standard deviation of 2.9. In other words, the observation interval was nine days longer, on average, for drivers in the crash-involved group than for drivers in the non-crash-involved group. A t-test between these means was not significant (p<.27). This outcome, and the closely overlapping distributions shown in figure 8, supported a decision to proceed with the planned analyses knowing that the interval in which the (prospective) crash and conviction data could be compiled would vary as per the earliest versus the latest test dates for the drivers in this sample.
The descriptive data summaries and analyses relating safety outcomes to functional performance were performed using SPSS SYSTAT (v. 9.01). Access tables were imported into SYSTAT using Open Database Connectivity (ODBC) Database Capture feature within SYSTAT. Once the data were successfully imported, SYSTAT was used to exclude outliers, recode variables, and create variables for analysis.
The following data filters and variable recodes were performed first (with actual variable names provided in parentheses)
In order to calculate odds ratios (O.R.s), pass/fail criteria had to be established for the performance measures, and safety outcome measures had to be recoded using a (-1) to indicate an occurrence of an adverse event (i.e., crash or conviction). As discussed in a later section, the highest O.R. for which cell counts permitted valid analyses established candidate criteria for cutoffs for the included measures.
The outcome variables (with actual variable names provided in parentheses) were calculated as follows:
In this section, figures and text descriptions summarize performance on the various functional screening measures, for each study sample. Age distributions are also shown below, for review purposes.
The following pages display the percentage of the distribution that scored at each possible level of performance for each measure. This permits the performance of the License Renewal, Residential Community, and Medical Referral samples to be compared directly despite the different numbers of participants in each group. Tables containing descriptive statistics for each screening measure may be reviewed in appendix D.
Age distributions of study participants, separately presented for each sample in an earlier section, are contrasted below. These data provide insight into certain performance differences between groups of drivers that are apparent in the functional screening data summaries that follow. Specifically, the ages on the date of testing for each study participant are shown in figure 9 for 5-year groups beginning at age 55. As indicated, the License Renewal sample most closely approximates a normal distribution, while the Residential Community sample, and especially the Medical Referral sample, are skewed somewhat toward older ages. For means and standard deviations of these age distributions, see the earlier discussion in Test Site and Sample Selection.
The following series of plots summarize the screening data collected for each functional measure, for each study sample: License Renewal, Residential Community, and Medical Referral. Results for the perceptual-cognitive measures are reported first, then the physical measures. To display data for all three samples within the same plot, it was necessary to use a common y-axis. Because of differences in the number of drivers in each sample, the counts within each bar were first normalized to the number of drivers. This permitted results to be displayed as a percent of the distribution scoring at each level represented on the x-axis. It should be noted that performance degrades for all measures moving toward the right along the x-axis.
Motor Free Visual Perception Test/Visual Closure Subtest (MVPT/VC). Figure 10 shows the distributions of the respective study samples on the MVPT/VC measure of perceptual-cognitive function. The horizontal axis is the number of incorrect responses, out of 11 trials. Descriptive statistics for this measure are presented in table 14 in appendix D.
There are two noteworthy aspects of this plot. First, the Medical Referral distribution shows a higher proportion of drivers with more than 2 incorrect responses. This occurs despite the fact that the mean age for the Residential Community and the Medical Referral samples are essentially the same (only 1 year difference in mean age). Second, the drivers from the License Renewal and Residential Community performed similarly despite the fact that the mean ages of these samples were very different; (68 versus 78 years, respectively. In fact, a slightly higher percentage of drivers in the Residential Community sample had zero errors on this test than drivers in the License Renewal sample. This dissociation of age and functional performance illustrates the problems associated with the use of (chronological) age alone as a predictor of driving impairment.
Delayed Recall. Figure 11 presents the distributions for Delayed Recall, a measure of a driver's working memory obtained approximately 10 minutes after the Cued Recall procedure during which the memory probe set was repeated. Performance is measured as the number of items correctly recalled after the intervening interference period (out of 3).
The most noteworthy aspect of the results for Delayed Recall is that about 20 percent more of the drivers in the Medical Referral sample missed 2 or more items, compared to the other samples. The License Renewal and Residential Community samples performed similarly--over half of those tested in each of these samples did not miss any of the recalled words. Descriptive statistics for this measure are presented in table 15 in appendix D.
Useful Field of View, Subtest 2. Figure 12 is a plot of the results for the Useful Field of View, Subtest 2. As noted earlier, this is a speed-of-processing test, with a divided attention requirement, where the field of view is actually held constant. It is a timed test, where the speed of response is scored in milliseconds, as plotted on the x-axis.
The apparent anomaly showing a high proportion of responses clustered at the 500 millisecond latency is due to an artifact of the scoring algorithm. If a person requires longer than 500 milliseconds to successfully discriminate the stimuli in this procedure, his or her score is entered as 500 milliseconds and the test is discontinued; thus, the actual range of responses for this measure is unknown.
The differences among the samples on this measure are most pronounced at the peaks of the distributions. The peak (or "best" performance) of the License Renewal distribution is 50 msec, whereas the peak for the Medical Referral distribution is 500 msec, i.e., the "worst" level of performance scored. At both extremes, Residential Community sample scores fall between the other two samples. No systematic differences between the samples are apparent at intermediate scores. Descriptive statistics for this measure are presented in table 16 in appendix D.
Trail-making, Part B. The distributions for this perceptual-cognitive measure are shown in figure 13. The x-axis, labeled completion time, indicates the number of seconds drivers required to connect the (25) items in the correct order. The maximum time allowed to complete the test was 6 minutes (360 seconds).
For all samples, performance on Trails B peaked at about 100 seconds. However, the License Renewal and Residential Community distributions are clearly skewed toward briefer completion times (intact functionality), while drivers in the Medical Referral sample displayed the full range of capabilities measured by this procedure. As a result, over half of the scores lie above 100 seconds for the Medical Referral group whereas over half of the scores for the other 2 groups lie at or below 100 seconds. Descriptive statistics for this measure are presented in table 17 in appendix D.
Dynamic Trails. Next, the results for the Dynamic Trails test are summarized in figure 14. This procedure was derived from Trails B, including a more distracting background for the letter and number stimuli but fewer items (14 instead of 25); this may explain the faster completion times.
As shown, the shift in the peaks, as well as the overall shape, of these distributions closely match performance using the paper-and-pencil test protocol. Descriptive statistics for this measure are presented in table 18 in appendix D.
Scan Test. The remaining measure of perceptual-cognitive ability screened for visual neglect or other scanning deficits. Scan Test results are presented in figure 15. As indicated, the overwhelming majority of drivers in all three samples passed this test. The highest failure rate, for drivers in the Medical Referral group, was 14 percent. Because of observed inconsistencies in the administration of this procedure, it is unknown whether the measure lacks sensitivity or whether differences were washed out by measurement error. The principal difficulty was that, without actually restraining head movement, the testing requirement that drivers scan the chart with eye movements only could not be met on a consistent basis. Descriptive statistics for this measure are presented in table 19 in appendix D.
Rapid Pace Walk. Turning to the results for measures of physical abilities, performance on the 20-foot Rapid Pace Walk is shown for each sample in figure 16. Descriptive statistics for this measure are presented in table 20 in appendix D.
The License Renewal sample demonstrated the fastest mean time (6.5 sec) to complete this measure. These drivers also evidenced the lowest proportion of individuals showing exaggerated decline in this functional ability. The Residential Community sample was similarly skewed toward "intact" functional status, but consistent with their relatively advanced age, the entire distribution was right-shifted along the x-axis. The Medical Referral sample, by comparison, was the slowest on average (7.8 sec), and also showed a marked increase in the proportion of those tested whose lengthy completion times indicated an exaggerated decline in this ability.
Foot Tap. Figure 17 presents performance distributions for the Foot Tap measure. As for the Rapid Pace Walk, the mean completion times are similar for the License Renewal and Residential Community samples, while drivers in the Medical Referral sample, on average, were a full second slower.
The similarities in the patterns of results for Foot Tap and Rapid Pace Walk are consistent with a presumption that these measures address common functional abilities. In fact, the calculated correlation between these measures in the License Renewal sample was r = .48. Descriptive statistics for the Foot Tap measure are presented in table 21 in appendix D.
Head/Neck Rotation. The results for the Head/Neck rotation measure are presented in figure 18. Most drivers passed this measure but there are some noteworthy differences among the samples. As anticipated, the poorest performing sample is the Medical Referral group, in which 37 percent of drivers failed. The License Renewal group demonstrated slightly less decline in this ability than the Residential Community group; but again, these drivers were nearly 10 years younger, on average. Descriptive statistics can be found in table 22 in appendix D.
Arm Reach. Figure 19 plots the data for the Arm Reach measure. As a reminder, drivers performed the test separately for the left and right arms. These results were then combined to create a single pass/fail measure, i.e., drivers had to pass both left and right arm reach tests to receive a passing score.
Similar to the results for the Scan Test reported above, virtually all drivers screened obtained a passing score on this measure. In this case, however, no serious methodological problems were evident in test administration; this was simply not a sensitive measure. Descriptive statistics for this measure are presented in table 23 in appendix D.
The descriptive data summaries presented in this section have underscored the importance of measuring functional ability without regard to chronological age--inarguably, samples alike in age differ substantially on perceptual-cognitive and physical measures related to safe driving ability, while the performance distributions of samples of older drivers almost a decade apart in mean age are nearly congruent, on multiple measures. Results of the critical analyses relating differences on each functional measure to crash and violation experience for the population-based sample in this study, the License Renewal group, follow a summary of the Mobility Questionnaire responses.
This section summarizes the data obtained using the Mobility Questionnaire, characterizing the study samples in terms of their self-imposed limitations in the amount of miles driven, and/or in the situations they choose to drive in. As displayed in appendix E, the following subjective measures were obtained:
Numerical estimates were obtained for questions addressing weekly driving exposure, while a categorical estimate of miles driven was obtained for annual exposure. For questions beginning, "How often do you …," responses were obtained using a rating scale containing the following 5 options: Never, Rarely, Sometimes, Usually, or Always. The figures below reveal differences between the study samples for each qualitative measure. Detailed descriptive statistics can be found in tables 24, 25, and 26 of appendix E for the License Renewal, Residential Community, and Medical Referral sample responses, respectively.
Figure 20 presents the results for self-reports of the typical number of driving days per week for each sample. In every group, the largest percentage of drivers makes at least one trip via personal automobile every day of the week. The Medical Referral group members were least likely to drive every day, and correspondingly more likely to drive only one, two, or three days per week.
Figure 20 also indicates that the Residential Community group--though considerably older, on average, than the License Renewal sample--chose to drive at comparable levels based on this measure.
Figure 21 presents the results for the estimated number of miles driven per week. The Medical Referral group, which reported driving fewer days per week, also included more people who reported driving the fewest miles per week. For every group, however, the distribution was strongly skewed toward limited driving exposure as one-half or more of respondents indicated that they drove less than 100 miles or less per week.
Figure 22 presents the results for the estimated number of miles driven per year. Consistent with the previous measure, Medical Referral drivers tended to report driving fewer miles per year; the peak of their distribution was 1,000 miles/year, and self-reported exposure fell sharply at the 5,000 miles per year level. In contrast, the peak in self-reported annual miles driven by the License Renewal and Residential Community groups was 5,000, and roughly one-third of the distributions for both of these samples fell in the 10,000-15,000 miles per year range.
An internal check on the reliability of the self-reported exposure measures was performed by calculating the correlation between each driver's estimates of miles driven on a weekly versus an annual basis. For all drivers age 55 and over sampled in the pilot study, r = .65. The License Renewal sample was of particular interest, since it provided the data upon which analyses relating functional status to safety outcomes were performed; for this group--which comprised the largest number of study participants by a wide margin--the calculated r value was a nearly identical .64.
This r value implies an overall level of agreement between these measures--which represent two different ways of asking the same question--that is moderate to good. A finer examination of the reliability of drivers' exposure estimates involves direct comparison of the annual-miles-driven figures with an extension of the miles per week estimates (i.e., multiplied by 52). This multiplication was performed, then the product was divided by the estimate of annual miles driven for each person in the License Renewal sample. A "percent error" score was yielded by this procedure that reveals the discrepancy between drivers' estimates of miles driven when asked the same question in two different ways. As shown in figure 23, over 10 percent of the sample provided responses characterized by over 100 percent error, and a 50 percent error rate was demonstrated in over 40 percent of the responses. The implications of this finding are discussed in the report's Conclusions section.
Next, some insight into the extent that older drivers may self-regulate their exposure and the situations they avoid most often is provided by the subjective responses that are summarized in figures 24 through 29.
Nighttime driving. First, figure 24 shows how often the drivers in each sample avoid driving at night. The Medical Referral sample, which also experienced the largest degree of functional decline based on the present screening battery, contained the highest proportion of drivers--almost 1/3--who reported that they "Always" limit their nighttime driving. At the opposite extreme, the License Renewal sample contained the highest proportion of drivers reporting that they "Never" limit their nighttime driving.
Left turns. Next, figure 25 shows results for the frequency of left turn avoidance by sample. The majority of drivers in each of the three samples report "Never" avoiding left turns. There are some slight differences in proportions among the 3 samples; namely, a larger proportion of drivers in the License Renewal sample report "Never" avoiding left turns, followed by drivers in the Residential Community and then the Medical Referral sample. However, there do not appear to be any systematic trends for drivers reporting results other than "Never" on this measure.
Bad weather. Figure 26 presents the results for avoiding driving in bad weather. The differences among the three samples are small and not systematic for those responding "Rarely," "Sometimes," and "Usually." At the endpoints, however, some clear distinctions emerge. As per prior responses, the License Renewal sample had the highest proportion of drivers who responded "Never," followed by the Residential Community and then the Medical Referral sample. At the opposite end of the scale, the largest proportion of drivers responding "Always" were in the Medical Referral group, followed by the Residential Community and then the License Renewal sample.
Unfamiliar roads. Figure 27 shows the frequency of avoiding unfamiliar roads. The most frequent response among all three samples was "Never." The License Renewal sample had the highest proportion of drivers responding "Never" whereas the Medical Referral group had the highest proportion of drivers responding "Always." This pattern of results is very similar to that for bad weather avoidance.
Heavy traffic. Self-reports of the frequency of heavy traffic avoidance are presented in figure 28. It is interesting to note that on this measure the Residential Community most closely matches the responses of the Medical Referral sample. In any case, the License Renewal sample has the highest proportion of drivers who "Never" avoid driving in heavy traffic.
Social opportunities. Finally, figure 29 presents the results for the frequency with which drivers pass up social opportunities because of concerns about their driving. The overwhelming majority of drivers from all three samples reported that they "Never" pass up such opportunities. Also, the proportion of drivers indicating that they "Never" pass up opportunities to drive shows the same pattern seen before where the proportion of drivers is highest for the License Renewal, next highest for Residential Community drivers, and lowest for Medical Referral drivers.
The subjective data summarized in this section have provided a useful contrast between groups of drivers with known differences in their age characteristics and functional status. First, these data showed that the similarity between groups in terms of functional ability is more important than their proximity in age, vis a vis the reported frequency of driving and the number of miles driven. In both cases, the sharpest distinctions observed among the self-reports were between those of the Medical Referral group and the other two samples.
A somewhat different picture emerges from inspection of the self-report data regarding avoidance of problem driving situations. In these comparisons, responses from the samples closest in age composition--the Residential Community and the Medical Referral groups--were more alike and were in contrast to the responses of the younger, License Renewal group with regard to how often the identified situations were "Never" avoided. Still, the Medical Referral group was consistently higher than the others with respect to how often these drivers said they "Always" avoid the identified situations.
These data reinforce other research findings and anecdotal reports indicating that self-regulation among older drivers is common. This supports a stance that, while safety-relevant functional deficits may be significant from both a statistical and operational perspective, these deficits may not manifest themselves in predicted increases in crash rates due to mediating effects of self-regulation. At the same time, the qualitative results summarized above provide only the most general insight into the questions of whether the "right" drivers (i.e., most functionally impaired) are self-limiting their exposure, and in what situations, and by how much.
This section quantifies and tests the significance of the statistical relationships between the functional screening measures and the crash data extracted from the Maryland Motor Vehicle Administration files. These associations were calculated according to the conventions for measuring, sorting, and summarizing functional status and safety outcome data described previously in this report.
The strength of relationship between functional status and crash risk was assessed primarily through the use of the "odds ratio" calculation. A brief explanation of this analytic technique, assumptions that must be met for its valid application, and its relationship to another potentially useful approach ("relative risk") follow. Analysis results are then reported.
Odds Ratio Calculations. Odds ratios are calculated by taking the ratio of "experimental event" odds to "control event" odds. The experimental event in the present application occurs when a driver "fails" a particular screening measure, whereas a control event occurs if the driver "passes," based on some criterion or cutpoint. Also included in this calculation are the event classification outcomes--crash versus no crash. Using traditional signal detection terminology, it may be demonstrated that each driver in the sample falls into one of four groups as shown in figure 30. For each of these groups, the odds of being involved in a crash are then calculated according to the formula shown in this figure (cf. http://www.jr2.ox..ac.uk/cebm/docs/oddsrats.html). For future reference, it should be noted that the numbers of drivers in cells labeled b and d in figure 30 will always vastly exceed the number of drivers in cells a and c, because motor vehicle crashes remain (relatively) rare events.
In this context, the practical meaning of the odds ratio (OR) is to express how much more likely it is that drivers will be involved in a crash if they fail a test than if they pass the test. For example, an OR value of 3 means that you are three times more likely to be involved in a crash if you fail a test than if you pass the test. Also worth mentioning is the relationship between OR and "relative risk." While subtle differences exist in calculating these measures, for rare events (i.e., crashes) they yield equivalent results (http://www.cche.net/usersguides/overview.asp).
Although the odds ratio has been used effectively in a number of contexts, it is important to note a few limitations to the validity of this calculation. First, OR cannot be calculated when any of the cell values are zero. Paradoxically, this includes instances where the measure is a perfect predictori.e., where there are no "misses" (where a driver passes the test but still has a crash) or "false alarms" (where a driver fails the test but remains crash-free). Second, an OR calculated for data with less than 5 counts in any cell in the matrix shown in figure 30, is statistically unreliable and easily susceptible to misinterpretation.
Finally, even when requirements for a valid OR calculation are met, the resulting values can be quite misleading. Since the OR calculation relies on four different cell counts, a high value can result from a relatively high number of hits or correct rejections. Conversely, the calculated OR can be high due to a relatively low number of false alarms or misses. Understanding the predictive value of an OR outcome requires an investigation of actual cell counts, a comparison of raw data distributions, and the investigation of multiple pass/fail cutpoints. Interpreting an OR value without explicit reference to these analysis attributes is problematic. Often, the pattern of change in calculated OR values across different cutpoints is most revealing of the relationship between predictor and criterion measures. The plots presented in this section, the accompanying data tables in appendix F, and the chi-square tables in appendix G are designed to satisfy requirements for a meaningful interpretation of calculated odds ratios.
In the following pages, three plots with bar graphs are presented to express the results of the OR calculations for every continuous measure in the functional screening battery. The three plots correspond to the three crash outcome measures--all crashes, at-fault plus unknown-fault crashes, and at-fault crashes only--as defined earlier in the report. Each plot contains the distribution of crash-involved and non-crash-involved drivers in the License Renewal sample, for a particular measure of functional ability. The heights of the bars allows direct comparison of the crash and non-crash distributions of drivers who "failed" a given test, calculated separately at each of a number of possible cutpoints within the range of performance on that test. The y-axis, labeled Percent of Distribution, is common for all plots. For the x-axis, movement from left to right connotes decreasing performance (or increasing functional impairment) for all measures. The x-axis varies among the plots, however, according to the units in which performance is measured (e.g., time, distance, percent correct) and the overall range of performance, for each continuous measure; for binary (pass/fail) measures, every response alternative is marked on the x-axis. The values labeled on the x-axis in each data plot thus define the range of all possible cutpoints for a given screening measure that were evaluated in these analyses.
For each measure, OR was calculated at every possible cutpoint represented in the plot. The resulting OR calculations were then graphed as a continuous line, using the right vertical axis to indicate OR value. In this context, the term "cutpoint" means that everyone who scored at that level of performance or worse failed the test; to pass the test a driver must perform better than the cutpoint. Therefore, no OR value could be calculated for the best level of performance on each functional measure, because no one passed according to the operational definition above. With no passes, the denominator in the odds ratio calculation formula (see figure 30) is zero. The line representing calculated OR value thus begins at the second-best level of performance, or first possible cutpoint, marked along the x-axis in each plot. Also, in every plot a dashed line, connoting an OR of 1.0, is included for reference. At this level, a driver is as likely to be crash-involved when passing a test as he/she is when failing the test; and the OR effectively has no predictive value. Exact OR values for the data represented in the plots, keyed to each potential cutpoint marked on the x-axis, are presented in appendix F.
Significance Testing. Levels of significance of calculated OR values were assessed using chi-square (c2) tests. Test statistics were calculated by SPSS/SYSTAT for relationships between functional performance measures and at-fault crashes. However, not all possible cutpoints were evaluated; typically, the significance level attained at the cutpoint where the peak valid OR value was calculated for a given measure is what is reported in the following write-up of analysis results. Chi-square tables are presented in appendix G.
As a general finding, it was observed that an OR value of approximately 2, or greater, was associated with a statistically significant (p<.05) chi-square test result. Sample sizes and the respective distributions of crash-involved versus non-crash-involved drivers--gauged in terms of their relative proportions at different degrees of functional impairment--also exert strong influence on c2 test results, as noted below where appropriate.
1 Pers. comm., Mr. Jack Joyce, Driver Safety Research Office, Maryland Motor Vehicle
Administration, January 8, 2002, conversation with the Principal Investigator.
2 Pers. comm., Dr. Robert Raleigh, Medical Advisory Board, Maryland Motor Vehicle Administration, November 27, 2001, conversation with the Principal Investigator.
BACK | CONTENTS | NEXT