Documentation Page
Tri-Level Comparison


The approach selected by Veridian Engineering to meet the objectives of the study involved the development and refinement of a crash causation clinical assessment methodology, the selection of a data source, the determination of necessary crash-related data, site selection, and data analysis techniques.

Clinical Assessment

A clinical analysis sequence was developed in order to determine the causes of crashes investigated and the specific unsafe driving acts or behavioral errors that occurred and contributed to the crash. The clinical analysis sequence was comprised of eleven steps:

    1. Assess crash participants statements.
    2. Examine physical evidence patterns generated during the crash sequence.
    3. Verify accuracy of available data and resolve discrepancies.
    4. Verify crash type.
    5. Assess pre-existing conditions.
    6. Assess critical event.
    7. Evaluate crash cause.
    8. Evaluate driver behavior (safe/unsafe).
    9. Specify UDA.
    10. Determine intentionality of UDA.
    11. Determine behavior source of UDA.

A schematic representation of the clinical analysis sequence is provided in Figure 1.

Previous experience indicated that most of the data required to successfully execute steps 1-7 was available in standard case reports provided by the National Automotive Sampling System (NASS) Crashworthiness Data System (CDS). It was also apparent, however, that additional data collection would be required to provide an adequate basis for executing steps 8-11 of this analysis sequence. This additional information related to what the involved drivers observed as the crash sequence developed, their specific responses to pre-crash and crash events, and their general physiological and psychological states prior to the crash. The project staff developed detailed interview formats to secure the required data.

Data Sources

Since the data necessary for steps 1-7 of the clinical assessment were already available in the NASS, and there was a desire to attain a fairly representative sample of serious crashes in the U.S., a decision was made to integrate the data collection activity into the NASS program as a special study.

Schematic Depiction fo Clinical Analysis Method

Evaluate Crash Cause (Steps 1-7) (What was the primary reason for the crash?)

Figure 1: Schematic Depiction of Clinical Analysis Sequence

Field Data Collection

Field data were collected in the following manner:

  • Case Selection Cases were selected in accordance with the NASS sampling algorithm.

  • Scene Documentation Scenes were documented in accordance with the NASS scene protocol with a few minor additions. NASS Researchers were requested to measure and photograph aspects of the roadway geometry/configuration and roadside features which may have influenced crash causation.

  • Vehicle Documentation Vehicles were documented in accordance with the NASS vehicle documentation protocol. A smaller number of exterior vehicle photographs were submitted with the UDA case report and interior vehicle documentation forms were omitted from the package. Obvious vehicle failures were recorded.

  • Occupant Injury Documentation Occupant injury levels were documented in accordance with the standard NASS protocols.

  • Driver Interviews The project staff developed a UDA form which summarized UDA data for each driver involved in the crash. While most of the variables contained in the UDA form were also present in a driver interview form, the driver was not intended to be the sole source for the UDA form responses. The intent of this form was to provide the most accurate assessment available for each driver in the crash sequence. Therefore, field investigation personnel were instructed to incorporate findings from other interviews conducted for the crash and from their field investigation of the crash sequence.

Data Processing

A UDA database was designed as a series of sub-files that described individual crashes. The file record for each crash contained the following information:

  • Selected NASS CDS Variables A total of 95 NASS CDS variables were incorporated into the UDA database directly from the NASS computerized file. Variables incorporated from the NASS Crash Form were general variables that applied to the overall crash sequence. All remaining CDS variables incorporated from the NASS file were either vehicle or occupant specific and were provided for each crash-involved vehicle/occupant.

  • UDA Form Variables A total of 78 UDA Form variables were incorporated into the database. These variables were coded by the NASS Researchers following certain clinical assessment rules.

  • UDA Variables Coded By Project Staff A total of 13 UDA variables were coded by the project staff for each crash-involved vehicle using the clinical assessment technique. These variables added the following information to the database:

    • Primary crash cause
    • Nature of crash causation factor
    • Assessment of the manner of vehicle operation on crash risk
    • Primary and contributory UDAs
    • UDAs which were a necessary condition for crash occurrence
    • Intentionality of primary UDA
    • Behavioral sources of UDAs
    • Temporal sequencing of UDAs
    • Estimated travel and impact speeds
    • Nature of speed estimates

Site Selection

It was considered important to select a limited number of sites to ensure that adequate oversight could be provided to these sites. In addition, it was important to select sites which had historically achieved high scene/vehicle inspection rates and very high interview completion rates in the NASS. A total of four PSU sites meeting the above criteria were selected to participate in this effort. The final sites were:

PSU Location

Allegheny County, Pennsylvania
Knox County, Tennessee
Jefferson and Gilpin Counties, Colorado
Seattle, Washington

Data collection at each of the four NASS sites was initiated on April 8, 1996, for crashes occurring on or after April 1, 1996. Data collection ended on April 30, 1997. A total of 723 crash cases involving 1284 vehicles was collected during this period.

Data Analysis

All relevant data were computerized and analyzed using the SAS statistical package. Initially, univariate analyses were performed to determine relative frequencies of the various unsafe driving acts (UDAs), driver behavioral errors, and crash types. In addition, multivariate analyses were performed to determine relationships between the UDAs, driver behavioral errors and crash circumstances. Emphasis was placed on identifying the most important driver demographic and behavioral characteristics and crash situation descriptions associated with each of a set of crash types. This analysis produced a series of profiles of the driver's actions, attributes and crash conditions.

For each crash type, the relative involvement for each value of each profile variable was calculated (excluding missing and unknown values). For each level of the profile variable, a relative involvement index, Ir was computed to assess the over- and under-representation of the level (i.e., row in the table) for the crash configuration relative to all crash configurations combined. Ir was a logodds like quantify. If Ir>0, then the row was over-represented in the column relative to the total column for a crash type. If Ir<0, then the row was under-represented in the column, relative to the total column for the crash type. The relative involvement index was defined as follows:

Ir = 1n{TBr/CTBR)/(Tr/CTr)}, where


CTr = T - Tr


Crash Type

Levels of Profile Variable

Type A

Type B

Continued Types






T1 = % of T





T2 = % of T















Tr = % of T





T = TAll

Two sets of tables were prepared showing the frequency, percentage and relative involvement index for each response level for each of 59 variables for each of the crash types. These tables were annotated to identify the highest frequency, the most over-represented, and the most under-represented response level for each variable and crash type.

Data Limitations

The interpretation of the findings presented in this report was based on unweighted data rather than on national crash estimates. This approach was implemented due to certain data limitations, as follows:

  • The data were obtained from only four of the twenty-four National Automotive Sampling System (NASS) sites, consequently the results of the study were not representative of the nation as a whole and may not generalize to the population of all crashes. In addition, an important major feature of the NASS sampling plan was that severe crashes were oversampled relative to less severe ones. For example, the NASS sample included fatal crashes with certainty, but property damage crashes with only a very low probability. The NASS sampling weights account for these uneven sampling probabilities, and the sampling weights in our sample varied over a wide range: from a high value of about 3,000 to a low value of about 3. Because the sample was not nationally representative, it was not appropriate to use the available NASS weights to expand the sample to national estimates for each studied crash type configuration and associated combination of crash factors. The approach taken in this study was to tilt all estimates towards severe crashes. Not using weights resulted in a bias relative to national distributions, but accorded more importance to severe crashes than to less severe crashes.

  • A related limitation of the study sample was that it included only a relatively small number of crashes (723) and drivers (1,284). The small sample size further limited analyses that simultaneously examined up to five factors - crash cause, primary behavioral source, necessary UDA, first UDA in the sequence, and travel speed - within each of seven uniquely identifiable crash type configurations that were included in this study. It should be noted that the crash configurations had sample sizes ranging between 121 and 389, enabling either a detailed look at a few events (combinations of one or two crash factors) or a coarse-grained look at many events (combinations of 3 or more factors).

  • An additional limitation was that the variable "BAC Test Result" was rarely available in the CDS data, limiting the use of that variable to reporting estimates of alcohol involvement.

  • It is also important to note, although the staff making the clinical assessments was highly experienced (e.g., three analysts/over 75 man-years of experience), causal factor and UDA assessments were subjective in nature and, therefore, were open to question. Veridian Engineering firmly believes that this approach is valid and accurate. In intercoder reliability checks performed during this interval, very high levels of agreement (e.g., Pearson Coefficients in the 0.98 to 0.99 range) were noted between individuals making the assessments and consistent findings have been documented over extended time intervals.