The most widely cited study on the effect of cell phone usage on traffic accidents is this one by Redelmeier and Tibshirani in the New England Journal of Medicine.  Their conclusion is that talking on the phone leads to a fourfold increase in accident risk.

Their method is interesting.  It’s called a case crossover design, and it works like this.  We want to know the odds ratio of an accident when you talk on the phone versus when you don’t.  Let’s write it like this, where A is the event of an accident and C is the event of talking on a cell phone while driving.

\frac{P(A \mid C)}{P(A \mid \neg C)}.

But we have no way of estimating numerator or denominator from traffic accident data because we would need to know the counterfactuals of how often people drive (with and without talking on the phone) and don’t have accidents.  Case crossover studies are based on a little algebraic trick which transforms the odds ratio into something we can estimate, with just a little more data.  Using Bayes’ rule and two lines of algebra, we can rewrite it like this.

\frac{P(A \mid C)}{P(A \mid \neg C)} = \frac{P(C \mid A)}{P(\neg C \mid A)} \cdot \frac{P(\neg C)}{P(C)}.

From accident data we can estimate the first term on the right-hand-side.  We just calculate the fraction of accidents in which someone was talking on the phone.  The finesse comes in when we estimate the second term. We don’t want to just estimate the overall frequency of cell phone use because we estimated the first term using a selected sample of people who had accidents.  They may be different from the population as a whole. We want the cellphone usage rates for the people in our sample.

Case crossover studies take each person in the data who had an accident and ask them to report whether they were talking on the phone while driving at the same time of day one week before.  Thus, each person generates their own control case.  It’s a valid control because its the same person, driving at the same time, and on average therefore under the same conditions.  These survey data are used to estimate the second term.

It’s really clever and its used a lot in epidemiological studies.  (People get sick, some were exposed to some potential hazard, others not.  The method is used to estimate the increase in risk of getting sick due to being exposed to the hazard.)

I have never seen it in economics however.  In fact, this was the first I ever heard of it.  So its natural to wonder why.  And it doesn’t take long before you see that it has a serious weakness when applied to data with a lot of heterogeneity.

To see the problem, suppose that there are two types of people. The first group, in addition to being generally accident prone are also easily distracted.  Everyone else is a safe driver and talking on cellphones doesn’t make them any less safe.  Then our sample of people who actually had accidents would consist disproportionately of the first group.  We would be estimating the effect of cell phone use on them alone. If they make up a small fraction of the population then we are drastically overestimating the increase in risk.

It’s fair to say that at best we can use the estimate of 4 as an upper bound on the risk ratio averaging over the entire population.  That population average could be zero and still be consistent with the findings from case crossover studies.  And there is no simple way to remedy the problems with this method.  So I think there is good reason to approach this question from a different direction.

As I described before, if cell phone distractions increase accident risk we would see it by comparing the population of drivers to drivers with hearing impairment, who don’t use cell phones.  And it turns out that the data exist.  In the NHTSA’s database of traffic accidents, there is this variable:

P18 Person’s Physical Impairment

Definition: Identifies physical impairments for all drivers and non-motorists which may have contributed to the cause of the crash.

And “deaf” is impairment number 9.