Which came first, the research question or the natural experiment?

A few years back Moretti and Dahl had a working paper about the “Demand for Sons.” The essential fact was that parents of daughters are more likely to divorce than parents of sons. The paper was recently published in the Review of Economics and Statistics.

The working paper got a lot of press and got a lot of people excited. See Steven Landsburg’s old columns at Slate, for example. One reason folks got so excited was that Landsburg’s title was a bit more provocative: “Oh No! It's a Girl. Why do Daughters Cause Divorce?”

What a crazy idea, yes? Where did the authors come up with it?

Neither Dahl nor Moretti have spent much of their lives studying marriage, divorce, or family dynamics, or at least that is what one would gather by looking at their other publications. Almost surely they came up with this idea after recognizing, or seeing someone else recognize, that the gender of the child is essentially (or at least seemingly) random. Dahl and Moretti (I’m guessing) reasoned that if gender of a child is randomly assigned then everything correlated with child gender must be caused by a child’s gender.

And so began the hunt for the question…

Dahl and Moretti is just one example from a large and growing faction of empirical economics that seemingly starts with the answer and then works backwards to find the question. The idea is to actively look for so-called natural experiments, odd and particularly acute events, that suddenly, unexpectedly and seemly randomly affect one group and not another. These are nature’s experiments. The empirical economist’s role is then to work backwards to find an interesting question to pair with nature’s experiment. Voila, a new discovery!

Andrew Gelman’s reaction to learning this approach to empirical research was both amusing and interesting.

I speculate (but do not know for sure) that this faction of economics can be traced to the influence of statisticians like David Freedman. Freedman thinks hard about empirical methodology and often writes critically of applications of regression analysis, and also advocates the use of natural experiments. He likes to describe the 19th century work of Snow who discovered that cholera was a waterborne infectious disease using a compelling natural experiment, some 75 years or so before Fisher developed modern test statistics. Freedman seems to think regression analysis has made scientists and especially social scientists lazy. Technique, he says, is a poor substitute for shoe leather, of carefully developing an appropriate research design and collecting the appropriate data.

I wonder, then, what Freedman would think of this new paradigm that starts with the natural experiment and then works backwards to find the question. Somehow this doesn’t seem like shoe leather. But it sure can be fun and can sometimes very interesting. I do think some good papers have and will continue to come out of this paradigm.

But I have some worries, too.

One worry is that we don’t know all the potential questions that might have been matched to the natural experiment. For example, when we read a paper like Dahl and Moretti we don’t know all the other dependent variables Dahl and Moretti tried to link with child gender. Perhaps they searched for variables in the census with the strongest non-obvious association with child gender. If so, I think this means their statistical significance is much too high. It’s data mining in reverse.

Another worry is that not all things seemingly random are truly random. Is it impossible to think that there is some obscure factor that influences the sex of a child and might also influence divorce? Hepatitis can apparently influence the sex of a child. So maybe Hepatitis causes divorce. Or maybe child gender is linked to certain hormonal imbalances that cause the egg to or uterus to favor one kind of sperm over another, and hormonal imbalances cause divorce.

(Aside: I’m big-time speculating here. And I’m picking on this particular paper because I like it, as you will see below. I’m definitely not saying Dahl and Moretti are wrong. Rather, I’m speculating about what might generally go wrong with a general methodology that espouses or implicitly encourages looking for a question after finding a natural source of seemingly random variation.)

If the link were especially large this would be an easy thing to dismiss. But it’s not. If you look closely (and this is especially non-transparent in Landsburg’s summary of their paper) a first-born daughter lives without a father in 16.7% of households and first-born son lives without a father in 16.2% of households. Less than half of this difference is explained by divorce. If this were made clear in popular articles it wouldn’t have caused such an uproar. Actually, it probably wouldn’t have gotten much media coverage at all. It’s too bad (and this is a legitimate criticism) Dahl and Moretti don’t report the basic statistics in clearer fashion in their abstract and introduction. Instead the statistics they do report up front are, in my view, rather obscure. And they write several times how "economically large the effect is." Uhm, if it were large, they probably wouldn't have to keep saying so. Landsburg writes divorce rates are 5 percent higher with daughters than with boys. Actually the difference is about 3 percent. But that's a percent change of a percent, which makes it both confusing and misleading. And it counts a lot of things besides divorce.

The point is that, especially for small effects, obscure third factors may actually be driving things. Confounding could be more of a concern than it may seem.

When using this kind of approach to empiricism I think it is critically important to find alternative or additional corroborating evidence, as Dahl and Moretti do. Indeed, I think this is the most interesting facet of their paper: they show that marriages are more likely to occur in the first place following an out-of-wedlock “shotgun” pregnancy that fathers a boy rather than a girl, and, furthermore, that these additional marriages occur only after observing an ultrasound that determines the sex of the child. This would seem to rule out other explanations for the correlation besides the “demand for boys”. This part of the paper is shoe leather.

Perhaps another worry is that this approach to research avoids focus on what some might perceive to be more important or interesting questions facing society. I’m not particularly sure of the importance of the paper by Dahl and Moretti, but it sure is interesting. I imagine one day the importance may be clear, too. In any case, it sure drew my attention more than the average paper, even among those from very good journals.

So while there are potential concerns and challenges with this approach to research, there is also a big potential benefit: it may help us to see questions we never thought of. This is interesting because in many ways our research and what we “know” is governed by the questions we choose to ask. If instead we start with a compelling natural source of variation and then look mechanically for an endogenous corrrelate, it suggests (but does not by itself indicate) a causal link. While far from foolproof, such a correlation gives rise to an interesting question: why does this association exist? Since one side is seemingly randomly assigned, the correlation could well be more interesting and relevant than most correlations.

A wise man once said academic economists should “question the question.” Very often it seems we as researchers (like everyone else) aquire a certain amount of tunnel vision by following what everyone else is doing. I wonder sometimes, and especially in times like now and over the last few years, how much researchers are like those driving asset and housing-price bubbles: following the herd. How often, I wonder, do we miss the really important finding because we’re too blind to see the really interesting and important question. At least in economics, an amazing share of Nobel prizes have gone to scholars whose findings seemed trivial once the right question was asked.

Maybe starting with the natural source of variation and working backwards toward the question is just an interesting and playful diversion from serious research, not something that should be done on a regular basis. But maybe it’s more than that. Maybe it can help us to remove the blindness imposed by our humanness, our intrinsic non-objectivity, and push us to ask truly unique questions. Maybe it can help us to see the proverbial elephant in the room that no one sees.

In any case, I certainly don’t think this is how all research should be done. Also, it requires extreme care, because without it this approach can lead to rather serious problems of data mining. But with care, those concerns can be reconciled with replication and validation.


Popular posts from this blog

Nonlinear Temperature Effects Indicate Severe Damages to U.S. Crop Yields Under Climate Change

Commodity Prices and the Fed