Good Statistics, and Bad
Suppose someone wanted to misrepresent a public policy to you. How could they do so most effectively? And who can help you resist?
It’s certainly a believable hypothetical. With two major parties who seem to disagree on everything, multiple intra-party fault-lines, and a plethora of interests who wish to turn laws and regulations in their favor, whipped together by a press in search of partisan scandal and ratings, it is hard to see how it could be otherwise. In fact, for almost every issue, it seems very likely that some, if not many, groups, will be tempted to promote their interests using techniques ranging along the spectrum from “putting one’s best foot forward” to bald-faced lies.
There are plenty of common political tricks that fall short of outright lying. For instance, one can bury desired changes in the paper avalanche of an omnibus bill, as in the Minnesota legislature’s recent attempt to sneak in enactment of the National Popular Vote project. Or one can pass vague legislation that passes the buck for what it will mean in practice to executive agencies and the courts. But such forms of subterfuge are not my interest here.
I wish to ask how people would misrepresent things in the open, rather than behind such political camouflage? As I warn my public policy students, the general principle is that people will lie to you in whatever areas you are most vulnerable.
If you are American, one of those weak spots is typically mathematics, and particularly statistics, which is why it earns its place of shame along with lies and damned lies. That is why the tricks for how to misrepresent statistics discussed in Darrell Huff’s How to Lie with Statistics still keep the book selling 65 years after its initial publication.
However, widespread ignorance goes deeper than the science of statistics itself. Very few people have a clear idea on what the data involved actually measures, under what assumptions and limitations, which can lead to careless and irresponsible usage. For instance, few people can articulate why both the employment and unemployment rates could go up at the same time, and which would be a more reliable economic indicator in such a case, when their names suggest it shouldn’t be possible.
Thomas Sowell , in his most recent book, Discrimination and Disparities, describes the problem as “overlooking simple but fundamental questions as to whether the numbers on which… analyses are based are in fact measuring what they seem to be measuring, or claim to be measuring,” which, in order to defend ourselves against misrepresentation, requires “much closer scrutiny at a fundamental level.” But far too few apply such careful, fundamental scrutiny.
However, there are a few people who do yeoman work in this area, providing valuable “insurance” against errors others would encourage us to make. They deserve our appreciation for toiling in that underserved area, and I would like to express thanks to several whose efforts I have particularly benefitted from.
Thomas Sowell is one such author who has provided a great deal of clarification over decades of prolific publication. For example, one common theme of his is the need to distinguish between what happens to a particular category of people (e.g., “the rich” or “the poor”), interpreted as a stable group, which lends itself to class-based conclusions, and the very different experiences of real people who move in an out of such categories over time, which upsets such analyses.
Discrimination and Disparities reiterates that theme from his earlier books. But my favorite illustration is his discussion of the famous Card and Krueger minimum wage study, which purported to overturn the conclusion that raising the minimum wage increases unemployment. It surveyed the same employers, asking how many employees they had before and after a minimum wage increase. The problem is that “you can only survey the survivors.” Anyone who went out of business, and the jobs that consequently disappeared, would not be included, so even if surveyed survivors did not reduce employment, many jobs invisible to their approach could still have been lost. To reinforce the image, he notes that a similar before-and-after survey of those who played Russian Roulette would show that no one was hurt, and cites a quip by George Stigler that if it had been used in a survey of American veterans in both 1940 and 1946, it would “prove” that “no solider was mortally wounded” during the war.
Another very prolific watchdog for statistical malfeasance is Mark J. Perry . He points out so many useful “red flags” in multiple outlets that I look forward to what is almost a one-a-day pleasure. A good example is his evisceration of “Equal Pay Day” discussions that attribute differences between median yearly incomes to unjustifiable discrimination against women “doing the same work as men.” He points out that the data fails to adjust for differences in “hours worked, marital status, number of children, education, occupation, number of years of continuous uninterrupted job experience, working conditions, work safety, workplace flexibility, family friendliness of the workplace, job security, and time spent commuting,” each of which would lead men to be paid more, on average.
Andrew Biggs is another stickler for statistical responsibility, particularly in areas connected to retirement security and retirement plans. For instance, in Forbes , he showed that a recent GAO report concluding that 48% of U.S. households aged 55 and over in 2016 “had no retirement savings” was far different from reality, as 72% of people had such savings plan, when those with traditional defined benefit pensions are counted, and 83% of married households had such savings when including those where only one had a retirement plan. Just those two changes massively changed the conclusions. And he pointed out other biases, as well.
These three people have each helped me understand measurement issues far better than before, enabling me to avoid errors that would have undermined my analyses of policy issues. I owe them thanks. But readers might also give them more attention, for similar “tutoring.” Many others have also been of use to me, and as I continue to learn, perhaps I can give a shout-out to others in the future, especially as this labor pool is still far too shallow. But mainly I wanted to put out a serious warning about ignorance not only of statistical applications and presentations, but also of the data that is often misused in reaching policy conclusions.