False positives, the Internet, and the grievance media
There is a little known, except among statisticians1 rule of statistics: when the rate of false positives exceeds the incident rate in a population, that test is more likely to be wrong than to be right about some incident having occurred. It doesn’t matter how accurate the test is, if the false positive rate exceeds the incident rate.
For example, say you have a cancer test that is correct 98% of the time. This means2 that it has a false positive rate of 2%. Since it is wrong 2% of the time, 2% of the time it will say that someone has that cancer when, in fact, they are fine.
Now, suppose that this particular cancer occurs in one out of a hundred thousand people. Some concerned politician of a ten-million population city says, we have this test that is practically always correct, and we have a lot of people with this cancer. We should run this test on everybody.
What happens after the city runs its test on its ten million residents? The test will tell 98 people who have the cancer that they have it.3 And it will tell 200,000 people who don’t have the cancer that they have cancer.
There is a further rule of thumb that, the bigger your population the lower your incident rate for any non-trivial occurrence, just because of the way people work. That cancer test might have made sense when used against patients who come in to have something looked at: it might well be that among patients who come in for an examination for some problem, and who are, after they talk to a doctor, referred to this test, are one in ten likely to have this cancer. The population is a population of people who have something wrong, and that something wrong already resembles this cancer. In that population, of, say, a hundred patients referred to the test, the test will tell about nine or ten of them that they have the cancer when in fact they do have it, and will tell one or two of them that they have cancer when instead they are cancer free.
But expand the test’s population beyond people who in conjunction with their doctors know they are sick, and the test falls apart.
I think we are seeing the same thing in the explosion of false rape reports and false hate crimes in the news media. Most women don’t lie about rape, and most people don’t enjoy being hated. Limit the population that gets reported on to those who call the police and file a police report and whose cases are then prosecuted, and you’re probably going to have mostly true cases reported in the media.
Start trolling the entire population for juicy stories, as Rolling Stone did, and you will find juicy stories—and chances are, many of them will be lies. The traditional test doesn’t work outside of the traditional population. Sometimes it will be right, but a non-trivial number of times it will be wrong.
I agree with those who argue that the University of Virginia rape story failing doesn’t invalidate having a national conversation, but the conversation needs to be about the competence of our news media and the applicability of their reporting methods. The UVA story, for example, makes me wonder if Justice Thomas was an early victim of the paradox: his opponents searched very hard for something to use against him during his confirmation hearings.
How many other scandals have been the result of using a flawed sieve to sort through populations larger than the sieve was meant for?
The same goes for witnesses to events that certainly happened, such as the Michael Brown shooting. Search deep enough, and you can find a witness who will say what you want to hear. But if the test is merely that they’re saying it, then you’re going to have a whole lot of false positives.
Combined with the media’s tendency to stop looking when they find the answer they want, the false positive paradox virtually guarantees large numbers of “hype positives”.
In What Your Children are Doing on the Information Highway I wrote of the Internet as a word processor for social relations, and that however strange or unique your beliefs, someone on the net shares them. If the media chooses to use the Internet as a searchable database of preconceptions, they will be able to find stories that match whatever line they want, and they will be able to find people to give them those stories.
In response to Confirmation journalism and the death penalty: Iterative journalism is like the Red Queen in Alice in Wonderland: “Sentence first, verdict after.” The Elements of Journalism praises David Protess’s project that railroaded a mentally disabled man into prison for fourteen years, because it served their bias.
And seemingly too little known among statisticians, or at least statisticians who talk to the media.
↑In heavily simplified terms, of course, since rates always vary around the average.
↑Technically, 98 people on average.
↑
Anita Hill
- Clarence Thomas, Anita Hill and I have some things in common: Michael M. Keohane at RedState
- “First of all, we all attended law school in the 1970’s – in fact, Anita Hill and I graduated in the same year.”
- Clarence Thomas, Part II: Thomas Sowell
- “The first of these hard facts is that, contrary to what has been repeated so often in the media, it was not just a question of what "he said" versus what ‘she said.’”
confirmation journalism
- Are Facts Obsolete?: Thomas Sowell at Real Clear Politics
- “Some of us, who are old enough to remember the old television police series ‘Dragnet,’ may remember Sgt. Joe Friday saying, ‘Just the facts, ma’am.’ But that would be completely out of place today. Facts are becoming obsolete, as recent events have demonstrated.”
- The Elements of Journalism
- Now that the Internet empowers readers to check the veracity of news reports, journalists need to come up with more and better justifications for their bias.
- Frontiers are for Children
- The infobahn is a completely new way of organizing our world. Our friends and acquaintances do not have to be the people who live near us. We find this an odd way of life, but our children will not.
- Lies, Damned Lies, and Social Media (part 5 of ∞): Scott Alexander at Slate Star Codex
- “There is something called the ‘just world fallacy’, that says everyone gets what they deserve and moral questions are always easy and there is never any need to make scary tradeoffs.”
false positives
- False positive paradox at Wikipedia
- “The false positive paradox is a statistical result where false positive tests are more probable than true positive tests, occurring when the overall population has a low incidence of a condition and the incidence rate is lower than the false positive rate.”
- The Paradox of a False Positive: vanryzlg
- “This reminded me of when we looked at the idea of data mining students to search for potential of suicide or mass shootings and the like. At the time, we simply talked about the ethics of looking into a students private data, but the efficiency and accuracy is a huge part of this too.”
More confirmation journalism
- The gullible media and the chocolate factory
- Journalists, because of their background and temperament, are specially unsuited to report on science.
- The Elements of Journalism
- Now that the Internet empowers readers to check the veracity of news reports, journalists need to come up with more and better justifications for their bias.
- Confirmation journalism and the death penalty
- Iterative journalism is like the Red Queen in Alice in Wonderland: “Sentence first, verdict after.” The Elements of Journalism praises David Protess’s project that railroaded a mentally disabled man into prison for fourteen years, because it served their bias.