Did you read A Wrinkle in Time as a kid? Charles Wallace, the telepathic baby brother in the book, would have been way less endearing if his psychic skill was to guess when an erotic picture was about to appear on a computer screen. And Matilda would have been a pretty dull book if the heroine's talent was getting bored before something boring happened. These are not the kinds of paranormal abilities anyone aspires to. But a research paper claiming to have found evidence for these abilities has been causing a lot of hubbub.
Daryl Bem, an emeritus professor at Cornell, is going to have his paper published in an upcoming issue of the Journal of Personality and Social Psychology. He's a respected researcher and it's a respected publication. These are not the circumstances under which you usually read about ESP--or "psi," as psychologists call it. Nevertheless, Bem's paper passed through peer review, which may have you feeling angry, confused, excited, or (if you possess precognitive abilities) totally unsurprised.
Bem's paper, "Feeling the Future" (you can see the unpublished version here), consists of 9 experiments that take standard psychological effects and reverse them. For example, say you're given a list of 48 nouns to read. Then you do an exercise--rearranging lists of words--in which you see half of those nouns again. Finally, you're asked to recall as many of the original 48 words as you can. You're expected to do better at remembering the words you "rehearsed" in the list exercise. Bem reversed this experiment by showing subjects 48 words, then asking them to recall as many as possible, and then giving them an exercise that used half the words (randomly selected by a computer). Bem reports that his subjects had better recall of the words they would rehearse later, because they had psychically anticipated practicing those words.
Another set of experiments studied "habituation," which non-psychologists call "getting used to stuff." Scary or gross pictures might prompt a strong reaction the first time we see them, but less of a reaction the second or third time. In a normal habituation experiment, a photo of a dangerous-looking snake might flash on a computer screen too quickly for you to register it consciously. Then you'd be shown the same snake photo next to a photo of, say, a spider, and asked which you like better. You're expected to prefer the snake, because seeing it subliminally has made you habituated--it doesn't bother you as much anymore. Bem's experiment reversed this: First subjects chose which of two pictures they liked better, and then one of them was flashed subliminally on the screen. The strongest results came when, instead of negative-reaction photos, the computer flashed erotic photos. (In that case, subjects supposedly preferred the erotic photo they were not about to see subliminally, because they weren't preemptively habituated to it.)
These effects weren't big; just a shade away from the results you'd get by guessing. But for 8 of his 9 experiments, Bem reports that the results were "statistically significant." If you've taken college science classes, you know what this means: A statistical test found the odds that the result would have occurred by chance alone to be lower than 5%. Of course 5% is an arbitrary cutoff; unlikely things happen by chance all the time. But scientists generally accept a result (called a p-value) under 5% as noteworthy.
For his erotic-picture experiment, Bem reports an even better p-value of .01. This is a less-than-1% possibility that chance alone could have caused his results. But think of the p-value as a medical test. Let's say your doctor tells you you've tested positive for a rare genetic disorder. The test is quite reliable: it has a false positive rate of just 1%. Things are sounding pretty bad for you, no? Now let's say this disorder only affects one in a million people. Out of a million people, 1%, or 10,000 people, would get a false positive on the medical test. That means there's still a 99.99% chance that you're fine.
This kind of analysis is called Bayesian statistics. Instead of assuming that your experiment takes place in a vacuum, it takes into account how likely your result would have seemed beforehand. A low p-value on one experiment might mean ESP is 100 times more likely to exist than it previously was. But if the sum of scientific knowledge before this paper was published said that telepathy was astronomically unlikely--well, we're probably still fine.
So this paper tells us a lot--but not about ESP. Whatever its author's intentions were, "Feeling the Future" will probably go down in history as an important paper about statistics. JPSP, recognizing this, is publishing a critique in the same issue as Bem's paper. In the critique, a group of scientists will share their own, Bayesian analysis of Bem's data. According to Science, this analysis "concludes that, if anything, [the data] support the hypothesis that ESP does not exist."
Or maybe Bem's results are real, and someone out there already knows exactly how this whole drama will play out.
Ten thousand and one thanks to Doug for teaching me about statistics.
Daryl Bem, an emeritus professor at Cornell, is going to have his paper published in an upcoming issue of the Journal of Personality and Social Psychology. He's a respected researcher and it's a respected publication. These are not the circumstances under which you usually read about ESP--or "psi," as psychologists call it. Nevertheless, Bem's paper passed through peer review, which may have you feeling angry, confused, excited, or (if you possess precognitive abilities) totally unsurprised.
Bem's paper, "Feeling the Future" (you can see the unpublished version here), consists of 9 experiments that take standard psychological effects and reverse them. For example, say you're given a list of 48 nouns to read. Then you do an exercise--rearranging lists of words--in which you see half of those nouns again. Finally, you're asked to recall as many of the original 48 words as you can. You're expected to do better at remembering the words you "rehearsed" in the list exercise. Bem reversed this experiment by showing subjects 48 words, then asking them to recall as many as possible, and then giving them an exercise that used half the words (randomly selected by a computer). Bem reports that his subjects had better recall of the words they would rehearse later, because they had psychically anticipated practicing those words.
Another set of experiments studied "habituation," which non-psychologists call "getting used to stuff." Scary or gross pictures might prompt a strong reaction the first time we see them, but less of a reaction the second or third time. In a normal habituation experiment, a photo of a dangerous-looking snake might flash on a computer screen too quickly for you to register it consciously. Then you'd be shown the same snake photo next to a photo of, say, a spider, and asked which you like better. You're expected to prefer the snake, because seeing it subliminally has made you habituated--it doesn't bother you as much anymore. Bem's experiment reversed this: First subjects chose which of two pictures they liked better, and then one of them was flashed subliminally on the screen. The strongest results came when, instead of negative-reaction photos, the computer flashed erotic photos. (In that case, subjects supposedly preferred the erotic photo they were not about to see subliminally, because they weren't preemptively habituated to it.)
These effects weren't big; just a shade away from the results you'd get by guessing. But for 8 of his 9 experiments, Bem reports that the results were "statistically significant." If you've taken college science classes, you know what this means: A statistical test found the odds that the result would have occurred by chance alone to be lower than 5%. Of course 5% is an arbitrary cutoff; unlikely things happen by chance all the time. But scientists generally accept a result (called a p-value) under 5% as noteworthy.
For his erotic-picture experiment, Bem reports an even better p-value of .01. This is a less-than-1% possibility that chance alone could have caused his results. But think of the p-value as a medical test. Let's say your doctor tells you you've tested positive for a rare genetic disorder. The test is quite reliable: it has a false positive rate of just 1%. Things are sounding pretty bad for you, no? Now let's say this disorder only affects one in a million people. Out of a million people, 1%, or 10,000 people, would get a false positive on the medical test. That means there's still a 99.99% chance that you're fine.
This kind of analysis is called Bayesian statistics. Instead of assuming that your experiment takes place in a vacuum, it takes into account how likely your result would have seemed beforehand. A low p-value on one experiment might mean ESP is 100 times more likely to exist than it previously was. But if the sum of scientific knowledge before this paper was published said that telepathy was astronomically unlikely--well, we're probably still fine.
So this paper tells us a lot--but not about ESP. Whatever its author's intentions were, "Feeling the Future" will probably go down in history as an important paper about statistics. JPSP, recognizing this, is publishing a critique in the same issue as Bem's paper. In the critique, a group of scientists will share their own, Bayesian analysis of Bem's data. According to Science, this analysis "concludes that, if anything, [the data] support the hypothesis that ESP does not exist."
Or maybe Bem's results are real, and someone out there already knows exactly how this whole drama will play out.
Ten thousand and one thanks to Doug for teaching me about statistics.