One of the researchers, a biostatistician named Georgia Salanti, fired up a laptop and projector and started to take the group through a study she and a few colleagues were completing that asked this question: were drug companies manipulating published research to make their drugs look good? Salanti ticked off data that seemed to indicate they were, but the other team members almost immediately started interrupting. One noted that Salanti’s study didn’t address the fact that drug-company research wasn’t measuring critically important “hard” outcomes for patients, such as survival versus death, and instead tended to measure “softer” outcomes, such as self-reported symptoms (“my chest doesn’t hurt as much today”). Another pointed out that Salanti’s study ignored the fact that when drug-company data seemed to show patients’ health improving, the data often failed to show that the drug was responsible, or that the improvement was more than marginal.
Salanti remained poised, as if the grilling were par for the course, and gamely acknowledged that the suggestions were all good—but a single study can’t prove everything, she said. Just as I was getting the sense that the data in drug studies were endlessly malleable, Ioannidis, who had mostly been listening, delivered what felt like a coup de grâce: wasn’t it possible, he asked, that drug companies were carefully selecting the topics of their studies—for example, comparing their new drugs against those already known to be inferior to others on the market—so that they were ahead of the game even before the data juggling began? “Maybe sometimes it’s the questions that are biased, not the answers,” he said, flashing a friendly smile. Everyone nodded. Though the results of drug studies often make newspaper headlines, you have to wonder whether they prove anything at all. Indeed, given the breadth of the potential problems raised at the meeting, canany medical-research studies be trusted?
THE CITY OF IOANNINA is a big college town a short drive from the ruins of a 20,000-seat amphitheater and a Zeusian sanctuary built at the site of the Dodona oracle. The oracle was said to have issued pronouncements to priests through the rustling of a sacred oak tree. Today, a different oak tree at the site provides visitors with a chance to try their own hands at extracting a prophecy. “I take all the researchers who visit me here, and almost every single one of them asks the tree the same question,” Ioannidis tells me, as we contemplate the tree the day after the team’s meeting. “‘Will my research grant be approved?’” He chuckles, but Ioannidis (pronounced yo-NEE-dees) tends to laugh not so much in mirth as to soften the sting of his attack. And sure enough, he goes on to suggest that an obsession with winning funding has gone a long way toward weakening the reliability of medical research.
It didn’t turn out that way. In poring over medical journals, he was struck by how many findings of all types were refuted by later findings. Of course, medical-science “never minds” are hardly secret. And they sometimes make headlines, as when in recent years large studies or growing consensuses of researchers concluded that mammograms, colonoscopies, and PSA tests are far less useful cancer-detection tools than we had been told; or when widely prescribed antidepressants such as Prozac, Zoloft, and Paxil were revealed to be no more effective than a placebo for most cases of depression; or when we learned that staying out of the sun entirely can actually increase cancer risks; or when we were told that the advice to drink lots of water during intense exercise was potentially fatal; or when, last April, we were informed that taking fish oil, exercising, and doing puzzles doesn’t really help fend off Alzheimer’s disease, as long claimed. Peer-reviewed studies have come to opposite conclusions on whether using cell phones can cause brain cancer, whether sleeping more than eight hours a night is healthful or dangerous, whether taking aspirin every day is more likely to save your life or cut it short, and whether routine angioplasty works better than pills to unclog heart arteries.
This array suggested a bigger, underlying dysfunction, and Ioannidis thought he knew what it was. “The studies were biased,” he says. “Sometimes they were overtly biased. Sometimes it was difficult to see the bias, but it was there.” Researchers headed into their studies wanting certain results—and, lo and behold, they were getting them. We think of the scientific process as being objective, rigorous, and even ruthless in separating out what is true from what we merely wish to be true, but in fact it’s easy to manipulate results, even unintentionally or unconsciously. “At every step in the process, there is room to distort results, a way to make a stronger claim or to select what is going to be concluded,” says Ioannidis. “There is an intellectual conflict of interest that pressures researchers to find whatever it is that is most likely to get them funded.”
Perhaps only a minority of researchers were succumbing to this bias, but their distorted findings were having an outsize effect on published research. To get funding and tenured positions, and often merely to stay afloat, researchers have to get their work published in well-regarded journals, where rejection rates can climb above 90 percent. Not surprisingly, the studies that tend to make the grade are those with eye-catching findings. But while coming up with eye-catching theories is relatively easy, getting reality to bear them out is another matter. The great majority collapse under the weight of contradictory data when studied rigorously. Imagine, though, that five different research teams test an interesting theory that’s making the rounds, and four of the groups correctly prove the idea false, while the one less cautious group incorrectly “proves” it true through some combination of error, fluke, and clever selection of data. Guess whose findings your doctor ends up reading about in the journal, and you end up hearing about on the evening news? Researchers can sometimes win attention by refuting a prominent finding, which can help to at least raise doubts about results, but in general it is far more rewarding to add a new insight or exciting-sounding twist to existing research than to retest its basic premises—after all, simply re-proving someone else’s results is unlikely to get you published, and attempting to undermine the work of respected colleagues can have ugly professional repercussions.
He chose to publish one paper, fittingly, in the online journal PLoS Medicine, which is committed to running any methodologically sound article without regard to how “interesting” the results may be. In the paper, Ioannidis laid out a detailed mathematical proof that, assuming modest levels of researcher bias, typically imperfect research techniques, and the well-known tendency to focus on exciting rather than highly plausible theories, researchers will come up with wrong findings most of the time. Simply put, if you’re attracted to ideas that have a good chance of being wrong, and if you’re motivated to prove them right, and if you have a little wiggle room in how you assemble the evidence, you’ll probably succeed in proving wrong theories right. His model predicted, in different fields of medical research, rates of wrongness roughly corresponding to the observed rates at which findings were later convincingly refuted: 80 percent of non-randomized studies (by far the most common type) turn out to be wrong, as do 25 percent of supposedly gold-standard randomized trials, and as much as 10 percent of the platinum-standard large randomized trials. The article spelled out his belief that researchers were frequently manipulating data analyses, chasing career-advancing findings rather than good science, and even using the peer-review process—in which journals ask researchers to help decide which studies to publish—to suppress opposing views. “You can question some of the details of John’s calculations, but it’s hard to argue that the essential ideas aren’t absolutely correct,” says Doug Altman, an Oxford University researcher who directs the Centre for Statistics in Medicine.
DRIVING ME BACK to campus in his smallish SUV—after insisting, as he apparently does with all his visitors, on showing me a nearby lake and the six monasteries situated on an islet within it—Ioannidis apologized profusely for running a yellow light, explaining with a laugh that he didn’t trust the truck behind him to stop. Considering his willingness, even eagerness, to slap the face of the medical-research community, Ioannidis comes off as thoughtful, upbeat, and deeply civil. He’s a careful listener, and his frequent grin and semi-apologetic chuckle can make the sharp prodding of his arguments seem almost good-natured. He is as quick, if not quicker, to question his own motives and competence as anyone else’s. A neat and compact 45-year-old with a trim mustache, he presents as a sort of dashing nerd—Giancarlo Giannini with a bit of Mr. Bean.
When a five-year study of 10,000 people finds that those who take more vitamin X are less likely to get cancer Y, you’d think you have pretty good reason to take more vitamin X, and physicians routinely pass these recommendations on to patients. But these studies often sharply conflict with one another. Studies have gone back and forth on the cancer-preventing powers of vitamins A, D, and E; on the heart-health benefits of eating fat and carbs; and even on the question of whether being overweight is more likely to extend or shorten your life. How should we choose among these dueling, high-profile nutritional findings? Ioannidis suggests a simple approach: ignore them all.
On the relatively rare occasions when a study does go on long enough to track mortality, the findings frequently upend those of the shorter studies. (For example, though the vast majority of studies of overweight individuals link excess weight to ill health, the longest of them haven’t convincingly shown that overweight people are likely to die sooner, and a few of them have seemingly demonstrated that moderately overweight people are likely to live longer.) And these problems are aside from ubiquitous measurement errors (for example, people habitually misreport their diets in studies), routine misanalysis (researchers rely on complex software capable of juggling results in ways they don’t always understand), and the less common, but serious, problem of outright fraud (which has been revealed, in confidential surveys, to be much more widespread than scientists like to acknowledge).
And so it goes for all medical studies, he says. Indeed, nutritional studies aren’t the worst. Drug studies have the added corruptive force of financial conflict of interest. The exciting links between genes and various diseases and traits that are relentlessly hyped in the press for heralding miraculous around-the-corner treatments for everything from colon cancer to schizophrenia have in the past proved so vulnerable to error and distortion, Ioannidis has found, that in some cases you’d have done about as well by throwing darts at a chart of the genome. (These studies seem to have improved somewhat in recent years, but whether they will hold up or be useful in treatment are still open questions.) Vioxx, Zelnorm, and Baycol were among the widely prescribed drugs found to be safe and effective in large randomized controlled trials before the drugs were yanked from the market as unsafe or not so effective, or both.
“Often the claims made by studies are so extravagant that you can immediately cross them out without needing to know much about the specific problems with the studies,” Ioannidis says. But of course it’s that very extravagance of claim (one large randomized controlled trial even proved that secret prayer by unknown parties can save the lives of heart-surgery patients, while another proved that secret prayer can harm them) that helps gets these findings into journals and then into our treatments and lifestyles, especially when the claim builds on impressive-sounding evidence. “Even when the evidence shows that a particular research idea is wrong, if you have thousands of scientists who have invested their careers in it, they’ll continue to publish papers on it,” he says. “It’s like an epidemic, in the sense that they’re infected with these wrong ideas, and they’re spreading it to other researchers through journals.”
Most journal editors don’t even claim to protect against the problems that plague these studies. University and government research overseers rarely step in to directly enforce research quality, and when they do, the science community goes ballistic over the outside interference. The ultimate protection against research error and bias is supposed to come from the way scientists constantly retest each other’s results—except they don’t. Only the most prominent findings are likely to be put to the test, because there’s likely to be publication payoff in firming up the proof, or contradicting it.
But even for medicine’s most influential studies, the evidence sometimes remains surprisingly narrow. Of those 45 super-cited studies that Ioannidis focused on, 11 had never been retested. Perhaps worse, Ioannidis found that even when a research error is outed, it typically persists for years or even decades. He looked at three prominent health studies from the 1980s and 1990s that were each later soundly refuted, and discovered that researchers continued to cite the original results as correct more often than as flawed—in one case for at least 12 years after the results were discredited.
Medical research is not especially plagued with wrongness. Other meta-research experts have confirmed that similar issues distort research in all fields of science, from physics to economics (where the highly regarded economists J. Bradford DeLong and Kevin Lang once showed how a remarkably consistent paucity of strong evidence in published economics studies made it unlikely that any of them were right). And needless to say, things only get worse when it comes to the pop expertise that endlessly spews at us from diet, relationship, investment, and parenting gurus and pundits. But we expect more of scientists, and especially of medical scientists, given that we believe we are staking our lives on their results. The public hardly recognizes how bad a bet this is. The medical community itself might still be largely oblivious to the scope of the problem, if Ioannidis hadn’t forced a confrontation when he published his studies in 2005.
Ioannidis initially thought the community might come out fighting. Instead, it seemed relieved, as if it had been guiltily waiting for someone to blow the whistle, and eager to hear more. David Gorski, a surgeon and researcher at Detroit’s Barbara Ann Karmanos Cancer Institute, noted in his prominent medical blog that when he presented Ioannidis’s paper on highly cited research at a professional meeting, “not a single one of my surgical colleagues was the least bit surprised or disturbed by its findings.” Ioannidis offers a theory for the relatively calm reception. “I think that people didn’t feel I was only trying to provoke them, because I showed that it was a community problem, instead of pointing fingers at individual examples of bad research,” he says. In a sense, he gave scientists an opportunity to cluck about the wrongness without having to acknowledge that they themselves succumb to it—it was something everyone else did.
The irony of his having achieved this sort of success by accusing the medical-research community of chasing after success is not lost on him, and he notes that it ought to raise the question of whether he himself might be pumping up his findings. “If I did a study and the results showed that in fact there wasn’t really much bias in research, would I be willing to publish it?” he asks. “That would create a real psychological conflict for me.” But his bigger worry, he says, is that while his fellow researchers seem to be getting the message, he hasn’t necessarily forced anyone to do a better job. He fears he won’t in the end have done much to improve anyone’s health. “There may not be fierce objections to what I’m saying,” he explains. “But it’s difficult to change the way that everyday doctors, patients, and healthy people think and behave.”
AS HELTER-SKELTER as the University of Ioannina Medical School campus looks, the hospital abutting it looks reassuringly stolid. Athina Tatsioni has offered to take me on a tour of the facility, but we make it only as far as the entrance when she is greeted—accosted, really—by a worried-looking older woman. Tatsioni, normally a bit reserved, is warm and animated with the woman, and the two have a brief but intense conversation before embracing and saying goodbye. Tatsioni explains to me that the woman and her husband were patients of hers years ago; now the husband has been admitted to the hospital with abdominal pains, and Tatsioni has promised she’ll stop by his room later to say hello. Recalling the appendicitis story, I prod a bit, and she confesses she plans to do her own exam. She needs to be circumspect, though, so she won’t appear to be second-guessing the other doctors.
Later, Ioannidis tells me he makes a point of having several clinicians on his team. “Researchers and physicians often don’t understand each other; they speak different languages,” he says. Knowing that some of his researchers are spending more than half their time seeing patients makes him feel the team is better positioned to bridge that gap; their experience informs the team’s research with firsthand knowledge, and helps the team shape its papers in a way more likely to hit home with physicians. It’s not that he envisions doctors making all their decisions based solely on solid evidence—there’s simply too much complexity in patient treatment to pin down every situation with a great study. “Doctors need to rely on instinct and judgment to make choices,” he says. “But these choices should be as informed as possible by the evidence. And if the evidence isn’t good, doctors should know that, too. And so should patients.”
We could solve much of the wrongness problem, Ioannidis says, if the world simply stopped expecting scientists to be right. That’s because being wrong in science is fine, and even necessary—as long as scientists recognize that they blew it, report their mistake openly instead of disguising it as a success, and then move on to the next thing, until they come up with the very occasional genuine breakthrough. But as long as careers remain contingent on producing a stream of research that’s dressed up to seem more right than it is, scientists will keep delivering exactly that.
“Science is a noble endeavor, but it’s also a low-yield endeavor,” he says. “I’m not sure that more than a very small percentage of medical research is ever likely to lead to major improvements in clinical outcomes and quality of life. We should be very comfortable with that fact.”