January 12, 2001
This article was
written in The New Yorker dated January 8, 2001.
By Atul Gawande
In 1901, a professor of criminal law at the University of Berlin was lecturing to his class when a student suddenly shouted an objection to his line of argument. Another student countered angrily, and the two exchanged insults. Fists were clenched, threats made: “If you say another word…” Then the first student drew a gun, the second rushed at him, and the professor recklessly interposed himself between them. A struggle, a blast-then pandemonium.
Whereupon the two putative antagonists disengaged and returned to their seats. The professor swiftly restored order, explaining to his students that the incident had been staged, and for a purpose. He asked the students, as eyewitnesses, to describe exactly what they had seen. Some were to write down their account on the stop, some a day or a week later; a few even had to depose their observations under cross-examination. The results were dismal. The most accurate witness got twenty-six per cent of the significant details wrong; others up to eighty per cent. Words were put in people’s mouths. Actions were described that had never taken place. Events that had taken place disappeared from memory.
In the century since, professors around the world have reenacted the experiment, in one form or another, thousands of times; the findings have been recounted in legal texts, courtrooms, and popular crime books. The trick has even been played on audiences of judges. The implications are not trivial. Each year, in the United States, more than seventy-five thousand people become criminal suspects based on eyewitness identification, with lineups used as a standard control measure. Studies of wrongful convictions – cases where a defendant was later exonerated by DNA testing – have shown the most common cause to be eyewitness error. In medicine, this kind of systematic misdiagnosis would receive intense scientific scrutiny. Yet the legal profession has conducted not further experiments on the reliability of eyewitness evidence, or on much else, for that matter. Science finds its way to the courthouse in the form of “expert testimony”- forensic analysis, ballistics, and so forth. But the law has balked at submitting its methods to scientific inquiry. Meanwhile, researchers working outside the legal establishment have discovered that surprisingly simple changes in legal procedures could substantially reduce misidentification. They suggest how scientific experimentation, which transformed medicine in the last century, could transform the justice system in the next.
For more than two decades now, the leading figure in eyewitness research has been a blond, jeans-and-tweed-wearing Midwesterner named Gary Wells. He got involved in the field by happenstance: one morning in 1974, a packet from a Cincinnati defense attorney arrived at the department of psychology at Ohio State University, in Columbus, where Wells was a twenty-three-year-old graduate student. The attorney had written to see if anyone there could help him analyze a case in which he believed his client had been wrongly identified as an armed robber. Inside the envelope was a large black-and-white photograph of the lineup form which his client had been picked out. Digging around a little in his spare time, Wells was surprised to discover that little was known about how misidentification occurs. He corresponded with the attorney several times during the next year, though he never came up with anything useful. The suspect was tried, convicted, and sent to prison. Wells never did find out whether the client had been falsely identified. But the case got him thinking.
Some months later, he put together his first experiment. He asked people in a waiting room to watch a bag while he left the room. After he went out, a confederate got up and grabbed the bag. Then he dropped it and picked it up again, giving everyone a good look at him, and bolted. (One problem emerged in the initial experiment: some people gave chase. Wells had to provide his shill with a hiding place just outside the room.) Wells knew form all the previous demonstrations that people would often misidentify the perpetrator. Still, he figured, if they did it without great assurance it wouldn’t matter much: under directions that the Supreme Court laid out in 1972, courts placed strong weight on an eyewitness’s level of certainty. Wells found, however, that the witnesses who picked the wrong person out of the lineup were just as confident about their choices as those who identified the right person. In a later experiment, he assembled volunteer juries and had them observe witnesses under cross-examination. The jurors, it turned out, believed inaccurate witnesses just as often as they did accurate one.
Wells tried variations on these experiments, first at the University of Alberta and later at Iowa State, where he’s now a professor of psychology, but after a time even he found the work discouraging. He did not just want to show how things go wrong; he wanted to figure out how they could be improved. His first clue came after several years, when he noticed an unexpected pattern: having multiple witnesses did not insure accurate identifications. In his studies, a crime might be witnessed by dozens of people, yet they would often finger the same wrong suspect. The errors were clearly not random.
To investigate further, Wells staged another crime, over and over, until he had gathered two hundred witnesses. The subjects were seated in a room, filling out what they thought were applications for a temporary job, when a loud crash came form behind the door to an adjacent room. A stranger (a graying, middle-aged, mustached local whom Wells had hired) then burst through the door, stopped in his tracks in evident surprise at finding people in the room, and retreated through the same door. Apparently finding a dead end that way, the man rushed in again, dropped an expensive-looking camera, picking it up, and ran out through the exit at the opposite end of the room. Everyone got several good looks at him. At this point, another person dashed in and said, “What happened to my cameral?” Wells tested each witness, one by one. Half the group was given a photo lineup of six people – a “six-pack,” as the police call it – which included the actual perpetrator. (Police use photo lineups far more frequently than live ones.) In a group of a hundred individuals, fifty-four picked the perpetrator correctly; twenty-one said they didn’t think the guy was there; and the others spread their picks across the people in the lineup.
The second group of witnesses was given the same lineup, minus the perpetrator. This time, thirty-two people picked no one. But most of the rest chose the same wrong person – the one who most resembled the perpetrator. Wells theorizes that witnesses faced with a photo spread tend to make a relative decision, weighing one candidate against the others and against incomplete traces of memory. Studies of actual wrongful convictions lend support to the thesis. For example, in a study of sixty-three DNA exonerations of wrongfully convicted people, fifty-three involved witnesses making a mistaken identification, and almost invariably they had viewed a lineup in which the actual perpetrator was not there. “The dangerous situation is exactly what our experiments said it would be,” Wells says.
Once this was established, he and others set about designing ways to limit such errors. Researchers at the State University of New York at Plattsburgh discovered that witnesses who are not explicitly warned that a lineup may not include the actual perpetrator are substantially more likely to make a false identification, under the misapprehension that they’ve got to pick someone. Wells found that putting more than one suspect in a lineup – something the police do routinely – also dramatically increases errors. Most provocative, however, were the experiments performed by Wells and Rod Lindsay, a colleague from Queen’s University in Ontario, which played with the way lineups were structured. The convention is to show a witness a whole lineup at once. Wells and Lindsay decided to see what would happen if witnesses were shown only one person at a time, and made to decide whether he was the culprit before moving on. Now, after a staged theft, the vast majority of witnesses who were shown a lineup that did not include the culprit went through the set without picking anyone. And when the culprit was present, witnesses who viewed a sequential lineup were no less adept at identifying him than witnesses who saw a standard lineup. The innovation reduced false identifications by well over fifty per cent without sacrificing correct identifications. The results have since been replicated by others. And the technique is beautifully simple. It wouldn’t cost a dime to adopt it.
It has now been fifteen years since Wells and Lindsay published their results. I asked Wells how widely the procedure has been followed. He laughed, because, aside from a scattered handful of police departments, mainly in Canada, it was not picked up at all. “In general,” he told me, “the reaction before criminal-law audiences was ‘Well, that’s very interesting, but…’” A Department of Justice report released in 1999 acknowledged that scientific evidence had established the superiority of sequential – lineup procedures. Yet the report goes on to emphasize that the department still has no preference between the two methods.
Among the inquisitive and scientifically minded, there are a few peculiar souls for whom the justice system looms the way the human body once did for eighteenth-century anatomists. They see infirmities to be understood, remedies to be invented and tested. And eyewitness identification is just one of the practices that invite empirical scrutiny. Unfortunately, only a handful of scientists have had any luck in gaining access to courtrooms and police departments. One of them is Lawrence Sherman, a sociologist at the University of Pennsylvania, who is the first person to carry out a randomized field experiment in criminal enforcement methods. In 1982, with the support of Minneapolis Police Chief Anthony Bouza, Sherman and his team of researchers completed a seventeen-month trial in which they compared three tactics for responding to non-life-threatening domestic-violence calls: arrest, mediation, and ordering the violent husband or boyfriend to leave the home for eight hours. Arrest emerged as the most effective way to prevent repeated acts of violence. The research was tremendously influential. Previously, it had been rare to arrest a violent husband, at least where the assault was considered “non-severe.” Afterward, across the country, arrest became a standard police response.
Such cooperation from law enforcement has proved rare. In Broward County, Florida, researchers started a randomized study to see whether counseling for convicted wife-beaters reduced repeat violence – and prosecutors went to court to stop the study. The state of Florida had granted judges discretion in mandating such counseling, and there was a strong belief that it should be assigned broadly, to stop violence, not randomly, for the sake of study. (“No one is suggesting counseling is a panacea and will solve everyone’s problems,” the lead prosecutor told the local newspaper, “but I think everyone will agree, in a certain percentage of cases it works.”) The researchers managed to get most of the men through the study before it was shut down, though, and they discovered not only that counseling provided no benefit but that it actually increased the likelihood of re-arrest in unemployed men. (Probably that’s because the women misguidedly believed that counseling worked, and were more likely to agree to see the men again.) In the field of law enforcement, people simply do not admit such possibilities, let alone test them.
Consider the jury box. Steven Penrod, a professor of both psychology and law at the University of Nebraska at Lincoln and another lonely pioneer in this area, is happy to rattle off a series of unexplored questions. Are there certain voting arrangements that make false convictions or mistaken acquittals less likely? (Most states require jurors to reach a unanimous verdict for a criminal conviction, but others allow conviction by as few as eight out of twelve jurors.) How would changing the number of jurors stated – say, to three or seventeen or eight – affect decisions? Do jurors understand and follow the instructions that judges give them? What instructions would be most effective in helping juries reach an accurate and just decision? Are there practical ways of getting juries to disregard inadmissible testimony that a lawyer has brought in? These are important questions, but researchers have little hope of making their way into jury rooms.
Lawrence Sherman points out that one of the most fertile areas for work is that of prosecutorial discretion. Most criminal cases are handled outside the courtroom, and no one knows how prosecutors decide whom to prosecute, how effectively they make these decisions, how often they let risky people go, and so on. But he reports that prosecutors he has approached have been “uniformly opposed” to allowing observation, let alone experimental study. “I’ve proposed repeatedly, and I’ve failed,” Sherman told me. He has a difficult enough time getting cooperation from the police, he says “but the lawyers are by far the worst.” In his view, the process of bringing scientific scrutiny to the methods of the justice system has hardly begun. “We’re holding a tiny little cardboard match in the middle of a huge forest at night,” he told me. “We’re about where surgery was a century ago.”
Researchers like Sherman say that one of their problems is the scarcity of financial support. The largest source of research funding is an obscure government agency called the National Institute of Justice, which was modeled on the National Institutes of Health when it was established, in 1968, but has a budget of less than one per cent of the N.I.H.’s. (The government spends more on meat and poultry research.) The harder problem, though, is the clash of cultures between the legal and the scientific approach, which is compounded by ignorance and suspicion. In medicine, there are hundreds of academic teaching hospitals, where innovation and testing are a routine part of what doctors do. There is not such thing as an academic police department or a teaching courthouse. The legal system takes its methods for granted: it is common sense that lineups are to be trusted, that wife-beaters are to be counseled, and that jurors are not to ask witnesses questions. Law enforcement, finally, is in thrall to a culture of precedent and convention, not of experiment and change. And science remains deeply mistrusted.
“The legal system doesn’t understand science,” Gary Wells told me. “I taught in law school for a year. Believe me, there’s no science in there at all.” When he speaks to people in the justice system about his work, he finds that most of his time is spend educating them about basic scientific methods. “To them, it seems like magic hand-waving and – boom – here’s the result. So then all they want to know is whose side you’re on – the prosecutor’s or the defendant’s.” In an adversarial system, where even facts come in two versions, it’s easy to view science as just another form of spin.
For a scientist, Gary Wells is a man of remarkable faith; he has spent more than twenty-five years doing research at the periphery of his own field for an audience that has barely been listening. When I point this out to him, it makes him chuckle. “It’s true,” he admits, and yet it does not seem to trouble him. “This may be my American optimism talking, but don’t you think, in the long run, the better idea will prevail?”
Lately, he has become fascinated with the alibi. “You know,” he told me in a recent conversation, “one of the strange things that pop up in DNA-exoneration cases is that innocent people often seem to be done in by weak or inconsistent alibis.” And it has got him thinking. Alibis seem so straightforward. The detective asks the suspect, “Where were you last Friday around 11 P.M.?” And if the suspect can’t account for his whereabouts – or, worse, gives one story now and another later – we take that as evidence against him. But should we? Wells wonders. How well do people remember where they were? How often do they misremember and change their minds? What times of the day is a person likely to have a provable alibi and what times not? How much does this vary among people who are married, who live alone, who are unemployed? Are there better ways to establish whether a suspect has a legitimate alibi? “No one knows these things,” he says.