By Benedict Carey
July 16, 2018
The urge to pull down statues extends well beyond the public squares of
nations in turmoil. Lately it has been stirring the air in some corners of
science, particularly psychology.
In recent months, researchers and some journalists have strung cables around
the necks of at least three monuments of the modern psychological canon:
Stanford Prison Experiment, which found that people playacting as guards
quickly exhibited uncharacteristic cruelty.
landmark marshmallow test, which found that
young children who could delay gratification showed greater educational
achievement years later than those who could not.
* And the lesser known but influential
concept of ego depletion - the idea that willpower is like a muscle that
can be built up but also tires.
The assaults on these studies aren't all new. Each is a story in its own
right, involving debates over methodology and statistical bias that have
surfaced before in some form.
But since 2011, the psychology field has been giving itself an
intensive background check, redoing
more than 100 well-known studies. Often the original results cannot be
reproduced, and the entire contentious process has been colored, inevitably,
by generational change and charges of patriarchy.
"This is a phase of cleaning house and we're finding that many things aren't
as robust as we thought," said Brian Nosek, a professor of psychology at the
University of Virginia, who has led the replication drive. "This is a
reformation moment - to say let's self-correct, and build on knowledge that
we know is solid."
Still, the study of human behavior will never be as clean as physics or
cardiology - how could it be? - and psychology's elaborate simulations are
just that. At the same time, its findings are far more accessible and
personally relevant to the public than those in most other scientific
Psychology has millions of amateur theorists who test the findings against
their own experience. The public's judgments matter to the field, too.
It is one thing to frisk the studies appearing almost daily in journals that
form the current back-and-forth of behavior research. It is somewhat
different to call out experiments that became classics - and world-famous
outside of psychology - because they dramatized something people recognized
in themselves and in others.
They live in the common culture as powerful metaphors, explanations for
aspects of our behavior that we sense are true and that are captured somehow
in a laboratory mini-drama constructed by an inventive researcher, or
The Stanford prison experiment is a case in point.
In the summer of 1971, Philip Zimbardo, a midcareer psychologist, recruited
24 college students through newspaper ads and randomly cast half of them as
"prisoners" and half as "guards," setting them up in a mock prison, compete
with cells and uniforms. He had the simulation filmed.
After six days, Dr. Zimbardo called the experiment off, reporting that the
"guards" began to assume their roles too well. They became abusive, some of
them shockingly so.
Dr. Zimbardo published dispatches about the experiment in a couple of
obscure journals. He provided a more complete report in an article he wrote
in The New York Times, describing how cruel instincts could emerge
spontaneously in ordinary people as a result of situational pressures and
That article and "Quiet Rage," a documentary about the experiment, helped
make Dr. Zimbardo a star in the field and media favorite, most recently in
the wake of the Abu Ghraib prison scandal in the early 2000s.
Perhaps the central challenge to the study's claims is that its author
coached the "guards" to be hard cases.
Is this coaching "not an overt invitation to be abusive in all sorts of
psychological ways?" wrote Peter Gray, a psychologist at Boston College who
exclude any mention of the
simulation from his popular introductory textbook.
"And, when the guards did behave in these ways and escalated that behavior,
with Zimbardo watching and apparently (by his silence) approving, would that
not have confirmed in the subjects' minds that they were behaving as they
Recent challenges have echoed Dr. Gray's, and earlier this month Dr.
Zimbardo was moved to post a response
"My instructions to the guards, as documented by recordings of guard
orientation, were that they could not hit the prisoners but could create
feelings of boredom, frustration, fear and a sense of powerlessness - that
is, 'we have total power of the situation and they have none,'" he wrote.
"We did not give any formal or detailed instructions about how to be an
In an interview, Dr. Zimbardo said that the simulation was a "demonstration
of what could happen" to some people influenced by powerful social roles and
outside pressures, and that his critics had missed this point.
Which argument is more persuasive depends to some extent on where you sit
and what you may think of Dr. Zimbardo. Is it better to describe his
experiment, questions and all - or to ignore it entirely as not real
One psychologist who doesn't have to choose is David Baker, executive
director of the Center for the History of Psychology at the University of
Akron, which hosts the National Museum of Psychology.
"We put everything in that's an important part of our history, including the
controversy," Dr. Baker said.
"To me, the target question of an experiment should be considered," he
added. "In this case, do social context and expectations significantly
change behavior. And if so, when and how so?"
The issues surrounding the marshmallow studies and the ego depletion work
are different, but land researchers in the same fundamental bind: Is this
something, or is it nothing?
Even younger psychologists who are eloquent partisans on the side of
self-correction can be conflicted.
"With ego depletion especially, it seems like there's some truth there - we
have a subjective feeling of cognitive fatigue" after exercising
self-control, said Katie Corker, as assistant professor of psychology at
Grand Valley State University in Michigan.
A recent replication, rigorously done by one of the original authors, found
evidence of an effect, but it was a small one, Dr. Corker said.
"Maybe we're not studying it right, I don't know. The better question may
be, what does it take to kill off a big finding like this? Or, what should
Given modern ethics restrictions, mounting precise replications of old
experiments is not always possible. The prison experiment would likely have
to be seriously modified to pass institutional review.
The marshmallow test and ego depletion studies are fair game for further
examination, and in those cases modifications may in fact clarify the
picture. Some children do exhibit a streak of self-restraint early that
seems to become central to their developing personality. What is the best
way to measure that ability, or trait? What are its rewards over time, and
A more careful investigation of the "subjective cognitive fatigue" resulting
from exercising self-control might help answer the latter question. It may
also save ego depletion from being discarded prematurely as a useful
When Dr. Nosek published his first major replication paper in 2015, finding
that about 60 percent of prominent studies did not pan out on a second try,
it was a gift to skeptics eager to dismiss the entire field (and maybe all
of social science) as a joke, a congregation of poorly anchored findings
that shift in the wind, like nutrition advice.
It's not. On the contrary.
Housecleaning is a crucial corrective in science, and psychology has led by
example. But in science, as in life, there's reason for care before dragging
the big items to the curb.