Author: David Epstein

   Some of the college students seemed to have unlearned number sense that most children have, like that adding two numbers gives you a third comprised of the first two. A student who was asked to verify that 462 + 253 = 715, subtracted 253 from 715, and got 462. When he was asked for another strategy, he could not come up with subtracting 462 from 715 to see that it equals 253, because the rule he learned was to subtract the number to the right of the plus sign to check the answer.

   When younger students bring home problems that force them to make connections, Richland told me, “parents are like, ‘Lemme show you, there’s a faster, easier way.’” If the teacher didn’t already turn the work into using-procedures practice, well-meaning parents will. They aren’t comfortable with bewildered kids, and they want understanding to come quickly and easily. But for learning that is both durable (it sticks) and flexible (it can be applied broadly), fast and easy is precisely the problem.


   “Some people argue that part of the reason U.S. students don’t do as well on international measures of high school knowledge is that they’re doing too well in class,” Nate Kornell, a cognitive psychologist at Williams College, told me. “What you want is to make it easy to make it hard.”

   Kornell was explaining the concept of “desirable difficulties,” obstacles that make learning more challenging, slower, and more frustrating in the short term, but better in the long term. Excessive hint-giving, like in the eighth-grade math classroom, does the opposite; it bolsters immediate performance, but undermines progress in the long run. Several desirable difficulties that can be used in the classroom are among the most rigorously supported methods of enhancing learning, and the engaging eighth-grade math teacher accidentally subverted all of them in the well-intended interest of before-your-eyes progress.

   One of those desirable difficulties is known as the “generation effect.” Struggling to generate an answer on your own, even a wrong one, enhances subsequent learning. Socrates was apparently on to something when he forced pupils to generate answers rather than bestowing them. It requires the learner to intentionally sacrifice current performance for future benefit.

   Kornell and psychologist Janet Metcalfe tested sixth graders in the South Bronx on vocabulary learning, and varied how they studied in order to explore the generation effect. Students were given some of the words and definitions together. For example, To discuss something in order to come to an agreement: Negotiate. For others, they were shown only the definition and given a little time to think of the right word, even if they had no clue, before it was revealed. When they were tested later, students did way better on the definition-first words. The experiment was repeated on students at Columbia University, with more obscure words (Characterized by haughty scorn: Supercilious). The results were the same. Being forced to generate answers improves subsequent learning even if the generated answer is wrong. It can even help to be wildly wrong. Metcalfe and colleagues have repeatedly demonstrated a “hypercorrection effect.” The more confident a learner is of their wrong answer, the better the information sticks when they subsequently learn the right answer. Tolerating big mistakes can create the best learning opportunities.*

   Kornell helped show that the long-run benefits of facilitated screwups extend to primates only slightly less studious than Columbia students. Specifically, to Oberon and Macduff, two rhesus macaques trained to learn lists by trial and error. In a fascinating experiment, Kornell worked with an animal cognition expert to give Oberon and Macduff lists of random pictures to memorize, in a particular order. (Example: a tulip, a school of fish, a cardinal, Halle Berry, and a raven.) The pictures were all displayed simultaneously on a screen. By pressing them in trial-and-error fashion, the monkeys had to learn the desired order and then practice it repeatedly. But all practice was not designed equal.

   In some practice sessions, Oberon (who was generally brighter) and Macduff were automatically given hints on every trial, showing them the next picture in the list. For other lists, they could voluntarily touch a hint box on the screen whenever they were stuck and wanted to be shown the next item. For still other lists, they could ask for a hint on half of their practice attempts. And for a final group of lists, no hints at all.

   In the practice sessions with hints upon request, the monkeys behaved a lot like humans. They almost always requested hints when they were available, and thus got a lot of the lists right. Overall, they had about 250 trials to learn each list.

   After three days of practice, the scientists took off the training wheels. Starting on day four, the memorizing monkeys had to repeat all the lists from every training condition without any hints whatsoever. It was a performance disaster. Oberon only got about one-third of the lists right. Macduff got less than one in five. There was, though, an exception: the lists on which they never had hints at all.

   For those lists, on day one of practice the duo had performed terribly. They were literally monkeys hitting buttons. But they improved steadily each training day. On test day, Oberon nailed almost three-quarters of the lists that he had learned with no hints. Macduff got about half of them.

   The overall experiment results went like this: the more hints that were available during training, the better the monkeys performed during early practice, and the worse they performed on test day. For the lists that Macduff spent three days practicing with automatic hints, he got zero correct. It was as if the pair had suddenly unlearned every list that they practiced with hints. The study conclusion was simple: “training with hints did not produce any lasting learning.”

   Training without hints is slow and error-ridden. It is, essentially, what we normally think of as testing, except for the purpose of learning rather than evaluation—when “test” becomes a dreaded four-letter word. The eighth-grade math teacher was essentially testing her students in class, but she was facilitating or outright giving them the answers.

   Used for learning, testing, including self-testing, is a very desirable difficulty. Even testing prior to studying works, at the point when wrong answers are assured. In one of Kornell’s experiments, participants were made to learn pairs of words and later tested on recall. At test time, they did the best with pairs that they learned via practice quizzes, even if they had gotten the answers on those quizzes wrong. Struggling to retrieve information primes the brain for subsequent learning, even when the retrieval itself is unsuccessful. The struggle is real, and really useful. “Like life,” Kornell and team wrote, “retrieval is all about the journey.”


   If that eighth-grade classroom followed a typical academic plan over the course of the year, it is precisely the opposite of what science recommends for durable learning—one topic was probably confined to one week and another to the next. Like a lot of professional development efforts, each particular concept or skill gets a short period of intense focus, and then on to the next thing, never to return. That structure makes intuitive sense, but it forgoes another important desirable difficulty: “spacing,” or distributed practice.

