What to Do When Algorithms Rule

The first American astronauts were recruited from the ranks of test pilots, largely due to convenience. As Tom Wolfe describes in his incredible book The Right Stuff, radar operators might have been better suited to the passive observation required in the largely automated Mercury space capsules. But the test pilots were readily available, had the required security clearances, and could be ordered to report to duty.

Test pilot Al Shepherd, the first American in space, did little during his first, 15-minute flight beyond being observed by cameras and a rectal thermometer (more on the “little” he did do later). Pilots rejected by Project Mercury dubbed Shepherd “spam in a can.”

Other pilots were quick to note that “a monkey’s gonna make the first flight.” Well, not quite a monkey. Before Shepherd, the first to fly in the Mercury space capsule was a chimpanzee named Ham, only 18 months removed from his West African home. Ham performed with aplomb.

Ham, the first “astronaut” to fly in the Mercury space capsule. Image: NASA/Wikimedia Commons

But test pilots are not the type to like relinquishing control. The seven Mercury astronauts felt uncomfortable filling a role that could be performed by a chimp (or spam). Thus started the astronauts’ quest to gain more control over the flight and to make their function more akin to that of a pilot. A battle for decision-making authority—man versus automated decision aid—had begun.

They wanted a window to look out of, which they got. They wanted control over the Mercury-Redstone rocket that would carry the capsule into space, which was denied. They wanted control over the thrusters that controlled the orientation of the capsule in space. They also wanted manual control over re-entry, such as using the thrusters to set the angle of attack. They were given a manual override for the thrusters and re-entry procedure, but the automatic systems were left in place. They also asked for an emergency hatch through which to get out of the capsule after splashdown; they otherwise had to wait until the hatch was unbolted from the outside. This request was granted.

For the second sub-orbital flight, “piloted” by Gus Grissom, the emergency hatch was in place. Whether or not Grissom, as Tom Wolfe colorfully phrased it, “screwed the pooch” and blew the hatch early after splashdown has been the subject of some debate. A more likely explanation was that after getting ahead of the post-landing checklist and disabling the emergency hatch safety mechanisms too early, Grissom bumped the plunger to trigger the explosive bolts. Or the bolts may have blown on their own. Regardless, the Mercury capsule “Liberty Bell 7” ended up 4.9 kilometers below the sea surface, and Grissom was pulled from the water near-drowned. A desire for control almost cost Grissom his life.

The Liberty Bell 7, “piloted” by Gus Grissom, filled with water and was unable to be extracted by helicopter after its initial return to Earth. It was finally recovered almost 38 years later. Image: NASA/Wikimedia Commons

Grissom’s experiences were not unique. Early flights were typified by operator errors linked to the requested modifications. After testing the manual attitude control during the first flight, Shepherd forgot to turn it off when he reactivated the automatic system. This cost fuel that was thankfully was not required on such a short flight. In the second orbital flight, Scott Carpenter, was late in starting the re-entry procedure and left both the manual and automatic systems on for 10 minutes. As a result, he ran out of fuel during re-entry. Thankfully he survived, although his overshoot of the target by 250 miles led to an impression for 40 minutes that he was dead. “I’m afraid … we may have … lost an astronaut,” announced a teary Walter Cronkite before Carpenter was found.

From the wish to control a space capsule’s angle of attack on re-entry, to unwillingness to get into a lift without an operator, the reluctance to have our decisions and actions replaced by automated systems extends through a range of human activity and decision-making. It took nearly 50 years for people to accept automated lifts. Today, over three quarters of Americans are afraid to ride in a self-driving vehicle.

Today, over three quarters of Americans are afraid to ride in a self-driving vehicle.

Human resistance to relinquishing decision-making to automated decision aids has been the subject of detailed research (for simplicity, I’ll refer to “automated decision aids” as algorithms). Despite the evidence of the superiority of (often simple) algorithms to human decision makers in many contexts, from psychiatric and medical diagnoses to university admissions offices (see here, here and here for some reviews), we humans tend not to listen to the answers (see here, here and here for examples of this reluctance). When humans are given a choice between their own judgment and that of a demonstrably superior algorithm, they will generally choose the former, even when it comes at the expense of themselves or their performance.

I discussed this last point in my previous article, “Don’t Touch the Computer.” Despite stories about the success of human–machine combinations in freestyle chess and dreams of a world where A.I. and humans work together in synergy, a person combined with an algorithm more often than not results in worse outcomes than the algorithm alone. A major factor in these poor outcomes is that decision makers simply fail to use or follow the algorithm. They leave the system on manual control. Even where autonomous vehicle technology is installed, it is often turned off.

When humans are given a choice between their own judgment and that of a demonstrably superior algorithm, they will generally choose the former, even when it comes at the expense of themselves or their performance.

Why do humans neglect superior algorithms? Suggestions include overconfidence, belief in their own expertise, and the presence of performance incentives (incentives making them more likely to use their own judgement). Doctors fear algorithms will take the “art” out of clinical judgment. People tend to prefer their own inferior judgement when they see an algorithm err, possibly because they do not compare the algorithm’s performance to their own. Rather, they compare to a reference point such as the target level of accuracy. Then there is the demand side problem—those subject to or receiving the decision often prefer a human decision maker.

One suggestion that resonates with my own professional experience is the difficulty in demonstrating the value of something that is statistically superior. Suppose you have an algorithm that is 75 percent accurate and the human is 65 percent accurate. You now have a single case in front of you. Is this a case where the human is right and the algorithm is wrong? It is hard if not impossible to know—if you could systematically predict these cases, you could improve the algorithm. If a decision maker cannot know which is correct for the case in front of them, they can be reluctant to let go of their own judgement.

So how can we encourage people to accept the use of algorithms when they will provide a superior or safer outcome? Once self-driving cars become safer than human drivers (if they’re not already), the unwillingness to use them will lead to more dangerous roads and deaths. Similarly, the failure to use superior decision-making tools in hospitals, schools, and other high-stakes environments is already costly.

One interesting possibility comes from an experiment by Berkeley Dietvorst, Joseph Simmons, and Cade Massey. The experimental subjects were given an algorithm to assist in predicting the percentile ranking of a student in their class. One set of experimental subjects were given the option of using only an algorithm or only their own judgement. A second group were given the choice of being able to use an algorithm for which they could adjust the result or their own judgement. The adjustment mechanism allowed the subjects to shift the algorithm’s prediction up or down by 10, 5, or 2 percentage points.

Giving subjects a role in the decision through the adjustment mechanism made them more likely to use the algorithm than those who simply had to accept its recommendation. This was the case even where the ability to adjust the algorithm was severely constrained. Although those who adjusted the algorithm performed worse than the algorithm alone, their constrained use of the algorithm gave them superior performance to those who solely used their own judgement.

Giving subjects a role in the decision through the adjustment mechanism made them more likely to use the algorithm than those who simply had to accept its recommendation.

This research tentatively suggests that one option for increasing the use of algorithms is to give people constrained ability to intervene. In the same way that NASA gave the first astronauts limited control, with sometimes poor but thankfully not disastrous outcomes, we can allow people to tweak the output of the algorithm.

Surveys on attitudes about self-driving cars suggest an opportunity of this nature. While over three quarters of Americans are afraid to ride in a self-driving vehicle, the majority stated that they want some automation in their next vehicle. It is a degree of control they want, not complete control. How constrained could this control be? A big red stop button on the dash of the new self-driving car? And does that button have to be connected to anything?

Another study by Dietvorst (working paper) suggests a different strategy—changing what the decision maker perceives to be the default. When experimental subjects were asked what they would require to shift from their own judgement to that of the algorithm, they tended to assess the accuracy of the algorithm against the bonus threshold they were trying to meet, not against the weaker performance of their own judgement. However, when the algorithm was the default and they were asked what they would require to shift to their own judgement, they were substantially more likely to use the algorithm and performed better on the task.

When the algorithm was the default and they were asked what they would require to shift to their own judgement, they were substantially more likely to use the algorithm and performed better on the task.

This study is a limited and preliminary result, but it illustrates the central challenge to this approach—the difficulty of manipulating the default across numerous domains. But on the bright side, once an algorithm becomes the default, there will be a barrier to sliding back to the old option. There appears little demand for elevator operators today.

So now to another side of this story.

John Glenn, the first American astronaut to orbit the Earth. Image: NASA/Wikimedia Commons

The third Mercury flight was taken by John Glenn in what would become the first orbit of the earth by an American (although he was also preceded by a chimpanzee). When Glenn was flying over California, the automatic attitude control and the gyros, which controlled the capsule’s orientation, started to go haywire. Glenn’s attempts to reset the gyros did not solve the problem. The automatic firing of the thrusters from the capsule trying to right itself was chewing up the fuel supply.

That Glenn was drifting around in circles did not matter at this stage of the flight—it only mattered at the critical point of re-entry that the blunt end of the capsule be pointed in the right direction. Too steep an angle of attack and the capsule would burn up; too shallow and he would bounce off the earth’s atmosphere and remain in space. He also needed fuel at the time that angle would be set.

To save fuel for the re-entry sequence, Glenn shifted the attitude control to manual. When the time for re-entry came, he did not seek to perform the full operation himself, but tried the automatic controls a final time. He reset the gyros and put the controls on automatic. They worked. The yaw was still slightly off, so he nudged the manual thrusters several times to maintain the right alignment using visual sighting, before making a smooth re-entry.

A degree of manual intervention on Glenn’s part was required to return to earth alive. As reported by Newsweek two weeks after the orbit, “A trained and attentive pilot can be superior to the best-made robot mechanisms in the world. The machine faltered, never Glenn.”

This argument became more stark for the sixth and final Mercury flight, a 22-orbit mission piloted by Gordon Cooper. The automatic system failed, and Cooper was required to line the capsule up for re-entry, control the capsule’s orientation on all three axes using the hand controller, and fire the retrorockets by hand. He became a true pilot. The capsule landed right on target.

Gordon Cooper inside the Mercury space capsule, Image: NASA/Space.com

The flights of Glenn and Cooper clearly required human intervention over the automatic system. How should we think about these events?

As a start, the algorithm itself did not fail, but rather its execution in a complex automated system failed. In any new complicated system, there are bound to be errors, failures, or unanticipated environmental changes. This reality makes a strong case for an operator to be able to intervene without constraint.

It is these types of environments in which our most important decisions are made. Environments that are complex, dynamic, and full of Knightian uncertainty (risk that can’t be measured or calculated). Think of an important strategic decision by a company’s CEO. Or those first space flights. In contrast, most of those domains where algorithms have been found to be superior (and humans mess them up) involve regular decisions in a largely constant environment about which we are able to gather data.

So in these complex, dynamic, and uncertain domains, when should we trust the human decision maker? In what situations should we use an algorithm and when should humans override it? And how should we make these decisions?

Those are questions for another article.

Correction: A previous version of this article incorrectly stated the Scott Crossfield, not Scott Carpenter, took the second orbital flight.

Further Reading & Resources

  • Wolfe, T. (2008). The right stuff. New York: Picador. (Link)
  • Perrow, C. (1999). Living with high-risk technologies. New Jersey: Princeton University Press. (Link)
  • Dietvorst, B. J., Simmons, J. P., & Massey, C. (2016). Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science. (Link)
  • Dietvorst, B. J. (2016). People reject (superior) algorithms because they compare them to counter-normative reference points. (Link)
  • Eastwood, J., Snook, B., & Luther, K. (2012). What people want from their professionals: Attitudes toward decision‐making strategies. Journal of Behavioral Decision Making25(5), 458-468. (Link)