Picture a driverless car cruising down the street. Suddenly, three pedestrians run in front of it. The brakes fail and the car is about to hit and kill all of them. The only way out is if the car crosses to the other lane and swerves into a barrier. But that would kill the passenger it’s carrying. What should the self-driving car do?
Would you change your answer if you knew that the three pedestrians are a male doctor, a female doctor, and their dog, while the passenger is a female athlete? Does it matter if the three pedestrians are jaywalking?
Millions of similar scenarios were generated by an experimental website my fellow researchers and I created and named “Moral Machine.”
How does the public think autonomous vehicles should resolve moral trade-offs? And could we use their responses to build a new kind of moral compass?
After the website received substantial media attention, more than four million people from 233 countries and territories visited the website between June 2016 and December 2017. They rated scenarios like the one described above, which were inspired by the famous philosophical conundrum the trolley problem. Though all of the scenarios are unlikely in real life, what we learned from visitors’ appraisal of them could help inform the regulation and programming of autonomous vehicles (AVs) and may also have implications for machine ethics generally. The main question we wanted to answer: How does the public think autonomous vehicles should resolve moral trade-offs? And could we use their responses to build a new kind of moral compass?
But before we dive into that, it’s important to understand how driverless cars make moral decisions in the real world. This might seem like a problem for the future, but cars already are making such decisions. For example, let’s say a car is programmed to drive in the middle of a lane. Sometimes, the car may “decide” to drive closer to the right side or left side of the lane, a response to programming meant to optimize for various objectives. These could include maximizing passenger convenience or minimizing liability.
However, these decisions may also result in moral consequences. Consider the case of a car positioning itself in the rightmost lane, between a truck on its left and a cyclist on its right. How much berth the car gives to each lane determines the risk it shifts from one road user to another. Taken in aggregate, those small-risk probabilities could lead to real harm, and disadvantage a specific group (like bike riders) in the long run.
For instance, suppose that after millions of miles, we examine the event-data recorders—kind of like a black box for planes—of two car models. We see that for car A, 80 percent of the related deaths are cyclists and 20 percent are the car’s own passenger; for car B, the proportions are flipped. Which car is more acceptable? You can expect that the former, privileging the car’s own passengers, would be more in demand. But this would endanger other road users. These are the kinds of moral dilemmas behind the abstract trolley-problem scenarios.
Simply because people report certain preferences doesn’t mean that those preferences make for wise or just policy.
In our experiment, the Moral Machine randomly presented participants with different treatments. Treatments were characterized by some of the following attributes: kind of intervention (stay/swerve), relationship to AV (pedestrians/passengers), legality (lawful/unlawful), gender (males/females), age (younger/older), social status (higher/lower), fitness (fit/large), number of characters (more/fewer), species (humans/pets).
We recently published the results in Nature and reported two main findings. First, people tend to prioritize three of the nine attributes. These are the preferences to spare humans over pets, to spare more lives over fewer lives, and to spare younger humans over older humans.
Second, while people from most countries agree on the general direction of the preferences, the magnitude of their preferences varied considerably. For example, while most countries preferred sparing younger humans over older humans, this preference was less pronounced in Eastern countries.
A word of warning: the preferences we found are not meant to instruct car programmers or regulators as to how they should regulate AVs. Simply because people report certain preferences doesn’t mean that those preferences make for wise or just policy. After all, the public can be ill-informed and biased, and some of the preferences we report are troubling—for instance, the somewhat strong preference to spare a higher status person at the cost of a lower status person.
Perhaps the focus for developers right now should be on creating a system where AV algorithms are transparent and accessible to the public.
Nevertheless, the results can still serve as a data point for experts to look at, especially when it comes to our second main finding—that there are strong cross-country variations in preferences.
This implies that programming ethical decisions in AVs using certain rules is likely to receive different pushback (e.g., lower levels of adoption, more protests) in different countries. For example, suppose that designers and programmers can increase passenger safety, but it’s conditional on pedestrians following road rules. Suppose further that this decision would result in less safety for jaywalkers. Should AV designers implement such a decision? Our results showed that it would have a varying approval rate among countries, and that the approval rate can be predicted by how strong the rule of law is in that country. This means that implementing universal rules in AVs could prove challenging.
This finding is important because developing universal ethical rules for AVs could be on the horizon. We’ve already seen various attempts to create broad ethical codes for intelligent machines like Asilomar AI Principles and AI4People. And in 2016, Germany became the first country to draft a code of ethics for AVs. But the German experience also illustrates the challenges embedded in any push for universal guidelines, either at a country or global level. Even after some long, contentious debates about particular parts of the code, German experts still couldn’t resolve certain issues—particularly around whether to prioritize the driver’s life over other “noninvolved” parties. A detailed explanation about this particular rule, for instance, noted that the driver should not be unconditionally saved, but, at the same time, their wellbeing should not be put last. In other words—an inconclusive culmination.
If experts within one country found it challenging to agree on specifics, how could we expect experts from different countries to agree on universal rules? One possible solution is to develop a high-level universal code of ethics and then leave the thorny details to be worked out within each country, recognizing that this, too, could prove challenging, given the German example. But since articulating the high-level universal codes could take some time, perhaps the focus for developers right now should be on creating a system where AV algorithms are transparent and accessible to the public. That way, the public can see and evaluate their vehicle’s moral compass and determine if they think it’s pointed in the right direction.