Safe Autonomy: Debunking AV Industry Positions on Standards and Regulations

(This is the expanded version of: Philip Koopman, "Autonomous Vehicle Myths: The Dirty Dozen," EE Times, Oct. 22, 2021.)

Too often, I’ve read documents or listened to panel sessions that rehash misleading or just plain incorrect industry talking points regarding autonomous vehicle standards and regulations. The current industry strategy seems to boil down to “Trust us, we know what’s best,” “Don’t stifle innovation,” and “Humans are bad drivers, so computers will be better.”

As far as I can tell, what’s really going on is that automated-vehicle companies are saying what they say both to avoid being regulated and to avoid having to follow their own industry safety standards. That strategy has not yielded long-term safety in other industries that have tried it, however, and I predict that in the long term it will not serve the automotive industry well either. It certainly does not encourage trust.

In this essay, I address the usual industry talking points and provide summary rebuttals. I’m intentionally simplifying and generalizing each talking point for clarity. It is my hope that other stakeholders, policymakers, and regulators can use this information to encourage AV companies to talk about the things that matter, such as ensuring the safety of all road users. We need more transparency and honest discussion — not a continuation of the current, empty rhetoric.

It is important to be clear that, from everything I’ve seen, the rank-and-file engineers, and especially the safety professionals — if the company has any — are trying to do the right thing. It is the government relations and policy people, not the engineers, who are providing the facile talking points. And it is the high-level managers — the ones who set budgets, priorities, and milestones — who affect whether safety teams have sufficient resources and authority to build an AV that will in fact be acceptably safe. So this essay is directed at them, not at the engineers.

For more detailed guidance to state and municipal DOT and DMV regulators, see this blog post.

(I've numbered these points for easier reference. Making yourself a bingo card and bringing it to the next regulatory/policy panel session you attend to keep score is optional.)

Myth #1: 94% of crashes are due to human driver error, so AVs will be safer.

I’ve seen this myth stretched to suggest that 94% of crashes are due not just to human error but to “bad driver choices” (implying driving drunk and texting, perhaps). Sometimes, the cited percentage is 90%. Regardless of the particular spin, the usual, unstated implication is that AVs will be about 10 to 20 times safer by not making those same mistakes.

However, the 94% number is a misrepresentation of the original source. A vocal proponent of this myth, notably under the previous administration, has been the U.S. DOT, which ironically was the source of the study being misrepresented.

What the study data actually shows is that 94% of the time, a human driver might have helped avoid a bad outcome. The source explicitly says that this is not 94% “blame” on the human driver. (Not discussed is the astonishing effectiveness of human drivers in mitigating technical failures. It is more about pointing out that about once in a while, they don’t get it right.)

Sometimes, the bad outcome is due to an overt mistake or to driving impaired or distracted. Often, failure to wear a seatbelt turns what would otherwise be an injury crash into a fatality.

But some crashes occur simply because the human driver incorrectly guessed the intent of another road user or misunderstood an unusual situation in the roadway. Those are mistakes an AV can make as well. The 94% number also ignores the possibility that roadway and other infrastructure improvements could increase safety even for human drivers.

Beyond the 94% number being more complex than just “driver error,” AVs will make different mistakes. This should be abundantly clear to anyone who’s watched automation road test videos. Yes, the technology will improve. But there is no evidence I’ve seen that proves an AV will be safer than a human driver anytime soon. The safety benefit is aspirational for now. And Tesla data doesn’t count, because Tesla blames the human driver for crashes, and a deployed fully automated vehicle does not have a human driver to blame. (See: "A Reality Check on the 94 Percent Human Error Statistic for Automated Cars")

To be sure, with something like $100 billion of investment chasing the problem, we could get there. Hardworking engineers at the AV companies are trying to make sure we do get there. But we’re still on the journey, not at the destination.

Myth #2: You can have either innovation or regulation, not both.

This is a false dichotomy. You can easily have both innovation and regulation, if the regulation is designed to permit it.

To consider a simple example, you could regulate road testing safety by requiring conformance to SAE J3018. That standard is all about making sure that the human safety driver is properly qualified and trained. It also helps ensure that testing operations are conducted in a responsible manner consistent with good engineering validation and road safety practices. It places no constraints on the autonomy technology being tested. (Adaptation would be needed for a safety driver in a chase car if it was deemed impractical to have a safety driver sit in a prototype vehicle for road testing; see this blog post.)

For more general approaches, you can switch from the current approach of track and road testing to more goal-based testing. For example, a regulation that tells you what symbol to put on a dashboard to tell the driver sitting in a driver seat that there is low tire pressure (Federal Motor Vehicle Safety Standard [FMVSS] 138) does indeed constrain design by requiring a light, a dashboard, a driver seat, and a driver. But the lighted symbol isn’t the point; getting the low tire pressure addressed is the point, and regulations can focus on that instead (see this blog post). To be sure, this requires a change in regulatory structure. But the choice isn’t between innovation and regulation; it is between old regulation and new regulation, and that is a far different matter — especially if the new elements of the regulatory approach are based on conforming to standards the industry itself wrote.

The primary industry standards for deployed AVs in the U.S. are ISO 26262 (functional safety), ISO 21448 (safety of the intended function, or SOTIF), and ANSI/UL 4600 (system-level safety). Indeed, the current US DOT proposal for regulation as of this writing is to get the AV industry to conform to precisely those standards. None of the standards stifle innovation. Rather, they promote a level playing field so companies can’t skimp on safety to gain a competitive timing advantage while putting other road users at undue risk.

If a company states that safety is their #1 priority, how can that possibly be incompatible with regulatory requirements to follow industry consensus safety standards written and approved by the industry itself?

Myth #3: There are already sufficient regulations in place (for example in California).

Existing regulations (with one exception) do not require conformance to any industry computer-related safety standard, and do not set any level of required safety. At worst, it is the "Wild West." At best, there are requirements for driver licensing, insurance, and reporting. But requirements on assuring safety, if any, are little more than taking the manufacturer's word for it.

The one exception is the New York City DOT’s rule to require the SAE J3018 road testing safety standard, and to attest that road testing will not be more dangerous than a normal human driver. (See: https://rules.cityofnewyork.us/rule/autonomous-self-driving-vehicles/ . For bonus points, see if you can find any of these myths in the comments submitted in response to that standard, or in responses to the DOT proposal referenced under Myth #2.)

Myth #4a: We don't need proactive AV regulation because of existing safety regulations.

The current Federal Motor Vehicle Safety Standard (FMVSS) regulations do not cover computer-based functionality safety. They are primarily about whether brakes work, whether headlights work, tire pressure, seat belts, airbags, and other topics that are basic safety building blocks at the vehicle behavior level. As the National Highway Traffic Safety Administration would tell you, merely passing FMVSS mandates is not enough to ensure safety on its own; it is simply a useful and important check to weed out the most egregious safety problems based on experience.

There are no FMVSS or other regulatory requirements for automotive software safety in general, let alone for AV-specific software safety.

Safety regulators should think hard about an approach in which “safety” means requiring insurance to compensate the next of kin after a fatality. With multi-billion dollar development war chests, a few million dollars of payout after a mishap might not be sufficient deterrent to taking safety shortcuts in the race to autonomy.

Myth #4b: We don’t need proactive regulation because of liability concerns and NHTSA recalls.

The National Highway Traffic Safety Administration generally operates reactively to bad events. Sometimes, car companies voluntarily disclose a problem. Other times, a number of people have to die or be seriously injured before NHTSA forces some action (for example, eleven crashes involving Tesla driver assistance "autopilot" with emergency vehicles occurred over a 3.5-year period before action was initiated, with an expectation of many months to resolution). For mature technology, maybe this is OK — if one makes the assumption that the industry is populated with only good faith actors. But even with that assumption, it isn’t enough for immature AV technology and manufacturers new to automotive safety.

Aircraft safety regulation used to wait for crashes, but air travel got a lot safer when FAA and airlines became proactive. Most importantly, a regulatory policy that waits for loss of life and limb before taking action can result in a process that takes years to solve problems, even as the loss events continue.

It would be better if companies voluntarily commit to follow their own industry’s safety standards. If not, we might be only one big news event from a regulatory hammer coming down.

Myth #5: Existing safety standards aren't appropriate because (pick one or more):

they are not a perfect fit;
no single standard applies to the whole vehicle;
they would reduce safety because they prevent the developer from doing more;
they would force the AV to be less safe;
they were not written specifically for AVs.

These statements misrepresent how the real standards work. ISO 26262, ISO 21448, and ANSI/UL 4600 all permit significant flexibility to be used in a way that makes sense. All three work together to fit any safe AV.

ISO 26262 can apply to any light vehicle on the road for the parts that aren’t the machine-learning–based mechanisms. Cars still have motors, brakes, wheels, and other non-autonomous features that have to be safe. The hardware on which the autonomy software runs can still conform to ISO 26262. All of these are covered by ISO 26262, and the standard specifically permits extension to additional scope.

ISO 21448 is explicitly scoped for AVs in addition to ADAS. Its origin story includes being proposed as an addition to ISO 26262, and it is written to be compatible with that standard.

ANSI/UL 4600 is specifically written for AVs. It applies to the whole vehicle as well as support infrastructure. The voting committee includes experts on ISO 26262 and ISO 21448, so it is compatible with those standards, and in fact it leads naturally to using all three of these standards and not just a “single” standard. (Anyone who knows standards knows it is improbable that any safety-critical system would involve use of just a single standard.)

There is no reason not to conform to these standards, and U.S. DOT has already proposed this set for the United States. All of these standards allow developers to do more than required. All of them are flexible enough to accommodate any AV. None of them force a company to be less safe (really, that argument is laughable). None of the standards constrain the technical approach used.

Myth #6: Local and state regulations need to be stopped to avoid a “patchwork” of regulations that inhibits innovation.

A significant reason that local and state regulations are a so-called patchwork is that in each jurisdiction, the AV companies play hardball to minimize regulation. Typically these negotiations involve statements that if regulation is too stringent, companies will take their jobs and spending elsewhere, and the jurisdiction in question will get a reputation for being hostile to innovation and technology. The outcome of each negotiation is different, resulting in somewhat different regulations or voluntary guidance from place to place.

If the industry changed its stance on avoiding regulation at all costs, the patchwork could be resolved via the same uniform state law-making mechanism that standardizes other driving laws. That would make things as uniform as is practical (“right turn on red” rule differences were around before AVs).

Moving to regulation based on industry standards would actually help in this regard, because national and international standards don’t change from city to city.

A federal regulation that requires conformance to standards would help address this issue. A federal regulation that prevents states from acting but does not itself ensure safety would be worse than nothing.

Myth #7: We conform to the “spirit” of ISO 26262, etc.

AV developers typically justify their “in the spirit of” statements by advancing the theory that there might be a need for deviation from the standard (beyond any deviations that the standards already permit). The statements never specify what the possibly required deviations might be, and I’ve never heard a concrete example at any of the many standards meetings I have attended. (I’m on the U.S. voting committees for all the standards listed in this essay.)

I’ve never heard an AV company argue, when making its case, that it conforms to the intent of a standard, just to the spirit of the standard — whatever that means. Indeed, any “in the spirit” statement is meaningless, because the standards I’ve mentioned are all flexible enough that if you actually conform to the spirit and intent of the standard, you can conform to the standard. The standards explicitly permit not doing things that don’t apply and deviating from inapplicable clauses with appropriate decision processes. Doing that still permits conformance to the actual standard.

I worry that AV companies’ “spirit” claims are really code for cutting corners on safety when they think it is economically attractive to do so, or they’re in a hurry, or both.

A reasonable alternative explanation is simply that lawyers might want to avoid committing to something if nobody is forcing them to do so. That is understandable from their point of view, but it impairs transparency. The dark side of the strategy is that it provides cover for companies that are not the best actors to hide cutting corners. If companies are worried that they’ll be called out for not following a standard after a crash, they should spend the resources to actually follow the standard. Or they should not spend so much effort making public claims about safety being their top priority.

Companies that are truly doing their best on safety should be transparent about conforming to consensus standards to raise the bar for others.

Consider whether you would ride in an airplane whose manufacturer said, “We conform to the spirit of the aviation safety standards, but we’re very smart and our airplane is very special so we skipped some steps. It will be fine. This aircraft type hasn’t killed anyone yet, so trust us.”

Now ask yourself if you’d want to share the road with a test AV whose developer has said it wants the flexibility of not conforming to the industry standards for road testing that the developer itself helped write.

Myth #8: Government regulators aren’t smart enough about the technology to regulate it, so there should be no regulations. Industry is smarter and should just do what it thinks is best.

Following the proposed U.S. DOT plan to invoke the industry standards mentioned earlier makes sense, because it addresses precisely this concern. Even two years ago, the standards weren’t really there, but now they are. Industry decided which standards made sense, and then they created them.

If we could trust industry — any industry — to self-police safety in the face of short-term profit incentive and organizational dysfunction, we wouldn’t need regulators. But that isn’t the real world. Trusting the automotive industry to regulate development with immature, novel technology is unlikely to work. It’s possible, and important, for the industry to achieve a healthy balance between taking responsibility for safety and accepting regulatory oversight. Near-zero regulatory influence until after the crashes start piling up is not the right balance.

Near-zero regulatory influence until after the crashes start piling up is not the right balance.

It’s difficult to understand why it is a bad idea for government regulators to say to the AV industry: We want you to follow your own safety standards, just as all the other industries do.

Myth #9: Disclosing testing data gives away the secret sauce for autonomy.

Road testing safety is all about whether a human safety driver can effectively keep a test vehicle from creating elevated risk for other road users. That has nothing to do with the secret-sauce autonomy intellectual property. It is about the effectiveness of the safety driver.

Companies sometimes say it would be too difficult or expensive to get or provide data. If companies don’t have data to prove that they are testing safely, they shouldn’t be testing. If they think that providing testing safety data is too expensive, they can’t afford the price of admission for using public roads as a testbed.

Testing data need not include anything about the autonomy design or performance. An example would be revealing how often test drivers fall asleep while testing. A non-zero result might be embarrassing, but how does that divulge secret autonomy technology data?

By the same token, regulators should not be asking for autonomy performance data such as how often the system “disengages” because of an internal fault or software issue. They should be asking how often other road users are placed at risk, which is an entirely different story. So the miles and locations tested, along with the collisions and near-miss situations that occur, make sense as measures of public exposure to risk. But metrics related to the quality of the autonomy itself do not, unless and until that data is being used to justify testing without a safety driver.

Myth #10: Delaying deployment of AVs is tantamount to killing people.

The safety benefits of AVs are aspirational and targeted for sometime in the future. Every year, that “sometime” seems to get further away. Given the track record of promises and delays, nobody really knows how far in the future. Moreover, there is no real proof to show that AVs will ever be safer than human-driven vehicles, especially with human-driven vehicles becoming safer via active safety systems such as automated emergency braking (AEB).

With something like $100 billion being spent on AV technology, it seems likely they will eventually be safer for appropriately restricted operational design domains (ODDs). But the “when” still remains a question mark.

Ignoring industry best practices to put vulnerable road users at risk today should not be permitted in a bid to maybe, perhaps, someday, eventually save potential later victims if the technology proves viable.

Even the famous RAND study urging early deployment was careful to say that AVs should be safer than human drivers initially. The discussion is not about whether AVs should be safer than humans when deployed, but rather how much safer (or, as RAND puts it, deploying while good rather than waiting for nearly perfect.) Deploying vehicles that aren’t clearly shown to be safer than an unimpaired human driver in a similar ODD violates this principle. So does adopting test practices that result in risk worse than that presented by an unimpaired human driver operating on the same roads.

If saving lives on the road today is indeed the No. 1 priority, then a small fraction of the tens of billions of dollars being spent on AVs could be spent on reducing the drunk driving rate (embarrassingly high in the U.S. compared with Europe), improving roadway infrastructure, improving speed limit strategies, installing safer pedestrian crossings and bikeways, and so on — not on increasing near-term risk via premature deployment or irresponsible testing.

Don’t forget that bad press from a high-profile mishap can easily sets the whole industry back. No company should be rolling the safety-shortcut dice to hit a near-term funding milestone while risking both people’s lives and the reputation of the entire industry.

Myth #11: We haven't killed anyone, so that must mean we are safe.

Often this amounts to arguing, “We’ve gotten lucky so far, so we plan to get lucky in the future.” If there is no evidence of robust, systematic safety engineering and operational safety practices, this amounts to a gambler on a winning streak claiming they will keep winning forever.

Think about the implications of accepting this argument. That means that every AV tester gets to operate however they like until they kill someone. This was effectively the dynamic in at least parts of the AV industry until when a pedestrian was killed during a testing mishap in Tempe Arizona. We should not be giving developers a free pass on safety, looking into the matter only after they have killed someone.

The one possible exception to this argument might be claimed if the company has a statistically significant basis for showing safety. For fatalities, that is perhaps 300 million miles of operation with zero fatalities against a 100 million-mile average fatality rate. In practice, however, even this argument doesn’t really work, because it requires nothing to change for a vehicle that is still being tested. Changes to vehicle software, changes to roads, different operational environments, different driver demographics, and so on all reset the test odometer, so to speak, and invalidate any safety claims being made. In reality, an argument based on history alone does not prove safety. And it begs the question of how safety can be ensured while 300 million miles of operational evidence are accumulated to support the claim.

Myth #12: Other states/cities let us test without any restrictions; you should too.

Whether regulators are willing to put their constituents at increased risk in exchange for some economic benefit is a decision they are permitted to make for themselves. But the hard reality is that any tester who is not at least doing as well as SAE J3018 for road testing safety is not following accepted practices and is likely putting the local population at unnecessary risk.

We all know what happened in the 2018 Tempe Arizona testing fatality. The NTSB chair pointed out at that investigation hearing that other companies didn’t need to have a similar crash to learn the lessons of this one. One result of learning those lessons was the starting document that became the newest revision of SAE J3018 for road testing safety. If testers won’t follow that consensus industry standard, they haven’t really taken that lesson to heart.

The responsibility of safety regulators is to promote safety. Vulnerable road users should not act as unwitting test subjects for AV road testers who can’t even be bothered to commit to following accepted industry safety practices. Regulators should not feel inhibited from merely asking developers to follow the industry safety standards that, in most cases, they themselves helped write.

Bonus myths beyond the "Dirty Dozen" -- but still problematic:

Myth #13: Testing deaths are a regrettable, but necessary price to pay for improved safety.

Usually this argument is accompanied by an observation that approximately 100 people die per day on US roads from human-driven vehicles. However, the proper risk comparison is not the number of deaths, but rather fatality rate per mile.

In the US, fatal car crashes happen approximately once every 100 million miles. The entire industry has not yet accumulated 100 million miles of AV road testing, but we’ve already seen a testing fatality (Uber in 2018). It’s unlikely that AV test fleets will rack up more than 100 million miles anytime soon, so the industry has already spent its fatality “budget” for AV testing deaths. There is no justifiable reason to road test in a way that is likely to result in further testing-related fatalities. Following industry standards for safe testing is the very least that testers should be doing.

Myth #14: Self-certification has served the industry well, so it should not be changed.

Victims and their families involved with numerous wide-scale safety and environmental issues might think that self-certification has not served them well, even if the industry is happy with the situation. Exercise: Pick your favorite automotive industry safety or emissions scandal. Be sure to include class actions and death and injury suits, as well as criminal proceedings, verdicts, and settlements.

It is important to remember that industry “self-certification” is not required to address any functional or software safety standard, despite automotive-specific guidelines and standards describing how to do such safety in detail going back more than 25 years. So companies are not really required to certify anything except conformance to FMVSS, which is not about software and computer-based system safety.

Other industries actually follow their own safety standards (aviation, rail, chemical, power, mining, factory robotics, and HVAC are examples). As far as we can tell, most automotive original equipment manufacturers (OEMs) — the companies that actually sell cars, rather than those companies’ suppliers — do not. (The nuances are significant. Supply chains often follow safety standards if OEMs pay accordingly. And it is difficult to judge the validity of an OEM claim that it does something “similar to” or “better than” an industry standard. I haven’t found a single OEM statement that unambiguously reports conformance to ISO 26262 for their vehicle, which is the bedrock automotive safety standard, but go ahead and look for one. If you find one, tell me the source of that statement, and I’ll happily put a link right here: <none so far> ). (In fairness, it seems some organizations conform to ISO 26262 process chapters, but not (yet) the chapters regarding hardware and software design. The example I'm aware of is GWM, for parts 2, 3, 4, 7, 8, 9 but not parts 5 & 6.)

Automotive is the one life-critical equipment industry that does not even claim to follow its own industry safety standards.

Let that sink in.

Meanwhile, the industry rarely talks about the profound effect that removing the human driver is going to have on safety. For decades, the industry has promoted a driver-error narrative (see this paper for the history). Once there is no driver to blame, that narrative falls apart. It is time for the industry to stop the cycle of safety opacity and embrace safety standards.

A core concept of safety standards and safety is independence. Without independence, it is in practice impossible to get sustainable safety. Just ask Boeing. Yet car makers continually push back on any external oversight as well as conforming to standards that permit non-external but substantively independent checks and balances.

Myth #15: Safety standards must be based on vehicle testing, via using "performance based safety standards."

Currently, FMVSS is based on vehicle testing, for a variety of reasons. The advantage is that test results can, at least in principle, be reproduced independently. However, the narrow testing parameters (for example, pavement temperature, air temperature, speed, tire pressure) mean FMVSS tests are a narrow check on minimum capability, not a robust characterization of safety across a full range of real-world conditions. That’s OK for what they do — which is to make sure certain features are present, not to ensure that features work across full environmental and usage conditions.

It has been known for decades that for computer-based systems such as AVs, you don’t get safety by testing. You get safety by following best practices and, where available, consensus industry safety standards. Testing is a way to spot check that you got safety right. Tests without required standards conformance won’t ensure safety.

Ironically, the FMVSS test-based regulations that the industry insists are the only ones we should have are probably the most unfriendly to innovation (see Myth #2).

When you hear someone say we should be using "performance based safety standards," that tends to be code for a testing-only approach (FMVSS-style) and avoiding process standards. It also implies rejecting conformance to industry safety standards and any requirement to perform accepted safety engineering practices.

Myth #16: Following standards would not be cost effective, or would force inferior approaches compared to "superior" internal proprietary standards.

I can’t think of anything in these standards that forces companies to be less safe than any internal standard I’ve seen. Remember that the companies themselves participate in writing these standards and would specifically complain if the standards were to break existing industry practices. Industry standards are written to be compatible with industry practices.

Any responsible company is already following internal standards, which should be at least as rigorous as published consensus industry standards. (If not, how is that a good thing?) They say their standards are better, so those standards should be more rigorous and therefore more expensive to accomplish. If companies think that conforming to industry standards is too expensive, what does that say about the resources they spend conforming to their purportedly superior internal standards?

One could speculate that the way their own internal standards are “superior” is that they permit cutting corners on safety procedures to reduce cost and speed up time to market. This would be consistent with an argument that following industry standards is too expensive and “stifles innovation.” But if we can’t see their standards, we can’t know for sure. And doing less than is required by the industry’s consensus safety standards sounds like a bad idea.

Myth #17: Regulations should be "standards neutral" to level the playing field.

That's ridiculous. The standards define the consensus level playing field.

All the standards mentioned go through an open industry-consensus process. Thousands of work-hours (at least) and multiple rounds of comments and balloting are spent making sure that all the stakeholders have their say. I can tell you from personal experience that the meetings are numerous and lengthy, and everyone who wants to have a say gets one. (At the end of the day, this is a good thing, even if those days are long.)

Anyone saying that regulations should be “standards neutral” more likely means they don’t want to have to follow standards at all.

All the standards mentioned are “technology neutral.” None of them require using a LiDAR, or a radar, or a camera, or whatever. What they do require is that whatever you decide to build into your vehicle ends up being acceptably safe.

Myth #18: ANSI/UL 4600 <is broken or says something awful>.

Grossly inaccurate statements about ANSI/UL 4600 are being circulated, apparently as part of a classic FUD (fear, uncertainty, and doubt) campaign. What’s being said often ranges from highly misleading to blatantly false. If you are told by an AV company or industry organization that ANSI/UL 4600 will cause problems, you should contact the author of this essay for more information (koopman@cmu.edu).

As an example, here is a response sent to the Washington State DOT at its specific request.

Philip Koopman is an associate professor at Carnegie Mellon University specializing in autonomous vehicle safety. He is on the voting committees for the industry standards mentioned. Regulators are welcome to contact him for support at koopman@cmu.edu

Safe Autonomy

Friday, October 22, 2021

Debunking AV Industry Positions on Standards and Regulations

No comments:

Post a Comment

Popular Posts