Safe Autonomy: deployment

Showing posts with label deployment. Show all posts

Friday, December 29, 2023

My Automated Vehicle Safety Prediction for 2024

My 2024 AV industry prediction starts with a slide show with a sampling of the many fails for automated vehicles in 2023 (primarily Cruise, Waymo, Tesla). Yes, some hopeful progress in many regards. But so very many fails.

Jalopnik slide show link is here: https://jalopnik.com/this-was-the-dire-state-of-self-driving-cars-in-2023-1851115821

At a higher level, the real event was a catastrophic failure of the industry's strategy of relentlessly shouting as loud as they can "Hey, get off our case, we're busy saving lives here!" The industry' lobbyists and spin doctors can only get so much mileage out of that strategy, and it turns out it is far less (by a factor of 10+) than the miles required to get statistical validity on a claim of reducing fatality rates.

My big prediction for 2024 is the industry (if it is to succeed) will get a more enlightened strategy for both deployment criterion and messaging. Sure, on a technical basis, indeed it needs to be safer than comparable human driver outcomes.

But on a public-facing basis it needs to optimize for fewer embarrassments like the 30 photos-with-stories in this slide show. The whole industry needs to pivot into this priority. The Cruise debacle of the last few months proved (once again; remember Uber ATG?) that it only takes one company doing one ill-advised thing to hurt the entire industry.

I guess the previous plan was they would be "done" faster than people could get upset about the growing pains. Fait accompli. That was predictably incorrect. Time for a new plan.

Monday, December 4, 2023

Video: AV Safety Lessons To Be Learned from 2023 experiences

Here is a retrospective video of robotaxi lessons learned in 2023

What happened to robotaxis in 2023 in San Francisco.
The Cruise crash and related events.
Lessons the industry needs to learn to take a more expansive view to safety/acceptability:

Not just statistically better than a human driver
Avoid negligent driving behavior
Avoid risk transfer to vulnerable populations
Fine-grain regulatory risk management
Conform to industry safety standards
Address ethical & equity concerns
Build sustainable trust.

Preprint with more detail about these lessons here: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4634179

Talk Slides: https://users.ece.cmu.edu/~koopman/lectures/L141-2023-12-AV-Safety.pdf

YouTube Video: https://youtu.be/eTwJJ3lpQC4

Archive.org alternate video source: https://archive.org/details/l-141-2023-12-av-safety

Saturday, August 26, 2023

Autonomous Vehicle State Policy Issues (Talk Video)

The commercial deployment of robotaxis in San Francisco has made it apparent many issues remain to be resolved regarding the regulation and governance of autonomous vehicle technology at the state and local levels. This talk is directed at state and local stakeholders who are considering how to set policies and regulations governing this technology.

Topics:

Getting past Automated Vehicle (AV) safety rhetoric
AV safety in a nutshell

Safe as a human driver on average
Avoiding risk transfer to vulnerable populations
Avoiding negligent computer driving
Conforming to industry consensus safety standards
Addressing other ethical & equity concerns

Policy points:

Societal benefits
Public road testing
Municipal preemption
SAE Level 2/2+/3 issues
Federal vs. state regulation
Other policy issues

Revisiting common myths

Slides: https://users.ece.cmu.edu/~koopman/lectures/L136-AV_State_Policy_Issues.pdf

Youtube video: https://youtu.be/6YTSjmxu-mI

Youtube play list of individual slide videos: https://www.youtube.com/watch?v=TSW4617EDeQ&list=PL_DQiOR0jhbWdRqQM17be0vylaUwd9WOc&pp=gAQBiAQB

Archive.org video: https://archive.org/details/L136-state-regulatory-advice-for-autonomous-vehicles

Washington State policy meeting in which I give this talk and answer questions: https://avworkgroupwa.org/committee-meeting/executive-committee-meeting-15

Sunday, January 8, 2023

The case for AVs being 10 to 100 times safer than human drivers

There is a case to be made that at-scale AV deployments should be at least ten times safer than human drivers, and perhaps even safer than that. The rationale for this large margin is leaving room for the effects of uncertainty via incorporating a safety factor of some sort.[1]

Consider all the variables and uncertainty discussed in this chapter. We have seen significant variability in fatality and injury rates for baseline human drivers depending on geographic area, road type, vehicle type, road user types, driver experience, and even passenger age. All those statistics can change year by year as well.

Additionally, even if one were to create a precise model for acceptable risk for a particular AV’s operational profile within its ODD, there are additional factors that might require an increase:

· Human biases to both want an AV safer than their own driving and to over-estimate their own driving ability as discussed in a previous section. In short, drivers want an AV driving their vehicle to better than they think they are rather than better than they actually are.

· Risk of brand tarnish from AV crashes which are treated as more newsworthy than human-driven vehicle crashes of comparable severity. Like it or not, AV crashes are going to be covered by news outlets as a consequence of the same media exposure that created interest in and funding for AV developers. Even if AVs are exactly as safe as human drivers in every respect, each highly publicized crash will call AV safety into question and degrade public trust in the technology.

· Risk of liability exposure to the degree that AV crashes are treated as being caused by product defects rather than human driver error. For better or worse (mostly for worse), “driver error” is attributed to a great many traffic fatalities rather than equipment failure or unsafe infrastructure design. Insurance tends to cover the costs. Even if a judicial system is invoked for drunk driving or the like, the consequences tend to be limited to the participants of a single mishap, and the limits of personal insurance coverage limit the practical size of monetary awards in many cases. However, the stakes might be much higher for an AV if it is determined that the AV is systematically prone to crashes in certain conditions or is overall less safe than a human driver. A product defect legal action could affect an entire fleet of AVs and expose a deep-pockets operator or manufacturer to having to pay a large sum. Being seen to be dramatically safer than human drivers could help both mitigate this risk and provide a better argument for responsible AV developer behavior.

· The risk of not knowing how safe the vehicle is. The reality is that it will be challenging to predict how safe an AV is when it is deployed. What if the safety expectation is too optimistic? Human-driven vehicle fatalities in particular are so rare that it is not practicable to get enough road experience to validate fatality rates before deployment. Simulation and other measures can be used to estimate safety but will not provide certainty. The next chapter talks about this in more detail.

Taken together, there is an argument to be made that AVs should be safer than human drivers by about a factor of 10 (being a nice round order of magnitude number) to leave engineering margin for the above considerations. A similar argument could be made for this margin to be an even higher factor of 100, especially due to the likelihood of a high degree of uncertainty regarding safety prediction accuracy while the technology is still maturing.

The factor of 100 is not to say that the AV must be guaranteed to be 100 times safer. Rather, it means that the AV design team should do their best to build an AV that is expected to be 100 times safer plus or minus some significant uncertainty. The cumulative effect of uncertainties in safety prediction, inevitable fluctuations in operational exposure to risky driving conditions, and so on might easily cost a factor of 10 in safety.[2] That will in turn reduce achieved safety to “only” a factor of 10 better than a baseline human driver. That second factor of 10[3] is intended to help deal with the human aspect of expectations being not just a little better than the safety of human drivers, but a lot better, the risk of getting unlucky with an early first crash, and so on.

Waiting to deploy until vehicles are thought to be 100 times safer than humans is not a message investors and design teams are likely to want to hear. But it is, however, a conservative way to think about safety that leaves room for the messiness of real-world engineering to deploy AVs. Any AV deployed will have a safety factor over (or under) Positive Risk Balance (PRB).

The question is whether the design team will manage their PRB safety factor proactively. Or not.

[1] Safety factors and derating are ubiquitous in non-software engineering. It is common to see safety factors of 2 for well understood areas of engineering, but values can vary. A long-term challenge for software safety is understanding how to make software twice as “strong” for some useful meaning of the word “strong.” Over-simplifying, with mechanical structures, doubling the amount of steel should make it support twice the load. But with software, adding twice the number of lines of code just doubles the number of defects, potentially making the system less reliable instead of more reliable unless special techniques are applied very carefully. And even then, predicted improvement can be controversial.
See: https://en.wikipedia.org/wiki/Factor_of_safety
https://en.wikipedia.org/wiki/Derating
and https://en.wikipedia.org/wiki/N-version_programming

[2] For better or worse – but given the optimism ingrained in most engineers, probably not for better.

[3] Some good news here – by the time you have a safety factor of 10 or more, nuances such as driver age and geofence zip codes start being small compared to the safety factor. If someone says they have a safety factor of 10, it is OK not to sweat the small stuff.

The effect of AV company business models on safety

The business model and exit plan for an AV company can powerfully incentivize behavior that is at odds with public safety and transparency. This is probably not news regarding any private company, but it is especially a problem for AV safety.

An AV developer with a plan to develop, deploy, and long-term sustain their technology should be incentivized to reach at least some level of safety subject to all the ethical issues discussed already in this chapter. If they do not, they will probably not have a viable long-term business. Arguments for a light regulatory touch often make this argument that companies will act in their own long-term best interest. But what if the business incentive model is optimized for something shorter than the “long-term” outcomes?

Short-term aspects of the business objectives and the business structure itself can add pressure that might tend to erode any commitment to acceptable safety. Factors include at least the following, several of which can interact with each other:

· Accepting money from traditional venture capital sources can commit a company to a five-year timeline to produce products. Thus far we have seen that five-year timelines are far too aggressive to develop and deploy an AV at scale. Re-planning and raising more funding can lengthen the timeline, but there remains risk that funding incentivizes aggressive milestones to show increased functionality and, in particular, remove safety drivers rather than core efforts on safety. Some companies will likely be better at resisting this pressure than others.

· A business exit plan of an Initial Public Offering (IPO), going public via a Special Purpose Acquisition Company (SPAC), or being bought out by a competitor historically emphasize perceived progress on functionality rather than safety. If the exit plan is to make safety someone else’s problem post-exit, it is more difficult to justify spending resources on safety rather than functionality until the company goes public.[1]

· The AV industry as a whole takes an aggressively non-regulatory posture, with that policy approach historically enabled by US DOT.[2] This situation forces little, if any, accountability for safety until crashes happen on public roads. There is a tendency for at least some companies seem to treat safety more as a public relations and risk management function than a substantive safety engineering function. Short-term incentives can align with a dysfunctional approach.

· Founders of AV companies with a primarily research, consumer software, or other non-automotive background might not appreciate what is involved in safety at scale for such systems. They might earnestly – but incorrectly – believe that when bugs are removed that automatically bestows safety, or otherwise have a limited view of the different factors of safety discussed in chapter 4. They might also earnestly believe some of the incorrect talking points discussed in section 4.10 regarding safety myths promoted by the AV industry.

· The mind-boggling amount of money at stake and potential winnings for participants in this industry would make it difficult for anyone to stay the course in ensuring safety in the face of rich rewards for expediency and ethical compromise. No matter how pure of spirit and well intentioned.

It is impossible to know the motivations, ethical framework, and sincerity of every important player in the AV industry. Many participants, especially rank and file engineers, are sincere in their desire to build AVs and believe they are helping to build a better, safer world. Regardless of that sincerity, it is important to have checks and balances in place to ensure that those good intentions translate into good outcomes for society.

One has to assume that outcomes will align with incentives. Without checks and balances, dangerous incentives can be expected to lead to dangerous outcomes. Checks and balances need to be a combination of internal corporate controls and government regulatory oversight. A profit incentive is insufficient to ensure acceptable safety, especially if it is associated with a relatively short-term business plan.

[1] Safety theater money spent to impart an aura of safety is a different matter, and spending on this area can bring good return on investment. But we are talking about real safety here. In the absence of a commitment to conform to industry safety standards it can be difficult to tell the difference without a deep dive into company practices and culture.

[2] The official US Department of Transportation policy still in effect at the time of this writing states: “In this document, NHTSA offers a nonregulatory approach to automated vehicle technology safety.” See page ii of: https://www.nhtsa.gov/document/automated-driving-systems-20-voluntary-guidance

Book: How Safe is Safe Enough? Measuring and Predicting Autonomous Vehicle Safety

How Safe Is Safe Enough for Autonomous Vehicles?

The Book

The most pressing question regarding autonomous vehicles is: will they be safe enough? The usual metric of "at least as safe as a human driver" is more complex than it might seem. Which human driver, under what conditions? And are fewer total fatalities OK even if it means more pedestrians die? Who gets to decide what safe enough really means when billions of dollars are on the line? And how will anyone really know the outcome will be as safe as it needs to be when the technology initially deploys without a safety driver?

This book is written by an internationally known expert with more than 25 years of experience in self-driving car safety. It covers terminology, autonomous vehicle (AV) safety challenges, risk acceptance frameworks, what people mean by "safe," setting an acceptable safety goal, measuring safety, safety cases, safety performance indicators, deciding when to deploy, and ethical AV deployment. The emphasis is not on how to build machine learning based systems, but rather on how to measure whether the result will be acceptably safe for real-world deployment. Written for engineers, policy stakeholders, and technology enthusiasts, this book tells you how to figure out what "safe enough" really means, and provides a framework for knowing that an autonomous vehicle is ready to deploy safely.

Currently available for purchase from Amazon, with international distribution via their print-on-demand network. (See country-specific distribution list below.)

See bottom of this post for e-book information, from sources other than Amazon, as well as other distributors for the printed book.

Media coverage and bonus content:

Free downloadable book preview here: BOOK PREVIEW
Book Review by Ojo-Yoshida Report
Podcast with Ojo-Yoshida Report
Video lecture summarizing main topics covered in the book: YouTube video | Archive.org video | slides

Chapters:

Introduction
Terminology and challenges
Risk Acceptance Frameworks
What people mean by "safe"
Setting an acceptable safety goal
Measuring safety
Safety cases
Applying SPIs in practice
Deciding when to deploy
Ethical AV deployment
Conclusions

368 pages.

635 footnotes.

On-line clickable link list for the footnotes here: https://users.ece.cmu.edu/~koopman/SafeEnough/

Koopman, P., How Safe Is Safe Enough? Measuring and Predicting Autonomous Vehicle Safety, September 2022.
ISBN: 9798846251243 Trade Paperback
ISBN: 9798848273397 Hardcover (available only in marketplaces supported by Amazon)

Also see my other recent book: The UL 4600 Guidebook

For those asking about distribution -- it is served by the Amazon publishing network. Expanded distribution is selected, so other distributors might pick it up in 6-8 weeks to serve additional countries (e.g., India) or non-Amazon booksellers, especially in US and UK. How that goes is beyond my control, but in principle a bookstore anywhere should be able to order it by about mid-November 2022. Alternately, you can order it direct from Amazon in the closest one of these countries for international delivery: US, UK, DE, FR, ES, IT, NL, PL, SE, JP, CA, AU.

Australia: https://www.amazon.com.au/dp/B0BCSCZ6NC

Canada: https://www.amazon.ca/dp/B0BCSCZ6NC

France: https://www.amazon.fr/dp/B0BCSCZ6NC

Germany: https://www.amazon.de/dp/B0BCSCZ6NC

Italy: https://www.amazon.it/dp/B0BCSCZ6NC

Japan: https://www.amazon.co.jp/dp/B0BCSCZ6NC

Netherlands: https://www.amazon.nl/dp/B0BCSCZ6NC

Poland: https://www.amazon.pl/dp/B0BCSCZ6NC

Spain: https://www.amazon.es/dp/B0BCSCZ6NC

Sweden: https://www.amazon.se/dp/B0BCSCZ6NC

UK: https://www.amazon.co.uk/dp/B0BCSCZ6NC

US: https://www.amazon.com/dp/B0BCSCZ6NC

You can also buy it from some Amazon country web sites via distributors. A notable example is:

India: https://www.amazon.in/How-Safe-Enough-Predicting-Autonomous/dp/B0BCSCZ6NC

Barnes & Noble US: https://www.barnesandnoble.com/w/how-safe-is-safe-enough-philip-koopman/1142576570

Your local bookstore should also be able to order it through their US or UK distributor.

And Target (mail order) of all places: https://www.target.com/p/how-safe-is-safe-enough-by-philip-koopman-paperback/-/A-88112477#lnk=sametab

E-book available from distributors as they pick it up over time:

Smashwords: https://www.smashwords.com/books/view/1347364

Apple Books: https://books.apple.com/us/book/how-safe-is-safe-enough-measuring-and-predicting/id6445826883

Barnes & Noble: https://www.barnesandnoble.com/w/how-safe-is-safe-enough-philip-koopman/1142576570?ean=2940166017215

What Happens After The Industry's JohnnyCab Adventure?

Recent news has people questioning whether autonomous vehicles are viable. Promises that victory is right around the corner are too optimistic. But it's far too early to declare defeat. There is a lot to process. But a pressing technology roadmap question is: if robotaxis aren't really the answer, what happens next for passenger vehicles?

Ahhhnold starts an ill-fated JohnnyCab trip in Total Recall (older version)

A robotoxi ride that did not go as expected. (Total Recall 1990)

We can expect OEMs to double down on their Level 2è2+è3 strategies. But there is a very real risk of a race to the bottom as companies scramble to deploy shiny automation technology while skipping over reasonable, industry-created safety practices. A subtle, yet crucial, point is that asking a driver to supervise a driver assistance system is a completely different situation than putting a civilian car owner in the position of being an untrained autonomous vehicle test driver.

Where are we now?

The recent demise of Argo AI has made it crystal clear that Big Auto is pivoting away from robotaxis. Big Auto has kept plugging away at driver assistance systems while hedging its bets by taking stakes in Level 4 robotaxi companies. Those Level 4 companies were carefully kept at arms length against the day it was time to close out those hedges. And here we are.

Other companies continue to work on robotaxis -- most notably Waymo, Cruise, and some players in China that are carrying passengers on public roads. Waymo seems unconstrained by runway length. Presumably Cruise has a year-end milestone as they did in 2020 and 2021, and we'll see how that goes. Meanwhile Tesla continues to celebrate anniversaries of its promise to put a million robotaxis on the road within a year. Heavy trucks, parcel delivery, and low speed shuttles are also part of the mix, but face their own challenges that are beyond the scope of this discussion.

For now let's entertain the possibility that robotaxis are not going to happen anytime soon. It seems likely to take years of hammering away at the heavy tail of edge cases to get there. Sure, some of us can get a demo ride in a Taxi of Tomorrow in a couple cities. But let's assume that robotaxis are more of a Disney-esque experience than a practical tool for profitable mobility at scale anytime soon.

What happens next?

We can expect to see OEMs double down on evolutionary strategies. Ford has pretty much said as much, promising a Level 3 system is on the way. It will probably be talked about as climbing up the SAE Levels from 2 to 3 to 4, which has been their narrative all along. That narrative has significant issues because it makes the common mistake of using SAE levels as a progressive ladder to climb -- which it is definitely not. But it will be the messaging framework nonetheless.

In the mix are several options:

Level 2 highway cruising systems that are already on the road
Level 2+ add-ons, but with the driver still responsible for safety
Traffic jam pilot as the first step to drivers taking attention off the road
Level 3/4 capabilities beyond traffic jams (harder than it might seem)
Abuse of Level 2/2+ designations to evade regulation

We'll see all of these in play. Each option comes with its own challenges.

Level 2 systems: highway cruising

The general idea is that first you build an SAE Level 2 system that has a speed+steering cruise control capability (automated speed control and automated lane keeping). This is intended for use on highways and other situations involving well behaved, in-lane travel. This functionality is already widely available on new high-end vehicles. The driver is responsible for continuously monitoring the car's performance and the road conditions, and intervening instantly if necessary -- whether the car asks for intervention or not.

A significant challenge for these systems is ensuring that drivers remain attentive in spite of inevitable automation complacency. This requires a highly capable driver monitoring system (DMS) that, for example, tracks where the driver is looking to ensure constant monitoring of the situation on the roadway.

Performance standards for monitoring driver attentiveness are in their infancy, and capabilities of the DMS on different vehicles varies dramatically. It is pretty clear that a camera-based system is much more effective than steering wheel touch sensors. And a camera that can see the driver's eye gaze direction, including having infrared emitters for night operation, is likely to be more effective than one that can't.

Another crucial safety feature that should be implemented is restricting operation of Level 2 features to the conditions they are designed to be safe in. The SAE J3016 taxonomy standard makes this totally optional -- which is one of the many reasons J3016 is not a safety standard, and why no government should write the SAE Levels into their safety regulations. But if the automation is supposed to only be used on divided highways with no cross-streets, then it should refuse to engage unless that is the road type it is being used on. Some companies restrict their vehicle to operation on specific roads, but others do not.

NTSB has made a number of recommendations to improve DMS capability and related Level 2 functionality, but the industry is still working on getting there.

Vehicle automation proponents have been beating the drum of potential safety benefit for years -- to the point that many take safety benefits as an article of faith. However, the reality is that there is no credible evidence that Level 2 capabilities make systems safer. There is plenty of propaganda to be sure, but much of that is based on questionable comparisons of very capable high-end vehicles with AEB and five-star crash test results vs. fleet-average 12 year old cars without all that fancy safety technology. Comparisons also tend to involve apples-meets-oranges different operational scenarios. The results are essentially meaningless, and simply provide grist for the hype machine.

At best, from available information, Level 2 systems are a safety-neutral convenience feature. At worst, AEB is compensating for a moderate overall safety loss from Level 2 features, contributing to an ever-increasing number of fatalities that might be associated with disturbing patterns such as crashing into emergency responders. However, nobody really knows the extent of any problems because of extreme resistance to sharing data by car companies that would reveal the true safety situation. (NHTSA has mandated crash reporting for some crashes, but there is no denominator of miles driven, so crash rates cannot be determined in a reliable way using publicly available data.)

If the safety benefits were really there, one imagines that these companies full of PhDs who know how to write research papers would be falling all over themselves to publish rigorously stated safety analyses. But all we're seeing is marketing puffery, so we should assume the safety benefit is not yet there (or at least potential safety benefits are not yet supported by the actual data).

In the face of these questions about automation safety and pervasive industry safety opacity, the industry's plan is to ... add even more automation.

Level 2+ systems

"Level 2+" is a totally made up marketing term. The SAE J3016 standard says this term is invalid. But it gets used anyway so let's go with it and try to see what it might mean in practice

Once you have a vehicle that stays in its lane and avoids most other vehicles, companies feel competitive pressure to add more capability. Features such as lane changing, automatically merging into traffic, and automatically detecting traffic lights might help drivers with workload. And sure, that sounds nice if it can be done safely.

The question is what happens to driver attention as automation features get added and automation performance improves. There is plenty of reason to believe that as the driver perceives automation quality is improving, they will struggle to remain engaged with vehicle operation. They will succumb to driver complacency. The better the automation, the worse the problem, as illustrated by this conceptual graph:

As autonomous features become more reliable net safety can decrease due to driver complacency

A significant complication is that the driver has to monitor not only the road conditions, but also the car's behavior to determine if the car will do the right thing or not. This can be pretty tricky, especially if it is difficult to know if the car "sees" something happening on the road or not. What if there is a firetruck stopped on the highway in front of you? Will the car detect and stop, or run right into it? How long do you wait to find out? What if you wait just a little too long and suffer a crash? Monitoring automation behavior can be tricky, and is a much different task than simply driving. Assuming that a competent driver is also a competent monitor is a bit of a leap of faith. As you add more complex automation behavior, the driver's ability to track what is and is not supposed to be handled and compare that to what the vehicle is doing can easily be overwhelming.

Monitoring cross-traffic can be especially difficult. How can the supervising driver know their car is about to accelerate out across traffic to make a left turn when there is an oncoming car? Pressing the brake quickly after the car has already started lunging into traffic as a safety supervising driver is an entirely different safety proposition than waiting for a driver to command speed when the road is clear. The human/computer interface implications here are tricky.

Car companies will continue to pile on more automation features. At least some of them will continue to improve DMS capability. There are many human/machine interface issues here to resolve, and the outcome will in large part depend on non-linear effects related to often the human driver needs to intervene quickly and correctly to ensure safety. How that will work out in real world conditions remains to be seen.

The moral hazard of blaming the driver

A significant issue with Level 2 and 2+ systems is that the driver is responsible for safety, but might not be put into a position to succeed. There are natural limits to human performance when supervising automation, and we know that it is difficult for humans to do well at this task. We should not ask human drivers to display superhuman capabilities, then blame them for being mere mortals. Driver monitoring might help, but we should respect the limits to how much it can help.

There is a temptation to blame the driver for any crash involving automation technology. However, doing so is counterproductive. A crash is a crash. Blame does not change the fact that a crash happened. Blame is a fairly ineffective deterrent to other drivers slacking off in general, and is wholly ineffective at converting normal humans into super-humans. If using a Level 2 or 2+ system results in a net increase in crashes, blaming the drivers won't bring the increased fatalities back to life -- even if we bring criminal charges.

If real-world crashes increase with the use of driver automation (compared to a comparably equipped vehicle in the same conditions without driver automation), they should be considered unreasonably dangerous. Something would need to be done about such systems to change the system, human behavior, or both. Changing human behavior via "education" is usually what is attempted, and almost never works. The human/computer interface and the feature set are much more likely what need to change.

As a simple example of how this might play out, let's say a car company runs TV advertisements showing drivers singing and clapping hands while engaging a Level 2/2+ system. (Dialing back the #autonowashing, later ads show only the passengers doing this.) The company should have data showing that even if a worst case driving automation error for their system takes place mid-clap, the driver will be attentive enough to the situation (despite being caught up in the song) and have a quick enough reaction time to get their hands back onto the steering wheel and intervene for safety. Blaming the driver for clapping hands after airing a TV commercial amounts should not be a permissible tactic.

Note that we are not saying hands-off is inherently unsafe, but rather that permitting (even encouraging) hands-off operation significantly increases the challenge of ensuring practical safety. The data needs to be there to justify safety for whatever operational concept is being deployed and encouraged.

ALKS: a baby Step Toward Level 3

The way the term "Level 3" is being used by almost everyone seldom matches what the SAE J3016 standard actually says. (See Myths #6, #7, #8. #14, #15, etc in this J3016 analysis.) But this is not the place to hash out that incredibly messy situation other than to note that SAE J3016 terminology is not how we should be describing these safety critical driving features at all.

For our purposes, let's assume when a car maker says "Level 3" they mean that the driver can take their eyes off the road, at least for a little bit, so long as they are ready to jump back into the role of driving if the car sounds an alarm for them to do so.

The industry's baby step toward this vision of Level 3 is Automated Lane Keeping Systems (ALKS) as described by the European standard UNECE Reg. 157. (To its credit, this standard does not mention SAE Levels at all.) The short version is that drivers can take their eyes off the road and let the car do the driving in slow speed situations. In general, this is envisioned as a traffic jam assistant (slow speed, stop-and-go, traffic jam situations). We can expect this to be the first Level 3 step in an envisioned transition from Level 2 to 2+ to 3 strategy that is already said to be deployed at small scale in Europe and Japan.

ALKS might work out well. Traffic jams on freeways are pretty straightforward compared to a lot of other operational scenarios. Superficially there are a lot of slow moving cars and not much else. Going just a bit deeper, if the jam is due to a crash there will be road debris, emergency responders, and potentially dazed victims walking through traffic. So saying "no pedestrians" or even "pedestrians will be well behaved" is unrealistic near a crash scene. But one can envision this can be managed, so it seems a reasonable first step so long as safety is considered carefully.

Broader Level 3 safety requirements

Going beyond the strict workings of SAE J3016 (which, if followed to the letter and not exceeded, will almost certainly result in an unsafe vehicle), Level 3 driving safety only works if:

The Automated Driving System (ADS) is 100% responsible for safety the entire time it is engaged. Period. Any weasel wording "but the driver should..." will lead to Moral Crumple Zone designs (blaming people for automation failing to operate as advertised). Put another way, if the driver is ever blamed for a crash when a Level 3 automation feature is engaged, it wasn't really Level 3.
The ADS needs to determine with essentially 100% reliability when it is time for the driver to intervene. This is one aspect of the most difficult problem in autonomous vehicle design: having the automation know when it doesn't know what to do. The magnitude of this challenge should not be underestimated. Safe Level 3 is not just a little harder than Level 2. The difference is night and day in any but the most benign operational conditions. Machine learning is terrible at handling unknowns (things it hasn't trained on), but recognizing something is an unknown is required to make sure the driver intervenes when the ADS cannot handle the situation.
The ADS ensures reasonable safety until the driver has time to respond, despite the fact that something has gone wrong (or is about to go wrong) to prompt the takeover request. This means the ADS needs to keep the car safe at least for a while. If the driver takes a long time to respond, the ADS needs to do something reasonable. In some cases perhaps an in-lane stop is good enough; in others not. (In practice this pushes the ADS arguably to be a very low-end Level 4 system, but we're back to J3016 standards gritty details so let's not even go there. The point is that the driver might take a long time to respond, and the ADS can't simply dump blame on the driver if a crash happens when the going gets tough.)

For ALKS, the main safety plan is to go slowly enough that the car can stop before it is likely to hit anything. That "anything" is predominantly other cars, and sometimes people at an emergency scene. High speed animal encounters, cross traffic, and other situations are ruled out by the narrowly limited operational design domain (ODD), which should be enforced by prohibiting activation in inappropriate situations. The ALKS standard presumes drivers will respond to takeover notifications in 10 seconds, but the vehicle has to do something reasonable even if that does not happen.

More generally, it is OK to take credit for most drivers being able to respond relatively quickly in most situations. It is likely that a more advanced Level 3 system will progressively degrade its operation over a period of many seconds, such as first slowing down, then coming to a stop in the safest place it can reach given the situation. If a driver falls asleep from boredom, it might take a while for honking horns from other drivers wake them up. Or it might take longer than 10 seconds to regain situational awareness if overly engrossed in a movie they were told was OK to watch while in this operational mode. Human subject studies can be used to claim credit for most drivers intervening relatively quickly (to the degree that is true) even if all drivers cannot respond that quickly. Credit could be taken for being able to pull over to the side of the road in most situations to reduce the risk of being hit by other vehicles, even if that is not possible in all situations. Safety for human driver takeover is not the result of a single ten second threshold, but rather a stack-up of various levels of degraded operation and probabilities of mishaps in a variety of real-world scenarios.

One might be able to argue net acceptable safety as long as the worst cases are infrequent. If the worst cases happen too often -- including inevitable misuse and abuse -- some sort of redesign will be needed. Again, blaming drivers or "educating" them won't fix a fundamentally unsafe operational concept. Safety must be built into the system, not dodged by blaming drivers for having an unacceptably high crash rate.

Advanced features and test platforms

Once ALKS is in place, the inevitable story is going to be to slowly expand the ODD. For example, if things seem safe, increase the speed at which ALKS can operate. Then let it do on/off ramps. And so on.

The same thinking will be in place for Level 2+ systems. If in-lane cruise control works, add lane changing. Add traffic merging. Let it operate in suburbs. Add unprotected left turns. Next thing you know, we'll be back to saying robotaxis will be here next year -- this time for sure!

While this seductive story sounds promising, it's not going to be easy. Automation complacency combined with limits of drivers to supervise potentially unpredictable autonomy functions will be a first issue to overcome. The second will be confusing a Level 2 driver with an autonomous vehicle testbed driver.

We've already discussed automation complacency. However, it is going to get dramatically worse with more advanced functionality. Humans learn to trust automation far too quickly. If a car correctly makes an unprotected left turn 100 times in a row, any safety driver might well be mentally unprepared to intervene in time when it pulls out in front of oncoming traffic on the 101st turn. Even something as simple as a automation being able to see moving cars but not detecting an overturned truck on the road has led to crashes, presumably due to automation complacency.

While car companies can argue that trained, professional safety drivers might keep testing acceptably safe (e.g., by conforming to the SAE J3018 safety standard -- which mostly is not being done), that does not apply to retail customers who are driving vehicles.

Tesla has led the industry in some good ways by promoting the adoption of electric vehicles. But it has set a terrible precedent by also adopting a policy of having untrained civilian vehicle owners operate as testers for immature automation technology, leading to reckless driving on public roads.

There is already significant market pressure for other companies to follow the Tesla playbook by pushing out immature automation features to their vehicles and telling drivers that they are responsible for ensuring safety. After all, if Tesla can get away with it, only suffering the occasional wrist slap for egregious rule violations, is it not the case that other companies have a duty to their stockholders to do the same so as to compete on a level playing field? As long as we (the public) allow companies to get away with blaming drivers for crashes instead of pinning responsibility on auto makers for deploying poorly executed automation features, reckless road testing (for example, running red lights) will continue to proliferate.

While it is difficult to draw a line in the sand, we can propose one. Any vehicle that can make turns at intersections with no driver control input is not a Level 2 system -- it is an autonomous vehicle prototype. There is an argument supporting this based on the evolution of J3016. (See page 242 of this paper.) But more importantly, supervising turns in traffic is clearly of a different nature than supervising cruise control on a highway. It takes much more attention and awareness of what the vehicle is about to do to prevent a tragic crash. Especially if you are busy clapping hands to a rock song.

Abusing Level 2/2+ to evade regulation

Another trail blazed by Tesla is letting regulators think they are deploying unregulated Level 2/2+ systems when they are really putting Level 3/4 testbeds on the roads in the hands of ordinary vehicle owners.

Companies should not call something Level 2+ when it is really a testbed for Level 3 features. Deployed Level 2+ features must be production ready instead of experimental, and should never put human drivers in the role of being an unqualified "tester" of safety critical functionality. Among other things, the autonowashing label of "beta test" that implies drivers are responsible for crashes due to automation malfunctions should be banned.

The details are subtle here: if a vehicle does something dangerous that is not reasonably expected by an ordinary driver, that should be considered a software malfunction. Any dangerous behavior that cannot be compensated for by an ordinary driver should not happen -- regardless of whether the driver has been told to expect it or not. Products sold to retail customers should not malfunction, nor should they exhibit unreasonably dangerous behaviors. Telling the driver they are responsible for safety should not change the situation.

Software should either work properly or be limited to trained testers operating under a Safety Management System. Rollout of Level 2+ and Level 3 features can happen in parallel, and that's fine. But a prototype Level 3 feature that can malfunction is not in any way the same as a mature Level 2+ feature -- even if they superficially behave the same way. It's not about the intended behavior, it's about having potentially very ill-behaved malfunctions. A Level 2+ system might not "see" a weird object in the road and perhaps the driver should be able to handle such a situation if they have been told that is an expected behavior. But any system that suddenly swerves the vehicle left into oncoming traffic is not fit for use by everyday drivers. A suddenly left-swerving vehicle is not Level 2/2+ -- it is a malfunctioning prototype vehicle that should only be operated by trained test drivers regardless of the claimed automation level.

Over time we'll see car companies aggressively try to get features to market. We can expect repeated iterations of arguments that the next incremental feature is no big deal for a driver to supervise. But if that feature comes with cautions that drivers must pay special attention beyond what a driver monitoring system supports, or incur extra responsibility, that should be a red flag that we're looking at asking them to be testers rather than drivers. Warnings that the vehicle "may do the wrong thing at the worst time" are unambiguously a problem. Such testing should be strictly forbidden for deployment to ordinary drivers.

Telling someone that a product is likely to have defects (that's what "beta" means these days, right?) means that they are being sold a likely-defective product. As such, they should not be distributed as retail products to everyday customers, regardless of click-through disclaimers. After all, any unqualified "tester" is not just placing themselves at risk, but other road users as well.

For those who say that this will impede progress, see my discussion of the myths propagated by the AV industry in support of avoiding safety regulation. Following basic road testing safety principles should be a part of any development process. Anything else is irresponsible and, in the long run, will hurt the industry more than it helps by generating bad press and accompanying ill will.

Civilian drivers should never be held responsible for malfunctioning technology by being used as Moral Crumple Zones. That especially includes mischaracterizing Level 3 (and above) prototype features as Level 2+ systems.

The bottom line

Perhaps NHTSA will wake up and put a regulatory framework in place to help the industry succeed at autonomous vehicle deployment safety. (They already have proposed a plan, but it is in suspended animation.) Even if the new NHTSA framework proceeds, regulations and proposals from both NHTSA and the states address levels 3-5, while leaving a situation ripe for Level 2+ abuse. Until that changes, expect to see companies pushing the envelope on Level 2+ as they compete for dominance in the vehicle automation market.

Video: How Safe Is Safe Enough for Autonomous Vehicles

Video lecture summarizing main topics covered in my book:

YouTube video (55 minutes)
YouTube playlist (in case you want to find a particular slide)
Archive.org video (55 minutes)
Lecture Slides
More information about the book

Abstract:

In this talk I outline some key factors involved in measuring and predicting autonomous vehicle (AV) safety. This includes what people mean by "safe," setting an acceptable safety goal, measuring & predicting safety, deciding when to deploy, and ethical AV deployment. A framework for making a responsible deployment decision needs to include not just risk, but also deal with inevitable uncertainty, stakeholder inclusion, and an ethical governance model. The talk is a high level overview of my recently published book: How Safe Is Safe Enough? Measuring and Predicting Autonomous Vehicle Safety.

Tuesday, May 28, 2019

Ethical Problems That Matter for Self Driving Cars

It's time to get past the irrelevant Trolley Problem and talk about ethical issues that actually matter in the real world of self driving cars. Here's a starter list involving public road testing, human driver responsibilities, safety confidence, and grappling with how safe is safe enough.

Public Road Testing. Public road testing clearly puts non-participants such at pedestrians at risk. Is it OK to test on unconsenting human subjects? If the government hasn't given explicit permission to road test in a particular location, arguably that is what is (or has been) happening. An argument that simply having a "safety driver" mitigates risk is clearly insufficient based on the tragic fatality in Tempe AZ last year.

In general, see: https://grants.nih.gov/policy/humansubjects.htm
A credible, independently reviewed argument that a road testing campaign provides adequate safety might be enough. But how transparent should that process be? Should we just take a company's word for it that they are doing responsible, safe road testing? For an example of a safety argument approach, see: https://safeautonomy.blogspot.com/2019/04/safety-argument-consideration-for.html

Expecting Human Drivers to be Super-Human. High-end driver assistance systems might be asking the impossible of human drivers. Simply warning the driver that (s)he is responsible for vehicle safety doesn't change the well known fact that humans struggle to supervise high-end autonomy effectively, and that humans are prone to abusing highly automated systems. This gives way to questions such as:

At what point is it unethical to hold drivers accountable for tasks that require what amount to super-human abilities and performance?
Are there viable ethical approaches to solving this problem? For example, if a human unconsciously learns how to game a driver monitoring system (e.g., via falling asleep with eyes open -- yes, that is a thing) should that still be the human driver's fault if a crash occurs?
Is it OK to deploy technology that will result in drivers being punished for not being super-human if result is that the total death rate declines?

Confidence in Safety Before Deployment. There is work that advocates even slightly better than a human is acceptable (https://www.rand.org/blog/articles/2017/11/why-waiting-for-perfect-autonomous-vehicles-may-cost-lives.html). But there isn't a lot of discussion about the next level of what that really means. Important ethical sub-topics include:

Who decides when a vehicle is safe enough to deploy? Should that decision be made by a company on its own, or subject to external checks and balances? Is it OK for a company to deploy a vehicle they think is safe based just on subjective criteria alone: "we're smart, we worked hard, and we're convinced this will save lives"
What confidence is required for the actual prediction of casualties from the technology? If you are only statistically 20% confident that your self-driving car will be no more dangerous than a human driver, is that enough?
Should limited government resources that could be used for addressing known road safety issues (drunk driving, driving too fast for conditions, lack of seat belt use, distracted driving) be diverted to support self-driving vehicle initiatives using an argument of potential public safety improvement?

How Safe is Safe Enough? Even if we understand the relationship between an aggregate safety goal and self-driving car technology, where do we set the safety knob? How will the following issues affect this?

Will risk homeostatis apply? There is an argument that there will be pressure to turn up the speed/traffic volume dials on self-driving cars to increase permissiveness and traffic flow until the same risk as manual driving is reached. (Think more capable cars resulting in crazier roads with the same net injury and fatality rates.)
Is it OK to deploy initially with a higher expected death rate than human drivers under an assumption that systems will improve over time, long term reducing the total number of deaths? (And is it OK for this improvement to be assumed rather than proven to be likely?)
What redistribution of demographics for victims is OK? If fewer passengers die but more pedestrians die, is that OK if net death rate is the same? Is is OK if deaths disproportionately occur to specific sub-populations? Did any evaluation of safety before deployment account for these possibilities?

I don't purport to have the definitive answers to any of these problems (except a proposal for road testing safety, cited above). And it might be that some of these problems are more or less answered. The point is that there is so much important, relevant ethical work to be done that people shouldn't be wasting their time on trying to apply the Trolley Problem to AVs. I encourage follow-ups with pointers to relevant work.

If you're still wondering about Trolley-esque situations, see this podcast and the corresponding paper. The short version from the abstract of that paper: Trolley problems are "too contrived to be of practical use, are an inappropriate method for making decisions on issues of safety, and should not be used to inform engineering or policy." In general, it should be incredibly rare for a safely designed self-driving car to get into a no-win situation, and if it does happen they aren't going to have information about the victims and/or aren't going to have control authority to actually behave as suggested in the experiments any time soon, if ever.

Here are some links to more about applying ethics to technical systems in general (@IEEESSIT) and autonomy in particular (https://ethicsinaction.ieee.org/), as well as the IEEE P7000 standard series (https://www.standardsuniversity.org/e-magazine/march-2017/ethically-aligned-standards-a-model-for-the-future/).

Tuesday, June 5, 2018

A Reality Check on the 94 Percent Human Error Statistic for Automated Cars

Automated cars are unlikely to get rid of all the "94% human error" mishaps that are often cited as a safety rationale. But there is certainly room for improvement compared to human drivers. Let's sort out the hype from the data.

You've heard that the reason we desperately need automated cars is that 94% of crashes are due to human error, right? And that humans make poor choices such as driving impaired, right? Surely, then, autonomous vehicles will give us a factor of 10 or more improvement simply by not driving stupid, right?

Not so fast. That's not actually what the research data says. In the words of NTSB Chair Jennifer Homendy: It Ain't 94%!

It's important for us to set realistic expectations for this promising new technology. Probably it's more like 50%. Let's dig deeper.

(NEW: video explainer here 8 minutes https://youtu.be/pYb4X5aJhgU)

(https://bit.ly/2smqwYu /CC0)

The US Department of Transportation publishes an impressive amount of data on traffic safety -- which is a good thing. And, sure enough, you can find the 94% number in DOT HS 812 115 Traffic Safety Facts, Feb. 2015. (https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/812115) which says: "The critical reason was assigned to drivers in an estimated 2,046,000 crashes that comprise 94 percent of the NMVCCS crashes at the national level. However, in none of these cases was the assignment intended to blame the driver for causing the crash" (emphasis added). There’s the 94% number. But wait – they’re not actually blaming the driver for those crashes! We need to dig deeper here.

Before digging, it's worth noting that this isn't really 2015 data, but rather a 2015 summary of data based on a data analysis report published in 2008. (DOT HS 811 059 National Motor Vehicle Crash Causation Survey, July 2008 https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/811059) If you want to draw your own conclusions you should look at the original study to make sure you understand things.

Now that we can look at the primary data, first we need to see what assigning a "crash reason" to the driver really means. Page 22 of the 2008 report sheds some light on this. Basically, if something goes wrong and the driver should have (arguably) been able to avoid the crash, the mishap “reason” is assigned to the driver. That's not at all the same thing as the accident being the driver's fault to due an overt error that directly triggered the mishap. To its credit that original report makes this clear. This means that the idea of crashes or fatalities being “94% due to driver error” differs from the report's findings in a subtle but critical way.

Indeed, many crashes are caused by drunk drivers. But other crashes are caused by something bad happening that the driver doesn't manage to recover from. Still other crashes are caused by the limits of human ability to safely operate a vehicle in difficult real-world situations despite the driver not having violated any rules. We need to dig still deeper to understand what's really going on with this report.

Page 25 of the report sheds some light on this. Of the 94% of mishaps attributed to drivers, there are a number of clear driver misbehaviors listed, including distracted driving, illegal maneuvers, and sleeping. But the #1 problem is "Inadequate surveillance 20.3%." In other words, a fifth of mishaps blamed on drivers are the driver not correctly identifying an obstacle, missing some road condition, or other problem of that nature. While automated cars might have better sensor coverage than a human driver's eyes, misclassifying an object or being fooled by an unusual scenario could happen with an automated car just as it can happen to a human. (In other words, sometimes driver blame could be assigned to an automated driver, even if part of the 94%.) This biggest bin isn’t drunk driving at all, but rather gets to a core reason of why building automated cars is so hard. Highly accurate perception is difficult, whether you're a human or a machine.

Other driver bins in the analysis include "False assumption of other's action 4.5%," "Other/unknown recognition error 2.5%," Other/unknown decision error 6.2%," and "Other/unknown driver error 7.9%". That’s another 21% that might or might not be impaired driving, and might be a mistake that could also be made by an automated driver.

So in the end, the 94% human attribution for mishaps isn't all impaired or misbehaving drivers. Rather, many of the reasons assigned to drivers sound more like imperfect drivers. It's increasingly clear that autonomous vehicles can also be imperfect. For example, they can misclassify objects on the road. So we can't blithely claim that automated cars won't have any of failures that the study attributes to human error. Rather, at least some of these problems will likely change from being assigned to "human driver error" to instead being "robot driver error." Humans aren't perfect. Neither are robots. Robots might be better than humans in the end, but that area is still a work in progress and we do not yet have data to prove that it will turn out in the robot driver's favor any time soon.

A more precise statement of the study's findings is that while it is indeed true that 94% of mishaps might be attributed to humans, significantly less than that number is attributable to poor human choices such as driving drunk. While I certainly appreciate that computers don't drive drunk, they just might be driving buggy. And even if they don’t drive buggy, making self driving cars just as good as an unimpaired driver is unlikely to get near the 94% number so often tossed around. Perception, modeling expected behavior of other actors, and dealing with unexpected situations are all notoriously difficult to get right, and are all cited sources of mishaps. So this should really be no surprise. It is possible automated cars might be a lot better than people eventually, but this data doesn't support that expectation at the 94% better level.

We can get a bit more clarity by looking at another DOT report that might help set more realistic expectations. Let's take a look at 2016 Fatal Motor Vehicle Crashes: Overview (DOT HS 812 456; access via https://www.nhtsa.gov/press-releases/usdot-releases-2016-fatal-traffic-crash-data). The most relevant numbers are below (note that there is overlap, so the categories add up to more than 100%):

Total US roadway fatalities: 37,461
Alcohol-impaired-driving fatalities: 10,497
Unrestrained passenger vehicle fatalities (not wearing seat belts): 10,428
Speeding-related fatalities: 10,111
Distraction-affected fatalities: 3,450
Drowsy driving fatalities: 803
Non-occupant fatalities (sum of pedestrians, cyclists, other): 7,079

I'm not going to attempt a detailed analysis, but certainly we can get the broad brush strokes from this data. I come up with the following three conclusions:

1. Clearly impaired driving and speeding contribute to a large number of fatalities (ballpark twenty thousand, although there are overlaps in the categories). So there is plenty of low hanging fruit to go after if we can create a automated vehicle that is as good as an unimpaired human. But it might be more like a 2x or 3x improvement than a 20x improvement. Consider the typical 100 million mile between fatality number that is quoted for human drivers. If you remove the impaired drivers, based on this data you get more like about 200 million miles between fatalities. It will take a lot to get automated cars that good. Don't get me wrong; a 2x or so improvement is potentially a lot of lives saved. But it's not zero fatalities, and it's nowhere near a 94% reduction.

2. Almost one-fifth of the fatalities are pedestrians, cyclists, and other at-risk road users. Detecting and avoiding that type of crash is notoriously difficult for autonomous vehicle technology. Occupants have all sorts of crash survivability features. Pedestrians -- not so much. Ensuring non-occupant safety has to be a high priority if we're going to deploy this technology and avoid an unintended consequence of a rise in pedestrian fatalities potentially offsetting gains in occupant safety.

3. Well over a quarter of the vehicle occupant fatalities are attributed to not wearing seat belts. Older generations have lived through the various technological attempts to enforce seat belt use. (Remember motorized seat belts?) The main takeaway has been that some people will go to extreme lengths to avoid wearing seat belts. It's difficult to see how automated car technology alone will change that. (Yes, if there are zero crashes the need for seat belts is reduced, but we're not going to get there any time soon. It seems more likely that seat belts will play a key part in reducing fatalities when coupled with hopefully less violent and less frequent crashes.)

So where does that leave us?

There is plenty of potential for automated cars to help with safety. But the low hanging fruit is more likely cutting fatalities perhaps in half if we can achieve parity with an average well behaved human driver. The 94% number so often quoted will take a lot more than that. Over time, hopefully, automated cars can continue to improve further. ADAS features such as automatic emergency braking can likely help too. And for now, even getting rid of half the traffic deaths is well worth doing, especially if we make sure to consider pedestrian safety.

Driving safely is a complex task for man or machine. It would be a shame if a hype roller coaster ride triggers disillusionment with technology that can in the end improve safety. Let's set reasonable expectations so that automated car technology is asked to provide near term benefits that it can actually deliver.

Update: Mitch Turk pointed out another study from 2016 that is interesting.
http://www.pnas.org/content/pnas/113/10/2636.full.pdf

This has data from monitoring drivers who experienced crashes. The ground rules for the experiment were a little different, but it has data explaining what was going on in the vehicle before a crash. A significant finding is that driver distraction is an issue. (Note that this data is several years later than the previous study, so that makes some sense.)

For our purposes an interesting finding is that 12.3% of crashes were NOT distracted/NOT Impaired/NOT Human Error:

Beyond that, it seems likely that some of the other categories contain scenarios that could be difficult for an AV, such as undistracted, unimpaired errors (16.5%).

Update 7/18/2018: Laura Fraade-Blanar from RAND send me a nice paper from 2014 on exactly this topic:
https://web.archive.org/web/20181121213923/https://www.casact.org/pubs/forum/14fforum/CAS%20AVTF_Restated_NMVCCS.pdf

The study looked at the NMVCSS data from 2008 and asked the question of what this data means for autonomy, accounting for issues that autonomy is going to have trouble addressing. They say that "49% of accidents contain at least one limiting factor that could disable the [autonomy] technology or reduce its effectiveness." They also point out that autonomy can create new risks not present in manually driven vehicles.

So this data suggests that eliminating perhaps half of autonomous vehicle crashes is a more realistic goal.

NOTES:

For an update on how many organizations are misusing this statistic, see:
https://usa.streetsblog.org/2020/10/14/the-94-solution-we-need-to-understand-the-causes-of-crashes/

The studies referenced, as with other similar studies, don’t attempt to identify mishaps that might have been caused by computer-based system malfunction. In studies like this, potential computer-based faults and non-reproducible faults in general are commonly attributed to driver error unless there is compelling evidence to the contrary. So the human error numbers must be taken with a grain of salt. However, this doesn’t change the general nature of the argument being made here.

The data supporting the conclusion is more than 10 years old. So it would be no surprise if ADAS technology has been changing things. In any event, looking into current safety expectations should arguably require separating the effects of ADAS systems such as automatic emergency braking (AEB) from functions such as lane keeping and speed control. ADAS can potentially improve safety even for well behaved human drivers.

Some readers will want to argue for a more aggressive safety target than 2x to 3x safer than average humans. I'm not saying that 2x or 3x is an acceptable safety target -- that's a whole different discussion. What I'm saying is that is a much more likely near term success target than the 94% number tossed around.

There will no doubt be different takes on how to interpret data from these and other reports. That’s a vitally important discussion to have. But the point of this essay is not to argue how safe is safe enough. Rather, the point is to have a discussion about realistic expectations based on data and engineering, not hype. So if you want to argue a different outcome than I propose, that's great. But please bring data instead of marketing claims to the discussion. Thanks.

Safe Autonomy