Friday, September 30, 2022

The NIEON Driver Benchmark and the Two-Sided Coin for AV safety

Waymo just published a study showing that they can outperform an unimpaired (NIEON) driver on a set of real-world crashes. That's promising, but it's only half the story. To be a safe driver, an Autonomous Vehicle (AV) must both not only be good -- but also not be bad. Those are two different things. Let me explain...

Man showing younger driver how to hold steering wheel properly
Learning to drive includes learning how to drive well.
It also includes learning to avoid making avoidable mistakes. They're not the same thing.

A Sept. 29, 2022 blog posting by Waymo explains their next step in showing that their automated driver can perform better than people at avoiding crashes. The approach is to recreate crashes that happened in the real world and show that their automated driver would have avoided them. 

For this newest work they compare their driver to not just any human, but a Non-Impaired, with Eyes always ON the conflict (NIEON) driver. They correctly point out that no human is likely to be this good (you gotta blink, right?), but that's OK. The point is to have an ideal upper bound on human performance, which is a good idea.

Setting a goal that to be acceptably safe an AV should be at least as good as a NIEON driver makes sense. It cuts out all the debate about which human driver is being used as a reference (e.g., a tired 16 year old vs. a well rested professional limo driver 50 year old will have very different driving risk exposure).

Unsurprisingly Waymo does better than a NIEON on scenarios they know they will be running, because computers can react more quickly than people when they correctly sense, detect, and model what is going on in the real world. Doing so is not trivial, to be sure. As Waymo describes, this involves not just good twitch reflexes, but also realizing when things are getting risky and slowing down to give other road users more space, etc.

That is the "be a good driver" part, and kudos to Waymo and other industry players who are making progress on this. This is the promise of AV safety in action.

But to be safe in the real world, that is not enough. You also have to not be a bad driver. Doing better at things humans mess up is half the picture. The other half is not making "stupid" mistakes that humans would likely avoid. 

AVs will surely have crashes that would be unlikely for a human driver to experience. Or will sometimes fail to be cautious when they should, and so on. While this did not lead to crashes, the infamous Waymo ride video showing construction zone cone confusion shows there is more work to be done in getting AVs to handle unusual situations. To its credit, in that scenario the Waymo vehicle did not experience a crash. But other uncrewed AVs are in fact having crashes due to misjudging traffic situations. And an automated test truck crashed into a safety barrier despite having a human safety driver due to an ill-considered initialization strategy that at least some would consider a rookie mistake (e.g., repeats a type of mistake made in Grand Challenge events -- should have known better).

It is good to see Waymo showing their automated driver can do well when it correctly interprets the situation it is in. That shows it is a potentially capable driver. We still need to see it is additionally not making mistakes in novel situations that aren't part of the human driver crash dataset. If Waymo can show NIEON level safety in both the knowns and the unknowns, that will be an impressive achievement.

Sunday, September 25, 2022

The Autonomous Vehicle Deployment Governance Problem

The #1 ethical issue in autonomous vehicles is not the infamous Trolley Problem. It is the question of who gets to decide when it is OK to deploy a vehicle without a safety driver on public roads.

Woman with umbrella crossing street in rain

Consider a thought experiment which, if you follow AV industry news, you might recognize as not entirely hypothetical. You, the reader, are in charge of a company that needs to do a public road demonstration with no driver in an AV. You know that safety is not where you would like it to be. In fact, you have no safety case at all. You might not even have any real safety engineers on staff. But you have a smart, super-capable team. You have done a lot of test driving and it is going pretty well.

You intuitively figure it is more likely than not that you can pull off a one-time demo without a crash, and even less likely that a crash will kill someone. You figure you have something like 5 chances out of 6 of pulling off the demo with nobody getting hurt, and it is, in your mind, near certainty due to low-speed urban driving that any crash would avoid a fatality. For good measure maybe you plan to station employees near the demo site to shoo away any pedestrians and light mobility users who would be at increased risk of harm, and do the demo very late at night when roads are usually empty of other road users. Regulators are not in a position to influence your decision.

Your investors have told you they will pull the plug on your entire company if you do not demo by December 31st. Right now, it is the first week of December, and it is time to decide what to do. If the investors pull the plug at the end of the month, you lose perhaps $1B in personal equity you hope to net in next year’s public offering. And all your employees will lose their equity as well as their jobs. This will also end a journey you have spent your life on to build and deploy a truly self-driving car.

Further negotiations with the investors are not possible. It is time to decide. That leaves you three main options:

Case 1: The AV company does not do the demo because it cannot assure a PRB level of safety. The company runs out of money and folds. This option kills the company.

Case 2: The AV company does the demo and harms a road user. This might or might not result in termination of funding depending on the optics of the crash (perhaps a pedestrian victim can be blamed for jaywalking, being impaired, or having low societal status; maybe all three). You think minor harm is more likely than a fatality, and you will have lots of money available to pay off a potential victim to keep quiet. You will not pre-announce the demo, so you feel able to control the narrative if something goes wrong. The company and the mission go on unless there is a truly unlucky break during that one demo/test session that cannot be cleaned up. Even Uber ATG kept going for a while after a really bad crash, and you know in your heart that your team is better.

Case 3: The AV company does the demo and gets lucky, not harming any other road users. The company meets its milestone and gets more funding. This the most likely case, and it would be a perfect victory.

Given this setup doing the demo is clearly the best financial bet for the company. Probably you will get lucky with a positive outcome. No harm will be done, and the demo can be said to be safe under the culturally dominant no harm/no foul principle. But if the demo is skipped due to safety, the company is sure to die, and the decision maker is out a billion dollars.

Even if you get unlucky, the cost of a few million dollar settlement pales in comparison with the billions of dollars on the table. Really – you might think – a payout is just the cost of doing business. And even if the crash optics get out of control and the startup company folds, the investors have hedged their bets and the team can simply move to another company and try again. Pretty much everyone will do fine. Except for the victim, if there is one.

This is how demo milestones incentivize deploying systems when the calendar and funding flow says it is time for a demo rather than when the demo is known to be acceptably safe. After all, it is someone else who is injured or dies, not the decision maker. And a billion dollars is a ton of money. And probably it will be fine. After all, you think, only other companies kill pedestrians.

Saturday, September 24, 2022

FMVSS Exemption Considerations for Fully Autonomous Vehicles

Summary: The human role in FMVSS should not be removed for exemption request because there is no driver. Rather, what should be stated is how the driver is being replaced in that safety role as a matter of system design.

These comments were filed regarding the General Motors Petition for Temporary Exemption from FMVSS -- Federal Motor Vehicle Safety Standards (https://www.regulations.gov/document/NHTSA-2022-0067-0002)  However, they are likely to apply to any autonomous vehicle that does not have conventional driver controls or operates in a mode which does not have an officially designated driver.

Cruise Origin II vehicle

It is good to see companies working to advance the potential benefits of autonomous vehicle technology. However, it is also important for NHTSA to ensure public safety when such technology is deployed on public roads.

There will almost certainly need to be an auxiliary controller to command motion of vehicles without normally accessible driver controls. To the extent that these are in-vehicle wired controls the question arises as to their suitability for safe vehicle control (e.g., using a "Playstation" type video game controller) and whether they can be improperly activated during autonomous operation. To the extent that these are wirelessly connected controls there are serious security issues that must be considered. In the absence of normal driver controls any petitioner should be required to justify the safety and security of its maintenance and supplemental human-operated vehicle control strategy beyond more general claims of working on cybersecurity.

FMVSS 101: Petitioner should demonstrate that all telltales that are intended to prompt human driver reaction similarly prompt a relevant ADS reaction to an indicated exceptional condition. If no such demonstration is provided, NHTSA would simply be taking Petitioner’s word that the telltale functions are implemented in a way that provides equivalent safety. (This also applies to other telltales such as FMVSS No. 126, 138, etc.)

FMVSS 102: While the ADS is said to control the transmission, there will be times in which humans need to know the state of the transmission for safety, especially whether the vehicle is in park or not. This includes emergency responders and maintenance personnel to ensure that the vehicle will not move unexpectedly when it is operating in a degraded or post-crash state. NHTSA should evaluate the suitability of a passenger video display for reliably displaying vehicle park state to non-passenger stakeholders.

FMVSS 104: From the NTHSA summary:  "For GM's petition for exemption from portions of FMVSS No. 104, GM argues that the purpose and intent of the safety standard is obviated by the Origin's sensor system design."  GM's argument is ridiculous from a technical point of view.  (Their technical disclosure is more substantive, but this type of lawyer argument in their summary serves to undermine their filing's credibility.)  In the context of an ADS, the intent of the safety standard is not to permit a human driver's eyes to see out the windshield. Rather, it is to ensure that all visual sensors can see through the windshield (if interior mounted), or through their lenses and weather covers as appropriate. It is completely foreseeable that camera and lidar operation will be impaired by adverse operational conditions. Moreover, the problem is likely to be worse than for human drivers because even a relatively small bug splat on a camera lens will form a visual obstruction covering a much larger fraction of the field of view compared to the same bug impacting the windshield of a human-driven vehicle, because human drivers can and do simply move their heads a bit to see past windshield obstructions.  GM states that it keeps its sensors clean, but provides no objective way for NHTSA to validate the claim.

FMVSS 111: From the NHTSA summary: "GM points out that the purpose and intent of FMVSS No. 111 is based on human perception and visibility so there is no operational safety need for these requirements when applied to a vehicle driven by an ADS." This argument amounts to the common fallacy of: "humans have limitations and therefore computers will be perfect." (Anyone paying attention knows that computers fail, often spectacularly – just in different ways than humans.) There are safety requirements behind FMVSS 111 that are not about "human perception and visibility" but rather, for example, not running over children that are not visible to the driver sensors.  Again, GM's lawyer-phrased argument undermines the credibility of their filing. In addressing this request for exemption the petitioner should explain why it is the case that small children will be protected from being run over at least as well as by a human driver using a rear view camera as one example. Again, the technical substance is better, but leaves a big open question. GM provides pictures indicating that cameras can potentially see areas required. However, there is no data supporting that those cameras will correctly sense and classify children close to the vehicle with high accuracy, especially at night when the vehicle automation might rely more heavily on roof-top mounted lidar sensors. The safety requirement is not camera visibility, but rather that the driver+camera will avoid hitting children. Given apparent issues with camera-based AEB systems having trouble sensing and avoiding hitting children this is an issue that needs to be taken seriously.

FMVSS 201: From the NHTSA summary: "GM argues that sun visors are not necessary because the Origin is not operated by a human driver, and the ADS does not use the windshield for visibility." Again, limitations of human drivers are not the real point for ADS safety. It is well known that cameras struggle when looking into the sun. GM should explain how they handle that problem (sun visors are a way human drivers can handle it) and now NHTSA can validate that sun glare is handled safely.  GM states it is not a problem for this system, hinting that perhaps they rely on lidar+radar instead of a camera in such situations, but that raises additional concerns. GM should provide a way for NHTSA to validate that sun glare does not present undue risk.

Validation: For all requested exemptions it is not enough to simply say more or less "trust us, we have figured this out" for an ADS equipped vehicle than it is for a conventional vehicle. If "trust us" were enough, there would be no need for the carefully design tests in FMVSS for any vehicle. Rather, petitioners should explain (a) why the broader safety objective of the FMVSS element being waived is still being met (e.g., not hitting children vs. field of view), and (b) a way that NHTSA can validate for itself that the safety objective is in fact being met. See:  Koopman, “How to keep self-driving cars safe when no one is watching for dashboard warning lights,” The Hill, June 30, 2018. https://thehill.com/opinion/technology/394945-how-to-keep-self-driving-cars-safe-when-no-one-is-watching-for-dashboard/

 

Regarding Public Interest Considerations:

Safety Benefit: It is essential to keep in mind when evaluating ADS petitions that potential safety benefits are purely aspirational. There is not yet any proof that ADS-equipped vehicles will be even as safe as a comparable active-safety-equipped human-driven vehicle using any currently available or near term deployable technology. While supporting the development of technology that has the potential to reduce harm is laudable, this should not be done at the cost of increased near-term harm. Putting road users at increased risk today because someday, maybe, perhaps, autonomous vehicles might be safer is unacceptable. NHTSA should be in the safety business, not the hopium business. That means exemptions should require a concrete justification of deployed safety that is not weakened by promises of ADS perhaps increasing safety in the future.

Environmental Impact: Requiring environmental impact data reporting is an excellent idea. It is important to be mindful of indirect environmental and safety consequences of ADS-equipped vehicle adoption. One example is that (presumably) decreased cost per mile is prone to increasing demand, resulting in a potential increase in overall emissions if small ADS-equipped vehicles are used for more trips.  Both human harm and environmental damage could be increased if ADS-equipped vehicles draw passenger loads away from mass transit even if the total number of user-miles across all modes remains the same. Moreover, use of uncrewed ADS-equipped vehicles can increase both total pedestrian harm and emissions as well as road congestion if that leads to a large increase in on-demand delivery services.

Equity: As NHTSA notes in their summary, petitioners for exemptions extoll the potential benefits to expand transportation options to under-served areas and customers with disabilities. It is important to point out that these populations can be served with traditional human-operated vehicles as well, so this is not a fundamentally new capability, but rather an argument that cheaper costs will increase vehicle availability (perhaps at the cost of increased road congestion). Note that vehicle sharing does not require autonomy, nor does use of electric vehicles. To the extent that the exemptions are for "robo-taxi" and shuttle type applications, petitioners should not only say that their technology might possibly help disadvantaged populations, but also how they plan to serve such populations directly as a result of the exemption. Promising to serve such populations in an exemption petition but then fielding a single token wheelchair-accessible vehicle or only deploying in a rich urban center would result in such statements of providing equitable transportation being only aspirational rather than directly related to the exemption decision. Credit should not be given for empty promises in evaluating an application.

Safety: While considering economic impacts might be relevant as NHTSA notes in their summary, it is essential to remind all stakeholders that the "S" in NHTSA stands for "Safety." It is imperative that economic motivations not over-ride the NHTSA mission to ensure safety. Safety must come first, and not be overwhelmed by economic pressure.

Standards: Petitioners who want exemptions from FMVSS provisions should have an alternate method of ensuring that a rigorous engineering approach has been used to ensure safety. Such an approach was proposed by NHTSA itself in an ANRPM (Docket No. NHTSA-2020-0106 / RIN 2127-AM15 Framework for Automated Driving System Safety https://www.regulations.gov/docket/NHTSA-2020-0106 ). NHTSA should respond to comments from that ANPRM and work further on setting up such a framework that could provide and an alternate basis for ADS-equipped vehicle regulation.

SMS: NHTSA should require all petitioners to implement an effective Safety Management System (SMS) as a condition of receiving an exemption, with SMS data included in periodic reports for the life of any vehicle.

Crash Reporting: NHTSA should continue to require crash reporting as proposed. The definition of a crash should be scoped to include: (a) any contact between the vehicle and another road user regardless of severity; (b) any contact between the vehicle and other obstacles or infrastructure regardless of severity; (c) any "near hit" that required evasive action from another road users (e.g., pedestrian jumping back on sidewalk to avoid being hit while in a crosswalk); (d) any substantive moving violation (e.g., running a red light).  Any responsible tester/operator of an ADS will have data already compiled on all these events as part of their SMS to ensure that they are operating safely on public roads, so the reporting burden should be minimal.  These reports should be required for the life of any exempt vehicle.

Commenter Qualifications: Prof. Philip Koopman is an internationally recognized expert on Autonomous Vehicle (AV) safety whose work in that area spans over 25 years. He is also actively involved with AV policy and standards as well as more general embedded system design and software quality. His pioneering research work includes software robustness testing and run time monitoring of autonomous systems to identify how they break and how to fix them. He has extensive experience in software safety and software quality across numerous transportation, industrial, and defense application domains including conventional automotive software and hardware systems. He was the principal technical contributor to the UL 4600 standard for autonomous system safety issued in 2020. He is a faculty member of the Carnegie Mellon University ECE department where he teaches software skills for mission-critical systems. In 2018 he was awarded the highly selective IEEE-SSIT Carl Barus Award for outstanding service in the public interest for his work in promoting automotive computer-based system safety. In 2022 he was named to the National Safety Council's Mobility Safety Advisory Group. He is the author of the book How Safe is Safe Enough: measuring and predicting autonomous vehicle safety (2022). https://users.ece.cmu.edu/~koopman/

Monday, September 19, 2022

Continuous Learning Approach to Safety Engineering

Continuous Learning Approach to Safety Engineering

Rolf Johansson & Philip Koopman / CARS @EDCC 2022

Abstract:

A phase change moment is upon us as the automotive industry moves from conventional to highly automated vehicle operation, with questions about how to assure safety. Those struggles underscore larger issues with current functional safety standards in terms of a need to strengthen the traceability between required practices and safety outcomes. There are significant open questions regarding both the efficiency and effectiveness of standards-based safety approaches, including whether some engineering practices might be dropped, or whether others must be added to achieve acceptable safety outcomes. We believe that rather than an incremental approach, it is time to rethink how safety standards work. We propose that real-world field feedback for an initially safe deployment should support a DevOps-style continuous learning approach to lifecycle safety. Safety engineering should trace from a safety case to engineering practices to safety outcomes. Such an approach should be incorporated into future safety standards s (including ISO 26262) to improve safety engineering efficiency and effectiveness.

Full paper here: link

Think About Things Differently