Tuesday, December 4, 2018

Three Safety Bins: road testing doesn't get you all the way to "safe"

There are three bins for self-driving car safety: 
(1) Obviously dangerous
(2) NOT obviously dangerous, and
(3) Safe.

Road testing and driving scenario coverage can help you get from bin 1 to bin 2.

Getting to Safe (bin 3) requires a whole lot more. That's because it requires handling rare, unexpected, and novel events in both the equipment and the environment.



Monday, November 26, 2018

FiveAI Report on Autonomous Vehicle Safety Certification

FiveAI has published an autonomous vehicle safety approach that includes independent verification, transparency, and data sharing. (I provided inputs to the @_FiveAI  authors.)

Here is a pointer to the summary on Medium
https://medium.com/@_FiveAI/we-need-an-industry-wide-safety-certification-framework-for-autonomous-vehicles-fiveai-publishes-1139dacd5a8c

It's worth jumping through the registration hoop to read the full version.
https://five.ai/certificationpaper


Monday, November 19, 2018

Webinar on Robustness Testing of Perception

Zachary Pezzementi and Trenton Tabor have done some great work on perception systems in general, and how image degradation affects things.  I'd previously posted information about their paper, but now there is a webinar available here:
    Webinar home page with details & links:  http://ieeeagra.com/events/webinar-november-4-2018/

This includes pointers to slides, a recorded webinar, the paper, and papers.

My robustness testing team at NREC worked with them on the perception stress testing parts, so here are quick links to the parts covering that part:


Wednesday, November 7, 2018

Potential Autonomous Vehicle Safety Improvement: Less Hype, More Data (OESA 2018)


I enjoyed being on a panel at the Annual OESA Suppliers Conference today. My intro talk covered setting reasonable expectations about how much safety benefit autonomous vehicles can provide in the near-term to mid-term. Spoiler: when you hear that 94% of all road fatalities are caused by bad and impaired drivers, that number doesn't mean what you think it means!







Monday, November 5, 2018

Uber ATG Safety Report

Summary: Uber's reports indicate that they are taking improving their safety culture seriously. Their new approach to public road testing seems reasonable in light of current practices. Whether they can achieve an appropriate level of system safety and software quality for the final production vehicles remains an open question -- just as it does for the self-driving car industry in general.

Uber ATG has released a set of materials regarding their in-development self-driving car technology and testing, including a NHTSA-style safety report, as well as reports of a safety review in light of the tragic death in Tempe AZ earlier this year. (See: https://www.uber.com/info/atg/safety/)

Generally I have refrained from critical analysis of other company safety reports because this is still something everyone is sorting out. Anyone putting out a safety report is automatically in the top 10% for transparency (because about 90% of the companies haven't even released one yet). So by this metric Uber looks good.  In fact, their report has a lot more detail than we've been seeing in general, so kudos to them for improving transparency. The other companies who haven't even published a report at all should get with the program.

But, Uber has also had a fatality as a result of their on-road test program. If any company should be put under increased scrutiny for safety it should be them.  I fully acknowledge that many of the critique points apply to other companies as well, so this is not about whether they are ahead or behind, but rather how they stand on their own merits.  (And, if anyone at Uber thinks I got something wrong, please let me know.)

Overall

It seems that Uber's development and deployment plan is generally what we're seeing from other companies.  They plan to operate on public roads to build a library of surprises and teach their system how to handle each one they encounter. They plan to have safety drivers (Mission Specialists) intervene when the vehicle encounters something it can't handle.  As a result of the fatal mishap they plan to improve safety culture, improve safety drivers, and do more pre-testing simulation. There is every reason to believe that at least some other companies were already doing those things, so this generally puts Uber on a par with where all companies doing road testing should be.

Clearly the tragic death in Tempe got Uber's attention, as it should have. Let's hope that other companies pay attention to the lessons learned before there is another fatality.

Doing the math, there should be no fatalities in any reasonable pre-deployment road test program. That's because there simply won't be enough miles accumulated in road testing with a small fleet to reach a level at which an average human driver would be likely to have experienced a fatal accident. (It is not zero risk, just as everyday driving is not risk free. But a fatality should be unlikely.)

The Good

  • This is perhaps the most thorough set of safety reports yet. We've been seeing a trend that more recent reports often include areas not touched on by earlier reports. I hope this results in a competitive dynamic in which each company wants to raise the bar for safety transparency. We'll see how this turns out. Uber is certainly doing their part.
  • The materials place significant emphasis on improving safety culture, including excellent recommendations of good practices from the external report. Safety culture is essential. I'm glad to see this.
  • There are detailed discussions about Mission Specialist roles, responsibilities, and training.  This is important. Supervising autonomy is a difficult, demanding role, and gets more difficult as the autonomy gets better. Again, I'm glad to see this.
  • There is quite a bit about hardware quality for both computer hardware and vehicle hardware. It is hard to tell how far down the ISO 26262 hardware safety path they are for life critical functions such as disengaging autonomy for Mission Specialist takeover.  They mention some external technical safety reviews, but none recently. This is a good start, but more work required here. They say they plan more external reviews, which is good.
  • They state concrete goals for each of their five safety principles. This is also good.

Jury Still Out on Fully Autonomous Operation System Safety:

  • The section on system safety is for the most part aspirational. Even the section that is not forward looking is mostly about plans, not current capabilities. This is consistent with currently using Mission Specialists to ensure testing safety.  In other words, assuming the Mission Specialist can avoid mishaps, and the vehicle always responds to human driver takeover, this isn't a problem yet. So we'll have to wait to see how this turns out.
  • The external review document concentrated on safety culture and road testing supervision. That would be consistent with a conclusion that the fatality root causes were poor safety culture and ineffective road testing supervision. (Certainly it would be no surprise if this hypothetical analysis were true, but we'll see what the final NTSB report says to know for sure.)
  • In general, we have no idea how they plan to prove that their vehicles are safe to deploy other than by driving around until they feel they have sufficiently infrequent road failures. Simulation will improve quality before they do road testing, but they are counting on road testing to find the problems. To be clear, road testing and that type of simulation can help, but I don't believe they are enough. (This is the same as for many other developers, so this is not a particular criticism of Uber.)
  • Uber says they are working on a safety case, perhaps using a GSN-based approach. This is an excellent idea. But I don't see anything that looks like a formal safety case in these documents. Hopefully we'll get to see something like that down the road.

Software Quality and Software Safety:

  • The software development process shown on page 47 of the report emphasizes fixing bugs found in testing. I don't know of any application domain in which that alone will actually get you acceptably safe life-critical software. Again, for now their primary safety strategy for testing is Mission Specialists, so this is an issue for the future. Maybe we'll find out if they are doing more in a later edition of this report. 
  • The information on software quality and software development process description is a bit skimpy in general.  It is difficult to tell if that is a reflection of their process or they just didn't want to talk about it.  For example, there is a box in their process diagrams on page 47 that says "peer review" with no description as to which of the many well-known review techniques they are using, whether they review everything, etc. There are no boxes in their software process for requirements, architecture, and design. There isn't an SQA function described (Software Quality Assurance, which deals with development process quality monitoring).  For Agile fans, there are plenty of boxes missing from whichever methodology you like. The point is that this is an incomplete software process model compared to what I'd expect to see for life-critical software. The question is whether the pieces are there and not drawn. Again, there is no other industry in which the approach shown would be sufficient or acceptable for creating life-critical software. It is possible there is more to their process than they are revealing, or they have some plan to address this before they remove their Mission Specialists from the vehicles. 
  • Perception (having the system recognize objects, obstacles, and so on) is notoriously difficult to get right, and probably the hardest problem out of many difficult problems to make self driving cars safe. They talk about how they use perception, but not how they plan to validate it, other than via road testing and possibly some aspects of simulating scenarios observed during road testing. But then again, other developers don't say much about this either.
  • It's easy to believe that at least some other organizations are following similar software approaches and will face the same challenges.  Again, because they currently have safety drivers these are forward-looking issues that are not the primary concerns for Uber road testing safety in the near term.
  • It is worth noting that they plan to have a secondary fail-over computer that they say will be developed at least taking into account software quality and safety standards such as ISO 26262 and MISRA C. (Safety Report Page 35.) But they don't seem to say if this is what they are doing for their basic control software that controls normal operation. Again, perhaps there is more to this they haven't revealed.

Is It Enough?

Overall the reports seem to put them on a par with other developers in terms of road testing safety. Whether they operate safely on public roads will largely depend upon maintaining their safety culture and Mission Specialist proficiency. I'd suggest an independent monitor within the organization to make sure that happens.


What I'd Like to See

There are a number of things I'd like to see from Uber to help in regaining public trust. (These same recommendations go for all the other companies doing public road testing.)
  • Uber should issue periodic report cards in which they tell us about adopting the recommendations in their various reports and their safety plans in general.  Are they staying on track? Did the safety culture initiative really work? Are they following their Mission Specialist procedures?
  • I'd like to see metrics that track the effectiveness of Mission Specialists. Nobody is perfect, but I'd be happier having data about how often they get distracted to see whether the break schedule and monitoring are working as intended. This should be something all companies do, since in the end they are putting the public at risk. The effectiveness of Mission Specialists who have been assigned a difficult job is their stated way to mitigate that risk -- but we have no insight as to whether that approach is really working until a crash is in the news.
  • They have promised safety metrics that are better than disengagements and miles driven. That's a great idea. We'll have to see how that turns out. (They sponsored a RAND report on this topic that was recently released. That will have to be the topic of another post.)
  • We should track whether they establish their external safety advisory board -- and whether it has appropriate autonomy and software safety technical expertise as well as areas such as safety culture and human/machine interface.
  • They should also have an independent internal monitor making sure their safety-relevant operational and design processes are being followed. This seems in line with their plans.
  • They need a much stronger story about how they plan to ensure software safety and system safety when they remove their Mission Specialists from the vehicle downstream. Hopefully they'll make public a high level version of the safety case and have it externally evaluated.
  • I hope that they work with PennDOT to comply with PA AV testing safety guidelines before resuming operation in Pittsburgh, where I live. From the materials I've seen that should be straightforward, but they should still do it. As of right now, they'd only be the second company to do so.
Dr. Philip Koopman is a faculty member at Carnegie Mellon University. He is an internationally recognized expert in the area of self-driving car safety. He is also Co-Founder of Edge Case Research, which provides products and services relating to autonomy safety. 
koopman@cmu.edu

Tuesday, September 18, 2018

Automotive Safety Practices vs. Accepted Principles (SAFECOMP paper)

I'm presenting this paper at SAFECOMP this today

2018 SAFECOMP Paper Preprint

Abstract. This paper documents the state of automotive computer-based system safety practices based on experiences with unintended acceleration litigation spanning multiple vehicle makers. There is a wide gulf between some observed automotive practices and established principles for safety critical system engineering. While some companies strive to do better, at least some car makers in the 2002-2010 era took a test-centric approach to safety that discounted nonreproducible and “unrealistic” faults, instead blaming driver error for mishaps. Regulators still follow policies from the pre-software safety assurance era. Eight general areas of contrast between accepted safety principles and observed automotive safety practices are identified. While the advent of ISO 26262 promises some progress, deployment of highly autonomous vehicles in a nonregulatory environment threatens to undermine safety engineering rigor.

See the full paper here:
https://users.ece.cmu.edu/~koopman/pubs/koopman18_safecomp.pdf

Note that there is some pretty interesting stuff to be seen by following the links in the paper reference section.
Also see the expanded list of (potentially) deadly automotive defects.

Here are the accompanying slides:  https://users.ece.cmu.edu/~koopman/pubs/koopman18_safecomp_slides.pdf







Wednesday, September 12, 2018

Victoria Australia Is Winning the Race to ADS Testing Safety Regulations

Victoria Australia has just issued new guidelines regarding Automated Driving System (ADS) testing.  These should be required reading for anyone doing on-road testing elsewhere in the world. There is just too much good stuff here to miss.  And, the guidelines are accompanied by actual laws that are designed to make autonomy testing safe.

A look through the regulations and guidelines shows that there is a lot to like. The most intriguing points I noticed were:
  • It provides essentially unlimited technical flexibility to the companies building the ADS vehicles while still providing a way to ensure safety. The approach is a simple two-parter:
    1. The testing permit holders have to explain why they will be safe via a safety management plan.
    2. If the vehicle testing doesn't follow the safety management plan or acts unsafely on the roads, the testing permit can be revoked.
  • The permit holder rather than the vehicle supervisor (a.k.a. "safety driver" in the US) is liable when operating in autonomous mode.  In other words, if the safety driver fails to avoid a mishap, liability rests with the company running the tests, not the safety driver. That sounds like an excellent way to avoid a hypothetical strategy of companies using safety drivers as scapegoats (or expendable liability shields) during testing.
  • The permitting process requires a description of ODD/OEDR factors including not just geofencing, but also weather, lighting, infrastructure requirements, and types of other road users that could be encountered.
  • The regulators have broad, sweeping powers to inspect, assess, require tests, and in general do the right thing to ensure that on-road testing is safe. For example, a permit can be denied or revoked if the safety plan is inadequate or not being followed.
There are many other interesting and on-target discussions in the guidelines.  They include the need to reduce risk as low as reasonably practicable (ALARP); accounting for the Australian road safety approach of: safe speeds, safe roads, safe vehicles, safe people during testing; transition issues between ADS and supervisor; the need to drive in a predictable way to interact safely with human drivers; and a multi-page list of issues to be considered by the safety plan. There is also a list of other laws that come into play.

Here are some pointers for those who want to look further.
There are some legal back stories at work here as well. For example, it seems that under previous law a passenger in an ADS could have been found responsible for errors made by the ADS, and this has been rectified with the new laws.

The regulations were created according to the following criteria from a 2009 Transportation bill:
  • Transportation system objectives:
    • Social and economic inclusion
    • Economic prosperity
    • Environmental sustainability
    • Integration of transport and land use
    • Efficiency, coordination and reliability
    • Safety and health and well being
  •  Decision making principles:
    • Principle of integrated decision making
    • Principle of triple bottom line assessment
    • Principle of equity
    • Principle of the transport system user perspective
    • Precautionary principle
    • Principle of stakeholder engagement and community participation
    • Principle of transparency. 
(The principle of transparency is my personal favorite.)

Here is a list of key features of the Road Safety (Automated Vehicles) Regulations 2018:

  1. The purpose of an ADS permits scheme (see regulation 5):
    • For trials of automated driving systems in automated mode of public roads
    • To enable a road authority to monitor and  manage the use and impacts of the automated driving system on a highway
    • To enable VicRoads to perform its functions under the Act and the Transport Integration Act
  2. The permit scheme requires the applicant to prepare and maintain a safety management plan that (see regulation 9 (2)):
    • Identifies the safety risks of the ADS trials
    • Identifies the risks to the reliability, security and operation of the automated driving system to be used in the ADS trial
    • Specifies what the applicant will do to eliminate or reduce those risks so far as is reasonably practicable; and
    • Addresses the safety criteria set out in the ADS guidelines
  3. The regulations will require the ADS permit holder to submit a serious incident within 24 hours (see regulations 13 and 19). A serious incident means any:
    • accident
    • speeding, traffic light, give way and level crossing offence
    • theft or carjacking
    • tampering with, unauthorised access to, modification of, or impairment of an automated driving system
    • failure of an automated driving system of an automated vehicle that would impair the reliability, security or operation of that automated driving system.
I hope that US states (and the US DOT) have a look at these materials.  Right now I'd say VicRoads is ahead of the US in the race to comprehensive but reasonable autonomous vehicle safety regulations.

(I would not at all be surprised if there are issues with these regulations that emerge over time. My primary point is that it looks to me like responsible regulation can be done in a way that does not pick technology winners and does not unnecessarily hinder innovation. This looks to be excellent source material for other regions to apply in a way suitable to their circumstances.)


Three Safety Bins: road testing doesn't get you all the way to "safe"

There are three bins for self-driving car safety:  (1) Obviously dangerous (2) NOT obviously dangerous, and (3) Safe. Road testing and ...