Tuesday, August 14, 2018

ADAS Code of Practice

One of the speakers at AVS last month mentioned that there was a Code of Practice for ADAS design (basically, level 1 and level 2 autonomy).  And that there is a proposal to update it over the next few years for higher autonomy levels.

A written set of uniform practices is generally worth something worth looking into, so I took a look here:

The main report sets forth a development process with a significant emphasis on controllability. That makes sense, because for ADAS typically the safety argument ultimately ends up being that the driver will be responsible for safety, and that requires an ability for the driver to assert ultimate control over a potentially malfunctioning system.

The part that I actually found more interesting in many respects was the set of Annexes, which include quite a number of checklists for controllability evaluation, safety analysis, and assessment methods as well as Human-Machine Interface concept selection.

I'd expect that this is a useful starting point for those working on higher levels of autonomy, and most critically anyone trying to take on the very difficult human/machine issues involved with level 2 and level 3 systems.  (Whether it is sufficient on its own is not something I can say at this point, but starting with something like this is usually better than a cold start.)

If you have any thoughts about this document please let me know via a comment.

Monday, August 6, 2018

The Case for Lower Speed Autonomous Vehicle On-Road Testing

Every once in a while I hear about a self-driving car test or deployment program that plans to operate at lower speeds (for example, under 25 mph) to lower risk. Intuitively that sounds good, but I thought it would be interesting to dig deeper and see what turns up.

There have been a few research projects over the years looking into the probability of a fatality when a conventionally driven car impacts a pedestrian. As you might expect, faster impact speeds increase fatalities. But it's not linear -- it's an S-shape curve. And that matters a lot:

(Source: WHO http://bit.ly/2uzRfSI )

Looking at this data (and other similar data), impacts at less than 20 miles an hour have a flat curve near zero, and are comparatively survivable. Above 30 mph or so is a significantly bigger problem on a per-incident basis.  Hmm, maybe the city planners who set 25 mph speed limits have a valid point!  (And surely they must have known this already.) In conventional vehicles the flat curve at and below 20 has lead to campaigns to nudge urban speed limits lower, with slogans such as "20 is plenty."

For on-road autonomous vehicle testing there's a message here. Low speed testing and deployment carries dramatically less risk of a fatality. The risk of a fatality goes up dramatically as speed increases beyond that.

For systems with a more complex "above 25 mph" strategy there still ought to be plenty that is either reused from the slow system or able to be validated at low speeds.  Yes, slow is different than fast due to the physics of kinetic energy.  But a strategy that validates as much as possible below 25 mph and then reuses significant amounts of that validation evidence as a foundation for higher speed validation could present less risk to the public.  For example, if you can't tell the difference between a person riding a bike and a person walking next to a bike at 25 mph, you're going to have worse problems at 45 mph.  (You might say "but that's not how we do it."  My point is maybe the AV industry should be optimizing for validation, and this is the way it should get done.)

It's clear that many companies are on a "race" to autonomy. But sometimes slow and steady can win the race. Slow speed runs might be less flashy, but until the technology matures slower speeds could dramatically reduce the risk of pedestrian fatalities due to a test platform or deployed system malfunction. Maybe that's a good idea, and we ought to encourage companies who take that path now and in the future as the technology continues to mature.

The "above 25 mph" paragraph was added in response to social media comments 8/9/2018.  And despite that I still got comments saying that systems below 25 mph are completely different than higher speed systems.  So in case that point isn't clear enough, here is more on that topic:

I'm not assuming that slow and fast systems are designed the same. Nor am I advocating for limiting AV designs only to slow speeds (unless that fits the ODD).

I'm saying when you build a high-speed capable AV, it's a good idea to initially test at below 25 mph to reduce the risk to the public for when something goes wrong.  And something WILL go wrong.  There is a reason there are safety drivers.

If a system is designed to work properly at speeds of 0 mph to 55 mph (say), you'd think it would work properly at 25 mph.  And you could design it so that at 25 it's using most or all of the machinery that is being used at 55 mph (SW, HW, sensors, algorithms, etc.)  Yes, you can get away with something simpler at low speed.  But this is low speed testing, not deployment.  Why go tearing around town at high speed with a system that hasn't even been proven at lower speeds?  Then bump up speed once you've built confidence.

If you design to validate as much as possible at lower speeds, you lower the risk exposure.  Sure, investors probably want to see max. speed operation as soon as possible.  But not at the cost of dead pedestrians because testing was done in a hurry.

Notes for those who like details:

There is certainly room for reasonable safety arguments at speeds above 20 mph. I'm just pointing out that testing time spent at/below 20 mph is inherently less risky if a pedestrian collision does occur. So maximizing the exposure to high speed operation is a way to improve overall safety in the event a pedestrian impact does occur.

The impact speed is potentially different than vehicle speed. If the vehicle has time to shed even 5 or 10 mph of speed at the last second before impact that certainly helps, potentially a lot, even if the vehicle does not come to a complete stop before impact. But a slower vehicle is less dependent upon that last second braking (human or automated) working properly in a crisis.

The actual risk will depend upon circumstances. For example, since the 1991 data shown it seems likely that emergency medical services have improved, reducing fatality rates. On the other hand, increasing prevalence of SUVs might increase fatality rates depending upon impact geometries. And so on.   A study that compares multiple data sets is here:
But, all that aside, all the data I've seen shows that traditional city speed limits (25 mph or less) help with reducing pedestrian fatalities.

Friday, July 27, 2018

Putting image manipulations in context: robustness testing for safe perception

I'm very pleased to share a publication from our NREC autonomy validation team that explains how computationally cheap image perturbations and degradations can expose catastrophic perception brittleness issues.  You don't need adversarial attacks to foil machine learning-based perception -- straightforward image degradations such as blur or haze can cause problems too.

Our paper "Putting image manipulations in context: robustness testing for safe perception" will be presented at IEEE SSRR August 6-8.  Here's a submission preprint:


Abstract—We introduce a method to evaluate the robustness of perception systems to the wide variety of conditions that a deployed system will encounter. Using person detection as a sample safety-critical application, we evaluate the robustness of several state-of-the-art perception systems to a variety of common image perturbations and degradations. We introduce two novel image perturbations that use “contextual information” (in the form of stereo image data) to perform more physically-realistic simulation of haze and defocus effects. For both standard and contextual mutations, we show cases where performance drops catastrophically in response to barely perceptible
changes. We also show how robustness to contextual mutators can be predicted without the associated contextual information in some cases.

Fig. 6: Examples of images that show the largest change in detection performance for MS-CNN under moderate blur and haze. For all of them, the rate of FPs per image required to detect the person increases by three to five orders of magnitude. In each image, the green box shows the labeled location of the person. The blue and red boxes are the detection produced by the SUT before and after mutation respectively, and the white-on-blue text is the strength of that detection (ranged 0 to 1). Finally, the value in whiteon-yellow text shows the average FP rate per image that a sensitivity threshold set at that value would yield. i.e., that is the required FP rate to still detect the person.

Pezzementi, Z., Tabor, T., Yim, S., Chang, J., Drozd, B., Guttendorf, D., Wagner, M., & Koopman, P., "Putting image manipulations in context: robustness testing for safe perception," IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Aug. 2018.

Tuesday, July 24, 2018

Pennsylvania's Autonomous Vehicle Testing Guidelines

PennDOT has just issued new Automated Vehicle Testing Guidance:
       July 2018 PennDOT AV Testing Guidance (link to acrobat document)
(also, there is a press release.)

It's only been a short three months since the PA AV Summit in which PennDOT took up a challenge to improve AV testing policy. Today PennDOT released a significantly revised policy as promised. And it looks like they've been listening to safety advocates as well as AV companies.

At a high level, there is a lot to like about this policy. It makes it clear that a written safety plan is required, and suggests addressing one way or another the big three items I've proposed for AV testing safety
  • Make sure that the driver is paying attention
  • Make sure that the driver is capable of safing the vehicle in time when something goes wrong
  • Make sure that the Big Red Button (disengagement mechanism) is actually safe

There are a number of items in the guidance that look like a good idea. Here is a partial list of ones that catch my idea as being on the right track (many other ideas in the document are also good):

Good Ideas:
  • Submission of a written safety plan
  • Must have a safety driver in the driver seat who is able to take immediate physical control as required
  • Two safety drivers above 25 mph to ensure that the safety drivers are able to tend to both the safety driving and the testing
  • Validation "under controlled conditions" before on-road testing
  • Disengagement technology complies with industry standards
  • Safety driver training is mandatory, and has a nice list of required topics
  • Data recording for post-mishap analysis
  • Mitigate cybersecurity risk
  • Quality controls to ensure that major items are "adhered to and measured to ensure safe operation"
There are also some ideas that might or might not work out well in practice. I'm not so sure how these will work out, and they seem in some cases to be compromises:

Not Sure About These:
  • Only one safety driver required below 25 mph. It's true that low speed pedestrian collisions are less lethal, and there can be more time to react, so the risk is somewhat lower. But time will tell if drivers are able to stay sufficiently alert to avoid mishaps even if they are lower speed.
  • It's not explicit about the issue of ensuring that there is enough time for a safety driver to intervene when something goes wrong. It's implicit in the parts about a driver being able to safe the vehicle. It's possible that this was considered a technical issue for developers rather than regulators, but in my mind it is a primary concern that can easily be overlooked in a safety plan. This topic should be more explicitly called out in the safety plan.
  • The data reporting beyond crashes is mostly just tracking drivers, vehicles, and how much testing they are doing.  I'd like to see more reporting regarding how well they are adhering to their own safety plan. It's one thing to say things look good via hand waving and "trust us, we're smart." It's another to report metrics such as how often drivers drop out during testing and what corrective actions are taken in response to such data. (The rate won't be a perfect zero; continual improvement should be the goal, as well as mishap rates no worse than conventional vehicles during testing.) I realize picking metrics can be a problem -- so just let each company decide for themselves what they want to report. The requirement should be to show evidence that safety is actually being achieved during testing. To be fair, there is a bullet in the document requiring quality controls. I'd like that bullet to have more explicit teeth to get the job done.
  • The nicely outlined PennDOT safety plan can be avoided by instead submitting something following the 2017 NHTSA AV Guidance. That guidance is a lot weaker than the 2016 NHTSA AV Guidance was. Waymo and GM have already created such public safety disclosures, and others are likely coming. However, it is difficult for a reader to know if AV vendors are just saying a lot of buzzwords or are actually doing the right things to be safe. Ultimately I'm not comfortable with "trust us, we're safe" with no supporting evidence. While some disclosure is better than no disclosure, the public deserves better than NHTSA's rather low bar in safety plan transparency, which was not intended to deal specifically with on-road testing. We'll have to see how this alternative option plays out, and what transparency the AV testers voluntarily provide. Maybe the new 2018 NHTSA AV Guidance due later this summer will raise the bar again.
Having said nice things for the most part, there are a few areas which really need improvement in a future revision. I realize they didn't have time to solve everything in three months, and it's good to see the progress they made. But I hope these areas are on the list for the next iteration:

Not A Fan:
  • Only one safety driver above 25 mph after undergoing "enhanced driver safety training." It's unclear what this training might really be, or if more training will really result in drivers that can do solo testing safely. I'd like to see something more substantive demonstrating that solo drivers will actually be safe in practice. Training only goes so far, and no amount of hiring only experienced drivers will eliminate the fact that humans have trouble staying engaged when supervising autonomy for long stretches of time. I'm concerned this will end up being a loophole that puts solo drivers in an untenable safety role.
  • No independent auditing. This is a big one, and worth discussing at length.
The biggest issue I see is no requirement for independent auditing of safety. I can understand why it might be difficult to get testers on board with such a requirement, especially a requirement for third party auditing. The AV business is shrouded in secrecy. Nobody wants PennDOT or anyone else poking around in their business. But every other safety-critical domain is based on an approach of transparent, independent safety assessment.

A key here is that independent auditing does NOT have to include public release of information.  The "secret sauce" doesn't even have to be revealed to auditors, so long as the system is safe regardless of what's in the fancy autonomy parts of the system. There are established models to keep trade secrets a secret used in other industries while still providing independent oversight of safety. There's no reason AVs should be any different. After all, we're all being put at risk by AV testing when we share public roads with them, even as pedestrians. AV testing ought to have transparent, independent safety oversight.

Overall, I think this guidance is excellent progress from PennDOT that puts us ahead of most, if not all locations in the US regarding AV safety testing. I hope that AV testers take this and my points above to heart, and get ahead of the safety testing problem.

Monday, July 23, 2018

Road Sign Databases and Safety Critical Data Integrity

It's common for autonomous vehicles to use road map data, sign data, and so on for their operation. But what if that data has a problem?

Consider that while some data is being mapped by the vehicle manufacturers, they might be relying upon other data as well.  For example, some companies are encouraging cities to build a database of local road signs  (https://www.wired.com/story/inrix-road-rules-self-driving-cars?mbid=nl_071718_daily_list3_p4&CNDID=23351989)

It's important to understand the integrity of the data. What if there is a stop sign missing from the database and the vehicle decides to believe the database if it's not sure whether a stop sign in the real world is valid?  (Perhaps it's hard to see the real world stop sign due to sun glare and the vehicle just goes with the database.) If the vehicle blows through a stop sign because it's missing from the database, whose fault is that?  And what happens next?

Hopefully such databases will be highly accurate, but anyone who has worked with any non-trivial database knows there is always some problem somewhere. In fact, there have been numerous accidents and even deaths due to incorrect or corrupted data over the years.

Avoiding "death by road sign database" requires managing the safety critical integrity of the road sign data (and map data in general).  If your system uses it for guidance but assumes it is defective with comparatively high probability, then maybe you're fine. But as soon as you trust it to make a safety-relevant decision, you need to think about how much you can trust it and what measures are in place to ensure it is not only accurately captured, but also dependably maintained, updated, and delivered to consumers.

Fortunately you don't need to start from scratch.  The Safety-Critical Systems Club has been working on this problem for a while, and recently issued version 3 of their guidlines for safety critical data. You can get it for free as a download here: https://scsc.uk/scsc-127c

The guidance includes a broad range of  information, guidance, and a worked example.  It also has quite a number of data integrity issues in Appendix H that are worth looking at if you need some war stories about what happens if you get data integrity wrong.  Highly recommended.


Monday, July 16, 2018

A Safe Way to Apply FMVSS Principles to Self-Driving Cars

As the self-driving car industry works to create safer vehicles, it is facing a significant regulatory challenge.  Complying with existing Federal Motor Vehicle Safety Standards (FMVSS) can be difficult or impossible for advanced designs. For conventional vehicles the FMVSS structure helps ensure a basic level of safety by testing some key safety capabilities. However, it might be impossible to run these tests on advanced self-driving cars that lack a brake pedal, steering wheel, or other components required by test procedures.

While there is industry pressure to waive some FMVSS requirements in the name of hastening progress, doing so is likely to result in safety problems. I’ll explain a way out of this dilemma based on the established technique of using safety cases. In brief, auto makers should create an evidence-based explanation as to why they achieve the intended safety goals of current FMVSS regulations even if they can’t perform the tests as written. This does not require disclosure of proprietary autonomous vehicle technology, and does not require waiting for the government to design new safety test procedures.

Why the Current FMVSS Structure Must Change

Consider an example of FMVSS 138, which relates to tire pressure monitoring. At some point many readers have seen a tire pressure telltale light, warning of low tire pressure:

FMVSS 138 Low Tire Pressure Telltale

This light exists because of FMVSS, which specifies tests to make sure that a driver-visible telltale light turns on for under-inflation and blow-out conditions with specified road surface conditions, vehicle speed, and so on.

But what if an unmanned vehicle doesn’t have a driver seat?  Or even a dashboard for mounting the telltale? Should we wait years for the government to develop an alternate self-driving car FMVSS series? Or should we simply waive FMVSS compliance when the tests don’t make sense as written?

Simplistic, blanket waivers are a bad idea. It is said that safety standards such as FMVSS are written in the blood of past victims. Self-driving cars are supposed to improve safety. We shouldn’t grant FMVSS waivers that will result in having more blood spilled to re-learn well understood lessons for self-driving cars.

The weakness of the FMVSS approach is that the tests don’t explicitly capture the “why” of the safety standard. Rather, there is a very prescriptive set of rules, operating in a manner similar to building codes for houses. Like building codes, they can take time to update when new technology appears. But just as it is a bad idea to skip a building inspection on your new house, you shouldn’t let vehicle makers skip FMVSS tests for your new car – self-driving or otherwise. Despite the fear of hindering progress, something must be done to adapt the FMVSS framework to self-driving cars.

A Safety Case Approach to FMVSS

A way to permit rapid progress while still ensuring that we don’t lose ground on basic vehicle safety is to adopt a safety case approach. A safety case is a written explanation of why a system is appropriately safe. Safety cases include: a safety goal, a strategy for meeting the goal, and evidence that the strategy actually works.

To create an FMVSS 138 safety case, a self-driving car maker would first need to identify the safety goals behind that standard. A number of public documents that precede FMVSS 138 state safety goals of detecting low tire pressure and avoiding blowouts. Those goals were, in turn, motivated by dozens of deaths resulting from tire blowouts that provoked the 2000 TREAD act.

The next step is for the vehicle maker to propose a safety strategy compatible with its product. For example, vehicle software might set internal speed and distance limits in response to a tire failure, or simply pull off the road to await service. The safety case would also propose tests to provide concrete evidence that the safety strategy is effective. For example, instead of demonstrating that a telltale light illuminates, the test might instead show that the vehicle pulls to the side of the road within a certain timeframe when low tire pressure is detected. There is considerable flexibility in safety strategy and evidence so long as the safety goal is adequately met.

Regulators will need a process for documenting the safety case for each requested FMVSS deviation. They must decide whether they should evaluate safety cases up front or employ less direct feedback approaches such as post-mishap litigation. Regardless of approach, the safety cases can be made public, because they will describe a way to test vehicles for basic safety, and not the inner workings of highly proprietary autonomy algorithms.

Implementing this approach only requires vehicle makers to do extra work for FMVSS deviations that provide their products with a competitive advantage. Over time, it is likely that a set of standardized industry approaches for typical vehicle designs will emerge, reducing the effort involved. And if an FMVSS requirement is truly irrelevant, a safety case can explain why.

While there is much more to self-driving car safety than FMVSS compliance, we should not be moving backward by abandoning accepted vehicle safety requirements. Instead, a safety case approach will enable self-driving car makers to innovate as rapidly as they like, with a pay-as-you-go burden to justify why their alternative approaches to providing existing safety capabilities are adequate.

Author info: Prof. Koopman has been helping government, commercial, and academic self-driving developers improve safety for 20 years.
Contact: koopman@cmu.edu

Originally published in The Hill 6/30/2018:

Saturday, July 14, 2018

AVS 2018 Panel Session

It was great to have the opportunity to participate in a panel on autonomous vehicle validation and safety at AVS in San Francisco this past week.  Thanks especially to Steve Shladover for organizing such an excellent forum for discussion.

The discussion was the super-brief version. If you want to dig deeper, you can find much more complete slide decks attached to other blog posts:
The first question was to spend 5 minutes talking about the types of things we do for validation and safety.  Here are my slides from that very brief opening statement.

ADAS Code of Practice

One of the speakers at AVS last month mentioned that there was a Code of Practice for ADAS design (basically, level 1 and level 2 autonomy)....