Monday, July 22, 2019

Autonomous Vehicle Testing Safety Needs More Transparency

Last week there were two injuries involving human-supervised autonomous test shuttles on different continents, with no apparent connection other than random chance.  (For example: Link) As deployment of this work-in-progress technology scales up in public, we know that we can expect more high-profile accidents. Fortunately this time nobody was killed or suffered life-altering injuries. (But we still need to find out what actually happened.) And to be sure, human-driven vehicles are far from accident-free. But what about next time?

The bigger issue for the industry is: will the next autonomous vehicle testing mishap be due to rare and random problems that are within the bounds of reasonable risk? Or will they be due to safety issues that could and should have been addressed beforehand?

Public trust in autonomous vehicle technology has already eroded in the past year. Each new mishap has the unfortunate potential to make that situation worse, regardless of the technical root cause. While no mode of transportation is perfectly safe, it's important that the testing of experimental self driving car technology not expose the public to reasonably avoidable risk. And it's equally important that the public's perception matches the actual risk.

Historically the autonomous vehicle industry has operated under a cloak of secrecy. As we've seen, that can lead to boom and bust cycles of public perception, with booms of optimism followed by backlash after each publicized accident. But in fairness, if there is no information about public testing risk other than hype about an accident-free far-flung future, what is the public supposed to think? Self-driving cars won't be perfect. The goal is to make them better than the current situation. One hopes that along the way things won't actually get worse.

Some progress in public safety disclosure has been made, albeit with low participation rates. One of the two vehicles involved in injuries this past week has a public safety report available. The other does not. In fact, a significant majority of testing organizations have not taken the basic step of making a Voluntary Safety Self-Assessment report available to NHTSA. And to be clear, that disclosure process is more about explaining progress toward production maturity rather than the specific topic of public testing safety.

The industry needs to do better at providing transparent, credible safety information while testing this still-experimental technology. Long term public education and explanation are important. But the more pressing need revolves around what's happening on our roads right now during testing operations. That is what is making news headlines, and is the source of any current risk.

At some point either autonomous vehicle testers are actually doing safe, responsible public operations or they aren't. If they aren't, that is bound to catch up with them as operations scale up. From the point of view of a tester:

- It's a problem if you can't explain to yourself why you are acceptably safe in a methodical, rigorous way. (In that case, probably you're unsafe.) 
- It's a problem if you expect human safety drivers to perform with superhuman ability. They can't.
- It's a problem if you aren't ready to explain to authorities why are still acceptably safe after a mishap.
- It's a problem if you can't explain to a jury that you used reasonable care to ensure safety. 
- It's a problem if your company's testing operations get sidelined by an accident investigation.
- It's a problem for mishap victims if accidents occur that were reasonably avoidable, especially involving vulnerable road users. 
- It's a problem for the whole industry if people lose trust in the technology's ability to operate safely in public areas. 

- It's a problem if you can't explain to the public -- with technical credibility -- why they should believe you are safe. Preferably before you begin testing operation on public roads. 

Some companies are working on transparent safety more aggressively than others. Some are working on safety cases that contain detailed chains of reasoning and evidence to ensure that they have all the bases covered for public road testing. Others might not be.  But really, we don't know. 

And that is an essential part of the problem -- we really have no idea who is being diligent about safety. Eventually the truth will out, but bad news is all too likely to come in the form of deaths and injuries. We as an industry need to keep that from happening.

It only takes one company to have a severe mishap that, potentially, dramatically hurts the entire industry. While bad luck can happen to anyone, it's more likely to happen to a company that might be cutting corners on safety to get to market faster. 

The days of "trust us, we're smart" are over for the autonomous vehicle industry. Trust has to be earned. Transparency backed with technical credibility is a crucial first step to earning trust. The industry has been given significant latitude to operate on public roads, but that comes with great responsibility and a need for transparency regarding public safety.

Safety should truly come first. To that end, every company testing on public roads should immediately make a transparent, technically credible statement about their road testing safety practices. A simple "we have safety drivers" isn't enough. These disclosures can form a basis for uniform practices across the industry to make sure that this technology survives its adolescence and has a chance to reach maturity and the benefits that promises.

Dr. Philip Koopman is co-founder and CTO of Edge Case Research, which helps companies make their autonomous systems safer. He is also a professor at Carnegie Mellon University. He is a principle technical author of the UL 4600 draft standard for autonomous system safety, and has been working on self-driving car safety for more than 20 years. Contact:

Monday, July 1, 2019

Autonomous Vehicle Fault and Failure Management

When you build an autonomous vehicle you can't count on a human driver to notice when something's wrong and "do the right thing." Here is a list of faults, system limitations, and fault responses AVs will need to get right. Did you think of these?

System Limitations:

Sometimes the issue isn't that something is broken, but rather simply that all vehicles have limitations. You have to know your system's limitations.
  • Current capabilities of sensors and actuators, which can depend upon the operational state space.
  • Detecting and handling a vehicle excursion outside the operational state space for which it was validated, including all aspects of {ODD, OEDR, Maneuver, Fault} tuples.
  • Desired availability despite fault states, including any graceful degradation plan, and any limits placed upon the degraded operational state space.
  • Capability variation based on payload characteristics (e.g. passenger vehicle overloaded with cargo, uneven weight distribution, truck loaded with gravel, tanker half filled with liquid) and autonomous payload modification (e.g. trailer connect/disconnect).
  • Capability variation based on functional modes (e.g. pivot vs. Ackerman vs. crab steering, rear wheel steering, ABS or 4WD engaged/disengaged).
  • Capability variation based on ad-hoc teaming (e.g. V2V, V2I) and planned teaming (e.g. leader-follower or platooning vehicle pairing).
  • Incompleteness, incorrectness, corruption or unavailability of external information (V2V, V2I).

System Faults:
  • Perception failure, including transient and permanent faults in classification and pose of objects.
  • Planning failures, including those leading to collision, unsafe trajectories (e.g., rollover risk), and dangerous paths (e.g., roadway departure).
  • Vehicle equipment operational faults (e.g., blown tire, engine stall, brake failure, steering failure, lighting system failure, transmission failure, uncommanded engine power, autonomy equipment failure, electrical system failure, vehicle diagnostic trouble codes).
  • Vehicle equipment maintenance faults (e.g., improper tire pressure, bald tires, misaligned wheels, empty sensor cleaning fluid reservoir, depleted fuel/battery).
  • Operational degradation of sensors and actuators including temporary (e.g., accumulation of mud, dirt, dust, heat, water, ice, salt spray, smashed insects) and permanent (e.g., manufacturing imperfections, scratches, scouring, aging, wear-out, blockage, impact damage).
  • Equipment damage including detecting and mitigating catastrophic loss (e.g., vehicle collisions, lighting strikes, roadway departure), minor losses (e.g., sensor knocked off, actuator failures), and temporary losses (e.g., misalignment due to bent support bracket, loss of calibration).
  • Incorrect, missing, stale, and inaccurate map data.
  • Training data incompleteness, incorrectness, known bias, or unknown bias.

Fault Responses:

Some of the faults and limitations fall within the purview of safety standards that apply to non-autonomous functions. However, a unified system-level view of fault detection and mitigation can be useful to ensure that no faults are left unaddressed. More importantly, to the degree that credit has been taken for a human driver participating in fault mitigation by safety standards, that places fault mitigation obligations upon the autonomy.
  • How the system behaves when encountering an exceptional operational state space, experiencing a fault, or reaching a system limitation.
  • Diagnostic gaps (e.g., latent faults, undetected faults, undetected faulty redundancy).
  • How the system re-integrates failed components, including recovery from transient faults and recovery from repaired permanent faults during operation and/or after maintenance.
  • Response and policies for prioritizing or otherwise determining actions in inherently risky or certain-loss situations.
  • Withstanding an attack (system security, compromised infrastructure, compromised other vehicles), and deterring inappropriate use (e.g., malicious commands, inappropriately dangerous cargo, dangerous passenger behavior).
  • How the system is updated to correct functional defects, security defects, safety defects, and addition of new or improved capabilities.
Is there anything we missed?

(This is an excerpt of Koopman, P. & Fratrik, F., "How many operational design domains, objects, and events?" SafeAI 2019, AAAI, Jan 27, 2019.)

Wednesday, June 26, 2019

Event Detection Considerations for Autonomous Vehicles (OEDR -- part 2)

Object and Event Detection and Recognition (OEDR) also involves making predictions about what might happen next. Is that pedestrian waiting for a bus? or about to walk out into the crosswalk right in front of my car? Did you think of all of these aspects?
The infamous Pittsburgh Left. The first vehicle at a red light
will (sometimes) turn left when the light turns to green.

Some factors to consider when deciding what events and behaviors your system needs to predict include:

  • Determining expected behaviors of other objects, which might involve a probability distribution and is likely to be based on object classification.
  • Normal or reasonably expected movements by objects in the environment.
  • Unexpected, incorrect, or exceptional movement of other vehicles, obstacles, people, or other objects in the environment.
  • Failure to move by other objects which are reasonably expected to move.
  • Operator interactions prior to, during, and post autonomy engagement including: supervising driver alertness monitoring, informing occupants, interaction with local or remote operator locations, mode selection and enablement, operator takeover, operator cancellation or redirect, operator status feedback, operator intervention latency, single operator supervision of multiple systems (multi-tasking), operator handoff, loss of operator ability to interact with vehicle.
  • Human interactions including: human commands (civilians performing traffic direction, police pull-over, passenger distress), normal human interactions (pedestrian crossing, passenger entry/egress), common human rule-breaking (crossing mid-block when far from an intersection, speeding, rubbernecking, use of parking chairs, distracted walking), abnormal human interactions (defiant jaywalking, attacks on vehicle, attempted carjacking), and humans who are not able to follow rules (children, impaired adults).
  • Non-human interactions including: animal interaction (flocks/herds, pets, dangerous wildlife, protected wildlife) and delivery robots.

Is there anything we missed?   (Previous post had the "objects" part of OEDR.)

(This is an excerpt of Koopman, P. & Fratrik, F., "How many operational design domains, objects, and events?" SafeAI 2019, AAAI, Jan 27, 2019.)

Wednesday, June 19, 2019

Object Detection Considerations for Autonomous Vehicles (OEDR -- part 1)

Object and Event Detection and Recognition (OEDR) involves having an autonomous vehicle detect and classify various types of objects so that it can plan a response. Detection is only the first step; you need to also be able to classify the obstacle to predict what might happen next. Pedestrians tend to walk into the roadway. Bushes, not so much. Did you think of all of these aspects?
Q: Why did the Mr. Rogers-saurus cross the street?
A: Trick question; he doesn't actually move because he is part of the Pittsburgh Dinosaur Parade.
Some factors to consider when deciding what objects your system needs to detect and recognize include:
  • Ability to detect and identify (e.g. classify) all relevant objects in the environment.
  • Processing and thresholding of sensor data to avoid both false positives (e.g., bouncing drink can, steel bridge joint, steel road construction cover plate, roadside sign, dust cloud, falling leaves) and false negatives (e.g., highly publicized partially automated vehicle collisions with stationary vehicles)
  • Characterizing the likely operational parameters of other road users (e.g., braking capability of leading and following vehicle, or whether another vehicle is behaving erratically enough that there is a likely control fault.)
  • Permanent obstacles such as structures, curbs, median dividers, guard rails, trees, bridges, tunnels, berms, ditches, roadside and overhanging signage.
  • Temporary obstacles such as transient keep-out zones, spills, floods, water-filled potholes, landslides, washed out bridges, overhanging vegetation, and downed power lines. (For practical purposes, “temporary” might mean obstacles not included on maps, with some vehicle having to be the first vehicle to detect an obstacle for placement even on a dynamic map.)
  • People, including cooperative people, uncooperative people, malicious behaviors, and people who are unaware of the operation of the autonomous system.
  • At-risk populations which might be unable, incapable, or exempt from following established rules and norms, such as children as well as injured, ability-impaired, or under-the-influence people.
  • Other cooperative and uncooperative human-driven and autonomous vehicles.
  • Other road users including special purpose vehicles, temporary structures, street dining, street festivals, parades, motorcades, funeral processions, farm equipment, construction crews, draft animals, farm animals, and endangered species.
  • Other non-stationary objects including uncontrolled moving objects, falling objects, wind-blown objects, in-traffic cargo spills, and low-flying aircraft.
Is there anything we missed?   (Next post will have the "events" part of OEDR.)

(This is an excerpt of Koopman, P. & Fratrik, F., "How many operational design domains, objects, and events?" SafeAI 2019, AAAI, Jan 27, 2019.)

Wednesday, June 12, 2019

Operational Design Domain (ODD) for Autonomous Systems

The Operational Design Domain (ODD) is the set of environmental conditions that an autonomous system is designed to work in. Typically an ODD is thought of as some sort of geo-fencing plus a obvious weather conditions (rain, snow, sun). But, it's a lot more than that. Did you think of all of these?

Canton Avenue, the unofficial steepest street in the world, is less than 4 miles from downtown Pittsburgh.
Note cobblestone on the top half and the sidewalk stairs.  Cars slide (sometimes backwards) down the street in winter.
Geo-fencing is more complicated than drawing a circle around a city center.
Characterizing the system operational environment should include at least the following:

  • Operational terrain, and associated location-dependent characteristics (e.g., slope, camber, curvature, banking, coefficient of friction, road roughness, air density) including immediate vehicle surroundings and projected vehicle path. It is important to note that dramatic changes can occur in relatively short distances.
  • Environmental and weather conditions such as surface temperature, air temperature, wind, visibility, precipitation, icing, lighting, glare, electromagnetic interference, clutter, vibration, and other types of sensor noise.
  • Operational infrastructure, such as availability and placement of operational surfacing, navigation aids (e.g., beacons, lane markings, augmented signage), traffic management devices (e.g., traffic lights, right of way signage, vehicle running lights), keep-out zones, special road use rules (e.g., time-dependent lane direction changes) and vehicle-to-infrastructure availability.
  • Rules of engagement and expectations for interaction with the environment and other aspects of the operational state space, including traffic laws, social norms, and customary signaling and negotiation procedures with other agents (both autonomous and human, including explicit signaling as well as implicit signaling via vehicle motion control).
  • Considerations for deployment to multiple regions/countries (e.g., blue stop signs, “right turn keep moving” stop sign modifiers, horizontal vs. vertical traffic signal orientation, side-of-road changes).
  • Communication modes, bandwidth, latency, stability, availability, reliability, including both machine-to-machine communications and human interaction.
  • Availability and freshness of infrastructure characterization data such as level of mapping detail and identification of temporary deviations from baseline data (e.g., construction zones, traffic jams, temporary traffic rules such as for hurricane evacuation).
  • Expected distributions of operational state space elements, including which elements are considered rare but in-scope (e.g. toll booths, police traffic stops), and which are considered outside the region of the state space in which the system is intended to operate.

Special attention should be paid to ODD aspects that are relevant to inherent equipment limitations, such as the minimum illumination required by cameras.

Are there any other aspects of ODD we missed?

(This is an excerpt of Koopman, P. & Fratrik, F., "How many operational design domains, objects, and events?" SafeAI 2019, AAAI, Jan 27, 2019.)

Tuesday, May 28, 2019

Ethical Problems That Matter for Self Driving Cars

It's time to get past the irrelevant Trolley Problem and talk about ethical issues that actually matter in the real world of self driving cars.  Here's a starter list involving public road testing, human driver responsibilities, safety confidence, and grappling with how safe is safe enough.

  • Public Road Testing. Public road testing clearly puts non-participants such at pedestrians at risk. Is it OK to test on unconsenting human subjects? If the government hasn't given explicit permission to road test in a particular location, arguably that is what is (or has been) happening. An argument that simply having a "safety driver" mitigates risk is clearly insufficient based on the tragic fatality in Tempe AZ last year. 
  • Expecting Human Drivers to be Super-Human. High-end driver assistance systems might be asking the impossible of human drivers. Simply warning the driver that (s)he is responsible for vehicle safety doesn't change the well known fact that humans struggle to supervise high-end autonomy effectively, and that humans are prone to abusing highly automated systems. This gives way to questions such as:
    • At what point is it unethical to hold drivers accountable for tasks that require what amount to super-human abilities and performance?
    • Are there viable ethical approaches to solving this problem? For example, if a human unconsciously learns how to game a driver monitoring system (e.g., via falling asleep with eyes open -- yes, that is a thing) should that still be the human driver's fault if a crash occurs?
    • Is it OK to deploy technology that will result in drivers being punished for not being super-human if result is that the total death rate declines?
  • Confidence in Safety Before Deployment.  There is work that advocates even slightly better than a human is acceptable ( But there isn't a lot of discussion about the next level of what that really means. Important ethical sub-topics include:
    • Who decides when a vehicle is safe enough to deploy? Should that decision be made by a company on its own, or subject to external checks and balances? Is it OK for a company to deploy a vehicle they think is safe based just on subjective criteria alone: "we're smart, we worked hard, and we're convinced this will save lives"
    • What confidence is required for the actual prediction of casualties from the technology? If you are only statistically 20% confident that your self-driving car will be no more dangerous than a human driver, is that enough?
    • Should limited government resources that could be used for addressing known road safety issues (drunk driving, driving too fast for conditions, lack of seat belt use, distracted driving) be diverted to support self-driving vehicle initiatives using an argument of potential public safety improvement?
  • How Safe is Safe Enough? Even if we understand the relationship between an aggregate safety goal and self-driving car technology, where do we set the safety knob?  How will the following issues affect this?
    • Will risk homeostatis apply? There is an argument that there will be pressure to turn up the speed/traffic volume dials on self-driving cars to increase permissiveness and traffic flow until the same risk as manual driving is reached. (Think more capable cars resulting in crazier roads with the same net injury and fatality rates.)
    • Is it OK to deploy initially with a higher expected death rate than human drivers under an assumption that systems will improve over time, long term reducing the total number of deaths?  (And is it OK for this improvement to be assumed rather than proven to be likely?)
    • What redistribution of demographics for victims is OK? If fewer passengers die but more pedestrians die, is that OK if net death rate is the same? Is is OK if deaths disproportionately occur to specific sub-populations? Did any evaluation of safety before deployment account for these possibilities?
I don't purport to have the definitive answers to any of these problems (except a proposal for road testing safety, cited above). And it might be that some of these problems are more or less answered. The point is that there is so much important, relevant ethical work to be done that people shouldn't be wasting their time on trying to apply the Trolley Problem to AVs. I encourage follow-ups with pointers to relevant work.

If you're still wondering about Trolley-esque situations, see this podcast and the corresponding paper. The short version from the abstract of that paper: Trolley problems are "too contrived to be of practical use, are an inappropriate method for making decisions on issues of safety, and should not be used to inform engineering or policy." In general, it should be incredibly rare for a safely designed self-driving car to get into a no-win situation, and if it does happen they aren't going to have information about the victims and/or aren't going to have control authority to actually behave as suggested in the experiments any time soon, if ever.

Here are some links to more about applying ethics to technical systems in general (@IEEESSIT) and autonomy in particular (, as well as the IEEE P7000 standard series (

Autonomous Vehicle Testing Safety Needs More Transparency

Last week there were two injuries involving human-supervised autonomous test shuttles on different continents, with no apparent connecti...