Safe Autonomy: ODD

Showing posts with label ODD. Show all posts

Saturday, December 12, 2020

Operational Design Domain Metrics (Metrics Episode 11)

Operational Design Domain metrics (ODD metrics) deal with both how thoroughly the ODD has been validated as well as the completeness of the ODD description. How often the vehicle is forcibly ejected from its ODD also matters.

Operational Design Domain metrics (ODD metrics) deal with both how thoroughly the ODD has been validated as well as the completeness of the ODD description.

An ODD is the designer’s model of the types of things that the self-driving cars intended to deal with. The actual world, in general, is going to have things that are outside the ODD. As a simple example, the ODD might include fair weather and rain, but snow and ice might be outside the ODD because the vehicle is intended to be deployed in a place where snow is very infrequent.

Despite designer’s best efforts, it’s always possible for the ODD to be violated. For example, if the ODD is Las Vegas in the desert, this system might be designed for mostly dry weather or possibly light rain. But in fact, in Vegas, once in a while, it rains and sometimes it even snows. The day that it snows the vehicle will be outside its ODD, even though it’s deployed in Las Vegas.

There are several types of ODD safety metrics that can be helpful. One is how well validation covers the ODD. What that means is whether the testing, analysis, simulation and other validation actually cover everything in the ODD, or have gaps in coverage.

When considering ODD coverage it’s important to realize that ODDs have many, many dimensions. There are much more than just geo-fencing boundaries. Sure, there’s day and night, wet versus dry, and freeze versus thaw. But you also have traffic rules, condition of road markings, the types of vehicles present, the types of pedestrians present, whether there are leaves on the tree that affect LIDAR localization, and so on. All these things and more can affect perception, planning, and motion constraints.

While it’s true that a geo-fence area can help limit some of the diversity in the ODD, simply specifying a geo-fence doesn’t tell you everything you need to know, and you’ve covered all the things that are inside that geo-fenced area. Metrics for ODD validation can be based on a detailed model of what’s actually in the ODD -- basically an ODD taxonomy of all the different factors that have to be handled and how well testing, simulation, and other validation cover that taxonomy.

Another type of metric is how well the system detects ODD violations. At some point, a vehicle will be forcibly ejected from its ODD even though it didn’t do anything wrong, simply due to external events. For example, a freak snowstorm in the desert, a tornado or the appearance of a new type of completely unexpected vehicle and force a vehicle out of its ODD with essentially no warning. The system has to recognize when it has exited its ODD and be safe. A metric related to this is how often ODD violations are happening during testing and on the road after deployment.

Another metric is what fraction of ODD violations are actually detected by the vehicle. This could be a crucial safety metric, because if an ODD violation occurs and the vehicle doesn’t know it, it might be operating unsafely. Now it’s hard to build a detector for ODD violations that the vehicle can’t detect (and such failures should be corrected). But this metric can be gathered by root cause analysis whenever there’s been some sort of system failure or incident. One of the root causes might simply be failure to detect an ODD violation.

Coverage of the ODD is important, but an equally important question is how good is the ODD description itself? If your ODD description is missing many things that happen every day in your actual operational domain (the real world,), then you’re going to have some problems.

A higher level of metric to talk about is ODD description quality. That is likely to be tied to other metrics already mentioned in this and other segments. Here are some examples. The frequency of ODD violations can help inform the coverage metric of the ODD against the operational domain. Frequency of motion failures could be related to motion system problems, but could also be due to missing environmental characteristics in your ODD. For example, cobblestone pavers are going to have significantly different surface dynamics than a smooth concrete surface and might come as a surprise when they are encountered.

Frequency of perception failures could be due to training issues, but could also be something missing from the ODD object taxonomy. For example, a new aggressive clothing style or new types of vehicles. The frequency of planning failures could be due to planning bugs, but could also be due to the ODD missing descriptions of informal local traffic conventions.

Frequency of prediction failures could be prediction issues, but could also be due to missing a specific class of actors. For example, groups of 10 and 20 runners in formation near a military base might present a challenge if formation runners aren't in training data. It might be okay to have an incomplete ODD so long as you can always tell when something is happening that forced you out of the ODD. But it’s important to consider that metric issues in various areas might be due to unintentionally restricted ODD versus being an actual failure of the system design itself.

Summing up, ODD metric should address how well validation covers the whole ODD and how well the system detects ODD violations. It’s also useful to consider that a cause of poor metrics and other aspects of the design might in fact be that the ODD description is missing something important compared to what happens in the real world.

Operational Design Domain (ODD) for Autonomous Systems

The Operational Design Domain (ODD) is the set of environmental conditions that an autonomous system is designed to work in. Typically an ODD is thought of as some sort of geo-fencing plus a obvious weather conditions (rain, snow, sun). But, it's a lot more than that. Did you think of all of these?

Canton Avenue, the unofficial steepest street in the world, is less than 4 miles from downtown Pittsburgh.
Note cobblestone on the top half and the sidewalk stairs. Cars slide (sometimes backwards) down the street in winter.
Geo-fencing is more complicated than drawing a circle around a city center.
[Wikipedia]

Characterizing the system operational environment should include at least the following:

Operational terrain, and associated location-dependent characteristics (e.g., slope, camber, curvature, banking, coefficient of friction, road roughness, air density) including immediate vehicle surroundings and projected vehicle path. It is important to note that dramatic changes can occur in relatively short distances.
Environmental and weather conditions such as surface temperature, air temperature, wind, visibility, precipitation, icing, lighting, glare, electromagnetic interference, clutter, vibration, and other types of sensor noise.
Operational infrastructure, such as availability and placement of operational surfacing, navigation aids (e.g., beacons, lane markings, augmented signage), traffic management devices (e.g., traffic lights, right of way signage, vehicle running lights), keep-out zones, special road use rules (e.g., time-dependent lane direction changes) and vehicle-to-infrastructure availability.
Rules of engagement and expectations for interaction with the environment and other aspects of the operational state space, including traffic laws, social norms, and customary signaling and negotiation procedures with other agents (both autonomous and human, including explicit signaling as well as implicit signaling via vehicle motion control).
Considerations for deployment to multiple regions/countries (e.g., blue stop signs, “right turn keep moving” stop sign modifiers, horizontal vs. vertical traffic signal orientation, side-of-road changes).
Communication modes, bandwidth, latency, stability, availability, reliability, including both machine-to-machine communications and human interaction.
Availability and freshness of infrastructure characterization data such as level of mapping detail and identification of temporary deviations from baseline data (e.g., construction zones, traffic jams, temporary traffic rules such as for hurricane evacuation).
Expected distributions of operational state space elements, including which elements are considered rare but in-scope (e.g. toll booths, police traffic stops), and which are considered outside the region of the state space in which the system is intended to operate.

Special attention should be paid to ODD aspects that are relevant to inherent equipment limitations, such as the minimum illumination required by cameras.

Are there any other aspects of ODD we missed?

(This is an excerpt of Koopman, P. & Fratrik, F., "How many operational design domains, objects, and events?" SafeAI 2019, AAAI, Jan 27, 2019.)

Wednesday, May 22, 2019

Car Drivers Do More than Drive

How will self-driving cars handle all the non-driving tasks that drivers also perform? How will they make unaccompanied kids stop sticking their head out the window?

https://pixabay.com/photos/stuffed-animals-classic-car-driving-2798333/

Hey Kids -- Don't stick your heads out the window!

The conversation about self-driving cars is almost all about whether a computer can safely perform the "dynamic driving task." As well it should be -- at first. If that part isn't safe, then there isn't much to talk about.

But, looking forward, human drivers do more than drive. They also provide adult supervision (and, on a good day, mature judgement) about the operation of the vehicle in other respects. If you've never heard the phrase "stop doing that right now or I swear I'm going to stop the car!" then probably you've never ridden in a car with multiple children. And yet, we're already talking about sending kids to school in an automated school bus. Presumably the point is to avoid the cost of the human supervision.

But is putting a bunch of kids in a school bus without an adult a good idea? Will the red-faced person on the TV monitor yelling at the kids really be effective? Or just provide entertainment for already screaming kids?

But there's more than that to consider. Here's my start at a list of things human drivers (including vehicle owners, taxi drivers, and so on) do that isn't really driving.

Some tasks will arguably be done by a fleet maintenance function:

Preflight inspection of vehicle. (Flat tires, structural damage.)
Preflight correction of issues. (Cleaning off snow and ice. Cleaning windshield.)
Ensure routine maintenance has been performed. (Vehicle inspections, good tires, fueling/charging, fluid top-off if needed.)
Maintain vehicle interior cleanliness. And we're not just talking about empty water bottles here. (Might require taking vehicle out of service for cleaning up motion sickness results. But somehow the maintenance crew needs to know there has been a problem.)

But some things have to happen on the road when no human driver is present. Examples include:

Ensure vehicle occupants stay properly seated and secured.
Keep vehicle occupants from doing unsafe things. (Hand out window, head out sunroof, fighting, who knows what. Generally providing adult supervision. Especially if strangers or kids are sharing a vehicle.)
Responding to cargo that comes loose.
Emergency egress coordination (e.g., getting sleeping children, injured, and mobility impaired passengers out of vehicle when a dangerous situation occurs such as a vehicle fire)

Anyone who seriously wants to build vehicles that don't have a designated "person in charge" (which is the driver in conventional vehicles) will need to think through all these issues. And likely more. Any argument that a self driving vehicle is safe for unattended service will need to deal with all these issues, and more. (UL 4600 is intended to cover all this ground.)

Can you think of any other non-driving tasks that need to be handled?

Friday, January 25, 2019

How Many Operational Design Domains, Objects, and Events? Safe AI 2019 talk

Validating self-driving cars requires so, so much more than just "geo-fencing" if you want to make the problem tractable. My Safe AI 2019 paper and presentation explain and illustrate why this is the case.

Paper: https://users.ece.cmu.edu/~koopman/pubs/Koopman19_SAFE_AI_ODD_OEDR.pdf
Download slides: https://users.ece.cmu.edu/~koopman/pubs/koopman19_SAFE_AI_slides.pdf
(For slideshare version see below)

How Many Operational Design Domains, Objects, and Events
Phil Koopman & Frank Fratrik

Abstract:
A first step toward validating an autonomous vehicle is deciding what aspects of the system need to be validated. This paper lists factors we have found to be relevant in the areas of operational design domain, object and event detection and response, vehicle maneuvers, and fault management. While any such list is unlikely to be complete, our contribution can form a starting point for a publicly available master list of considerations to ensure that autonomous vehicle validation efforts do not contain crucial gaps due to missing known issues.

Safe Autonomy