Monday, March 11, 2024

When Is a Recall Not a Recall?

 This opinion piece claims that a Tesla software update should not be a recall. I'm calling bullsh*t.


Article Link: https://www.wsj.com/articles/dont-believe-the-tesla-headlines-safety-recall-software-update-c2e95f3a?st=x6p4mcr79zaee10&reflink=desktopwebshare_permalink

This opinion WSJ piece has so many issues... in essence arguing the common Tesla talking points that any safety defect (a design defect that might kill someone) doesn't count if an Over-The-Air software update can fix it. Changing the method of fix delivery does not bring potential crash victims back to life. Arguing the dictionary meaning of "recall" is a rhetorical device, and nothing more. Whining about the cost of post cards misses the point that this is about people's lives. If companies care about postage costs perhaps they should try harder not to deploy unsafe software.

If one really needs to respond to the word game arguments from Tesla supporters, it's pretty straightforward to do so:

  • "Recall" => "Safety Defect" (simple terminology change) 
  • "Recall[ed][ing]" => "Publish[ed][ing] a Notice of Safety Defect" (simple terminology change)
  • "Issued a Recall" => "Started the Safety Defect Response Process" (simple terminology change)
  • "OTA software Remedy" => An over the air software update that corrects a safety defect. (No change in meaning.)
  • "OTA software update" => Any over-the-air software update. Some updates are remedies, but hopefully most are not. If not to fix a safety defect, then it's not a remedy. (No change in meaning.)

The above does not change any process; just terminology. Journalists can adopt those terms right now and still be accurate by informally using the phrase "Notice of Safety Defect" to refer to the "Part 573 Safety Recall Report" that is published for every recall. Here's an example of one for Tesla FSD for not stopping at stop signs that Tesla admits in that document "may increase the risk of collision": https://static.nhtsa.gov/odi/rcl/2022/RCLRPT-22V037-4462.PDF

Or we could just continue to use the term "recall" for its compactness and well-established meaning.

It is already the case that only SOME updates are associated with recalls. Any OTA update that does not remedy a safety defect is NOT a recall. Never has been. (Also, an OTA update that does not address a safety defect never has been a Remedy.) Anyone saying that the current rules require all OTA updates to be called recalls is just intentionally confusing things, potentially in an effort to get people to stop paying attention to the many safety defects that have indeed been corrected by OTA software Remedies.

BTW, when someone says the OTA happened in a few days, that overlooks the fact that the safety problem might have been on the road for months (or years) before being corrected. For the rolling stop remedy the firmware with the safety defect was released on Oct. 20, 2020, and the remedy was applied Feb. 1, 2022: A duration of about 469 days during which road users were exposed to undue risk by software with a safety defect.

It is reasonable to say that Software Defined Vehicles should cause NHTSA to revisit and streamline the recall process. If for no other reason that they'll likely be buried under an avalanche of safety-relevant software updates. But arguing that they should be out of the loop for software updates that correct safety defects goes against rather than towards safety.

Monday, January 29, 2024

The Exponent Report on the Cruise Pedestrian Dragging Mishap

On October 2, 2023, a Cruise robotaxi Autonomous Vehicle (AV) in San Francis-co struck and then later – as part of different maneuver – dragged a pedestrian under the vehicle as part of a more complex road mishap. The circumstances included a different human-driven vehicle striking the pedestrian first and what amounts to an attempted cover-up by Cruise senior management of the final portion (the dragging part) of the mishap. High level Cruise leadership was largely sacked. A significant fraction of staff were let go as well. A 3rd party external review was commissioned. Now we have the report..

Quinn Emanuel Investigation report:  Original Link / Archive.org link

High level takeaways based on the Exponent analysis (just the technical portion of the report):

  • A computer-speed response could have avoided the initial crash between the Cruise AV and the pedestrian entirely. A slightly less quick response could have significantly reduced harm. This would have required the AV to recognize a person suddenly appearing in the travel lane in front of it, which it was not able to do. "Unavoidable" is not strictly true.
  • The Cruise AV accelerated toward a pedestrian it knew was directly in front of the AV in a crosswalk, apparently because its end-to-end neural network predicted the pedestrian would be out of the way by the time it got there. That additional speed contributed to having a tight reaction time.
  • The Cruise AV tracked the pedestrian through the collision with the adjacent Nissan. It continued accelerating even though a pedestrian had just been hit in an adjacent lane, additionally reducing available reaction time.
  • The Cruise vehicle had the legs of the trapped pedestrian in camera view during the entirety of the dragging event. In other words it had a sensor that saw the entrapment well enough that post-crash analysis confirmed that view. But that available sensor data did not result in a detection. Rather, the vehicle eventually stopped because of the resistance and excessive wheel spin caused by a rear wheel running up and over the pedestrian's legs and potentially other impediments to movement that might have been caused by the pedestrian trapped under the vehicle.
  • The Cruise vehicle definitely knew it had hit something, but decided to initiate a pull-over maneuver and restart motion for <reasons>. Exponent states: "an alert and attentive human driver would be aware that an impact of some sort had occurred and would not have continued driving without further investigating the situation"

Details follow:

Graph excerpt (page 83):

I'll start by saying it is good to see that Cruise released the 3rd party reports instead of keeping them secret. Sadly, the technical report is heavily redacted, but here is what I can see from what I have to work with.

Both the main report and the attachment are masterpieces of saying the most favorable thing possible given a situation that reflects poorly on the client sponsoring the project. I leave analysis of the lawyer-written portion to others. Here I focus on the technical portion written by Exponent. (Page numbers are from the Exponent report, which starts at page 1 of the appendix.) 

This is not an exercise in blaming Cruise technical workers for what happened. They were clearly pressured by those at the top to put technology into service before it was fully ripe. Rather, this is an exercise in showing how an apparently objective (superficially) technical report can severely skew the narrative. Perhaps some will say this analysis pushes the other way. But what the reader should not do is swallow the Cruise-sponsored narrative whole. Doing that will not serve the much-needed imperative to get Cruise on a track to safe deployment of this technology.

The Exponent report is written in what looks like an expert witness style for use in any potential court case. Exponent does a lot of work for car companies like that, and the "tell" is the phrase: "The findings presented herein are made to a reasonable degree of engineering and scientific certainty." (pg. 13)  This blog post is an informal look based on limited time (nobody paid me to do this post -- but I can assure you Exponent gets paid plenty for their work). Since the report is heavily redacted I might well change my thoughts based on further information or further consideration of the information already available. Nonetheless, here is what I got out of what I could see. I welcome any factual corrections.

  • Despite a previous statement by Cruise that Exponent's scope would be expanded to recommending safety engineering and process improvements, Exponent explicitly stated that these were out of scope for this report. Perhaps that is a separate effort, but that topic is explicitly out of scope for both this report and the Quinn Emmanuel main report.
    • Page 13: "A review of Cruise's overall safety systems, culture, and technology is beyond the scope of this investigation."
  • As documented by Exponent it seems the pedestrian was hurrying across an intersection right after the light had changed.
    • Page 44, table 6: 
      • Light changes at -10.0 seconds
      • Pedestrian visually appears to enter crosswalk at -7.9 seconds
      • 2.1 seconds have elapsed + dwell between opposing light changes, if any
      • Both Cruise and adjacent other vehicle have started moving at this point from the far side of the intersection.
  • The Cruise AV accelerated directly toward the pedestrian while that pedestrian was in a crosswalk in its own travel lane.
    • Page 83 figure 62: AV acceleration is positive between the times the pedestrian enters and exits the Cruise AV's lane. Speed increases from about 5 mph to about 13 mph during that time (visual approximate estimate from graph).
    • California Rules of the Road have an explicit that a vehicle in this situation should reduce speed:
      • "The driver of a vehicle approaching a pedestrian within any marked or unmarked crosswalk shall exercise all due care and shall reduce the speed of the vehicle or take any other action relating to the operation of the vehicle as necessary to safeguard the safety of the pedestrian." (21950(c))
  • Exponent both says that the pedestrian crash "may not have been avoidable" and that the vehicle could have avoided the crash. The word "may" is doing some really heavy lifting here. In fact, Exponent admits that the vehicle could have avoided the crash with a computer-speed fast response to a pedestrian in its path (the kind we have been promised every time an AV advocate says that computers can react faster than people to problems):
    • Page 16: "Calculations of potential AV stopping distance indicate that a collision of the AV with the pedestrian may not have been avoidable, even if the ADS had reacted to the collision between the Nissan and the pedestrian."
    • Page 66: "Accounting for brake system latency, the system would have needed to initiate a brake request no later than 0.78 seconds prior to AV-pedestrian contact in order to completely avoid contact. At this relative time, the pedestrian had just fallen into the AV’s travel lane, the AV was traveling at approximately 18.4 mph and the AV was approximately 6.55 m (21.5 ft) from the point at which AV-pedestrian contact occurred. It is noteworthy that a hypothetical brake activation occurring after this time (and prior to when the AV initiated braking at 0.25 s) would have potentially mitigated the severity of the initial collision between the AV and the pedestrian."
  • The Cruise vehicle maintained tracking with the pedestrian until almost a second after the impact with the adjacent vehicle. So it had the opportunity to detect that something very bad was happening in terms of a pedestrian hit by a vehicle a few feet from its lane, but it did not react to that available information.
    • Page 15: "As evidenced by the video and sensor data, the classification and tracking of the pedestrian became intermittent within 1.0 s after the initial contact between the pedestrian and the Nissan until the last correct object classification occurred at approximately 0.3 s prior to the collision between the AV and the pedestrian. This intermittent classification and tracking of the pedestrian led to an unknown object being detected but not accurately tracked by the automated driving system (ADS) of the AV and the AV detected occupied space in front of the AV."
    • Also see Table 6 on page 44:  0.9 seconds between Nissan contact and pedestrian track ID being dropped. Potentially much of that 0.9 seconds was waiting for tracking to recapture the pedestrian during an interval of missing detections.
  • The AV was accelerating during the entire event, including speeding up from 17.9 mph to 19.1 mph during the time the pedestrian was being struck by the adjacent Nissan.
    • Page 15: "At this separation time, the AV was traveling at a speed of approximately 17.9 mph and was approximately one car length behind the Nissan in the adjacent right lane."
    • Page 15: "This deceleration resulted in a vehicle speed reduction from approximately 19.1 mph, prior to the onset of braking, to approximately 18.6 mph at the time of impact with the pedestrian."
  • Exponent admits that lack of anticipation of a problem contributed to the mishap. It then attempts to excuse this by saying a human driver could not react to the pedestrian strike -- but we were promised robots would be better than humans.
    • Page 17: "The AV’s lack of anticipation of a potential future incursion of the pedestrian into its travel lane was a contributing factor to this incident. Reasonable human drivers would face challenges reacting to the pedestrian being projected into their lane of travel and would likely not have been able to avoid the collision under similar circumstances. This difficulty could be due to violations of expectancy, glare, or A-pillar obstruction, or a combination of these, as well as to a failure to predict the collision of the Nissan with the pedestrian in the adjacent lane and/or the resulting redirection of the pedestrian into their lane of travel. Moreover, reasonable human drivers would not likely have had adequate time to avoid the collision once the pedestrian was struck by the Nissan."
  • In fact, the AV did not realize it was hitting the pedestrian. Rather it noticed something ("occupied space") was in front of/beside it, which apparently it does treat as a Vulnerable Road User. It did not brake until 1/4 second before impact. The vehicle decelerated from 19.1 mph to 18.6 mph before the strike. It did in fact have a lidar observation of the pedestrian's leg, but did not classify this as a pedestrian. (A pedestrian prone in the travel lane after having fallen, tripped, been shoved, etc. is an eminently foreseeable road hazard.)  Any statement about initiating aggressive braking before impact is a bit over-stated. To be sure, it takes a while to put on the brakes due to physical constraints. It is more like a case of better late than never, with aggressive braking more properly characterized as initiating before impact but actually taking place after impact.
    • Page 15: "The ADS started sending steering and braking commands to the vehicle at approximately 0.25 s prior to the collision between the AV and the pedestrian due to the detection of occupied space in front of the AV. Consequently, just prior to the collision with the pedestrian, the AV’s heading momentarily changed rightward, and the vehicle began decelerating. This deceleration resulted in a vehicle speed reduction from approximately 19.1 mph, prior to the onset of braking, to approximately 18.6 mph at the time of impact with the pedestrian."
    • Page 16: "Only the pedestrian’s raised leg, which was bent up and out toward the adjacent lane, was in view of these lidar sensors immediately prior to collision."
    • Page 83 figure 62: braking force at best -0.4g or -0.5g at time of impact, spiking down to perhaps -1.2g right after impact.
  • Exponent gives <reasons> why the dragging occurred, but they admit that a human driver would not have made the mistake of dragging a pedestrian under the vehicle:
    • page 17: "After the AV contacted the pedestrian, an alert and attentive human driver would be aware that an impact of some sort had occurred and would not have continued driving without further investigating the situation."
  • Exponent confirms that the pedestrian was dragged approximately 20 feet at speeds of up to 7.7 mph. This dragging occurred entirely after the vehicle initially stopped post-impact. This does not include whatever dragging might have occurred during the initial impact and initial stop.
    • Page 16: "During this maneuver, the AV reached a speed of 7.7 mph and traveled approximately 20 feet while dragging the pedestrian before reaching its final rest position."
  • The AV had camera data that it had a pedestrian trapped under it the whole time, but failed to recognize the situation. So any claim that this was all due to lack of a "trapped pedestrian sensor" would not be the whole story.   
    • Page 16: "The pedestrian's feet and lower legs were visible in the wide-angle left side camera view from the time of the collision between the pedestrian and the AV through to the final rest position of the AV."
  • The vehicle did a degraded mode shutdown after the dragging started not because it explicitly recognized it hit a pedestrian, but rather because it noticed something was wrong with the spinning of the wheel that was over the pedestrian's legs. Note that they say the degraded state initiated an "immediate stop" which took 3 seconds to complete even though the speed was slow.
    • Page 16: " A traction control system event was recorded at approximately 3.8 s after the initial contact between the pedestrian and the AV due to the pedestrian physically resisting the motion of the vehicle. An accumulated offset between the wheel rotation of the left-rear wheel relative to the others from the wheel speed sensors led to the AV entering a degraded state approximately 5.8 s after the initial contact between the pedestrian and the AV. This degraded state caused the vehicle to initiate an immediate stop, and the vehicle reached its final point of rest approximately 8.8 s after the initial contact between the pedestrian and the AV."
  • Exponent claims they know of no failures or faults that contributed to the incident. This is an odd statement -- does that mean dragging a pedestrian is intended functionality? More likely if pressed their expert would say it was due to a "functional insufficiency." But if that is the case, it means they had deployed a vehicle into public use as a commercial service that was not fully capable of operating safely within its intended Operational Design Domain. In other words they seem to be saying it was not "broken" -- just unsafe.
    • Page 14: "Exponent did not identify any evidence of reported vehicle, sensor, actuator, or computer hardware failures or software faults that could have contributed to the incident."
    • Note that they did not opine there were no failures/faults, but rather that they "did not identify any evidence" -- which is a somewhat different statement since there is no sign they did a source code review, hardware diagnostics themselves etc. This limits the scope of their statement more than might be obvious at a quick read.

We also learned (page 29): "The Cruise ADS employs an end-to-end deep learning-based prediction model in order to interpret the motion of tracked objects and contextual scene information in order to generate predicted object trajectories."   I have no idea how they might validate an end-to-end model for life critical applications used in a system which, apparently, is happy to accelerate toward a pedestrian in a crosswalk because the black box model says it will work out fine.  (Maybe there is a deterministic safety checker, but it obviously did not work well enough in this mishap.)

Thursday, January 25, 2024

Cruise Pedestrian Dragging Incident Report

Cruise has released materials from an investigation into the pedestrian dragging incident and related external communication transparency failures of last October. There is a lot to unpack, but remarkably little of it truly has to do with safety. Most of it seems to be an exercise in blaming senior leadership (who have largely been sacked) and explaining why the problems were more due to misguided individuals and poor leadership rather than malefaction. 

Cruise Blog post: Original Link / Archive.org link
Quinn Emanuel Investigation report:  Original Link / Archive.org link

The explanations in the report include some pretty remarkable events. Repeated internet disruptions across multiple meetings only for the regulatory audience when showing the videos that affected the pedestrian dragging part.  Combined with intentionally not showing the disrupted portions to others, including especially the press. And failing to correct mistaken impressions that were favorable to Cruise (while aggressively trying to fix ones that were unfavorable). And leaving the bad part out of public statements. And a paralegal who just didn't realize they should be putting the pedestrian dragging part into a report to NHTSA. Twice. And not talking to external parties who attended meetings to hear their side of the story in preparing this very investigative report when some key points had mixed evidentiary support.

Regardless of what you might think of the report veracity, my safety takeaways are:

  1. These reports do not actually address safety and safety processes. Initially Cruise said those topics would be addressed. However the Quinn Emanuel report specifically states those are out of scope, and instead limits inquiry to regulatory compliance and media relations issues. The Exponent report is clearly limited to root cause for that specific crash. Perhaps someone is working on an independent look at safety and safety processes, but it is not mentioned anywhere I could find.
  2. A pivotal moment (perhaps this will be the pivotal moment in retrospect) was in the timeline (page 12) Oct 3, 3:21 AM where the Director of Systems Integrity for Cruise created a video that, at the request of Cruise government affairs, left out the strike of the pedestrian and subsequent dragging. My recollection from listening to at least one journalist is that this video was later shown to journalists. An intentional decision not to tell the whole truth to the public gave regulators ammunition to blame Cruise for holding back regardless of the ultimate facts of those discussions.
  3. The significantly redacted Exponent report has some new information about the crash dynamics, including the vehicle speed at impact. Those are of interest to understanding the root cause, but have little to do with much bigger safety concerns. The very final sentence of the report is the most telling, and accurately summarizes the big safety concern for this particular mishap: "After the AV contacted the pedestrian, an alert and attentive human driver would be aware that an impact of some sort had occurred and would not have continued driving without further investigating the situation."
Cruise's narrow technical problem boils down to their vehicle continuing driving even though it knew that some sort of impact had occurred. Their regulatory/governance problem of the moment is the scope and intent of the cover-up.

Cruise's bigger safety problems are out of scope for this report, and from everything I've seen remain unaddressed. That topic will have to be resolved for Cruise to remain viable.

Saturday, January 20, 2024

Vay deployment of tele-driven cars

Vay is deploying tele-driven cars in Las Vegas. At a first pass they have done some safety homework, but open questions remain that they should address to instill public confidence. I spent some time doing background research before posting (especially listening to this All About Autonomy podcast with their founder https://lnkd.in/eYQH72EE ).



Initial summary of their system:
- The operational concept is a rental car ferry service with a tele-driving kit on top of an ordinary vehicle. 100% remotely driven between rentals, and a regular human driver during a rental period.
- Telecom disruptions result in the vehicle doing a safing response of some sort, details unspecified.
- They claim conformance to ISO 26262 (functional safety), ISO 21448 (SOTIF), and ISO 21434 (security) with TUV certification. This is some good news for teleoperation. Would like to know if ISO 26262 includes software (companies often play a game and just do hardware; don't know the case for Vay). Does not address full scope of autonomous safing response when needed.
- They mention, but do not claim conformance to, UL 4600. That standard applies both when driving and during an autonomous safing response.
- This is an SAE Level 4 vehicle because it must completely control itself during a safing response. (This is another reason SAE J3016 levels are unsuitable for regulation, but that is what states are using nonetheless.)
- Limited to slow speeds. They claim they are characterizing remote connectivity lag and modulating maximum speed accordingly, etc.

Initial Thoughts:
- There is no safety report and no VSSA evident on their company web page. They would be well advised to be more public about safety before the first crash.
- The timing, dynamics, and human/machine interface issues of remote driving are potentially very problematic. They say they have a handle on it. Because of opacity on safety (typical of the industry) we either have to take their word for it -- or not. But at least they acknowledge it is a concern, and that is better than the obvious cluelessness of some previous companies on this topic. Driver training will be huge.
- I'll bet at this point the failure response to communications loss is an in-lanes stop. They say they have redundancy but we'll just have to see how this works out for both safety and stranded cars. We won't really know until they try to scale up.
- I'd like to know what will happen if there is a catastrophic large-scale communication disruption -- are all the cars stranded in travel lanes impeding emergency response vehicles? For example deploying in cities prone to earthquakes this is an important question.

Takeaways:
- My take is that safe remote driving is a strong claim, requiring strong evidence.
- There is a risk any crash will be blamed on the remote driver regardless of any contributing technology failure. We'll have to see how that turns out.

Friday, December 29, 2023

My Automated Vehicle Safety Prediction for 2024

 My 2024 AV industry prediction starts with a slide show with a sampling of the many fails for automated vehicles in 2023 (primarily Cruise, Waymo, Tesla). Yes, some hopeful progress in many regards. But so very many fails.



At a higher level, the real event was a catastrophic failure of the industry's strategy of relentlessly shouting as loud as they can "Hey, get off our case, we're busy saving lives here!" The industry' lobbyists and spin doctors can only get so much mileage out of that strategy, and it turns out it is far less (by a factor of 10+) than the miles required to get statistical validity on a claim of reducing fatality rates.

My big prediction for 2024 is the industry (if it is to succeed) will get a more enlightened strategy for both deployment criterion and messaging. Sure, on a technical basis, indeed it needs to be safer than comparable human driver outcomes.

But on a public-facing basis it needs to optimize for fewer embarrassments like the 30 photos-with-stories in this slide show. The whole industry needs to pivot into this priority. The Cruise debacle of the last few months proved (once again; remember Uber ATG?) that it only takes one company doing one ill-advised thing to hurt the entire industry.

I guess the previous plan was they would be "done" faster than people could get upset about the growing pains. Fait accompli. That was predictably incorrect. Time for a new plan.

Saturday, December 23, 2023

2023: Year In Review

 2023 was a busy year!  Here is a list of my presentations, podcasts, and formal publications from 2023 in case you missed any and want to catch up.

2023 written in beach sand

Presentations:

Podcasts:
Publications:

Tuesday, December 19, 2023

Take Tesla safety claims with about a pound of salt

This morning when giving a talk to a group of automotive safety engineers I was once again asked what I thought of Tesla claims that they are safer than all the other vehicles. Since I have not heard that discussed in a while, it bears repeating why that claim should be taken with many pounds of salt (i.e., it seems to be marketing puffery).

(1) Crash testing is not the only predictor of real-world outcomes. And I'd prefer my automated vehicle to not crash in the first place, thank you!  (Crash tests have been historically helpful to mature the industry, but have become outdated in the US: https://www.consumerreports.org/car-safety/federal-car-crash-testing-needs-major-overhaul-safety-advocates-say/ )

(2) All data I've seen to date, when normalized (see Noah Goodall's  paper: https://engrxiv.org/preprint/view/1973 ) suggests that any steering automation safety gains (e.g., Level 2 autopilot) are approximately negated by driver complacency, with autopilot on being slightly worse than autopilot off:  "analysis showed that controlling for driver age would increase reported crash rates by 11%" with autopilot turned on vs. lower crash rates with autopilot turned off on the same vehicle fleet.

(3) Any true safety improvement for Tesla is good to have, but is much more likely due to comparison against an "average" vehicle (12 years old in the US) which is much less safe than any recent high-end vehicle regardless of manufacturer, and probably not driven on roads as safe on average as where Teslas are more popular. (Also, see Noah Goodall's point that Tesla omits slow speed crashes under 20 kph whereas comparison data includes those. If you're not counting all the crashes, it should not be a huge surprise that your number is lower than average -- especially if AEB type features are helping mitigate the crash speed to be below 20 kph.)  If there is a hero here it is AEB, not Autopilot.

(4) If you look at IIHS insurance data, Tesla does not rate in the top 10 in any category. So in practical outcomes they are not anywhere near number one. When I did the comparison last year I found out a new Tesla was about the same as my 10-year-old+ Volvo based on insurance outcomes. (Which I have since sold to get a vehicle with newer safety features). That suggests their safety outcomes are years behind the market leaders in safety.  However, it is important to realize that insurance outcomes are limited because they incorporate "blame" into the equation. So they provide a partial picture. IIHS Link: https://www.iihs.org/ratings/insurance-losses-by-make-and-model

(5) The NHTSA report claiming autopilot was safer was thoroughly debunked: https://www.thedrive.com/tech/26455/nhtsas-flawed-autopilot-safety-study-unmasked 

(6) Studies that show ADAS features improve safety are valid -- but the definition of ADAS they include does not include sustained steering of the type involved with Autopilot.  Autopilot and FSD are not actually ADAS. So ADAS studies do not prove they have a safety benefit.  Yep, AEB is great stuff. And Teslas have AEB which likely provides a safety benefit (same as all the other new high-end cars). But Autopilot is not ADAS, and is not AEB.
For example:  https://www.iihs.org/topics/advanced-driver-assistance  lists ADAS features but makes it clear that partial driving automation (e.g., Autopilot) is not a safety feature: "While partial driving automation may be convenient, we don’t know if it improves safety." Followed by a lot of detail about the issues, with citations.

(7) A 2023 study by Lending Tree showed that Tesla was the absolute worst at crashes per 1000 drivers. There are confounders to be sure, but no reason to believe this does not reflect a higher crash rate for Teslas than other vehicles even on a level playing field: https://www.lendingtree.com/insurance/brand-incidents-study/

(8) In December 2023, more than two million Teslas were recalled due to safety issues with autopilot (NHTSA Recall 23V-838). The recall document specifically noted a need for improved driver monitoring and enforcement of operation only on intended roads (for practical purposes, only limited access freeways, with no cross-traffic). The fact that a Part 573 Safety Recall Report was issued means by definition the vehicles had been operating with a safety defect for many years (the first date in the chronology is August 13, 2021, at which time there had been eleven incidents of concern). Initial follow-up investigations by the press indicate the problems were not substantively resolved. (Note that per NHTSA terminology, a "recall" is the administrative process of documenting a safety defect and not the actual fix. The over-the-air-update is the "remedy".  OTA updates do not in any way make such a remedy not a "recall" even if people find the terminology less than obvious.)

(updated 1/1/2024)



Sunday, December 17, 2023

Social vs. Interpersonal Trust and AV Safety

Bruce Schneier has written a thought-provoking piece covering the social fabric vs. human behaviors aspects of trust. Just like "safety," the word "trust" means different things in different contexts, and those differences matter in a very pressing way right now to the larger topic of AI.

Robot and person holding hands

Exactly per Bruce's article, the car companies have guided the public discourse to be about interpersonal trust. They want us to trust their drivers as if they were super-human people driving cars, when the computer drivers are in fact not people, do not have a moral code, and do not fear jail consequences for reckless behavior. (And as news stories constantly remind us, they have a long way to go for the super-human driving skills part too.)

While not specifically about self-driving cars, his theme is about how companies exploit our tendency to make category errors between interpersonal trust and social trust. Interpersonal trust is, for example, the other car will try as hard as it can to avoid hitting me because the other driver is behaving competently or perhaps because that driver has some sort of personal connection to me as a member of my community. Social trust is, for example, the company who designed that car has strict regulatory requirements and a duty of care for safety, both of which incentivize them to be completely sure about acceptable safety before they start to scale up their fleet. Sadly, that social trust framework for computer drivers is weak to the point of being more apparition than reality. (For human drivers the social trust framework involves jail time and license points, neither of which currently apply to computer drivers.)

The Cruise debacle highlights, once again (see also Telsa and Uber ATG, not to mention conventional automotive scandals), the real issue is the weak framework to create social trust of the corporations that build the cars. That lack of framework is a direct result of the corporation's lobbying, messaging, regulatory capture efforts, and other actions.

Interpersonal trust doesn't scale. Social trust is the tool our society uses to permit scaling goods, services, and benefits. Despite the compelling localized incentives for corporations to game social trust for their own benefit, having the entire industry succeed spectacularly at doing so invites long-term harm to the industry itself, as well as all those who do not actually get the promised benefits. We're seeing that process play out now for the vehicle automation industry.

There is no perfect solution here -- it is a balance. But at least right now, the trust situation is way off balance for vehicle automation technology. Historically it has taken a horrific front-page news mass casualty event to restore balance for safety regulations. Even then, to really foster change it needs to involve someone "important" or an especially vulnerable and protection-worthy group.

Industry can still change if it wants to. We'll have to see how it plays out for this technology.

The piece you should read is here:  https://www.belfercenter.org/publication/ai-and-trust

Monday, December 4, 2023

Video: AV Safety Lessons To Be Learned from 2023 experiences

Here is a retrospective video of robotaxi lessons learned in 2023

  • What happened to robotaxis in 2023 in San Francisco.
  • The Cruise crash and related events.
  • Lessons the industry needs to learn to take a more expansive view to safety/acceptability:
    • Not just statistically better than a human driver
    • Avoid negligent driving behavior
    • Avoid risk transfer to vulnerable populations
    • Fine-grain regulatory risk management
    • Conform to industry safety standards
    • Address ethical & equity concerns
    • Build sustainable trust.
Preprint with more detail about these lessons here: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4634179

Archive.org alternate video source: https://archive.org/details/l-141-2023-12-av-safety



Saturday, October 28, 2023

A Snapshot of Cruise Crash Reporting Transparency: July & August 2023

A comparison of California Cruise robotaxi crash reports between the California DMV database and the NHTSA SGO database reveals significant discrepancies in reporting. 31 crashes reported to NHTSA do not appear in the California DMV database. This includes seven unreported injury crashes. Of special note is the Cruise crash with a fire truck that caused serious injury to an occupant of the Cruise robotaxi does not appear as a California DMV crash report. To be sure, Cruise might not be legally required to file these reports, but the situation reveals an apparent lack of transparency.

Comparison Results:

39 crashes were identified across both databases for the date-of-crash months of July 2023 through August 2023. The comparison was performed on October 28, 2023, so there was adequate time for all such crashes to have been reported.

Each database was missing one or more crashes found in the other database:

  • 39 crashes in the NHTSA base, including 8 also found in the CA DMV database.
  • 31 crashes reported to NHTSA were not in the California DMV database
  • The California DMV database was in particular missing SEVEN (7) crash reports which indicated an injury had occurred or might have occurred.
  1. NHTSA 30412-5968: Other car ran a red light striking Cruise; passenger of other vehicle treated on scene for minor injury.
  2. NHTSA 30412-5982: Other car ran into Cruise; passenger of other vehicle transported by EMS for further evaluation. Possible injury ("unknown" injury status).
  3. NHTSA 30412-6144: Cruise crash with fire truck; serious injury reported to passenger
  4. NHTSA 30412-6145: Cruise reversing contacted cyclist; minor injury reported to cyclist
  5. NHTSA 30412-6167: Cruise rear-ended after braking; minor injury reported to other vehicle driver
  6. NHTSA 30412-6175: Cruise hit pedestrian crossing in front of it (said to be crossing against light); moderate injury to pedestrian
  7. NHTSA 30412-6270: Cruise hit from behind after stopping to yield to pedestrian in crosswalk; minor injury to passengers inside AV
Two crashes involved non-motorists:
  • 30412-6145 with a cyclist
  • 30412-6175 with a pedestrian

        Why the Disparity?

        The thing that makes this complicated is that CA DMV does not require reporting crashes for "deployment" operation -- just for "testing" operation. Apparently when the regulations were written they did not anticipate that companies would "deploy" immature technology, but that is exactly what has happened.

        It is difficult from available information to check how Cruise is determining which crashes must be reported to California DMV (testing) and which do not have to be reported (deployment). In practice it might boil down to a management decision which ones they want to report, although there might be some less arbitrary internal decision criterion in use.

        CA DMV should require all companies to provide them with unredacted copies of all NHTSA SGO reports to provide improved transparency. For the foreseeable future, making a distinction between "testing" and "deployment" with no driver in the vehicle serves no useful purpose, and impairs transparency. If there is no driver it is a deployment, and should be held to the standards of a production vehicle, including reporting both crashes and driving behavior that puts other road users at undue risk. This is true for all companies, not just Cruise.

        Other notes:

        • CA DMV reports have the street names, yet Cruise redacts this same information from reports filed with NHTSA claiming it is "confidential business information." It is difficult to understand how information publicly reported by California can be classified as "confidential."
        • The NHTSA database does not have the date of the crash, although the California database has that information.
        • Crashes considered were for reported incident dates of July & August 2023, considering only uncrewed (no safety driver) operation.
        • It is our understanding that Cruise is not required to report all crashes that occur during deployment to California DMV. So it is possible that these reporting inconsistencies are still in accordance with applicable regulations.
        • All crashes on this spreadsheet in the NHTSA database list the "driver/operator type" as "Remote (Commercial / Test)" so it is not possible to distinguish whether the vehicle was considered in commercial service at the time of the crash. 
        • At the time of this posting the tragic Oct. 2nd severe injury crash that involved a Cruise robotaxi dragging a pedestrian who had been trapped under the vehicle has also not been reported, while another crash on Oct 6th has. There is nothing on the Oct 6th CA DMV form to indicate that the reported crash was specific to a testing permit vs. deployment permit.

        Review status: 

        This data has not been peer reviewed. Corrections/additions/clarifications are welcome to improve accuracy. The data analysis results are included below.

        Google Spreadsheet link:  https://docs.google.com/spreadsheets/d/1o9WWzMpiuum-QHZk9goY68gnBZuRC1InxUSMa7-h4DU/edit?usp=sharing

        Data sources: 






        Updated 10/30/2023 to incorporate three more crash reports found in a wider search of the SGO database. All CA DMV crash reports have now been identified in the SGO database.






        Friday, October 27, 2023

        The Cruise Safety Stand-Down -- What Happens Next?

        Cruise has announced a fleet-wide safety stand-down. This involves suspending driverless operations in all cities, reverting to operation only with in-vehicle safety drivers.

        I'm glad to see this step taken. But it is crucial to realize that this is the first step in what it likely to prove a long journey. The question is, what should happen next?


        Loss of public trust is an issue as they say. And perhaps there was an imminent ban in another state forcing their hand to be proactive. But the core issues almost certainly run deeper than mismanaging disclosure of the details of a tragic mishap and doing damage control with regulatory trust. 

        The real issues will be found to have their roots in the culture of the company. Earnest, smart employees with the best of intentions can be ineffective at achieving acceptable safety if the corporate culture undermines their official company slogan of "safety first, always."

        This is the time to ask the hard questions. The answers might be even harder, but they need to be understood for Cruise to survive long term. It is questionable whether they could survive a ban in another state. But escaping that via a stand-down only to implement a quick fix won't be enough. If we see business as usual restored in the next few weeks, that is almost certainly a shallow fix. It will simply be a matter of time before a future adverse situation happens from which there will be no recovery. 

        This is the moment for Cruise to decide to lean into safety.

        The details are too much for a post like this, but the topics alone indicate the scope of what has to be considered:
        • Safety engineering -- Have they effectively identified and mitigated risks?
        • Operational safety -- Safety procedures, inspections, maintenance, management of Operational Design Domain limits responsive to known issues, field data feedback, etc.  This includes ensuring their Safety Management System (SMS) is effective.
        • System engineering -- Do the different pieces work together effectively? This includes all the way from software in 3rd party components to vehicle integration to ability of remote operators to effectively manage gaps in capabilities ... and more
        • Public messaging and regulatory interface -- Building genuine trust, starting with more transparency. Stop the blame game; accept accountability. Own it.
        • Investor expectations -- Determine a scaling plan that is sustainable, and figure out how to fund it in the likely case it is longer than what was previously promised
        • Definition of acceptable safety -- More concrete than seeing how it turns out based on crash data, with continual measurement of predictive metrics
        • Safety culture -- Which underlies all of the above, and needs to start at the top.
        And I'm sure there are more; this is just a start.

        Near-term, the point of a safety stand-down is to stabilize a situation during uncertainty. The even more important part comes next: the plan to move forward. It will take weeks to take stock and create a plan, with the first days simply used to organize how that is going to happen. And months to execute that plan. Fortunately for Cruise there is an existing playbook that can be adapted from Uber ATG's experience with their testing fatality in 2018. Cruise should already have someone digging into that for initial ideas.

        An NTSB-style investigation into this mishap could be productive. I think such an investigation would be likely to bring to light new issues that will be a challenge to the whole industry involving expectations for defensive driving behaviors and post-crash safety. If NTSB is unable to take that on, Cruise should find an independent organization who can do something close. But such an investigation is not the fix, and cultural improvements at Cruise should not wait for one to conclude. However, an independent investigation can be the focal point for deeper understanding of the problems that need to be addressed.

        ------------------------------------------------

        Philip Koopman is a professor at Carnegie Mellon University in Pittsburgh Pennsylvania, USA, who has been working on self-driving car safety for more than 25 years.   https://users.ece.cmu.edu/~koopman/