Computer-Based System Safety Essential Reading List

Here is a quick start resource guide for computer-based system safety literacy. If you work on computer-based system safety and you aren't familiar with the below case studies, you really need to read them.  (Not just safety engineers -- everyone!)


Ariane 5 flight 501 launch failure

Essential Case Studies: Because those who have not read history are doomed to repeat it.
Additional Case Studies:
Recommended Supplemental Materials
Other Case Studies: (Still important, and should be read by anyone digging deep into safety. But less specifically related to computer-based system risks.)
Other Related Topics:
Resources for deeper engagement:
  • Systems Engineering Body of Knowledge on Safety Engineering (SEBoK)
  • NASA Safety library (index | Safety Guidebook)
  • NASA Fault Management Handbook (NASA)
  • NASA Real System Failure story collection by Kevin Driscoll  (Home | slides)
  • FAA System Safety Handbook (FAA
    • FAA AC 25.1309-1A - System Design and Analysis (FAA)
  • USAF System Safety Handbook (USAF)
  • List of NHTSA software-related automotive recalls (Blog)
  • List of accidents and incidents involving commercial aircraft (Wikipedia)
  • Safety of Work podcast (Rae & Provan) (Podcast)
  • Cautionary Tales podcast (Harford) (Podcast)

Advanced Specialty Topics/Research & Papers I Personally Recommend:

    NOTES:
    • While Wikipedia is not always an authoritative source, for these sorts of events it tends to present useful summary descriptions, so I have included those links. Usually the reference pointers in the Wikipedia article form a useful starting point for diving deeper.  
    • Links have a way of going bad. If a link no longer works please let me know, but also don't forget to try it at the wayback machine for instant gratification.
    • If you think something important is missing, let me know!
    Last update 13 January 2026

    6 comments:

    1. Leveson, "Engineering A Safer World", PDF download from https://mitpress.mit.edu/books/engineering-safer-world.

      Neumann, "Computer-Related Risks" (based on the Risks Digest archives as of 1994).

      ReplyDelete
    2. Excellent list. While I've read most of these, seeing this list there's a few I want to add to my reading list.

      ReplyDelete
    3. The Eschede train accident should be included under Other Mishap Case Studies, not only because it highlights the importance of proper maintenance procedures, but also the legal aftermath where officials and engineers were charged with manslaughter.
      https://en.wikipedia.org/wiki/Eschede_derailment

      ReplyDelete
    4. Thanks for the list - just found it today. I'm both glad I already knew quite a few and that there are more to read about!

      The 2004 Spirit memory issue is not in the Wikipedia article anymore. Perhaps replace the reference link with https://llis.nasa.gov/lesson/1483?

      ReplyDelete
      Replies
      1. Thanks for the feedback. I'm currently working through to identify links like that. Sadly NASA links are decaying quickly these days, so I'm also updating those as needed.

        Delete

    All comments are moderated by a human. While it is always nice to see "I like this" comments, only comments that contribute substantively to the discussion will be approved for posting.