Computing Cases Header, Picture of a Keyboard with the text "ComputingCases.org" printed over it

System Safety

Machine-Based Safety Mechanisms

A cross section drawing of a Therace-25 facility, including technological devices and electronic switches.As the diagram indicates, the Therac-25 linear accelerator was isolated in a heavily shielded room. This shielding protected the operator (who might do as many as 30 treatments in one day) from the low-level radiation that might scatter from the machine. In addition, the machine itself was shielded in many ways to reduce the amount of scattered radiation it would emit. AECL was particularly proud of this innovation in machine shielding, and even published a paper in a technical journal on its design.

Software Based Safety Mechanisms

Previous versions of Therac (Therac-6 and Therac-20) used software to make the hand operation of the machine more convenient. But Therac-25 was completely software controlled. In addition and safety checking was made the job of the software many of the hardware safety interlocks were removed. Thus, the safe operation of the machine became almost completely the responsibility of the software.

For example, intensity of the beam is monitored by ion chambers placed on the turntable. There were two different ion chambers, one located beneath the scanning magnets that spread the electron beam and one located beneath the foil that turned a high intensity electron beam into X-rays. These chambers monitored the amount of radiation that was being delivered to the patient in each mode (electron beam or X-ray) and each could measure the beam intensity only within the expected range from the beam with which it was paired. If the chamber detected a dose that was different from that assigned to the patient, the software immediately suspended treatment.

If the difference was a minor amount or if the beam intensity was measured as hardly there, the software might allow the operator to retry the treatment up to 5 times before shutting down completely. This retry facility was added to the software because it was a regular occurrence for the beam to be slightly "out of tune" and for the software to suspend treatment.

If the beam intensity was detected to be quite different from the assigned intensity, the software shut the machine down completely and required all the treatment parameters to be entered again.

Safety Analysis of the System

In 1983, just after AECL made the Therac-25 commercially available, AECL performed a safety analysis of the machine using Fault Tree Analysis. This involves calculating the probabilities of the occurrence of varying hazards (e.g. an overdose) by specifying which causes of the hazard must jointly occur in order to produce the hazard.

In order for this analysis to work as a safety analysis, one must first specify the hazards (not always easy), and then be able to specify the all possible causal sequences in the system that could produce them. It is certainly a useful exercise, since it allows easy identification of single-point-of-failure items and the identification of items whose failure can produce the hazard in multiple ways. Concentrating on items like these is a good way to begin reducing the probabilities of a hazard occurring.

In order to be useful, a Fault Tree Analysis needs to specify all the likely events that could contribute to producing a hazard. In addition, if one knows the specific probabilities of all the contributing events, one can produce a reasonable estimate of the probability of the hazard occurring.

Since much of the software had been taken from the Therac-6 and Therac-20 systems, and since these software systems had been running many years without detectable errors, the analysts assumed there were no design problems in the software. The analysts did consider software failures like "computer selects wrong mode" but assigned them probabilities like 4 x 10**-9. These sorts of probabilities are likely assigned based on the remote possibility of random errors produced by things like electromagnetic noise. They do not take into account the possibility of design flaws in the software.