This long legal transcript comes from the Bookout v Toyota case.
Recursion has long been known to be a potentially risky propgramming construct; not only is it potentially hard to understand by programmers, it carries a number of risks, notably that of stack overflow. While less of a problem on modern desktop machines, stack space can be severely constrained in embedded CPU systems. This is the reason why recursion has been banned for decades in automotive ECUs and similar safety-critical systems.
In this case, an expert on embedded system design talks about how a lack of understading of recursion may have allowed the toyota Camry ECU to malfunction causing the electronically controlled throttle to stay open.
Bookout vs Toyota
Cliffs
1. Toyota ECU runs a special real-time OS, with a number (exact number redacted) of processes that control various engine parameters.
2. There are specific processes for controlling parameters such as combustion and spark, and a general-purpose "everything including the kitchen-sink" process for a variety of miscellaneous tasks.
3. The "kitchen-sink" process has control of the electronic throttle angle computation, as well as managing cruise-control, various fail-safes, brake-pedal accelerator override, ECU DTC/error logging, failsafe mode enable, etc.
4. The CalculateRequiredThrottlePosition function is incredibly complex, about 1500 lines, with over 146 branches.
5. A large number of functions in the "kitchen-sink" process made heavy use of recursion, and Toyota had not taken account of this when calculating the amount of stack space required.
6. Stack usage by the "kitchen-sink" process averaged 94% when analysed. Toyota had designed for worst-case usage to be under 41%
7. Stack overflow in the "kitchen-sink" process would cause the OS to kill the process, and as this process had the watchdog code and error logging code, it would not be restarted, nor would any error codes be logged. When the process was killed, the throttle motor would remain at the position it was in prior to the process crash. As the combustion/fuel injection and ignition processes (and throttle plate motor control process) were separate, they would continue to run as normal.
Edit:
8. There is a second "monitor" CPU, that is intended to monitor the main ECU CPU. One thing it does is query the main ECU's RAM for brake pedal status, and compare it's hardware connection to the brake pedal switch. If there is a mismatch (indiciating that the main ECU's "kitchen sink" process has crashed and is not monitoring the brake pedal) then it will override the main ECU's connection to the throttle motor and drive it to closed, causing the engine to stall. However, this would failsafe would only work if the brake pedal was released for several seconds, prior to being held for several seconds. If the brake pedal was pressed at the time the "kitchen-sink" process crashed, the ping (which was a read to memory) would reveal "brake pressed", which would correspond with the hardware "brake pressed" signal, and the failsafe would not be enabled. If the driver, then released the pedal for several seconds, a mismatch would be detected, and the engine stalled. If the driver released only for a fraction of a second, the chance of the watchdog CPU catching the mismatch was low. Dyno testing with a camry connected to a debugger showed that the engine would continue to accelerate, for as long as the brake pedal was depressed following a debugger induced "kitchen sink" process crash. Once the brake pedal was released for 5 seconds, the engine would stall.
Recursion has long been known to be a potentially risky propgramming construct; not only is it potentially hard to understand by programmers, it carries a number of risks, notably that of stack overflow. While less of a problem on modern desktop machines, stack space can be severely constrained in embedded CPU systems. This is the reason why recursion has been banned for decades in automotive ECUs and similar safety-critical systems.
In this case, an expert on embedded system design talks about how a lack of understading of recursion may have allowed the toyota Camry ECU to malfunction causing the electronically controlled throttle to stay open.
Bookout vs Toyota
Cliffs
1. Toyota ECU runs a special real-time OS, with a number (exact number redacted) of processes that control various engine parameters.
2. There are specific processes for controlling parameters such as combustion and spark, and a general-purpose "everything including the kitchen-sink" process for a variety of miscellaneous tasks.
3. The "kitchen-sink" process has control of the electronic throttle angle computation, as well as managing cruise-control, various fail-safes, brake-pedal accelerator override, ECU DTC/error logging, failsafe mode enable, etc.
4. The CalculateRequiredThrottlePosition function is incredibly complex, about 1500 lines, with over 146 branches.
5. A large number of functions in the "kitchen-sink" process made heavy use of recursion, and Toyota had not taken account of this when calculating the amount of stack space required.
6. Stack usage by the "kitchen-sink" process averaged 94% when analysed. Toyota had designed for worst-case usage to be under 41%
7. Stack overflow in the "kitchen-sink" process would cause the OS to kill the process, and as this process had the watchdog code and error logging code, it would not be restarted, nor would any error codes be logged. When the process was killed, the throttle motor would remain at the position it was in prior to the process crash. As the combustion/fuel injection and ignition processes (and throttle plate motor control process) were separate, they would continue to run as normal.
Edit:
8. There is a second "monitor" CPU, that is intended to monitor the main ECU CPU. One thing it does is query the main ECU's RAM for brake pedal status, and compare it's hardware connection to the brake pedal switch. If there is a mismatch (indiciating that the main ECU's "kitchen sink" process has crashed and is not monitoring the brake pedal) then it will override the main ECU's connection to the throttle motor and drive it to closed, causing the engine to stall. However, this would failsafe would only work if the brake pedal was released for several seconds, prior to being held for several seconds. If the brake pedal was pressed at the time the "kitchen-sink" process crashed, the ping (which was a read to memory) would reveal "brake pressed", which would correspond with the hardware "brake pressed" signal, and the failsafe would not be enabled. If the driver, then released the pedal for several seconds, a mismatch would be detected, and the engine stalled. If the driver released only for a fraction of a second, the chance of the watchdog CPU catching the mismatch was low. Dyno testing with a camry connected to a debugger showed that the engine would continue to accelerate, for as long as the brake pedal was depressed following a debugger induced "kitchen sink" process crash. Once the brake pedal was released for 5 seconds, the engine would stall.
Last edited: