The automotive industry is going through many changes - demanding challenges to overcome and exciting goals to achieve. Automotive companies are investing tremendous effort, money and talent to successfully manage the growing complexity of new electrical/electronic (E/E) architectures and achieve higher levels of automation. They need to design and develop advanced vehicle systems with safety in mind every step of the way.
Software-defined vehicles are highly complex devices, but devices none the less. Bearing in mind that trust, by definition, is a firm belief in the reliability, truth, or ability of someone or something, vehicles must demonstrate a high degree of accuracy and trustworthiness to be considered safe.
Achieving system safety
Indeed, Functional Safety (FuSa) is the most important system property and crucial for public acceptance of highly automated vehicles. It mainly relates to software and electronics and is realized through safety mechanisms that reduce the risk of injury or damage.
Karl Gruber, Functional Safety Expert at TTTech Auto, recently participated in ARM's summit on functional safety and its implications. He spoke about the challenges of new E/E architectures and explained what the industry needs today to pave the way to highly automated vehicles without compromising safety. "The transition to more centralized E/E vehicle architectures brings new challenges for software functional safety. Applications have to share the same hardware system resources (e.g., memory, network, and processor time).", says Karl.
To meet these challenges, modern software platform solutions must:
- Ensure real-time safety in the event of a failure
- Handle the complexity of a heterogeneous environment with mixed-critical applications
- Optimize the resource utilization of embedded applications
New architectures require sharing resources and can compromise safe performance if not handled properly.
The time-aware architecture approach offers many advantages for achieving the required level of accuracy in automated driving (AD) systems. It ensures deterministic behavior of the system, is an important factor in reducing system complexity, enables synchronous execution of applications without message loss and extreme latencies, etc.
Fail-silent vs fail-operational performance
What does fail-silent mean? By definition, a fail-silent system is a type of system that either provides the correct service or does not provide any service at all (goes silent). There may still be systematic residual failures of the hardware or software design or random residual failures of the hardware, but these are considered acceptable in fail-silent systems because these systems are used only for vehicle functions where the driver can mitigate any failure. The precondition is that the ISO 26262 requirements for the desired ASIL of the hardware and software must be met.
For upcoming SAE L3/ SAE L4 type vehicle functions, such as highway pilot, the driver can be out of the loop while the vehicle function is in operational mode and is unable to mitigate the risk caused by a malfunction. Availability requirements must be considered for this type of system, i.e., safe operation in the event of a failure must be ensured until either the vehicle can independently transition to a safe state or a handover to the driver is possible.
Therefore, such a system must be designed with redundancy, starting with sensors, external input sources, processing, and actuation. Other additional failure modes must be considered, such as dependent failures, which can result in a worst-case scenario where all redundancies required for safe operation in nominal or degraded operation, fail at the same time or during the transition to safe state or handover to the driver. Additional measures are required to avoid or mitigate dependent failures, e.g., sufficient independence between redundancies or avoidance of common cause faults in hardware and software.
Dependent failures caused by systematic errors in hardware and software design can be avoided by design measures (e.g., diversity) or by ensuring the highest quality in the development process for critical functions or components. Further measures may be required to handle all relevant failure modes in a safe manner.
How does MotionWise ensure the highest levels of safety?
MotionWise Safety Supervision and Platform Health Management provide safety and robustness during operation by:
- Temporal & spatial freedom from interference (FFI)
- Safe communication (requires safe execution) or silent failures
- Being configurable to reach required fault tolerant time interval
- Enabling global health status with support from safety capable operating system
- Providing foundation for fail-operational extensions (redundancy, diverseness)
The MotionWise safe vehicle software platform contributes to safe vehicle operation under all conditions and provides functionality and software components for realizing highly reliable systems up to fail-operational systems. MotionWise capabilities support isolation and separation of software components, which enables the realization of degradation concepts in case of detected isolatable failures. This facilitates fail-operational vehicle performance so that safety-related software applications, either fully or specified subsets for degraded operation, remain operational even in the event of a failure.
Karl Gruber
Simplified system safety concept for fail-silent ECUs
Hardware must be safe too
To achieve the required efficiency through harmonized interaction between software and hardware platforms, the hardware architecture concept must follow certain rules. Although not mandatory, the use of safety libraries simplifies the implementation. Software platforms should also interact with hardware safety features of different SoCs.
Simplified hardware safety concept for fail-silent ECUs
Arm's A-class processing cores with its class-leading high performance application processor and R-class processing cores with real-time capabilities support the realization of the state-of-the-art functional safety for the automotive industry.
Arm's Split-Lock: Flexible operation modes, post-Silicon
As Madhusudan Rao, Arm's Functional Safety Product Lead, explained at the Arm FuSa Summit, the vision is to enable functional safety from the base of the computing stack to have safe computing available to the industry, enabling greater flexibility in the development and deployment of SoCs. In addition, this approach should imply providing high-performance, safe hardware in highly automated systems such as MotionWise. Arm's safety enablement allows partitioning of cores at the cluster level to enable a split mode for high performance computing and a hybrid or lock mode for safety processing cores for ASIL B/ ASIL D applications.
Arm's comprehensive range of technologies and tools
The size of the challenge in transition to fail-operational systems
System-wide safety is the main challenge to solve on the way to new E/E architectures. Modern systems must cope with the growing complexity of new-generation vehicles.
To better understand the size of this challenge, let us take a look at the safety goals:
- The outputs of the AD system must be correct (ASIL D)
- At least one set of outputs of the AD system must be available (ASIL D); note: executing no function or generating no output (= any silent safe state) is now considered unsafe and therefore not allowed!
To achieve these goals, system designers must consider success factors, and they must consider them in parallel: Correctness can be achieved through the ASIL D safety level of any system's development, and availability should be achieved through various redundancies and sufficient independence in the system. At the same time, the system must be hard real-time, i.e., it must demonstrate deterministic behavior, be designed according to the standards for time-aware architectures, and have pre-planned schedules for the execution of functions.
So, is the automotive industry there yet?
Modern vehicles are extremely complex and must be handled by the right technology to be safe and trusted.
This change requires a new mindset that focuses on ASIL D compliance as well as fail-operational performance characteristics, rather than achieving ASIL D via strict monitoring functions that only leads to shut down. Fail-operational architecture solutions are required for SAE L3+ levels and must be cost-effective. Also, a vehicle's E/E architecture must account for common-cause faults (e.g., power failure) that were previously acceptable when relying on the driver and mechanical fallback. These architectures require redundant elements that meet sufficient independence requirements.
When it comes to automotive trends, we should not forget the presence of AI and the aspect of connectivity. Currently, there is no recognized approach to ensure the safety of AI based on deep neural networks, which is one of the gaps between the concept and the realization of true safety in a modern vehicle. In addition, increasing connectivity and autonomy also make security risks potential safety hazards.
One could say that the size of the challenge is as great as the complexity of the new age vehicle. Considering the enormous complexity, we may not be there yet, but we are certainly on the right track. Therefore, we can ask instead: How close are we to realizing the envisioned vehicle of the future?
How safe is safe enough?
„In an SAE system for automated driving of Level 4, the driver would trust that the vehicle is driving safely and, consequently, may not pay attention to the driving situation. The driver could be texting, working, or even sleeping. Since critical failures may not occur in these systems, a suitable safety architecture is required.“, this is how Wilfried Steiner, Director of TTTech Labs, explains the need for the safety architecture in highly automated/autonomous use cases in his article E/E Architectures on the Way to Level 4.
Kopetz Architecture for a driving automation system defines so called fault-containment units (FCUs), i.e., parts of the system that may fail as a whole, and also defines the interactions between these FCUs. The Primary, the Monitor, and the Fallback form one FCU each. The Decision System itself is composed of two FCUs with limited failure behavior, which can be established by common fail-safe technologies like lock-step mechanisms. Already on this level of abstraction we can prove that the failure of any FCU will not cause a complete system failure and that the system will remain operational. E/E Architectures on the Way to Level 4
Wilfried explains that automated driving systems that follow the Kopetz architecture can safely replace the human driver. These systems generate output that otherwise would be produced by the human driver. Even better, these systems will continue to operate even in the presence of failures.
The human trust of a machine must be met by the design of the L4 automated driving system as an ultra-high reliable system, e.g., this system may not fail more frequently than, say, 10^-8 failures per hour, which is about once every ten-thousand years.
Wilfried Steiner
TTTech Auto's contribution to SAE L4 architecture
To achieve the SAE L4 reference architecture for autonomous driving, The Autonomous established the Safety & Architecture Working Group, launched in June 2021. It brings together a wide range of experts, from academia to automotive companies, with a common goal - to define the safety architecture for safe autonomous vehicles.
The Working Group delivered its first digest report. Stay tuned for the full report by the end of 2023.
Photo by Philipp Lipiarski, The Autonomous Main Event 2022
“What lies ahead is very exciting. We have worked hard on the proposals, thoroughly analyzing, and validating the input from the various stakeholders and external contributors to finally decide on a unified solution for the second report increment. We are one step closer to a reference SAE L4 architecture and the realization of the vehicle of the future," said Christian Mangold, Functional Safety Manager at TTTech Auto and member of the working group. Such a result can only be achieved through joint efforts, not competition. Christoph Schulze, Chairman of The Autonomous Working Group Safety & Architecture, agrees: "On the road to safe L4 architectures, we must never forget the importance of collaboration for achieving the best result possible based on diverse knowledge and experience, and the will to succeed".
Learn more about how the Working Group Safety & Architecture reached a significant milestone with the successful completion of the second report increment in this interview with various working group members.
We look forward to this giant leap forward for the automotive world. Stay tuned!
To support our customers in finding technical safety concepts and solutions for their system requirements, TTTech Auto also provides expertise and practical knowledge through Safety Consulting Services. Learn more and get in touch with the TTTech Auto Safety Consulting team here.
Stay informed about the most recent technological findings in the automotive industry – visit us at www.tttech-auto.com and follow us on LinkedIn for more on this exciting topic.