How do you implement networked, safety-critical systems?

A systematic approach using the example of a drive control system

Author: Markus Maier, Assystem Germany

Contribution – Embedded Software Engineering Congress 2018

Whether analyzing process data or simply implementing software updates efficiently in the field: New business models increasingly require opening up once-isolated, safety-critical control systems. Assystem demonstrates a systematic approach to developing safety-related networked systems using the example of a drive control system, while adhering to the relevant security and safety standards for the application.

Automation technology has been undergoing a transformation towards modular, networked systems for several years. As a result, cybersecurity is becoming an indispensable prerequisite for functionally safety-critical systems.

The Triton malware attack on an industrial plant in the Middle East, which occurred at the beginning of 2018, demonstrated how vulnerable current safety-critical industrial control systems are.

Figure 1 (see. PDF) shows our application example of a networked electric drive controller, which is used, among other things, in hydroelectric power plants and high-performance machines.

The system under consideration (SuC) in our application example controls the operation of an electric motor and is networked with a backend/cloud infrastructure. This enables, on the one hand, the monitoring of the control process and, on the other hand, updates of non-functional safety-related software components.

The central safety function for the drive is the so-called Safe Torque Off (STO) function, which requires special protection with regard to cybersecurity. Other functions requiring protection include, for example, the electric motor control, machine status, diagnostics, software updates, and process data analysis. Relevant standards in this area are primarily IEC 62443, IEC 61508, ISO 13849, EN 62061, and IEC 61800-5.

Safety & Security Process

Figure 2 (see. PDF) shows the lifecycle process for safety-critical systems applied by Assystem for the application example.

The mapping of the process for the top-down design according to IEC62443-3-2 is shown in Figure 3 (see Figure 3). PDF) and the "Security Lifecycle" process according to the NIST standard is shown in Figure 4 (ss. PDF) shown. This results in a generic development and maintenance process that is compatible with both relevant safety standards and relevant security standards.

The development phase of the lifecycle process Figure 2 (see Figure 2). PDFThe process is divided into the areas of "Design," "Implementation," and "Admin." The "Operations" area represents the operational phase. For each block, corresponding input/output artifacts, responsibilities or roles, and activities are described.

Crucially, safety-related system development can be carried out independently of security-related system development from phase P4.1 onwards through suitable system partitioning.

Security risk analysis – methodology & standards using an example

Following the definition of the „System under Consideration“ (SuC), a high-level risk analysis is carried out for essential assets of the SuC, taking into account the physical interfaces, stakeholders, and use cases in the planned system environment (see Table 1, p. PDF).

Potential threats, vulnerabilities, and exploitation impacts are analyzed for each asset group. Threats and vulnerabilities are first assigned a qualitative probability. The potential damage (impact) is also qualitatively estimated. The qualitative values for probability and damage must be defined before the risk analysis is prepared (see rationale in Figure 5, p. 1). PDFFor example, manipulation of the safety function is classified as catastrophic, and the probability is determined in relation to the number of controllers in the field and a time period.

This results in a qualitative risk assessment for each asset group in the first step. Foundational Requirements (FRs) are then defined for each asset group to reduce this risk. A Target Security Level (SL-T) is assigned to each Foundational Requirements, as defined in Table 2 (see below). PDF).

Safety concept and architecture

The central safety function of the drive controller is the so-called Safe Torque Off (STO) function, which ensures a safe shutdown of the torque.

Figure 6 (see. PDFFigure 1 shows the dual-channel architecture of the STO function from input to output. This allows the STO to meet the requirements of ISO 13849 for PLe and the IEC 61508 and IEC 61800-5 standards for SIL 3. For safety verification, the diagnostic path for monitoring the STO hardware paths is considered separately. The diagnostics of the STO hardware paths were classified one SI level lower than the actual STO function and meet the requirements of IEC 61508 for SIL 2.

Since the FPGA manufacturer does not provide quantitative fault analysis, the diagnostic function was implemented using two independent paths within the FPGA. Additional measures for detecting and preventing common cause faults, such as excessively high or low ambient temperature, supply voltage, clocking, and EMC, ensure that individual faults within the FPGA cannot lead to the failure of the diagnostic function. Furthermore, a quantitatively verifiable high Safe Failure Fraction (SFF according to IEC 61508) is achieved.

Security concept, requirements and architecture

The result of the high-level risk analysis at the system level is the derivation of Foundational Requirements (FRs) for each asset based on the threat scenarios. For each Foundational Requirements at the system level, a Target Security Level (SL-T) is defined according to the required level of security (see Table 1)., PDF).

Subsequently, a system architecture for the system under consideration (SuC) is designed using the "Defense in Depth" principle. The system is divided into so-called security zones and conduits. This division into zones and conduits can be either physical or logical (in software), and the grouping is based, for example, on the criticality of the assets, their function, their physical/logical storage location, or access authorization (see IEC 62443-3-2).

Through the process of structured risk analysis and top-down designs (see Figure 3, PDF) By applying the Defense in Depth principle, the overall system (SuC) is divided into physical and logical security zones (Figure 7 and Figure 8, see below). PDF).

Each zone contains one or more systems, which in turn consist of basic components. Zones are assigned a specific security or trust level, including foundational requirements, and each zone only provides the truly relevant interfaces to the outside world, i.e., to other zones. Authentication, encryption, and data flow limitation typically occur between zones. Incoming data should always be validated before internal use, and outgoing data should be sanitized before output whenever possible to prevent the disclosure of critical information.

Summary and Outlook

In summary, the methodological approach presented in our application example offers numerous advantages for system integrators and plant operators. Thanks to the resulting high level of security and certification, the controller can be integrated into a wide variety of applications for controlling high-performance electric motors. The secure cloud/backend connection enables networking of the controller for process data analysis and allows for simple and secure updates in the field for non-safety functions.

Additionally, the controller can be scaled to meet specific needs thanks to its modular design.

Sources

e.g. Stuxnet, Triton malware or similar

author

Markus Maier is a team and project manager at Assystem Germany GmbH. He has many years of experience in the development of functionally safety-critical systems in the automotive and industrial sectors and has been intensively involved with cybersecurity for several years, particularly with the hardening of embedded systems and industrial controls.

Download the article as a PDF file

Architecture & Design – MicroConsult Training & Coaching

Do you want to bring yourself up to date with the latest technology?

Then find out more here MircoConsult offers training courses/seminars/workshops and individual coaching on the topics of architecture & design / embedded and real-time software development.

Training & coaching on the other topics in our portfolio can be found here.

Architecture & Design – Expertise

Valuable expertise in architecture & design / embedded and real-time software development is available. here Available for you to download free of charge.

To the specialist information

You can find expertise on other topics in our portfolio here. here.

The Triton malware attack on an industrial plant in the Middle East, which occurred at the beginning of 2018, demonstrated how vulnerable current safety-critical industrial control systems are.

Figure 1 shows our application example of a networked electric drive controller, which is used, among other things, in hydroelectric power plants and high-performance machines.

Figure 1 – Application example: networked electric drive controller

Safety & Security Process

Figure 2 shows the lifecycle process for safety-critical systems applied by Assystem for the application example.

Figure 2 – Safety & Security Lifecycle Process

Figure 3 – Process Risk Assessment & Top Down Design according to IEC62443-3-2

Figure 4 – Security Lifecycle according to NIST Standard

The mapping of the process for top-down design according to IEC 62443-3-2 is shown in Figure 3, and the mapping of the "Security Lifecycle" process according to the NIST standard is shown in Figure 4. This results in a generic development and maintenance process that is compatible with both relevant safety and security standards.

The development phase of the lifecycle process (Figure 2) is divided into the areas of "Design," "Implementation," and "Admin." The "Operations" area represents the operational phase. For each block, corresponding input/output artifacts, responsibilities or roles, and activities are described.

Crucially, safety-related system development can be carried out independently of security-related system development from phase P4.1 onwards through suitable system partitioning.

Security Risk Analysis – Methodology & Standards using an example

Table 1 – Example of a High-Level Risk Analysis

Potential threats, vulnerabilities, and exploitation impacts are analyzed for each asset group. Threats and vulnerabilities are initially assigned a qualitative probability. The potential damage (impact) is also qualitatively estimated. The qualitative values for probability and damage must be defined before the risk analysis is created (see rationale in Figure 5). For example, manipulation of the safety function is classified as catastrophic, and the probability is determined in relation to the number of controllers in the field and a specific time period.

Figure 5 – Risk classification

MicroConsult Newsletter

With the MicroConsult newsletter, you'll stay on the pulse of the embedded world. Look forward to proven practical knowledge, real professional tips, and current events – directly from our experts for your project success.

Subscribe now!

Published by

weissblau media

← Using software archaeology on the way to A-SPiCE® Level 3 Logical Execution Time in the Automotive Environment →