Management of Safety and Security in Programmable SoCs
Authors: Dr. James Hunt, Aicas, Dr. Giulio Corradi, Xilinx
Contribution – Embedded Software Engineering Congress 2017
Machine learning and artificial intelligence are key trends that play a central role in supporting autonomous decision-making. This requires secure, dynamic updates and monitoring of new configurations, as well as a perfectly managed and optimized language platform. This platform provides the necessary security for these systems and enables easy and secure access to and updates of GPU- and FPGA-coded functions.
Industrial trend towards machine learning
Industrial applications are expected to become increasingly efficient and intelligent. Machine learning (ML) and artificial intelligence (AI) are key technologies for achieving these goals. For the industrial sector, aspects such as real-time performance and low latency are of paramount importance and highlight further characteristics of the Industrial Internet of Things (IIoT), especially when ML and AI control physical quantities and real-time performance and low latency become critical.
The edge platform for IIoT
Creating an architecture for an IIoT platform is not trivial; much depends on the intended purpose. For example, the entire chain – from the physical world to the cloud – requires managing multiple technologies. Experts define such a platform by three main characteristics:
- a collection of assets, here meant as a combination of components, processes, knowledge, people and relationships;
- a collection of technical elements, in particular the underlying core technology, which are implemented across the entire product range; and
- a series of subsystems and interfaces that form a well-functioning, homogeneous structure
Since the need for updates is constantly growing, a modifiable platform is recommended, where the fixed structure still offers the flexibility to adapt optimally to different applications as needed. This is achieved using programmable logic through user-defined elements that help the system's standard resources to be optimally extended or complemented. This capability is called an All Programmable System on Chip (APSoC).
Machine Learning (ML)
Machine learning models have the ability to draw inferences based on experience from a range of input data. This experience is gained by analyzing a variety of data extracted from different systems, which constitutes evidence. The process of drawing inferences from evidence is called inference.
In these models, extracting evidence from a dataset is achieved through training and learning processes. Once training is complete, either supervised by human criteria or unsupervised by an evaluation function, the machine learning model is ready to perform the inference process for which it was trained. As a result, the system can derive new and previously unknown insights from the inputs.
Many models are used today to draw conclusions with varying degrees of complexity, accuracy, and precision in their results. All of them require a specialized environment and resources for machine learning.
The ML environment
The central role in this machine learning environment is played by the data scientist. They seek the best representation and format for the given problem. For unsolved problems, there is generally an indeterminate period of time dedicated to model exploration, an activity that includes data acquisition, normalization, and information reduction, as well as potential adjustments to the machine learning algorithm, transformation, extensions, and other data manipulations. In short, model exploration is an experimental phase in which the hidden essence is distilled. If the model fulfills its purpose, it is ready for the next optimization, which deploys the model on the final computing platform.
The process of discovering outstanding properties and using them to fit them into the embedded computing scenario consists of the following steps:
- Problem identification (includes understanding the information needed or available)
- Model Exploration (includes the selection of reasonable algorithms for machine learning)
- Model verification (includes testing the suitability for the selected algorithm)
- Model embedding (includes possible optimizations for execution, footprint, or other parameters)
- Model execution in the embedded environment (includes how machine learning interacts with the environment)
Problem detection, model exploration, verification, and embedding are the topics that most affect the designer. Unfortunately, even with a complete test set for the given task that covers all uncertainties, real-world applications still require adjustments after the machine learning algorithm has been implemented. Figure 1 (see PDF) takes this relearning into account as feedback from an embedded execution.
Modeling can be implemented in high-performance FPGAs, where models are created as configurations of FPGAs according to the problem solution, which are then executed.
This results in an iterative process, as new configurations must be created, downloaded, and deployed each time. Each of these updates should be fast, secure, and automatic, without interrupting the running system or even replacing the entire hardware.
Once the ML model is formed, the key issues are robustness and performance, and transmission security.
Security needs
A model like the one described here requires initial commissioning and subsequent, ideally continuous, revision based on new information. The generated results (i.e., FPGA configuration) must be uploaded to the target system and put into operation there. Updates must be able to be deployed immediately, directly, and securely, e.g., as Software Over the Air (SOTA).
For this, a framework is needed for the overall ML system that enables or ensures the execution of actions initiated from within the ML, communication with remote systems (e.g., in the cloud), and security during transmission.
This framework must be able to operate bidirectionally, meaning it must read performance information from the corresponding device and transmit it securely and without manipulation, as well as allow secure updates of the machine learning model (downloads). Applications such as the machine learning model or FPGAs for the required fast signal processing and flexible circuit modification—for example, to enable subsequent improvements to the implemented functions without having to directly change the hardware—are made available for remote download in a portal within the described framework, along with updates. From there, they can be initially commissioned; likewise, necessary updates can be automatically sent to the target system. (See Figure 2), PDF).
The security chain of trust
A seamless chain of trust is required from the creation of an ML model or FPGA and its updates to its secure execution on the device; a selection of such security mechanisms is described below.
A Trusted Platform Module (TPM) provides cryptographic functions in a computer, which are an effective addition to its security functions (e.g. for the Zynq®-7000 APSoC).
The secure boot function offers the ability to authenticate all partitions loaded during boot. It also supports AES (Advanced Encryption Standard) encryption of partitions that require confidentiality.
The Secure Framework Loader (SFL) is a binary file included in the boot image that verifies secure booting. The purpose of this loading process is to authenticate and start the Jamaica IoT framework. Because the SFL is authenticated through secure booting, it is trusted. Therefore, when the SFL authenticates and loads the framework, it transfers this trust to the framework.
The SFL transfers trust to the Jamaica IoT framework, which in turn contains the corresponding root certificate. This certificate authenticates all configuration resources, thereby transferring trust to all configuration data. If authentication of the configuration resources fails, the framework is stopped and an error is logged.
After signature verification, trust was transferred to the configuration data. The configuration data contains the OEM Certification Authority (CA) root certificate. This allows the OEM to control which resources and applications may be installed and run on the device.
All applications, components, and resources must be signature-authenticated at installation time using a certificate that is directly or indirectly linked to the OEM CA root certificate.
Installation can be performed via an over-the-air download or a local storage medium. No application, components, or resources will be installed if the authentication of their signatures or their associated certificates fails.
Resource management
The operator of the Jamaica IoT Framework, or an authorized body, defines limits for resource usage, such as RAM size or the number of allowed threads. These limits are applied during the execution of the framework or any application running within it. This prevents downloaded programs, or even just program components for updates, from taking over the device after installation by excessively proliferating their resources. (See Figure 3.), PDF).
Conclusion
Machine learning is a suitable technology for embedded systems, but to fully exploit its potential, more than just a neural network is needed. Only by applying additional technologies, such as FPGAs for efficient learning execution and frameworks for the secure downloading, installation, and execution of dynamic code, can the full capabilities of machine learning be effectively utilized in embedded systems.
authors
Dr. James J. Hunt He is co-founder and CEO of aicas. He holds a B.Sc. from Yale and a Master's degree from Boston University, as well as a Ph.D. from the University of Karlsruhe. He has extensive experience with wafer-scale integration, parallel signal processing systems, and formal methods. He also contributed to the development of the avionics standard ED217 (DO-332). He currently leads the expert group for real-time Java (JSR-282).
Co-presenter Dr. Giulio Corradi Giulio is a Senior System Architect with 25 years of experience in management, software engineering of embedded systems, and the development of ASICs and FPGAs. His focus areas include machine learning, real-time communication, and functional safety. He has worked at Xilinx in Munich since 2006 and has made a significant contribution to the Xilinx Functional Safety certification of tools and compilers.
Our training courses & coaching sessions
Do you want to bring yourself up to date with the latest technology?
Then find out more here MircoConsult offers training courses/seminars/workshops and individual coaching for system and hardware development.
Training & coaching on the other topics in our portfolio can be found here. here.
Expertise
Valuable expertise in system and hardware development is available. here Available for you to download free of charge.
You can find expertise on other topics in our portfolio here. here.
