On the use of hypervisors in embedded systems
Author: Jens Braunes, PLS Programmable Logic & Systems GmbH
Contribution – Embedded Software Engineering Congress 2018
For the implementation of safety-critical applications, a strict separation of applications or operating systems that share a common computing platform is essential. Therefore, virtualization and hypervisors are becoming increasingly important in the embedded systems sector. This poses a significant challenge, especially for developers working in a very hardware-centric environment.
Virtualization itself is nothing new. It has been used successfully in PC and server environments for many years. However, the topic has received relatively little attention in the field of embedded systems. Now that increasingly powerful multicore SoCs offer sufficient performance to run various applications and operating systems in parallel, a trend reversal is becoming apparent.
There are broadly two fundamental motivations for virtualization in the embedded sector: the separation of real-time critical from less time-critical applications on a shared computing platform, and ensuring security. The main focus here is on separating less security-critical applications from security-critical ones. Understandably, the latter issue requires the utmost care. Incidents such as the 2015 Chrysler hack [1], in which security researchers gained access to safety-critical vehicle systems, such as the brakes, via the internet and a vulnerability in the infotainment system, vividly illustrate that strict encapsulation of functionality is essential for security reasons. This applies not only to automotive systems but also to other highly networked applications, for example, in the IoT or smart home sectors.
Separation for greater safety
If multiple operating systems – referred to below as guest operating systems or guests – or individual bare-metal applications are to be virtualized, a hypervisor is required. This hypervisor acts as an abstraction layer over the real hardware, i.e., the processor cores, the memory, and, not least, the peripherals. A fundamental distinction must be made between Type 1 and Type 2 hypervisors (Figure 1, see below). PDF).
The Type 1 hypervisor runs directly on the hardware and is therefore also called a bare-metal hypervisor. It does not require a separate operating system, but consequently must provide all the hardware drivers itself.
Type 2 hypervisors require a host operating system to provide the device drivers. While applications can run alongside the hypervisor in this variant, it is less suitable for embedded systems. The goal is to keep the hypervisor as lean as possible and, above all, to achieve a strong separation between the individual guest operating systems or applications. Therefore, we will focus on the function and application of Type 1 hypervisors.
Besides basic operating system tasks like managing hardware resources, the Type 1 hypervisor primarily handles the allocation of which guest operating system within a virtual machine (VM) runs on which physical cores and how their processing is scheduled. The guest operating systems do not see the physical cores; instead, the respective VM provides them with virtual cores. The total number of virtual cores does not necessarily have to match the number of physical cores. Multiple guest operating systems can share one or more physical cores. In this case, however, the hypervisor must perform scheduling. Each guest is allocated a time slot for processing on the cores. When switching the active guest operating system, the processing context—that is, core registers, system registers, interrupt handling status, etc.—must be saved and restored. This is comparable to a task switch in an operating system.
In the virtualization of real-time operating systems, the virtual cores are typically assigned statically to the physical cores. The developer thus decides how the processing loads of the individual guest operating systems are distributed across the processor resources, i.e., the cores. This guarantees deterministic behavior of the overall system. While whether real-time capability is actually achieved with regard to the guaranteed timing behavior depends largely on the chosen allocation and distribution, it also naturally depends on the performance of the processor platform used.
The distribution of guest operating systems across different cores, along with other configuration information such as the allocation of physical memory areas into which the guest operating systems should be loaded and which peripherals are available to them, is often compiled or linked directly into the hypervisor. The same applies to the binaries of the virtual machines and the guest operating systems. While it would be possible to use external files located, for example, on a mass storage device or a network drive, this approach naturally carries a certain risk regarding the integrity and trustworthiness of the virtual machines being loaded. If everything has worked correctly, the end result is a monolithic binary consisting of all guest operating systems and the hypervisor, ready to be loaded into the target system's memory by the bootloader and ultimately started.
Not without suitable hardware
For truly secure isolation of guest operating systems, the hypervisor must, of course, be able to access several important hardware functions of the target processor. Let's take a closer look at these essential features using the specific example of an ARM Cortex-A53. With its up to four Armv8-A cores, this SoC not only offers the necessary computing power; the integrated on-chip Hardware virtualization support It also provides all the necessary hardware functions for the hypervisor:
- four exception levels (EL0 to EL3), where EL2 is explicitly reserved for the hypervisor
- a Memory Management Unit (MMU) with two-stage address translation
- Support for Device Emulation
- Exclusive assignment of physical devices to a specific virtual machine
- Forwarding of exceptions and virtual interrupts
The exception levels of the Armv8-A architecture define the privileges of the current software execution. The higher the exception level, the more privileged the execution. Consequently, the Armv8-A architecture provides a separate exception level (EL2) for the hypervisor to safely separate it from the guest operating systems (EL1 and EL0) [2].
To allow guest operating systems to use the physical memory of the Cortex-A53 without being aware of the hypervisor's existence and without unauthorized access to its memory, a two-stage address translation in the form of a MMU is implemented. The physical memory that the guest operating system believes it sees is actually the so-called Intermediate Physical Address Map. This, in turn, is the responsibility of the hypervisor. After the virtual address is translated by the guest operating system into the Intermediate Physical Address Therefore, a further translation into the actual physical address is necessary (Figure 2, p. PDF).
Through device emulation or device assignment, guest operating systems gain access to hardware devices or peripherals. Device emulation is necessary when a device needs to be available to more than one guest. Direct access to the hardware device is not possible in this case, as conflicts would be inevitable. Therefore, the hypervisor emulates the devices in software so that access by multiple guest operating systems can be handled. When a guest accesses such an emulated device mapped into memory, this results in a trap being sent to the hypervisor (EL2). The hypervisor must then react accordingly and, if necessary, address the real hardware device. Conversely, when the guest needs to receive data from the device, virtual interrupts are triggered by the hypervisor. From the guest operating systems' perspective, these appear as normal hardware interrupts. The interrupt controller is also available as a virtual device. If the guest then responds with a read access to the device's memory or registers, a trap is triggered in the hypervisor, as previously described. The hypervisor then provides the requested data.
The hypervisor can, of course, completely hide devices from the guest operating systems or make them exclusively available to a single guest. The latter allows the direct use of hardware devices. Only address conversion and adjustment of interrupt IDs by the hypervisor are necessary. The major advantage of this is, of course, that the overhead of device emulation is eliminated.
It is also often assumed that interrupts and exceptions originate directly from the hardware from the perspective of guest operating systems. However, this is not the case. Instead, hardware exceptions and interrupts are handled by the hypervisor and, if necessary, passed on to the respective guest as virtual exceptions or virtual interrupts.
Hypervisor Awareness in Development –
Debugging challenges
The presence of a hypervisor also impacts debugging, requiring consideration of several perspectives and tasks. Hardware-level debugging presents a particular challenge, as the developer naturally interacts with the hypervisor and/or the virtual machines (VMs).
- Development of a virtualized bare-metal application
Access to memory and peripherals is not encapsulated by an operating system; instead, the virtualized hardware is accessed directly. Therefore, memory contents and access to device registers are of interest for debugging. - Driver development for a guest operating system
This use case is largely identical to point 1. - Runtime analysis of the overall system
This primarily involves measuring runtimes and load distribution. The goal of these measurements is to optimize the hypervisor configuration so that, for example, the promised timing behavior of a real-time operating system running as a guest is actually maintained.
I will deliberately not go into detail about the pure debugging of applications running under a guest operating system at this point, because direct hardware debugging is rarely used there. Instead, let's take a closer look at the three applications mentioned above.
While the developer explicitly engages with the hypervisor in the latter use case, the fact that virtualization is involved should ideally remain hidden from them in the first two cases. In practice, however, this is usually not the case. If a JTAG debugger like PLS' Universal Debug Engine (UDE) is used for debugging, it has direct access to the memory, registers, and processor cores of the target processor—in other words, to everything located below the hypervisor. The view of the bare-metal application or driver being developed, however, ends above the hypervisor. For these applications, only the virtual hardware components—virtualized memory, emulated devices, and virtual processor cores—are visible. This, of course, has practical consequences.
While the debugger's hypervisor awareness theoretically allows for the conversion of virtual and intermediate physical addresses to physical addresses, this only works for real memory (e.g., RAM or FLASH) and not for emulated devices. These devices are mapped into virtual memory but have no assigned physical memory area. Accessing this device memory by a guest operating system is not problematic, as it is trapped in the hypervisor and processed by the device emulation, as described above. However, the address translation performed by the debugger returns an invalid memory area because it is unaware of the device's emulation. Therefore, the use of emulated devices within the virtual machine is not directly debuggable. This does not apply to devices directly and exclusively assigned to a VM, where real hardware exists to which the debugger has access.
If the debugger stops the system at a breakpoint or is triggered directly by the user, not only are all physical cores stopped, but also the entire hypervisor and thus all virtual machines. This is perfectly logical and consistent, because otherwise the hypervisor would suddenly have to deal with a virtual machine that is no longer responding. It would inevitably become unresponsive and could potentially crash.
The situation is different if the hypervisor offers a debug monitor. In that case, it is capable of and responsible for pausing individual virtual machines and their guest operating systems while the others continue running. However, when the debugger uses this debug monitor, it must, of course, no longer have any direct control over the hardware.
Setting a breakpoint within the context of a virtual machine while the system is running requires considerable effort from the debugger and, unfortunately, cannot be implemented completely seamlessly. The debugger must first determine the current state of the entire system, i.e., which virtual machines are currently active and which are not. This only works reliably when the system is paused, which is why the debugger briefly halts the entire system. While this brief pause remains largely invisible to the user, it does, of course, affect the runtime behavior of the individual virtual machines.
To optimize hypervisor scheduling and determine whether guest operating systems are meeting their real-time processing promises, runtime measurements are necessary. In the simplest case, the hypervisor provides this information itself, and the debugger can query the debug monitor. If not, tracing is practically the only alternative. However, the latter requires both the debugger and the user to have a thorough understanding of the hypervisor's management structures in order to effectively analyze the recorded trace data.
Conclusion
Without virtualization, a strict separation of safety-critical or real-time critical software on a shared embedded processor platform will be virtually impossible in the future. However, it's important to consider that hardware-level debugging of drivers or bare-metal applications in a virtualized environment can also have undesirable effects on the overall system. While debugging tool vendors strive to minimize such inherent side effects, they unfortunately cannot be completely eliminated. Only those who are aware of this beforehand will be able to correctly interpret unexpected effects during system development.
References
[1] https://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/
[2] ARM Ltd: ARM® Architecture Reference Manual, ARMv8, for ARMv8-A architecture profile
author
Jens Braunes Jens Braunes is Product Marketing Manager at PLS Programmable Logic & Systems GmbH. He studied computer science at the Technical University of Dresden and worked there as a research assistant. In 2005, he joined the PLS software team and played a key role in the development of the Universal Debug Engine. In 2016, he expanded his responsibilities to include product management and technical marketing. Jens Braunes regularly writes technical articles and speaks at conferences.
Multicore – our training & coaching
Do you want to bring yourself up to date with the latest technology?
Then find out more here MircoConsult offers training courses/seminars/workshops and individual coaching on the topic of multicore/microcontrollers.
Training & coaching on the other topics in our portfolio can be found here. here.
Multicore – Expertise
Valuable expertise on the topic of multicore/microcontrollers is available. here Available for you to download free of charge.
You can find expertise on other topics in our portfolio here. here.
