TPM 2.0 Policies in Practice

Easy and secure rights management for embedded systems

Author: Markus Wamser, Mixed Mode GmbH

Contribution – Embedded Software Engineering Congress 2018

Trusted Platform Modules (TPMs) have been firmly established in the market for many years. Modules based on the current version 2.0 of the standard have largely replaced older modules. Nevertheless, many of the new features and functions of these modules remain unused. A prominent example is the concept of Extended Authorization Policies. These not only enable secure and trustworthy boot and update concepts, but also allow for the implementation of a rights and license management concept, for example, in a vehicle, with minimal effort.

In everyday life, many people associate the term Digital Rights Management (DRM) with the purchase and consumption of media files. Indeed, video streaming accounts for more than half of all internet traffic [1], a large portion of which is protected by DRM. However, in embedded environments, the potential applications for rights and license management are far more diverse. For example, usage rights for individual software functions can be cryptographically and securely linked to authorization. These authorizations, in turn, can be flexibly and almost arbitrarily complexly designed using a TPM 2.0 module. At the same time, implementation in the embedded system is cost-effective, as only a TPM 2.0 module is required as a hardware-based trust anchor.

This increase in flexibility and security stems largely from the concept of Extended Authorization and the policies based on it, which were incorporated into the TPM standard during the transition to version 2.0.

TPM 2.0: Motivation and changes compared to TPM 1.2

Trusted Platform Modules (TPMs) were born out of necessity, requiring a cost-effective yet secure anchor of trust in devices. Simple chips, soldered onto a motherboard and offering basic functions for generating and securely storing cryptographic keys, first gained widespread use shortly after the turn of the millennium. However, minor incompatibilities prevented more widespread adoption, except in a few large companies with homogeneous environments. A turning point came with the adoption of version 1.2 of the TPM standard by the Trusted Computing Group (TCG) in 2009. Besides improvements to the security concept, this version primarily brought about a significant harmonization of interfaces and functions. Despite the small version jump, this was a milestone for the acceptance and use of TPMs. One of the most prominent application scenarios—TPM-based disk encryption—became an integral part of operating systems and can now be used without additional software or complex configuration.

The increased use of TPMs also revealed weaknesses in this version: The standard had evolved over several versions according to user requests, but always maintained backward compatibility. This made it complex and difficult to read, which hampered the development of further application scenarios. The lack of additional features also had a limiting effect. Ultimately, the biggest flaw was the strict adherence to a few specific cryptographic methods. Should even one of these methods prove weak or insecure, there was no way to switch to alternatives. That this fear was not merely theoretical became apparent during the development of version 2.0. The first practical attacks on SHA-1, the hashing algorithm defined in version 1.2, were looming.

The development of version 2.0 of the standard was therefore characterized by three fundamental ideas:

The standard should be completely redesigned and rewritten to improve readability and avoid ambiguities or contradictions.
The standard should define only classes of primitives, rather than specific cryptographic primitives. Agility It enables a fast and efficient response should new attacks on the algorithms in use become known. It also increases the applicability of TPMs, for example in government environments where different encryption algorithms are used regionally. Agility in the selection of cryptographic primitives was the primary reason for revising the standard.,
The concept of administrator and user roles should be significantly revised to enable more deployment scenarios. At the same time, all use cases achievable with a TPM version 1.2 should remain representable.

The most significant change was the introduction of symmetric cryptography. This had been deliberately omitted in the 1.x versions to avoid potential conflicts with export restrictions. However, the increased flexibility in the selection of cryptographic methods and the use of modern primitives (such as asymmetric methods based on elliptic curves) necessitated the introduction of hybrid methods and, consequently, symmetric primitives. The subsequent relaxation of export restrictions further simplified this decision.

The second key and significant change was the standardization of the various authorization procedures into the so-called Enhanced Authorization. This standardization expanded the range of possible applications, for example in the form of Extended Authorization Policies, which will be examined in more detail in this article. At the same time, the implementation effort for manufacturers and users of TPMs was reduced.

The goal of easy readability was ultimately subordinated to the need for unambiguousness. In return, it is possible to generate a TPM 2.0 simulator from the specification, the behavior of which can ultimately be used as an authoritative reference.

In addition to the specification [2], finally A Practical Guide to TPM 2.0 [3] as a freely available book to simplify access to the technology and specification. Chapter 3 of this book mentions further innovations in TPM 2.0 that are not relevant to this article and essentially serve to improve user-friendliness.

Extended Authorization Policies

Basics

The core of Extended Authorization Policies is the hash-extend method used to generate a kind of tamper-proof log. However, the data to be logged itself is not stored, but only a current status value. The hashing method used ensures that it is practically impossible to generate a second log that results in the same status value. Likewise, it is not possible to modify the log based on a given current status value. (any) to continue recording until a specific status value is reached. In simple terms, this corresponds to a blockchain integrated into the TPM, where only the current block is stored. More precisely, it is only its hash value that is stored.

The updating of this hash chain now happens as follows: If there is a status value in the TPM S stored and should the data be D to be stored there (the specification states that) D extended (will), so first D to S attached. With a hash function hash A new status value will then be generated. S'‘ calculated:

S'‘ ← hash(S || D)

These status values are usually stored in so-called Platform Configuration Registers (PCRs). The name derives from the original and still most common purpose: securing and logging the boot process (Secure Boot resp. Trusted BootThe PCRs of a TPM cannot be directly described. Starting from a well-defined initial value, they are only defined by the extendThe operation is mutable. This makes reverting to a previous value or selectively writing a status value practically impossible.

The extendThe operation itself is also a central component of the Extended Authorization Policies.

concept

To ensure smooth and secure operation, the TPM checks with every received command whether the sender is authorized to execute it, specifically to use the resources (entities) referenced in the command. In the simplest case, this verification is done by entering a password or using a session key agreed upon with the TPM.

Extended Authorization Policies, sometimes also as Enhanced Authorization Policies and often simply shortened to Policies These represent another and significantly more powerful way to grant and verify rights. The relevant literature even ventures the claim: „Clever policy designs can allow virtually any restriction on key use that you can envision,[…]“ [3, p. 33]. This is, of course, subject to the limitation that the required implementation effort represents a natural constraint.

A simple example is the following scenario: In a company, the keys for email communication are managed by TPMs (Trusted Process Management Units). When a user receives an encrypted email, they can have it decrypted by the TPM after entering their passphrase. However, in the company's interest, access to emails without knowing this passphrase should also be possible, for example, after an employee leaves the company or in case of sudden illness. For this purpose, passwords are stored for the works council, the managing director, and the administrator. A policy can then be used to stipulate that access to the key material, and thus to the emails, is also possible if at least two of these three passwords have been entered correctly. This protects both the company's interests and the employee's privacy.

How it works

Policies attest to a specific system state, property, or event. This could be, for example, the presence of a specific value or range of values in a PCR or in the non-volatile memory of the TPM. More complex conditions can also be addressed, such as a specific value in a second TPM integrated into a fingerprint reader.

The purpose of a policy is always to grant access to a Entity of the TPM, usually a secret key or an area of non-volatile memory. A policy is therefore a set of restrictions that limit access.

A special [tool/method] is used to check compliance with the restrictions. policyDigest-Register used. If the policy is met, the policyDigest-Register using extend-function and a special fingerprint of the policy (and the condition) updated. The release of the Entity is then tied to a specific value of the policyDigest-Register bound. This value must be used when generating the Entity be known.

Two types of policies must be distinguished: policies with direct (immediate assertion) Review of terms and policies with deferred (deferred assertion) Conditions check. While the former works as already described, the latter checks the conditions when accessing the policy-protected area. Entity instead. policyDigest However, it is updated with a specific value at the time of registration. The conditions are checked immediately upon accessing the Entity This is then ensured by the TPM.

Policies with deferred validation allow for the preprocessing of complex policy combinations that link static conditions (e.g., specific hardware configurations) and external/dynamic events (e.g., specific time periods). The most typical example, however, is simple password authorization. Here, the following flows into the policyDigest It merely states that such authorization must be verified when accessing the corresponding entity, not the password itself. (The TPM ensures the verification during actual access.)

Generally, it is possible to use the reference-policyDigest Offline, i.e., calculated without TPM and purely in software, since no secrets are ever involved in this value. In practice, however, a so-called trial policy session. The policies on the TPM are applied in the same way as during the audit, however, policy compliance is always assumed. Ultimately, the required policyDigest value can be read out or used directly to create an entity.

Hierarchies

The logical AND operation of multiple policies is trivial: the policies simply need to be checked sequentially. Maintaining the correct order is crucial. However, the TPM 2.0 standard also allows combining policies with a (non-)exclusive OR operation.

This allows for the creation of complex hierarchies in the form of policy trees. Even greater flexibility is achieved through the use of digital signatures. Within a policy hierarchy, the condition "Policy A is fulfilled" can be replaced by "Policy X is fulfilled and a valid signature exists for Policy X." This allows policies to be dynamically exchanged or added, especially when the specific values required to fulfill Policy X are not yet known when the hierarchy is created.

What sounds complicated in the theoretical description quickly becomes clear when an actually implemented example is considered.

Example: Rights Management

To illustrate these concepts, consider the following scenario: For a vehicle (such as a rental car), various user roles (groups of drivers) are to be defined. Besides the standard driver, who possesses the equivalent of a key, there should be an alternative authorization method for a hotel porter. This porter is only permitted to move the vehicle at a reduced speed for parking purposes. The third role represents a service technician who has access to extended functionality but must authenticate themselves using two-factor authentication. Regardless of the role, the entire policy should be tied to a vehicle attribute. For this purpose, the license plate number or vehicle identification number (VIN) is stored in the non-volatile memory of the TPM (Total Product Management) module.

Policy hierarchy

These requirements initially result in the figure shown in Figure 1 (see PDFThe policy hierarchy shown is as follows. The vehicle ID is stored in the non-volatile memory of the TPM during system provisioning. Assuming for the moment that all underlying policies are satisfied, the SAPI command is used to... TPM_PolicyNV Checked whether the specified area in the NV-RAM of the TPM contains the desired value.

The underlying OR operation can be performed with the command TPM2_PolicyOR This can be implemented. The TPM checks whether the current policyDigest corresponds to one of the values defined in the policy. (By default, the number of possible values is limited to eight. However, any number can be defined.) TPM2_PolicyOR(Conditions chained together.) Is the policy fulfilled, i.e., the current policyDigest If the value is valid, the policy will be updated. convertedA hash value is calculated based on the policy and this is entered into (a fresh) policyDigest extended. This always results in the same subsequent value for the policyDigest, regardless of which sub-condition is met.

The lowest level of the hierarchy can be achieved, for example, by sequentially executing TPM_PolicyNV and TPM2_PolicyAuthValue to be realized.

However, this hierarchy proves to be insufficiently flexible for practical use, as each role is rigidly linked to a single ID. Furthermore, alternative authentication methods are not yet explicitly represented.

The solution lies in the concept of wildcard policies. Specifically, in the initial hierarchy, each user ID is replaced by such a wildcard policy. Using... TPM2_PolicyAuthorize A (fingerprint) of the public part of an asymmetric key pair is stored in the policyDigest The process has been expanded. At this point, any policy can be used for which a valid signature with this key pair exists for its final value. The policy hierarchy is then verified in two steps: first, the signature is verified by the TPM. The TPM then generates a ticket confirming the successful signature verification to itself. This effectively inserts the specific policy into the policy hierarchy, replacing the wildcard policy. The resulting hierarchy is shown in Figure 2 (see...). PDFEach of the sub-hierarchies marked with a dashed line can exist multiple times. This allows user IDs to be dynamically activated and deactivated for the different roles.

Implementation

The generation and verification of the presented policy hierarchy was implemented in MixedMode using both scripts on a Raspberry Pi and as a Qt application for the company's own M2Control demonstrator platform. In both cases, an Infineon SLB 9670 chip was used as the TPM module, connected via SPI.

The policies will be passed directly through the demo via trial session generated on the TPM (Figure 3, see PDFThe key pair for the wildcard policies is generated using OpenSSL. User IDs can then be registered and authentications performed via the Qt interface.

Outlook: Further deployment scenarios

The presented example can easily be transferred to a more general rights management system. This can be expanded from pure software license management (release of functionalities on-demand and in the pay-per-use model) to integrated production management (licensing of production data, limitation of produced units, authentication, etc.), since policies can also include simple numerical comparison operations and irreversible counters implemented via the TPM.

Extended Authorizations also enable the certification of policies. This can, for example, increase trust in signature keys, since the TPM can provide cryptographic proof of the form of access control implemented for the signature key, in addition to the signature itself.

Ultimately, wildcard policies enable a secure system update. They provide the necessary indirection to ensure a TPM_PolicyPCR to update without temporarily breaking the chain of trust. A SecureBoot/TrustedBoot concept implemented in this way can then even be combined with the presented solution.

Summary

The introduction of policies has created a powerful tool for implementing complex yet flexible rights management architectures based on TPM 2.0 modules.

The widespread use of the modules and standardized programming interfaces allow for a fast, secure and sustainable implementation of rights management in a wide variety of application scenarios.

The combination of different types of sub-policies into a complex policy was demonstrated using the Mixed Mode M2Control Embedded Demonstrator. Here, functionality was enabled on a (fictitious) vehicle based on user roles. The authentication of each user was individually configured for each role.

Bibliography and list of sources

[1]	O. Bünte, „heise online““ 05 10 2018. [Online]. Available: https://heise.de/-4181988. [Accessed on 12 10 2018].
[2]	Trusted Computing Group, TPM Library Specification 2.0, October 1, 2014.
[3]	W. Arthur, D. Challener and K. Goldman, A Practical Guide to TPM 2.0, Apress, 2015.

thanksgiving

A big thank you goes to Benedikt Petschkuhn, who designed the practical example and initially implemented it for the Raspberry Pi.

author

Markus Wamser works as a systems developer and consultant specializing in embedded security at Mixed Mode in Gräfelfing near Munich. Previously, he was a research associate at the Chair of Information Technology Security at the Technical University of Munich from its inception. Markus Wamser has been a speaker at international conferences from Abu Dhabi to Verona, has more than ten years of university teaching experience, and holds degrees in both mathematics and computer science.

Download the article as a PDF

Our training courses & coaching sessions

Do you want to bring yourself up to date with the latest technology?

Then find out more here Regarding training courses/seminars/workshops and individual coaching sessions offered by MircoConsult on the topic Quality, Safety & Security.

Training & coaching on the other topics in our portfolio can be found here. here.

Quality, Safety & Security – Expertise

Valuable expertise on the topics of quality, safety & security is available. here Available for you to download free of charge.

To the specialist information

You can find expertise on other topics in our portfolio here. here.

MicroConsult Newsletter

With the MicroConsult newsletter, you'll stay on the pulse of the embedded world. Look forward to proven practical knowledge, real professional tips, and current events – directly from our experts for your project success.

Subscribe now!

Published by

weissblau media

← You can't get in here! (Or can you?) Immunization Techniques against the Side Channel Attack →