C++11/14 Multithreading

Overview, highlights and pitfalls

Author: Karl Nieratschker, SKT Nieratschker

Contribution – Embedded Software Engineering Congress 2015

Since the introduction of C++11, the C++ standard library has also offered support for developing multithreaded applications. This functionality has been further expanded in the latest standard, C++14. While using the C++ Multithread API simplifies the porting of such applications, it also means that developers are limited to the capabilities of the standard library if they want to benefit from its features. Therefore, not only when developing new applications, but also for existing applications that still rely on platform-specific multithreading solutions, the question arises whether it makes sense to use or migrate to this API. This presentation provides an overview of the capabilities of the C++ Multithread API and shows what needs to be considered when porting applications.

When designing the C++11 multithreaded standard library, many elements were adopted from the widely used C++ Boost library, and additional functionalities were added. However, those familiar with Boost should note that not everything was adopted, and the adopted features were not always implemented exactly the same way. The library's implementation made extensive use of new C++11 language features, such as variadic templates, rvalue references, and lambda functions.

The thread class and its possibilities

A thread is represented by an instance of the `thread` class. When creating this instance, the code to be executed by the thread can be specified as a global function, an instance or class method, a functor, or a lambda function. The constructor creates a runtime thread that can execute the code immediately. The destructor of the `thread` class checks whether the associated runtime thread is still running and throws an exception if necessary. To prevent this, you can either wait for the thread to finish using the `join()` method or detach the runtime thread from the C++ thread object using the `detach()` method. The thread function to be executed can have any number of parameters of any type, which are always implicitly copied to ensure a sufficiently long lifetime (Figure 1, see...). PDF).

Each thread has an ID of type `thread::id` for identification. Since the ID is platform-specific, this type supports only a few operations, such as `get_id()` as a method of the `thread` class and as a function of the `this_thread` namespace, as well as comparison operators. Other functions of the `this_thread` namespace include `sleep_for()` and `sleep_until()`, which allow a thread to sleep for a relative or absolute time, respectively, and the `yield()` function, which releases the remainder of a thread's time slice. Thread-local memory is also supported in C++11, but in a much more intuitive way than in Boost. Finally, the static method `hardware_concurrency()` returns the number of currently available hardware execution units (cores). However, this function can also return zero, so the value it returns should only be considered indicative.

synchronization

To protect resources, C++11 supports not only "normal" mutexes of type `mutex`, but also those that can be requested multiple times (non-blocking) by a thread (`recursive_mutex`) and those that return an error if the mutex cannot be requested within a certain time (`timed_mutex`, `recursive_timed_mutex`). With C++14, Boost's `shared_mutex` was also incorporated into the standard as `shared_timed_mutex` for multiple-reader/single-writer applications. The classes `lock_guard`, `unique_lock`, and `shared_lock` (C++14) reliably prevent deadlock problems caused by missing mutex releases (Figure 2, see...). PDF).

One of the biggest highlights of the C++11 multithreaded library is Atomics. With their help, standard data types such as int or float can be declared as atomic. The operations on a variable of type atomic (or atomic_int for C programs) are indivisible and therefore no longer need to be explicitly protected, e.g., by a mutex (Figure 3, see PDF).

Furthermore, a new memory model was introduced, which allows classic multithreading problems caused by compiler or processor optimizations to be reliably solved using memory barriers and atomic variables. In addition, atomics, with their `compare_exchange` methods, create the prerequisites for lock-free programming. With the exception of the `atomic_flag` data type, the standard does not guarantee that the operations of an atomic data type are lock-free. However, this can be checked using the `is_lock_free()` method. In principle, it is even possible to define custom atomic data types.

For event synchronization, C++11 provides condition variables. Using the `wait` method of the condition variable, a thread can wait for a condition to be met. If another thread executes code that meets the condition, the `notify_one` method of the condition variable must be called after the execution of that code to wake the waiting thread. `notify_all` can even wake multiple waiting threads simultaneously. In any case, after returning from the wait state, a thread must re-examine the condition, as the runtime system might wake the thread for other reasons as well.

Finally, the `call_once` function ensures that a function is only executed once, regardless of how often or by how many threads it is called. This is typically needed in the context of initializations.

Futures

The return value of a C++11 thread function cannot be used to provide a result, as is the case with many other multithreading platforms. Instead, an object of type `future` must be used. It can be used to temporarily store the result if the receiving thread is not yet ready, or to block the receiving thread if the result is not yet available. There are three different uses for it.

The application variant with the lowest level of abstraction consists of a promise The process creates a Future object and makes it available to the thread that determines the result, which it then passes to the Promise object using the `set_value()` method. The receiver obtains a Future object using the Promise object's `get_future()` method. After completing any necessary parallel operations, it can then access the result using the Future object's `get()` method, blocking if required until the result is available.

At the next higher level of abstraction, a `packaged_task` object is created and parameterized with a function that returns the result as a normal return value. The `get_future` method of the `Task` object then returns the corresponding `Future` object. To achieve concurrency, a thread must again be created and parameterized with the `Task` object. The result is passed as in the first solution.

At the highest level of abstraction, only the `async` function specifies the code of the function that returns the calculated result using a normal return value. Whether the function is executed asynchronously in its own thread or synchronously in the context of the thread requesting the result can be left to the runtime system or determined using a parameter. `async` returns the Future object directly as a return value (Figure 4, see...). PDF).

The result of a future The object can only be read once. If multiple threads require the same result, then `shared_future` must be used. -objects are worked on.

Things that the standard does not support

Unlike, for example, the POSIX/Pthread standard, the C++11/14 multithreading API does not offer a direct way to assign different priorities to threads or to create mutexes with priority inheritance or ceiling priority. This is a significant limitation, especially for use in real-time systems. While it is possible to obtain the handle of the corresponding runtime thread using the `native_handle()` method of the `thread` class and use it to set the thread's priority at the operating system level, this approach compromises portability and does not solve the problem of priority inversion.

Furthermore, C++11/14 lacks support for asynchronously interrupting or terminating a thread. While the associated complexity generally necessitates avoiding these features, this is often difficult in practice. The POSIX/Pthread standard offers good support in this area as well.

Other things that those familiar with other multithreading solutions might miss in C++11/14 include, for example, general (counting) semaphores, as well as efficient event flag mechanisms for signaling pure events where no data access needs to be synchronized.

Things to consider during the porting process

When porting multithreaded applications, it's crucial to be aware that the scheduling mechanisms of different platforms can vary significantly in detail. This applies not only to the scheduler's thread allocation behavior but also, for example, to the fairness of synchronization objects. While some aspects might be too slow, which is particularly problematic for real-time systems, a ported application should, in principle, function on any supported platform without modification. If this isn't the case, it's often because the design implicitly assumed that certain characteristics of the original platform were always present.

Sometimes, porting problems stem from the fact that the standard has been implemented differently by compiler vendors. Aside from actual implementation errors, this can also be due to the standard not specifying everything down to the last detail. For example, if a mutex is released by a thread that doesn't actually own the mutex, the response depends on the implementation. For instance, GCC 4.9.2 on Linux executes the operation, while Visual Studio 2015 on Windows throws an exception (see Figure 5). PDF).

Reference

„C++ Concurrency in Action“, Anthony Williams

Download the article as a PDF

Implementation – our training & coaching

Do you want to bring yourself up to date with the latest technology?

Then find out more here MircoConsult offers training courses/seminars/workshops and individual coaching on the topic of implementation/embedded and real-time software development.

Training & coaching on the other topics in our portfolio can be found here. here.

Implementation – Expertise

Valuable expertise in the field of implementation/embedded and real-time software development is available. here Available for you to download free of charge.

To the specialist information

You can find expertise on other topics in our portfolio here. here.

MicroConsult Newsletter

With the MicroConsult newsletter, you'll stay on the pulse of the embedded world. Look forward to proven practical knowledge, real professional tips, and current events – directly from our experts for your project success.

Subscribe now!

Published by

weissblau media

← Stack & Heap Reliable and secure device drivers →