Select Page

Compile-time programming

Author: Rainer Grimm

Contribution – Embedded Software Engineering Congress 2015

What do classic template metaprogramming, the functions of the type traits library, and constant expressions have in common? They are all executed at compile time. This combines higher performance with extended functionality. Higher performance because runtime calculations are shifted to compile time. Extended functionality because programming at compile time can modify the resulting C++ source code. But how does all this magic work?

Template metaprogramming

In 1994, Erwin Unruh discovered template metaprogramming by chance. His program calculated the first 30 prime numbers at compile time. The output of these prime numbers was part of the error message.

01 | Type `enum{}' can't be converted to txpe `D<2>' (‘primes.cpp‘,L2/C25).

02 | Type `enum{}' can't be converted to txpe `D<3>' (‘primes.cpp‘,L2/C25).

03 | Type `enum{}' can't be converted to txpe `D<5>' (‘primes.cpp‘,L2/C25).

04 | Type `enum{}' can't be converted to txpe `D<7>' (‘primes.cpp‘,L2/C25).

05 | Type `enum{}' can't be converted to txpe `D<11>' (‘primes.cpp‘,L2/C25).

06 | Type `enum{}' can't be converted to txpe `D<13>' (‘primes.cpp‘,L2/C25).

07 | Type `enum{}' can't be converted to txpe `D<17>' (‘primes.cpp‘,L2/C25).

08 | Type `enum{}' can't be converted to txpe `D<19>' (‘primes.cpp‘,L2/C25).

09 | Type `enum{}' can't be converted to txpe `D<23>' (‘primes.cpp‘,L2/C25).

10 | Type `enum{}' can't be converted to txpe `D<29>' (‘primes.cpp‘,L2/C25).

How does all this magic work? The compiler instantiates the templates and generates the temporary C++ source code, which is then compiled along with the rest of the source code. At runtime, only the executable code is available.

Thus, calling the faculty function leads to Factorial<5>::value The small code example shows that the value is calculated at compile time.

template

struct Factorial{

    static int const value= N * Factorial ::value;

};

template <>

struct Factorial<1>{

    static int const value = 1;

};

std::cout << Factorial<5>::value << std::endl;

std::cout << 120 << std::endl;

The following illustration nicely shows (see PDF), that the value of the factorial of 5 is already available as a constant at runtime. The compiler instantiates the expression in this case. Factorial<5>::value. To calculate this value, he needs the expression Factorial<4>::value. This recursion ends when the value Factorial<1>::value is needed, because its value is 1.

Functions such as the Factorial Functions that are executed at compile time are called metafunctions. They possess some interesting properties. In addition to integers, they can also use types and class types as template parameters. Metafunctions are essentially class templates. They cannot modify their data, but rather generate new data as needed.

Template metaprogramming, which is based on metafunctions that act on their metadata at compile time, is a purely functional sublanguage in the imperative language C++. This functional sublanguage is Turing-complete [1] and is comparable in power to the programming languages C++, C, or Java.

Template metaprogramming is the basis for many Boost libraries [2]. This also applies to the Type Traits library, which has been part of the C++ standard since C++11.

The Type-Traits library

The Type Traits library allows type queries, comparisons, and transformations to be performed at compile time. It is an application of template metaprogramming and primarily serves two purposes: correctness and optimization.

correctness

The Type-Traits library, in combination with the static_assert Expression: to impose binding conditions on the source code that are applied at compile time.

The qcdThe algorithm for calculating the greatest common divisor of two integers is far too generic. While it can be used with floating-point numbers, strings, and various data types such as... int (100) and long int (10L) is used. However, the compiler responds with a very verbose error message.

#include

#include

template

T gcd(T a, T b){

  if( b == 0 ) return a;

  else return gcd(b, a % b);

}

int main(){

  std::cout << gcd(100,10) << std::endl; // 10

  std::cout << gcd(100,33) << std::endl; // 1

  std::cout << gcd(100,0) << std::endl; // 100

  std::cout << gcd(3.5,4.0) << std::endl; // ERROR

  std::cout << gcd(„100″,“10“) << std::endl; // ERROR

  std::cout << gcd(100,10L) << std::endl; // ERROR

}

The Type-Traits library allows you to explicitly define the conditions for the algorithm. And all this without any additional cost to the program's runtime.

template

typename std::conditional<(sizeof(T1) ::type gcd(T1 a, T2 b){

  static_assert(std::is_integral ::value, „T1 should be integral!“);

  static_assert(std::is_integral ::value, „T2 should be integral!“);

  if( b == 0 )return a;

  else return gcd(b, a % b);

}

So the new qcd-Algorithm to determine if the two template arguments T1 and T2 are integers: std::is_integral ::value. The return type for the greatest common divisor is given by the qcd-Algorithm returns the smaller of the two types: std::conditional<(sizeof(T1) ::type.

optimization

The Type Traits library allows you to write code that optimizes itself during compilation. For example, the Standard Template Library (STL) contains optimized versions of well-known algorithms. std::copy, std::fill or std::equal. 

The idea is quite simple. Whenever the algorithms operate on containers whose elements are sufficiently simple, an optimized algorithm is used. This makes it possible to avoid copying the elements element by element (std::copy), to fill (std::fill) or to compare (std::equal), but to process entire container areas. This is where C functions come in. memmovememset and memcpy for use.

Whether the container elements are sufficiently simple is determined at compile time by the functions of the Type-Traits library.

template

void fill_impl(I first,I last,const T& val,

               const std::integral_constant &){

   while(first != last){

    *first = val;

    ++first;

  }

}

template

void fill_impl(T* first, T* last, const T& val, const std::true_type&){

  std::memset(first, val, last-first);

}

template

inline void fill(I first, I last, const T& val){

  typedef integral_constant <bool,is_trivially_copy_assignable ::value

                                 && (sizeof(T) == 1)> boolType;

  fill_impl(first, last, val, boolType());

}

 

Becomes std::fill When called, the compiler uses the implementation function fill_impl with std::memset if and only if the elements of the container have a copy constructor automatically generated by the compiler and are one byte in size: is_trivially_copy_assignable ::value && (sizeof(T) == 1.

The performance difference is significant. For example, the program fills an array with 100,000,000 elements.

 

const int arraySize = 100'000'000;

char charArray1[arraySize]= {0,};

char charArray2[arraySize]= {0,};

int main(){

  auto begin= std::chrono::system_clock::now();

  fill(charArray1, charArray1 + arraySize,1);

  auto last= std::chrono::system_clock::now() – begin;

  cout << „charArray1: “ << duration (last).count() << “ seconds“;

  begin= std::chrono::system_clock::now();

  fill(charArray2, charArray2 + arraySize, static_cast (1));

  last= std::chrono::system_clock::now() – begin;

  cout << „charArray2: “ << duration (last).count() << “ seconds“;

}

 

While the elements of the charArray1 While `charArray2` is initialized with the number 1, which is usually 4 or 8 bytes in size, `charArray2` is initialized with a char: static_cast (1).  

The high-performance version of the algorithm is 20 times faster (see figure)., PDF file).

Constant expressions

Constant expressions can be evaluated at compile time and stored in ROM. They give the compiler deep insight into the source code and are implicitly thread-safe.

Constant expressions exist as variables, user-defined types, and functions. Unlike template metaprogramming, they allow programming at compile time in an imperative style. To ensure that constant expressions are evaluated at compile time, a few rules must be followed.

  • Variables are implicit const and must be initialized by constant expressions: constexpr double pi= 3.14;
  • User-defined types cannot have a virtual base class. Their constructor must be compiler-generated and itself a constant expression. Methods that use objects of user-defined types must be constant expressions and cannot be virtual.
  • Functions cannot be static and thread_local Variables contain

The gcdThe `-function` is a constant expression. If its arguments are also constant expressions, it can be evaluated at compile time. The following program demonstrates exactly that:

#include

constexpr auto gcd(int a, int b){

  while (b != 0){

    auto t= b;

    b= a % b;

    a= t;

  }

  return a;

}

int main(){

  consexpr int i = gcd(11, 121);

  std::cout << i << std::endl; // 11

  constexpr int j= gcd(100,1000);

  std::cout << j << std::endl; // 100

}

A deeper look into the object code reveals that the results of gcd-Calls already available at compile time:

mov $0xb,%esi

mov $0x601080,%edi

mov $0x64,%esi

mov $0x601080,%edi

Constant expressions versus template metaprogramming
constexpr functions differ significantly in detail from metafunctions used in template metaprogramming. For example, they can execute constant expressions at compile time and runtime, and contain jump and iteration statements as well as conditional statements. Furthermore, constexpr functions support data manipulation and working with floating-point numbers.
Compile-time programming as a powerful tool
Template metaprogramming, the type traits library, and constant expressions all have in common that they are executed at compile time. However, while template metaprogramming requires a deep understanding of purely functional programming, the use of the new C++11 type traits library is significantly more intuitive for programmers trained in imperative programming. Only the naming conventions are reminiscent of template metaprogramming. Constant expressions, as defined by the C++14 standard, allow for direct programming in the imperative style.
Further information

[1] Turing completenesst

[2] Boost library

Rainer Grimm, „C++11 for Programmers“, O'Reilly 2013

Rainer Grimm, „Modern C++ in Practice“, series in Linux Magazine


Real-time – MicroConsult Training & Coaching

Do you want to bring yourself up to date with the latest technology?

Then find out more here MircoConsult offers training courses/seminars/workshops and individual coaching on the topic of embedded and real-time software development.

Training & coaching on the other topics in our portfolio can be found here. here.


Real-time expertise

Valuable expertise in architecture/embedded and real-time software development is available. here Available for you to download free of charge.

To the specialist information

You can find expertise on other topics in our portfolio here. here.

MicroConsult Newsletter

With the MicroConsult newsletter, you'll stay on the pulse of the embedded world. Look forward to proven practical knowledge, real professional tips, and current events – directly from our experts for your project success.

Subscribe now!

Published by

weissblau media

weissblau media