Create formally correct C code faster by using SPARK.

Finally, an easy way to write secure software.

Authors: Ingo Houben, Rob Tice, AdaCore

Contribution – Embedded Software Engineering Congress 2017

This work demonstrates how formally correct code can be generated in the C programming language through an approach that supports the necessary information for formal verification within the SPARK programming language itself. From our perspective, this is a novel approach to software development, offering numerous advantages by reducing development effort and thus saving costs. This is particularly relevant to the costs incurred when searching for errors in late development stages. The main focus of this work is to describe a possible workflow that minimizes costs associated with static analysis and formal methods for finding potential runtime errors in early development phases. We will limit ourselves to writing program code and will not consider topics such as safety studies and requirements engineering. We will also not provide a detailed description of how the formal methods function, as this would involve too many topics and details, which are already available in published work [4] [3].

The writing of software for applications is still largely done by humans. However, humans make mistakes, mostly unintentional and unconscious ones. Statistics show that approximately 10-15 errors occur in every 1,000 lines of code. If we now consider a safety-critical application with around 800,000 lines of code, we can assume that it contains 8,000 to 15,000 errors. Therefore, the development of safety-critical applications needs ways to drastically reduce the number of errors. Formal methods can help here. But this isn't just true for safety-critical applications; it also applies to software in our everyday lives. How often are users confronted with poorly written software? Crashes or incorrectly implemented functionality are commonplace. The user becomes a beta tester. The use of formal methods should therefore also be applied to everyday applications to improve software quality.

But first, let's look at what options are currently available to reduce the number of errors in the software.

A widely used approach is the application of coding standards, which every developer must adhere to. MISRA-C is a popular example. The C and C++ languages have many ways of being ambiguous, and guidelines help reduce the potential number of errors. They also make the code more readable for others. Tools are available to help programmers apply these standards. However, MISRA-C, and other standards as well, do not help find errors in the software. They only restrict the languages and help avoid syntactic errors and ambiguous use of constructs. Even if the rules of a standard are applied and syntactically correct code is present, it does not necessarily mean that it will function correctly [7].

Another approach is to increase the number of tests. More executed tests should, in theory, uncover more errors. The problem with this approach is that tests are usually designed to test the correct behavior. Modern development processes even stipulate that tests must be written before the application is developed. This highlights the problem, because the tests are written to reflect the correct behavior of the application and are not designed to find hidden errors in the program. Therefore, more testing does not necessarily mean that all errors will be found. Tests help to find more errors if they occur during the testing phase. Various tools are available that help identify the necessary tests by analyzing the software code. However, using these tools and executing the tests consumes more time in the development process [6].

Another approach could be to increase the number of reviews. This increases the likelihood of finding bugs, but it's still no guarantee. Since review work is done by people, we're back to the first problem: people make mistakes. So reviews aren't the perfect solution. They also increase the workload and consume more development time. Furthermore, reviews only make sense if the amount of code being reviewed is small. As the amount of code increases, or if the code becomes complex, the probability of overlooking problems also rises.

The next possible approach is the use of static analysis tools. This is a step in the right direction and is prescribed as essential by Mr. Holzmann in his ten rules [1] for the development of safety-critical systems. However, static analysis tools will not find functional errors in the code. As an example, we can consider a function that is supposed to sort data in ascending order but does so in descending order [8]. Static analysis tools will not find this type of error. In recent years, considerable effort has been invested in improving the quality of the algorithms in the analysis tools and reducing the number of "false positives." "False positives" are errors that the tools indicate but are not actually errors [8]. This has made static code analysis tools very useful and helpful—but it still leaves the gap of finding functional errors.

Certification is a combination of all the aforementioned methods. Different industries have different procedures that must be followed and then approved by a certification body. DO178 and EN 50128 are examples of this. While there is no official certification body for ISO 26262, it is still a step in the right direction. Certification guarantees a high standard, ensuring that all steps have been carried out correctly and that the number of errors has been kept to a minimum. However, it is still not a guarantee of the complete absence of errors.

Another way to find errors is through the use of formal methods. These are an extension of static analysis methods, but with a different purpose. Static analysis methods are used to find errors in software, while formal methods are used to develop error-free software from the outset.

The difference in effort invested between a test-based approach and an approach that uses formal methods can be seen in Figure 1 (see PDF) can be seen.

The red dashed line shows the cost of finding software bugs over the course of a development project; the solid lines indicate when and how many bugs are found. In later stages of the development process, the cost of fixing bugs increases significantly compared to finding them in earlier stages. This is precisely the case in traditional development processes with a test-based approach. Problems are found during the testing and integration phases, thus generating higher costs the later they are discovered. This has led to cost overruns in projects, reductions in planned functionality, or even total project failure. By using formal methods, bug detection can be postponed to earlier development stages, thereby significantly reducing costs. This is known as frontloading. This is illustrated on the right side of Figure 1 (see PDF) represented by the green, non-dashed line.

If we now come to the conclusion that static methods are helpful – what options are available?

There are two ways to apply formal methods. On the one hand, there is the FRAMA-C approach [2]. As the name suggests, this approach uses the C programming language. FRAMA-C provides a platform that can formally check code using several different tools. These tools are called provers. However, to enable these tools to do this, the C code must be annotated with additional comments that explain to the provers how the code should function. This represents an additional effort for the developers. It also means that an additional language must be learned, as the comments must be written in ACSL (ANSI/ISO C Specification Language) [9]. This approach has shown success. However, it also increases development time because additional effort is required, and more time is spent writing the program due to the insertion of the necessary comments.

Instead of combining two languages, there is a more efficient approach that contains all the necessary constructs in one language. This approach is pursued with SPARK [11]. The differences between FRAMA-C and SPARK can be seen in Figure 2 (see PDFSPARK is not a way to find errors, but rather a way to guarantee complete error-free software. The underlying mathematics is comparable to that of FRAMA-C, and the same provers can be used. Only the approach differs. FRAMA-C uses annotations in the form of comments, while SPARK is a subset of the Ada programming language and contains all the necessary annotations within the language itself. As demonstrated by the experiment of Christophe Garion and Jérôme Hugues, the weaker semantics of C require twice as many annotations and therefore twice the work [2]. They verified the complete error-free status of C and SPARK runtime libraries for AADL, using FRAMA-C for the C version of the library and SPARK for the Ada version, with the GNATprove tool. SPARK is thus a less labor-intensive approach than FRAMA-C because the programmer can work in the same language in which they write the code and the conditions for verifying correct functionality.

One obstacle to using SPARK is the requirement for an Ada 2012 compiler for the hardware in use. While there is very good support for many platforms, it is less likely that an Ada 2012 compiler will be available for less common platforms. It is more probable that an ISO-compatible C compiler will be available. However, even if no Ada compiler exists for a particular platform, it is still possible to utilize the advantages of SPARK.

The solution we present here is the use of the C language as an intermediary, allowing the hardware's C compiler to generate machine code. This enables us to write SPARK code and perform formal analyses on almost any platform. For simplicity, we limit our analyses to data and program flow analysis. This is comparable to the Bronze Level as described in the "Implementation Guidance for Adoption of SPARK" [12]. However, this concept can be elevated to the Silver or Gold Level to formally verify fundamental code properties.

In this example, we also used a hybrid approach, employing SPARK only for a few functions and leaving the rest in Ada or C. The advantage of this approach lies in limiting the effort to just a few functions that absolutely must be error-free, while the rest consists of, for example, legacy code. This offers the possibility of integrating SPARK and formal methods into software development step by step.

Our example demonstrates SPARK 2014 for an AVR platform, specifically an Arduino Uno. This is a 16-bit platform for which there is no Ada 2012 compiler. One disadvantage of using C as an intermediary is that you cannot utilize the full power of Ada as a programming language. The Common Code Generator (CCG) we use to generate C code from the Ada/SPARK code has only a minimal set of runtime components included with the toolchain. However, because we are generating C code, which we then convert into an executable in the Arduino build environment, we can use the Board Support Package (BSP), the runtime environment, and the flashing tools that Arduino provides with its platform.

Workflow overview

The development process is in Figure 3 (see PDFAs demonstrated, we wrote SPARK code and verified it using SPARK proofs. We then compiled the verified SPARK code to C using CCG. Next, we imported the C code into an Arduino-compatible library format and generated the necessary metadata. From the Arduino sketch, we called our application entry point, which was created by the CCG binding step. We then executed all the necessary setup sequences required by our application. Afterward, we called our workloop in the application from the `workloop()` function of the Arduino sketch. We then instructed the Arduino build system to search for the Arduino library that was output by CCG. Once this was done, we could build the software in the Arduino environment and deploy it to the hardware. During the build process, the Arduino environment used a set of configuration files to generate Makefiles and link the Arduino runtime environment, the BSP, our library, and the main loop of the Arduino sketch into the final binary, which could then be deployed to the target.

As mentioned earlier, we rely on the Arduino runtime environment in our SPARK application. The best way to do this is by using Ada's Pragma Import feature to tell CCG that we are calling external C functions. This is comparable to the `external` keyword in C. In our application, we use many files from the runtime environment. Some examples are shown in Figure 4 (see...). PDFAs demonstrated, in many cases it makes sense to wrap calls to C functions in a small subroutine to create a stronger and more robust interface. Compared to the loose binding of a direct call to the C routine, using a subroutine helps us to separate potentially unsafe C operations from the SPARK code and consider them separately from our SPARK application.

The application in our example is a simple robot that uses a reflective sensor cluster to follow a white line on a black background. The software architecture is shown in Figure 5 (see PDFThe top-level package, sparkzumo, creates the entry point for the Arduino application. The setup function allows the Arduino sketch to initialize the sensors and motors, as well as calibrate the sensor cluster. The workloop then calls the LineFinder function.

LineFinder is the most interesting part of the application because it does more than just a simple hardware and driver abstraction. In this part, we take the information from the sensor cluster, calculate and decide how the robot should stay on the line or even find the line again if it has lost it.

The data flow of this function can be seen in Figure 5 (see PDF). The first step is to determine the line's position relative to the robot. This is done using the ReadLine function. It calibrates the sensor cluster's values, filters out noise, and scales the values based on the robot's position. The sensor values are then summed and averaged to obtain an estimated value indicating the robot's position relative to the line. The underlying mathematics can be seen in Figure 6 (see PDF) can be seen.

A value of -2500 indicates that the line is at the far left of the robot. A value of 2500 indicates that the line is at the far right of the robot. A value of zero indicates that the line is centered under the robot. If we don't find the line under the robot, we will begin searching for it in increasingly larger circles until we find it. If we still can't find the line, we enter an offline error correction mode that counts how many times we have failed to find the line and changes the error values based on the number of iterations. This means that each time this function is called, a slightly larger radius is used. The function uses the previous value and employs a simple first-order correction loop. Ideally, when the value is zero, meaning the line is centered under the robot, both motors for left and right movement run at the same speed. If the error value deviates, the motors are controlled to compensate for the error and bring the robot back onto the line.

Based on this description, we can formulate the requirements for our code in SPARK to determine if the code does what it's supposed to do. Let's start with the ReadLine function and define requirements based on its functionality. We know that the function reads the values from the sensor cluster and should return a value between -2500 and 2500. We also know that if the line isn't found, the algorithm remembers where it last saw the line and returns a corresponding value. This helps us find the line more quickly if it's lost. We therefore have at least one variable, LastValue, which is updated on each iteration. We also know that if no line is found, the return value is either -2500 or 2500. From this, we can see the requirements shown in Figure 7 (see...). PDFCreate the specification shown.

We use SPARK's GLOBAL aspect to define the function's data dependencies on global variables. We expect LastValue to be read, modified, and written. We also use POST conditions to define the values after the function's execution, thus establishing a contract between the caller and the called function. Contracts with subroutines guarantee that the requirements are met. For example, we specify that if no line is found, the function's return value must be a maximum value, namely -2500 or 2500 of type Robot_Position. If the subroutine fails to meet this requirement during runtime, an exception is thrown, and error handling routines can be executed. Based on this information, SPARK can statically verify the requirements before the program is even executed. This allows us to remove the contracts later, making the binary leaner and faster, as we have already verified its correctness.

In the next function, Offline_Correction (Figure 8, see PDF) we modify the value of the variable Error with each iteration.

This causes the robot to move in increasingly larger circles. We know that we change the global variable Offline_Offset in the function and change the Error variable depending on the Offline_Offset variable. From this, we can see the behavior shown in Figure 8 (see...). PDFWe create the SPARK specification shown. Here, we define a global "in/out" dependency on the variable Offline_Offset. This tells SPARK that we will read and write this variable. We also use the DEPENDS aspect to represent the data flow dependency. In other words, we specify that the output variable Error depends on the input values of the variables Error and Offline_Offset. This tells SPARK that we modify Error by the value of the variable Offline_Offset and then increment Offline_Offset itself by a constant. We have also defined a POST condition that states that the Offline_Offset variable must be incremented by a constant. With this data flow dependency and the agreement between the programs, we have almost completely specified the behavior.

The next function, Error_Correct, is just as interesting as the previous one, since we take the error value as input and return the speeds for both motors. This function contains our correction algorithm. The specification for this function is shown in Figure 9 (see PDF) shown.

In this function, we define two data dependencies. We have Default_Speed as the input value and LastError as the "in/out" value, meaning we read, modify, and write the value. SPARK will ensure that we do not modify the variable Default_Speed in this function because we have defined it only as an input value. We have also defined data flow dependencies stating that RightSpeed and LeftSpeed depend on the values Error, LastError, and DefaultSpeed, respectively. We also define post-conditions that the variable LastError is modified by the value Error after the subprogram is executed. The SPARK tools can now verify that our state is modified with each execution of the loop and that our data and data flow dependencies are mathematically correct. The execution of the SPARK tools on our functions shows no errors or warnings in the static analysis report. In Figure 10 (see PDFThe possible results of a GNATprove run are shown, marked by the dashed line. Various analyses of data dependencies for LSP (Liskov substitution principle) verification are possible. If checks run successfully, no results are displayed for those runs. The results are shown in Figure 10 (see PDFAs you can see, the results are circled with a solid line, indicating that no data or data flow dependencies are violated in our software. However, there are still initialization errors, runtime problems, and an error in a contract definition between the caller and the called party. But these errors do not relate to the function we are currently examining. Therefore, we can assume that our code is formally correct.

Now we can pass the SPARK code to the CCG, compile it using the Arduino environment, and then upload it to the robot. Since we've already verified the correct functionality with SPARK, we can skip testing our functions.

Summary

The workflow shown leveraged the advantages of SPARK to examine our functions for data and data flow errors, ensuring error-free code even during the writing process. We were then able to transfer the SPARK application to hardware, even though no Ada 2012 compiler was available for it, by using C code as an intermediate step. Thus, we obtained formally correct C code through the use of SPARK and transferred it to the hardware.

outlook

In the next steps, we will try to eliminate the remaining errors from the SPARK tools and improve the control algorithm. The goal is two robots fighting each other, pushing each other out of a limited playing field. May the better algorithm win.

References

[1] Gerard J. Holzmann: The Power of Ten -- https://spinroot.com/gerard/pdf/P10.pdf (2006)

[2] Christophe Garion and Jérôme Hugues: From learning examples to High-Integrity Middleware https://frama-c.com/download/framaCDay/FCSD17/talk/02_Garion_Hugues.pdf (2017)

[3] John W. McComick, Peter C. Chapin: Building High Integrity Applications with SPARK
ISBN: 978-1-107-65684-0

[4] Eduardo Brito: A very short introduction to SPARK: Language, Toolset, Projects, Formal Methods & Certification (2010)

[5] Barry Fagin and Martin Carlisle: Provable Secure DNS: A case study in formal methods
https://ironsides.martincarlisle.com/ICRST2013.pdf (2011)

[6] Barry S. Fargin, Bradley Klanderman, Martin C. Carlisle: Making DNS Server Resistant to Cyber Attacks: An Empirical Study on Formal Methods and Performance
https://ieeexplore.ieee.org/document/8029991/?reload=true (2017)

[7] www.misra.org.ukMISRA Compliance: 2016 , https://www.misra.org.uk/LinkClick.aspx?fileticket=w_Syhpkf7xA%3D&tabid=57 (2016)

[8] Dr.Paul Anderson: The Use and Limitations of Static-Analysis Tools to Improve Software Quality
https://pdfs.semanticscholar.org/3b8e/a53e62bfc753dabf1deb83be7572d069f6b1.pdf [2008]

[9] Patrick Baudin, Pascal Cuoq, Jean-Christophe Filliâtre, Claude Marché, Benjamin Monat, Yannick Moy, Virgile Prevosto: https://frama-c.com/download/acsl_1.4.pdf (2010)

[10] ISO/IEC JTC 1/SC 22/WG 9 Ada Rapporteur Group (ARG): Ada 2012 Language Reference Manual
https://www.ada-auth.org/standards/ada12.html (2012)

[11] AdaCore and Altran UK Ltd: SPARK 2014 Reference Manual
https://docs.adacore.com/spark2014-docs/html/lrm/ (2013-2017)

[12] AdaCore and Thales: Implementation Guidance for the Adoption of SPARK
https://www.adacore.com/uploads/technical-papers/ePDF-ImplementationGuidanceSPARK.pdf (2017)

Download the article as a PDF

Our training courses & coaching sessions

Do you want to bring yourself up to date with the latest technology?

Then find out more here Regarding training courses/seminars/workshops and individual coaching sessions offered by MircoConsult on the topic Quality, Safety & Security.

Training & coaching on the other topics in our portfolio can be found here.

Quality, Safety & Security – Expertise

Valuable expertise on the topics of quality, safety & security is available. here Available for you to download free of charge.

To the specialist information

You can find expertise on other topics in our portfolio here. here.

MicroConsult Newsletter

With the MicroConsult newsletter, you'll stay on the pulse of the embedded world. Look forward to proven practical knowledge, real professional tips, and current events – directly from our experts for your project success.

Subscribe now!

Published by

weissblau media

← Fuzzing of embedded software Traceability – often demanded, rarely efficient →