42 years of complexity metrics – what's stopping us?

Effectively using software complexity metrics

Authors: Thomas Grundler, Hendrik Post, Jochen Quante, Sadi Yigit, Robert Bosch GmbH

Contribution – Embedded Software Engineering Congress 2018

The most well-known software complexity metric was introduced by Thomas J. McCabe in 1976 and has sparked discussions among generations of software developers about the significance of such metrics. The following article describes how software complexity metrics are used in the automotive division of Robert Bosch GmbH.

Initial situation

Our goal is to identify software product metrics that reveal which software components developers struggle to understand and further develop. The overarching question, therefore, is: How can software maintainability be measured, and how can this measurement be used to guide the software development process?

There are many metrics: „Lines of Code“ emerged as early as the 1960s, McCabe introduced cyclomatic complexity in 1976, and many other metrics emerged in the 1980s and 1990s, considering one aspect or another of software.

In the automotive sector, the HIS metrics were introduced in 2005. HIS stands for "Manufacturer Initiative Software," and the HIS comprised the automotive manufacturers Audi, BMW Group, DaimlerChrysler, Porsche, and Volkswagen. The goal of the HIS was to establish a basic set of metrics for evaluating testability and software quality in general. The HIS metrics consist of 15 software metrics with corresponding threshold values, which are to be collected for each "compilable unit.".

Our experience today is that a great deal of measurement is being done. There is an overwhelming number of metrics and tools on the market, all claiming to be able to measure and assess the quality of a software product. However, to our knowledge, few consequences are currently drawn from these measurements, and it is often unclear what the purpose of measuring certain metrics is even supposed to be. Furthermore, measuring a large number of metrics (possibly with their individual violations of a predefined threshold) is often not helpful in drawing the necessary conclusions.

Preliminary work

To get closer to the goal of being able to measure the maintainability of software using product metrics, the „MI method“ (Maintainability Index) according to Oman and Hagemeister was applied (P. Oman, J. Hagemeister: „Construction and Testing of Polynomials Predicting Software Maintainability“, in Journal of Systems and Software, Vol. 24, No. 3, 1994.First, a representative selection of so-called program versions of the Bosch engine control unit was measured. Measurements were taken on the C code, regardless of whether it was handwritten or generated. The measured metrics consisted of the HIS metrics and many other complexity metrics known from the literature, such as Halstead, cognitive complexity, outliers, and variable lifespan.

The next step involved calculating correlations between the metrics to obtain metrics with the most orthogonal information possible. For this, a principal component analysis was used. Principal component analysis provides two pieces of information: firstly, which metrics form a so-called cluster (i.e., are highly correlated), and secondly, what proportion of the total information this cluster contributes. Subsequently, a representative of each cluster was selected, and using the information contribution of the cluster, it was determined how many clusters could be included in the subsequent steps without losing too much information. Table 1 shows a typical result of the principal component analysis.

Cluster	Information content	Metrics (excerpt)
1	70%	Lines Of Code, Number of Statements in Function, Cyclomatic Complexity, Halstead Difficulty, …
2	9%	Estimated Static Program Paths, Number of Local Variables Declared, …
3	7%	Deepest Level of Control Flow Nesting,...
4	6%	Number of function parameters, …
5 .. 12	< 2%

Table 1: Results of the principal component analysis

Especially in larger software functions, many metrics correlate with "Lines Of Code", and this cluster also contains the largest share of the total information.

Design Guidelines (HIS metrics re-evaluated)

Based on knowledge of the correlations among the metrics, various metrics were selected that both encompass all clusters and have a certain steering effect. Steering effect in the sense that an improvement in the metric also leads to an improvement in software quality and is consistent with modern development paradigms. Furthermore, emphasis was placed on ensuring that the metrics are relatively easy to determine and understand. This allows developers to estimate in advance the impact of a specific code change on the new value of that metric. To be used as a design guideline, metrics must also exhibit a certain degree of selectivity; that is, the chosen threshold for the metric must not result in the detection of virtually all functions of the entire software.

With this background and experience gained from numerous developer reviews and interviews, six HIS metrics, including their thresholds, were selected to serve as design guidelines for new software modules programmed in C. Table 2 shows these metrics.

Metric	Designation QA-C®	Recommended area
Cyclomatic complexity	STCYC	1 .. 20
Number of „statements“ per function	STST3	1 .. 100
Static estimation of possible paths	STPTH	1 .. 1000
Maximum nesting depth of control structures	STMIF	0 .. 6
Number of jump instructions	STGTO	0
Number of function parameters	STPAR	0 .. 12

Table 2: Design specifications for software modules written in C

BMI (Bosch Maintainability Index)

To control and prioritize the software development process, a further simplification and reduction of key performance indicators was both sensible and necessary. Therefore, the second part of the "MI" method according to Oman and Hagemeister was applied.

The opinions of functional experts regarding the maintainability of the respective function were solicited. Care was taken to relate the complexity/maintainability to the problem to be solved in order to ensure comparability. A formula was developed using linear regression and metrics/clusters significant for maintainability, providing a good indicator of the maintainability of the software module.

This so-called BMI (Bosch Maintainability Index) is composed of several metrics and is normalized from +100 (very good maintainability) to -100 (very critical maintainability). All metrics used work in the same direction: the higher the metrics, the more difficult the functions are to maintain. The BMI is applied to hand-written C code and C code generated with the "ETAS ASCET" tool. For C code generated from Simulink® models, these code metrics provide little insight into the model itself. Therefore, a "ModelBMI" was developed that measures metrics directly on the Simulink® model and is currently in the pilot phase.

The BMI's hit rate is approximately 80%, meaning that 80% of the software modules classified as complex/difficult to maintain by the BMI procedure are also classified as such by the experts.

Not all aspects of "maintainability" can be packed into a single index (even by using significantly more metrics) – we solve the problem by integrating them appropriately into the software development process.

process

At the start of a project, customer projects assess their software and initiate a discussion with experts about the findings. Only those modules that experts deem difficult to maintain and where frequent changes are expected in the future are restructured. This is achieved through assessments, in which a panel of experts makes a well-founded and documented decision. Software that has been running smoothly in the field for many years and is no longer being modified is left untouched.

Mindset

Our goal is to identify software product metrics that reveal which software components developers struggle to understand and then further develop. The subsequent restructuring step should then focus on improving the software itself, not solely on the metrics. We don't view metrics as absolute constraints that must be followed by action. We trust our developers more than the metrics and allow for justified exceptions.

However, if a certain BMI threshold is not met, a discussion with documentation of the results in the SCM system is mandatory. This ensures that the metrics are perceived more as support than as a constraint. The ease of use (metrics including BMI are provided to developers by default during static code checks) and the focus on a limited number of parameters also contribute to the acceptance of the measurement system. The establishment of a support team for software developers has also been very well received: This team provides assistance both in evaluating the findings and in the concrete restructuring of the software.

Conclusion

With the right mindset (trusting developers more than metrics) and a focus on a few key performance indicators (KPIs), metrics are a meaningful and important addition to daily software development. Developers need support in metrics that enables open and objective discussions about the maintainability of "their" modules. This openness has resulted in, and continues to result in, the desired steering effect of complexity metrics, leading to more maintainable and less complex modules.

Metrics can also significantly increase the transparency of software's internal workings in management. For example, they have demonstrated that a certain degree of aging is normal for long-lived software, meaning that software must be adapted to new conditions from time to time, and that the restructured version then provides a good basis for further development. In our view, complexity metrics, even at the advanced age of 42, have lost none of their usefulness – nor any of their potential for discussion.

author

Thomas Grundler has 20 years of experience in software development for embedded systems. His focus areas are software analysis and software maintenance. Since 2004, Thomas Grundler has worked at Robert Bosch GmbH and is responsible for the topic of "complexity measurement and complexity reduction" of the classic engine control software in the Powertrain Solutions business unit.

Download the article as a PDF

Testing, Quality & Debugging – Our Training & Coaching

Do you want to bring yourself up to date with the latest technology?

Then find out more here MircoConsult offers training courses/seminars/workshops and individual coaching on the topics of testing, quality & debugging.

Training & coaching on the other topics in our portfolio can be found here.

Testing, Quality & Debug – Expertise

Valuable expertise on the topics of testing, quality & debugging is available. here Available for you to download free of charge.

To the specialist information

You can find expertise on other topics in our portfolio here. here.

MicroConsult Newsletter

With the MicroConsult newsletter, you'll stay on the pulse of the embedded world. Look forward to proven practical knowledge, real professional tips, and current events – directly from our experts for your project success.

Subscribe now!

Published by

weissblau media

← Software development for multicore systems System testing of eHealth service robots in a home environment →