Techniques for measuring software activity in real time
Author: Ulrich Dreher, iss innovative software services GmbH
Contribution – Embedded Software Engineering Congress 2016
Although debuggers, emulators and other development tools have made tremendous progress in recent decades, they lack one feature that would greatly facilitate the development, debugging and validation of real-time systems: the ability to link software events and "real world" events in "hard real time".
This paper addresses overcoming these weaknesses: It presents techniques that allow the real-time measurement of selected software properties. The focus is on the speed and scope of measurement data that can be provided, enabling insights into software processes. Using this data, software processes can then be correlated with real-world events.
A small glossary
The following terms may not be in common usage. Therefore, a small glossary is provided below:
| Term(s) | Meaning |
| PCB | Printed Circuit Board – circuit board |
| Target control unit | This refers to a unit whose function is dominated by a processor or controller and its software. However, a control system may also be implemented on a "control" device. This may mean. Her Device meant. |
| physical event physical quantity real world |
In different contexts, these terms refer to "things" that can be "touched" or measured. In contrast, "software" is not surreal, but is generally difficult to measure. |
| (finite) state machine | also known by the English name „(finite) state machine“ |
| Real-time | will be examined in more detail in the following chapter |
What is real time?
„Real-time requirements are ubiquitous in control systems. However, the definition of "real-time" depends significantly on the specific application: "real-time" can refer to response times (cycle times) that differ by orders of magnitude (see Table 1).
| cycle time | Examples of applications |
| 100 ms | User interfaces, PLC (programmable logic controller) |
| 1 – 10 ms | „classic“ control systems |
| 10 µs – 200 µs | Frequency converters („inverters“) for synchronous machines |
| < 10 µs | „Digital Power Supply“ (digitally controlled power supplies) For example, a 1 kW AD/DC power supply with a 200 kHz control loop. |
Table 1 – Examples of different interpretations of „real-time“
Table 1 lists only a few examples, and there are gaps between the specified time ranges. In reality, each application tends to redefine the attribute "real" in terms of timing requirements, often distinguishing between "hard" real-time (response times up to a maximum of 5 ms) and "soft" real-time (response times > 5 ms). Even the 5 ms threshold is arbitrary – depending on the organization and the task at hand, a different value may be used for this purpose.
As mentioned earlier, emulators and debuggers have made tremendous progress in recent decades in terms of features and overall processing power. However, a function for linking "real-world" events with "software events" is still lacking. While some debuggers today may offer a (small) number of digital inputs, this is rarely sufficient when it comes to correlating real-world events with software processes.
And software tools that can record measurements at high speed are not particularly helpful if the recorded data is only visualized after the triggering event is long gone.
However, there are some techniques – some very simple and yet largely unknown – that are suitable for filling the gaps described above.
The following describes some of these methods, which allow the identification of relationships between physical events and software processes. They make it possible to measure and thus evaluate software specifications (such as interrupt latencies, task order, or control software variables) in real time. This is a capability that existing tools lack.
Let us therefore consider some regularly recurring questions and possible methods for making the "working" of software measurable, and assess how well they are suited to answering the chosen questions.
The questions
The following are abstractions of some questions that have repeatedly arisen in my work. Past application examples attempt to clarify the issues at hand.
| „"How much computing power is actually still available?"“
A question that comes up with remarkable regularity. An episode from the development of the inverter of a PMSM with 20 kHz current control: The initial answer from the responsible developer was "still quite a lot". Applying one of the methods described below revealed that the current CPU utilization was 95 %. From that point on, the processor load was under constant control. |
| „"Does my regulation behave as necessary and expected?"“
An example from the further development of an injection control unit: a state machine was implemented to control the processes. State machines are typically "located" within a single function that is called cyclically or event-driven. They are extremely popular because they are easy to implement and very robust. In this case, the standard solution was not an option: the states were partly time-controlled and partly event-driven. And due to technical constraints, the individual parts of the state machine had to be implemented "scattered throughout the software". The development was carried out using a tool that allowed the measurement and visualization of the state variables: the sequence was very clear: 0 → 1 → 2 → 3 → … → 7 → 0. Apart from a premature termination (… → 0), the sequence had to be executed in the same order every time. The timing, however, depended on the occurrence of physical events. Maintaining the correct sequence was very easy to verify visually on the oscilloscope, as was the correlation with the physical events. |
| „"Why isn't the software working as it should?"“
A question that comes up again and again. When developing a fast measurement system that monitored the individual cells of a fuel cell system, the customer wanted a few special "features" (subsampling, different sequencing of the cells). Implementing the standard measurement procedure was straightforward. However, implementing the additional features was very complex due to more intricate sequences and algorithms. Debugging was also difficult, as the raw numerical values were not particularly meaningful. The availability of a special development tool (the "historical" OLDA) was enormously helpful: the crucial variables (two values) could be visualized in real time. It was clear what the "image" of these two variables should look like on the oscilloscope. A single run of the software was sufficient to determine whether the result was good or bad. |
| „"Is my system behaving as planned?"“
This question is also asked with remarkable regularity (see first point). System design typically involves planning which subtasks are to be implemented in which tasks and processes. Since not everything goes according to plan, regular monitoring of the status quo is essential. Numerous software solutions are now available for this purpose. However, as soon as "events" come into play, these software solutions quickly reach their limits. An independent tool that requires only a minimal amount of instrumentation is advantageous in this situation. |
| The „king discipline“: The evaluation of sin/cos angle sensors.
Currently, sine/cosine angle sensors are preferred for measuring the angle of synchronous machines because they offer high resolution with comparatively little effort. To derive the angle from these values, the arctangent function is needed. As is often the case, the arctangent function has a critical phase every 90° where small errors in the sine and cosine measurements can lead to significant angular errors. Wouldn't it be ideal to be able to continuously monitor the measured values and perhaps compare them with the physical measurements – or even with the calculated angle value? |
| Finally, an increasingly important question:„What effect does a change in hardware configuration have?“ Factors to be assessed include, for example, the influence of cache size, cache configuration, the use of inline code, etc. Essentially, this question can be answered by using the aforementioned software solutions for runtime measurement. However, these require extensive code instrumentation, which is likely to distort the measurements. A less invasive measurement method would be desirable. |
Some preliminary remarks
When it comes to real-time measurements on "real" objects, the question immediately arises as to which resources are actually available to perform these measurements. To support most (though perhaps not all) targets with a single tool, expectations regarding resource availability must be minimized. At the same time, these resources should offer maximum speed so as not to limit the measurement from this perspective.
Taking these limitations into account, two alternatives present themselves:
- The use of one (or more) unused pins
- The use of an interface that is not otherwise occupied (which of course also requires at least one available pin)
The measurement methods listed below consider examples of both cases. Furthermore, all of the measurement methods considered require additional equipment for carrying out the measurements: an oscilloscope, a logic analyzer, or a data logger is needed as a measuring instrument, for data recording, visualization, and, if necessary, storage.
The measurement methods
The aforementioned limitations regarding the availability of resources result in a number of possible variations for making measurement data available:
| A: „Bit Banging“ (abbreviated: „BB“)
If an unused pin is available that can be contacted, the simplest of all interfaces is available. It can provide 1 bit of information. (Even slightly more than 1 bit if we consider outputting pulse sequences to distinguish between different events.) Anyone who remembers the very first PIC microcontrollers will know that they did not have a hardware UART; instead, serial interfaces with impressive baud rates were implemented using bit-banging. |
| B: „Multi-Bit Banging“ (abbreviated: „MB“)
If you have more than one unused pin available, more complex outputs are possible: you can output more than one bit at a time. Or you can make measurable something that contains more than one bit of information: for example, the state of a state machine. Since pins in hardware are usually organized as byte-wide (or even wider) "ports", MB can become quite complicated if the unused bits belong to different ports. |
| C: Use of a digital-to-analog hike[1] (short: „DAC“)
In rare cases, your processor has an unused DAC. The difficulty in using it lies partly in its availability and partly in connecting it to the measuring instrument: analog signals may require an amplifier on your board, which limits its use to development control units. |
| D: Use of an unused UART[2]
Modern processors are equipped with plenty of digital interfaces. The likelihood that you have an unused UART is not insignificant. In this form, however, establishing a real-time connection to physical events is difficult, as the serial bitstream is difficult to evaluate. |
| E: The OLDA[3] („OnLine Data Analyzer“)
The historical OLDA is a tool that was only available to a small group of users and therefore remained largely unknown. |
In the following, we will examine cases A/B (an investigation is only meaningful here with regard to the "breadth" of information) and C/E in more detail. (Insofar as the necessary hardware is available, C is a (faster) special case of E.)
Evaluation of the performance of the individual procedures
At this point, we will limit ourselves to summarizing the results of the evaluation of the procedures – a more detailed examination and some sample recordings can be found in the presentation slides.
Considering the questions and the possible measurement methods, the following Table 2 results with regard to suitability and performance:
| Task |
BB |
/ |
MB |
DAC |
/ |
OLDA |
| Runtime measurement of a single task |
+++ |
/ |
+++ |
+++ |
/ |
++ |
| Development and validation of a state machine |
o |
/ |
++ |
+++ |
/ |
++ |
| General algorithm development |
– |
/ |
+ |
+++ |
/ |
++ |
| Monitoring the processing of tasks and processes |
– |
/ |
++ |
+++ |
/ |
++ |
| sin/cos analysis |
– |
/ |
o |
+++ |
/ |
+++ |
| Changing the hardware configuration |
o |
/ |
+ |
+++ |
/ |
++ |
Table 2 – Suitability of the measurement methods for the different tasksWithout anticipating the content of the presentation too much: a locally available DAC is naturally always superior to a serially connected DAC in terms of speed – assuming identical quality of execution. And BB/MB are unsurpassed in terms of speed, unless the signal output speed is artificially reduced to minimize emissions.
What exactly is an OLDA?
As mentioned above, OLDA stands for "OnLine Data Analyzer". The term "analyzer" is somewhat misleading, as the OLDA doesn't actually incorporate any analysis functions. Its function is that of a DAC – that is, the conversion of digital values into an analog voltage. The actual analysis is then performed using the connected measuring instrument (oscilloscope, data logger).
The target connection of the "historical" OLDA was established via the address and data bus – an interface that proved increasingly unreliable due to the increased clock frequencies. Furthermore, the wide connection cable repeatedly proved to be a hindrance. Ultimately, despite its usefulness, the historical OLDA was no longer usable.
The successor: the µOLDA
For several years we had to do without an OLDA and make do with BB/MB and similarly limited methods, as no adequate tool was available and no practical solutions were available with regard to the interfaces to the target.
With increasing integration density and performance of microcontrollers, a new way has now opened up to provide such a function: the serial connection of the DAC – the µOLDA.
Serial interfaces are now fast enough to achieve baud rates greater than 1 Mbit, and in many cases even greater than 10 Mbit. This makes data rates from 100 kS/s up to the MS/s range achievable. Furthermore, there are usually more serial interfaces (UART or SPI) available on a target device than are actually used in the application.
This realization led to the development of µOLDA, whose components and „ecosystem“ are shown in Figure 1 (PDF) are shown.
The µOLDA system consists of 3 components:
- The so-called "target adapter" – basically a simple electronic converter.
- A fiber optic connection. For reasons of simplicity and robustness, the Versatile Link system was chosen, which uses polymer optical fiber (POF) to allow cable lengths greater than 30 m and also offers advantages over copper cables in terms of flexibility. Furthermore, the interference immunity of optical connections is unsurpassed.
- The actual µOLDA (in which the optoelectric converter, the DACs and the necessary logic are installed).
To complete the system, you will also need power supplies, the aforementioned oscilloscope, and a cable to connect the target adapter to your target.
Hardware and software requirements
The µOLDA is therefore a (dual-channel) DAC. And since it operates via a serial connection, you need a serial interface on the target device. The list of necessary hardware requirements for the target device is pleasingly short. You will need:
- an unused UART. UARTs with a FIFO (First-In, First-Out) configuration are helpful, but not required. These allow for somewhat simpler code instrumentation, although this can come at the expense of data latency.
- a free pin on the target that you can use as the TX pin of the UART.
- a little current (<5 mA, 3.3 – 5 V) and
- a short 3-wire connection cable (GNC, VCC, signal) for connecting the target and target adapter.
That's it.
What? You don't have an unused UART? In that case, you can also use an unused SPI interface. It just needs to support data formats < 10 bits, "LSBit first," so you can emulate the data stream of a UART.
You don't have an unused SPI interface either? We need to discuss this calmly sometime…
And what are the requirements on the software side?
- A little code to initialize the UART
- The code instrumentation to write values to the transmit register
- Only in special cases: a little more code to check whether the transmit register can accept the next value.
- For some specific applications (e.g., measuring task sequencing), you will also need a few bytes of RAM. Bytes – not kilobytes!
What you don't need: Interrupts, ISRs, and the like.
The target adapter – the interface to your target
Some things are unlikely to ever change:
- The baud rates of UARTs correspond to a maximum of half the UART's clock frequency. (Dividers of /4, /8, and /16 are more common). For the highest data transfer rates, we don't need to worry about further clock divisions by a baud rate generator.
- The same principle applies to SPI interfaces. However, your processor's SPI port might be faster than its UART. This varies from case to case.
The target adapter actually requires more than the specified 5 mA to function. Therefore, it incorporates galvanic isolation to reduce the current draw from the target to the aforementioned 5 mA. To avoid placing an unnecessary load on your target, the isolated side of the target adapter is powered by its own separate power supply.,
The target adapter is quite small. This means it might still fit inside your housing, which could improve its robustness in some cases. However, should you wish to integrate it into your target, we will gladly provide the circuit diagram and bill of materials.
The µOLDA – some key data
| Baud rate (UART or SPI) |
DC – 16 MBd |
|
| Serial transmission format (payload data) |
8-bit (9-bit) |
|
| Analog outputs |
2 |
|
| resolution |
7-bit, glitch-free |
|
| Maximum update rate per channel |
1.6 MS/s |
|
| Addressing the outputs |
„"hard"“ (i.e., each channel has a fixed, non-configurable address) 8-bit frames : 1 address bit |
|
| Supported number formats |
unsigned, signed |
|
| Output voltage | unsigned 0 – 5 V signed -2.5 – 2.5 V |
|
Seven bits of resolution may seem quite low at first glance. However, if you take a closer look at your oscilloscope's datasheet, you'll likely find an ENOB (Effective Number Of Bits) of 7.5 – hardly better than the 7 bits of the µOLDA.
Given the decision to use a standard serial interface, 7 data bits also allow for the transmission of 1 or 2 address bits (depending on the frame length), thus enabling the output of 2 or 4 variables, respectively. And still allow for the highest data rates.
Overall, we consider this design decision a good compromise in terms of speed, ease of use, and resolution.
Summary
As demonstrated, outputting digital values as analog voltages allows for the correlation of real-world events and software responses. If a DAC is not available on the target, it is now possible to connect a DAC in series, thus enabling such measurements even for targets without a (free) DAC.
And how can you benefit from this?
We are not familiar with your current problems, so a specific answer to this question is not possible. However, if it would be helpful to be able to measure some real-time data from the software:
Stay tuned!
References
| [1] | FreeRTOS: Trace Hook Macros [More Advanced] https://www.freertos.org/rtos-trace-macros.html |
| [2] | „Debugging Embedded Systems with Minimal Resources“. Circuit Cellar July 12, 2016 https://circuitcellar.com/cc-blog/debugging-embedded-systems-with-minimal-resources/ |
| [3] | „On Real-Time Measurement Techniques“. Ulrich Dreher, February 24, 2016, embedded world Conference 2016. |
Download the article as a PDF file
Architecture – MicroConsult Training & Coaching
Do you want to bring yourself up to date with the latest technology?
Then find out more here MircoConsult offers training courses/seminars/workshops and individual coaching on the topic of architecture/embedded and real-time software development.
Training & coaching on the other topics in our portfolio can be found here. here.
Architecture – Expertise
Valuable expertise in architecture/embedded and real-time software development is available. here Available for you to download free of charge.
You can find expertise on other topics in our portfolio here. here.
