## Specification with Ghost Functions

In my previous post I showed that the C expression sublanguage extended with quantified
expressions (**forall **and **exists**) is insufficient to allow some
specifications to be expressed. I presented this function (annotated with an incomplete specification)
to average an array of data:

int16_t average(constint16_t *arrayreadings, size_t numReadings)pre(readings.lwb == 0; readings.lim == numReadings)pre(numReadings != 0)post(result== ?/* sum of elements of readings *//numReadings) {intsum = 0; size_t i;for(i = 0; i < numReadings; ++i)keep(i <= numReadings)keep(sum == ?/* sum of first i elements of readings */)decrease(numReadings - i) { sum += readings[i]; }return(int16_t)(sum/numReadings); }

The two question marks ? need to be replaced by expressions for sums over the appropriate
elements. Last time I showed how we can replace them by **over**-expressions. In this post, I’ll
describe an alternative solution that has more general applications.

The problem with which we are faced is that there is no C expression (even if we include **forall **and
**exists **expressions) that can express the sum of some of array elements, where the number of elements
is not statically known. We can calculate the sum using a loop; but a loop is not an expression and so cannot be used
in specifications. However, specification expressions can contain function calls, provided that the functions have no
side-effects. We can resolve our problem by defining a *ghost function* that calculates the sum. We’ll
specify the function recursively so that we can handle the variable number of elements.

In the postcondition of *average*, we need to calculate the sum of all the elements of the *readings *array.
In the loop variant, we need the sum of just the first *i* elements. So let’s declare a function that returns
the sum of the first *n *elements of an array *a*, where *a* and *n *are parameters. We define
the sum to be zero if *n *is zero, otherwise it’s the sum of the first *n-1* elements plus
the *n*th element. Here’s our ghost function specification:

#ifdef__ECV__ghost integersumOf(constint16_t *arraya, size_t n)pre(a.lwb == 0; n <= a.lim)decrease(n)returns(n == 0 ? 0 : a[n - 1] + sumOf(a, n - 1)); #endif

Declaring the function **ghost **tells eCv that it is for use in specifications only. A ghost
function must have a specification but no body. It can also do a few things that wouldn’t be allowed for
a normal function. In particular, it can use types not normally allowed in C, such as **integer**,
which is eCv’s name for the type of unbounded integers. I’ve declared the function as returning
**integer **so that neither we nor eCv need worry about arithmetic overflow in the specification.

The precondition defines the conditions for the function to be well-behaved, as usual. I’ve specified
the value that the function returns using a postcondition of the form **returns**(*expression*).
This has the same meaning as **post**(**result **== *expression*), but it’s shorter.
Also, we want to define the function recursively, and eCv doesn’t currently allow a recursive call to the
function being specified inside **post**(…) but it does inside **returns**(…).
I’ve enclosed the whole function declaration in **#ifdef __ECV__** … **#endif**
so that the C compiler doesn’t see it.

Recursion would normally be forbidden in critical software (MISRA rule 16.2), however recursive calls in
specifications are safe because they don’t carry any risk of run-time stack overflow. To ensure that the
specification makes sense, eCv will need to prove that the recursion is bounded. That’s what the
**decrease **clause is for, and it works exactly as if it were a loop variant – that is,
it must decrease on each recursive call and it must not go negative.

Having defined *sumOf*, we can complete the postcondition and loop invariant:

int16_t average(constint16_t *arrayreadings, size_t numReadings)pre(readings.lwb == 0; readings.lim == numReadings)pre(numReadings != 0)returns(sumOf(readings, numReadings)/numReadings) {intsum = 0; size_t i;for(i = 0; i < numReadings; ++i)keep(i <= numReadings)keep(sum == sumOf(readings, i))decrease(numReadings - i) { sum += readings[i]; }return(int16_t)(sum/numReadings); }

I’ve taken the liberty of replacing the original **post**(…) postcondition of average with
the **returns**(…) form.

Just like the version of *average * that I gave in my previous post, this one verifies completely except for possible
overflow during integer arithmetic. I’ll show how we can deal with that next time.