## Expressing the Inexpressible

When writing preconditions, postconditions and other specifications for C programs, sometimes we need
to write expressions that can’t be expressed in plain C. That’s why formal verification systems
based on annotated programming languages almost always augment the expression sublanguage with
**forall **and **exists **expressions. In previous posts, I’ve introduced
eCv’s implementations of these. For example, the following expression yields **true **if all
elements of the array *arr *are between 0 and 100 inclusive:

forallindinarr.indices :- arr[ind] >= 0 && arr[ind] <= 100

Here, *ind *is declared as a bound variable that ranges over the values in the expression that follows the
keyword **in**, which in this case is all the indices into *arr*.

Similarly, this expression:

existsindin0..9 :- arr[ind] >= 0 && arr[ind] <= 100

yields **true **if at least one of the first ten elements is in the range 0 to 100. I’ve used one of
eCv’s special operators here: the range operator “..”, which yields a sequence of values from its
first
operand up to its second operand. In fact, in the first example,* arr.indices* is defined as
*arr.lwb .. arr.upb*, so I was implicitly using the range operator there too.

Providing **forall **and **exists **to quantify over finite sets and sequences lets
us express many sorts of specification, but isn’t always enough. Very occasionally, we need to use
**forall **or **exists **to quantify over a potentially infinite type, which eCv also allows.
For example, the following expression yields **true **if *p(x)* is true for all values of *x*
belonging to type *T*:

forallT x :- p(x)

However, there are many cases in which a specification still cannot be expressed. For example, consider the following function for averaging a set of readings stored in an array:

int16_t average(constint16_t *arrayreadings, size_t numReadings)pre(readings.lwb == 0; readings.lim == numReadings)pre(numReadings != 0)post(result== ?/* sum of elements of readings *//numReadings ) {intsum = 0; size_t i;for(i = 0; i < numReadings; ++i)keep(i <= numReadings)keep(sum == ?/* sum of first i elements of readings */)decrease(numReadings - i) { sum += readings[i]; }return(int16_t)(sum/numReadings); }

I’ve included some eCv annotations (highlighted like this)
in this example, to specify that the valid indices of *readings *are* 0..(numReadings – 1)*, and
that *numReadings *isn’t zero so that the final division operation will be valid. I’ve also provided
a loop invariant and loop variant (see my posts on verifying loops if you aren’t familiar with these).
However, in the postcondition I’ve put a question mark where I need to express “sum of all elements of
*readings*“, and in the loop invariant I’ve put another question-mark where I want “sum of
the first *i* elements of *readings*“. How can we express these quantities?

In this case, there is an easy way, and another way that is less easy but more general. Let’s start with the
easy way. eCv is derived from Perfect Developer,
and when writing eCv specifications you can use most of the library types and expression types provided by PD.
In particular, type *seq of T* from PD is treated as equivalent to *T[ ]* in C. So we can
use *Perfect *sequence operations on C arrays. A list of member functions of *seq of T* can
be found in the Perfect Developer Library Reference. In PD, most of these functions are available
for use anywhere, since code can be generated for them; but when used in eCv, they are of course
all “ghost” functions – that is, functions that can be used in specifications only.

The particular *Perfect *expression type we need here is the **over **expression,
which expresses repeated application of an operator over the elements of a collection. Those with a
background in functional programming may recognise it as *left-fold*. Our postcondition can be written:

post(result== (+overreadings.all)/numReadings)

We’ve had to use *readings.all* rather than just *readings *as the
operand of +**over**, because *readings *is an array pointer, and we need
to provide a genuine array to +**over**. Hence eCv provides the array pointer type
with ghost member *all*. You can think of* readings.all* as yielding the
sequence *readings[readings.lwb]*, *readings[readings.lwb + 1]* and so on up to
*readings[readings.upb]*. We’d have liked to use **readings *to mean the
array that *readings *points to, but of course in C that just yields the value of the first element.

For the loop invariant expression, we can use +**over **again, but we
need to apply it to the first *i* elements of *readings *rather than all elements.
The *seq of T* class provides member *take *for this purpose, allowing us to use:

keep(sum == +overreadings.all.take(i))

That’s enough to verify our function, apart from dealing with a potential integer overflow
when summing the elements, which I’ll return to in a later post. Next time I’ll demonstrate
how we can use a *ghost function* to define the notion of summation without recourse to
**over**-expressions.