## Verifying absence of integer overflow

One class of errors we need to guard against when writing critical software is arithmetic overflow. Before I go into detail, I invite you to consider the following program and decide what it prints:

#include<stdio.h>intmain(intargc,char*argv[]) {unsigned intx = 42;longy = -10; printf("%s\n", (x > y ? "Hello, normal world!" : "Hello, strange world!"));return0; }

I’ll return to this example later. For now, consider the function for averaging a set of readings stored in an array that I examined at in my previous two posts (eCv-specific bits are shown like this):

int16_t average(constint16_t *arrayreadings, size_t numReadings)pre(readings.lwb == 0; readings.upb == numReadings)pre(numReadings != 0)returns((+overreadings.all)/numReadings) {intsum = 0; size_t i;for(i = 0; i < numReadings; ++i)keep(i <= numReadings)keep(sum == +overreadings.all.take(i))decrease(numReadings - i) { sum += readings[i]; }return(int16_t)(sum/numReadings); }

I reported in the earlier post that eCv is able to verify this, apart from proving absence of integer overflow. Here are the verification warnings that eCv reports:

c:\escher\ecvtest\ecvtest14.c (17,13): Warning! Exceeded time limit proving: Arithmetic result of operator ‘+’ is within limit of type ‘int’, suggestion available (see C:\Escher\ecvTest\ecvtest14_unproven.htm#9), did not prove: minof(int) <= (sum$loopstart_5398,5$ + readings[i$loopstart_5398,5$ - readings.lwb]). c:\Escher\ecvTest\ecvtest14.c (17,13): Warning! Exceeded time limit proving: Arithmetic result of operator ‘+’ is within limit of type ‘int’, suggestion available (see C:\Escher\ecvTest\ecvtest14_unproven.htm#10), did not prove: (sum$loopstart_5398,5$ + readings[i$loopstart_5398,5$ - readings.lwb]) <= maxof(int). c:\escher\ecvtest\ecvtest14.c (19,22): Warning! Unable to prove: Type constraint satisfied in implicit conversion from ‘int’ to ‘unsigned int’, suggestion available (see C:\Escher\ecvTest\ecvtest14_unproven.htm#18), did not prove: minof(unsigned int) <= sum$5398,5$.

Any time we do an explicit or implicit conversion from one type to another type that is not wider than the original or that is defined by a constrained typedef, or we do certain arithmetic operations, we need to guard against integer overflow. This example contains several such operations.

The first implicit type conversion is when we initialize the loop counter *i* to the constant 0.
We declared *i* to be of type **size_t**, which is an unsigned type, whereas the constant
0 has type (signed) **int **because we didn’t use the U suffix. So there is an implicit
conversion from **int **to** size_t** here, and eCv will generate verification conditions
that the constant 0 is in the range of **size_t**. Of course, the proof is trivial.

The other implicit type conversion is in the subexpression *sum/numReadings* in the **return **
statement. Since *numReadings *has type **size_t** and **sum** has type **int**,
the “usual arithmetic conversions” will be applied. In practice, **size_t** is never
smaller than **int**, so there will be an implicit conversion of *sum *to **size_t**.
eCv therefore generates the verification condition that the current value of *sum *is representable
in** size_t**. As there is nothing to prevent *sum* from being negative, the proof fails –
hence it emits the third warning message listed above.

In fact, eCv – like most ordinary static analyzers and some compilers – warns about the signed/unsigned mismatch even before starting the proofs. Avoiding signed/unsigned mismatches in the first place is probably a more efficient strategy than dealing with the verification failures that often occur when you don’t.

To fix the error, we can either change the type of the parameter *numReadings *to **int**, or cast it
to type **int **before using it as the divisor. If we do the latter, we’ll also need to add a function
precondition that *numReadings *is not too large to fit in an **int**.

For the integer operators + – and * it can be the case that applying the “usual arithmetic conversions”
to an operator expression does not involve an unsafe implicit type conversion, yet the arithmetic result is not
representable in the result type. The C standard defines the result in this case as undefined if the promoted
operands are signed, or the arithmetic result modulo some *N *if they are unsigned. eCv treats it as an
error in either case, because unsigned types are frequently used where modulo arithmetic is not the user’s intention;
so it always generates verification conditions that the result is in type. This is the reason for the
first two verification
failures that eCv reports for our original example. The expression
`sum += readings[i]`

could easily overflow,
because the sum could be as high as *maxof(int16_t) * maxof(size_t)* or as low
as *minof(int16_t) * maxof(size_t)*, which are larger than *maxof(int)*
and *minof(int)* respectively. To avoid overflow, we will need to constrain either
the minimum and maximum values of each reading, or the number of readings. Here’s our
example with the precondition changed to limit the number of readings:

int16_t average(constint16_t *arrayreadings, size_t numReadings)pre(readings.lwb == 0; readings.upb == numReadings)pre(numReadings != 0; numReadings <= maxof(int)/(maxof(int16_t) + 1))returns((+overreadings.all)/numReadings) {intsum = 0; size_t i;for(i = 0; i < numReadings; ++i)keep(i <= numReadings)keep(sum == +overreadings.all.take(i))decrease(numReadings - i) { sum += readings[i]; }return(int16_t)(sum/(int)numReadings); }

I’ve referred to the maximum value supported by type **int **as *maxof( int)*.
eCv provides type operators

*maxof(T)*and

*minof(T)*for your convenience, to save you from having to #include <limits.h> or similar just for specification purposes. I’ve also cast

*numReadings*to

**int**prior to using it in the division.

With the compiler parameters set to assume **int **is 32 bits and **short int** is 16 bits,
eCv verifies this example completely. You may be wondering why the
expression `(+ `

in the **over** readings.all)/numReadings**returns **specification
doesn’t also
generate a signed/unsigned warning or verification failure. That’s because integer arithmetic in
specifications is always carried out after promoting operands to type **integer**, the eCv
integral types that has no bounds.

Finally, let’s return to the short program I invited you to consider at the start of this post. Is the world
normal or strange? On my computer, compiling with Microsoft Visual C++ 2008, it’s strange.
That’s because for this compiler, both (unsigned) **int **and **long **are 32 bits.
So both operands of less-than get converted to **unsigned long** … and -10 won’t fit! For
the world to be normal, you’ll need to use a compiler for which **long **is larger than
**int**, so that both operands get converted to (signed) **long **instead.