Escher Technologies Articles on Formal Verification>

Making sure variables are initialized

One source of program bugs is the use of variables before they have been initialized.

In C/C++ all static variables get zero-initialized if they have no specified initialization, so it is only local variables we need to worry about. Bugs caused by use of uninitialized local variables can be particularly nasty, because the value of such a variable depends on whatever previously occupied the same stack location. So the value read can vary depending on the past history of the program, and on the optimization level selected when the program was compiled (thereby causing debug builds and release builds to behave differently). An uninitialized variable bug that was benign and undetected in one version of a program can suddenly rear its ugly head when an unrelated change is made to the program.

We therefore need to ensure that all automatic variables are initialized before being used (MISRA C 2004 rule 9.1). Let’s consider the case of variables of primitive types first – we’ll look at structs, classes and unions later. Most static analyzers can warn about simple cases like this:

void foo() {
  int i;
  ...      /* code that doesn't assign i */
  bar(i);  /* error: i not initialized */
}

Many static analyzers will also warn about the following:

void foo() {
  if (b1) { i = 0; }
  ...      /* code that doesn't assign i */
  if (b2) { bar(i); }
}

If b1 in the first if-statement was true whenever b2 in the second if-statement is true, then such a warning is a false positive. One approach when using such a static analyzer is to add a dummy initialization for i, after satisfying oneself that the code is correct. Modern languages such as C# and Java use a concept of “definite initialization” that also requires a dummy initialization in this situation, otherwise the compiler will refuse to compile the program. However, eCv does a more thorough analysis and will only generate a warning if it can’t prove that b2 implies b1, thereby avoiding the need for a dummy initialization.

Here is a more difficult case:

void foo() {
  int i;
  bar(&i);  /* does this require i to be initialized first? */>
  fum(i);   /* has i been initialized yet? */
}

The problem here is that we don’t know whether bar() expects its parameter to point to initialized memory or not, or whether bar() always writes to the variable addressed by its parameter. If bar() is a fairly simple function, then a static analyser may be able to work it out; but that isn’t always possible, especially if bar() calls other functions whose source is not available.

eCv gets round this problem by providing an out-parameter annotation. If bar() is declared with the following signature:

void bar(out int* p);

then bar() does not require p to point to an initialized variable, but it is obliged to assign *p prior to returning. As usual with eCv keywords, when the program is being compiled, out is defined as a macro expanding to nothing, so the compiler doesn’t object to this syntax.

What about structs and classes? If you’re using C++, then I recommend that you define at least one constructor in every class or struct declaration. This prevents you from declaring a variable of that type without a constructor being called. eCv will require a constructor to initialize all fields of the class or struct before returning, and will verify this.

What about unions? You may recall from my previous post that eCv considers each union to include a ghost discriminant field that can be interrogated using the holds operator. Well, eCv considers an uninitialized union to hold nothing. Therefore, attempting to access a member of an uninitialized union will always result in a verification failure.

List of Articles

TOP