Pure functions in C/C++

Introduction

Functions that only depend on their parameters and not on any global state are usually called pure functions. An example of a non-pure function is this:

int counter = 0;

int return_global_counter(int a)
{
  return counter++;
}

This function is a corner case. It doesn’t depend on the parameters at all and it returns a different value each time it is called. Functions returning void are another example of non-pure functions.

There are lots of examples of pure functions. For example, mathematical functions such as abs, pow and the operator+ for integers are pure. In fact, I’m telling the story backwards here. Pure functions are supposed to model mathematical functions, which are inherently pure (which “mathematical global state” should they depend on, anyway).

The question is, how are pure functions represented in C/C++. The canonical answer is: They aren’t. There’s no keyword to denote or enforce the purity of a function, so from a Standard’s viewport, this article ends here. However, almost all major compilers (except MSVC, apparently) support extensions that allow one to mark a function as pure.

In fact, there are two “purity attributes” that you can add: const and pure.

pure and const

I’ll just quote the gcc manual to explain what “const” means:

Many functions do not examine any values except their arguments, and have no effects except the return value. Note that a function that has pointer arguments and examines the data pointed to must not be declared const. Likewise, a function that calls a non-const function usually must not be const. It does not make sense for a const function to return void.

As you can see, this models purity as explained above, with some sensible constraints (for example, you cannot “break out of” purity by calling non-pure functions). The pointer constraint, however, is very restricting, especially in C++, because references seem to be treated as pointers in this context, too. I couldn’t find any good information on that, though.

The second attribute is pure and it means the following:

Many functions have no effects except the return value and their return value depends only on the parameters and/or global variables.

This is a less strict form of const. We’re allowed to read global variables, but not write to them.

You can attach these attributes in the following form to a function:

int output(int x) __attribute__((const));

int output(int x)
{
  return x+1;
}

int output2(int x) __attribute__((pure));

int output2(int x)
{
  return x+1;
}

Note that attaching an attribute to a function definition is illegal.

Implications

In theory

Determining which functions are pure/const is a vital part of the optimization process every modern compiler goes through. It is used, for example, in simplifying the loop:

for(container::iterator it = container_.begin(); it != container_.end(); ++it)
{
  ...
}

For most containers, the end member function is pure. It just returns a pointer to one-past the last element. This information allows the compiler to optimize away all but one call to the end function:

for(container::iterator it = container_.begin(),end = container_.end(); it != end; ++it)
{
  ...
}

This, of course, assumes that the loop body doesn’t do anything “nasty” like clearing the container.

The problem with this optimization is that it doesn’t scale to multiple translation units. Let’s assume we have this loop:

for(int i = 0; i < fibonacci(100); ++i)
{
	...
}

Since fibonacci(100) is a constant value, the compiler would like to optimize it this way:

for(int i = 0,end = fibonacci(100); i < end; ++i)
{
	...
}

Thus saving many unnecessary calculations. However, let’s say fibonacci is defined in a different translation unit (in a separate math.cpp file maybe). Then the compiler just cannot optimize it. It doesn’t have the information available to infer that this function is const. Bummer.

This is where the function attribute kicks in. Declaring the function like this:

int fibonacci(int) __attribute__((const));

allows the compiler to optimize the loop.

In practice

The question remains: Do the compilers give a crap? Let’s see what happens in a simple example:

#include <iostream>

int output() __attribute__((const));

int output()
{
    std::cout << "lol | ";
    return 1;
}

int main()
{
    output();
}

Compiling this with g++ -Wall test.cpp results in the following output:

test.cpp:14:11: warning: statement has no effect [-Wunused-value]

How did that happen? Well, pure and const functions just read data, they do not write to it. Also, “by contract”, they do not modify the environment (print values and such things). So calling a pure/const function and not using the result simply doesn’t make any sense, which is what gcc (and clang) is trying to tell us here.

That also explains that when we run the program, nothing is printed. Even without optimization enabled, the whole function call gets eliminated (again, by both clang and gcc). Apparently, this type of dead code is removed in a very early stage of analysis.

Let’s modify the example so the function cannot be completely eliminated:

#include <iostream>

int output() __attribute__((const));

int output()
{
		std::cout << "lol | ";
		return 1;
}

int main()
{
	std::cout << output() << " | " << output();
}

What happens here is very interesting. Let’s go though all the optimization/nonoptimization and clang/gcc cases:

Output with -O0 Output with -O3
gcc lol | lol | 1 | 1 lol | 1 | 1
clang lol | 1 | lol | 1 1 | 1

Apparently, gcc doesn’t remove the cout statement in output no matter how hard you tell it to optimize. clang removes the statement when at least -O1 is enabled. gcc also seems to reorder the code a bit so the cout statements are evaluated first and then the values in main are printed. This is, of course, permitted since declaring the function const and then “breaking the contract” implies undefined behavior.

Note that the compilers do not issue a warning about the illegal cout statement in the const function. Purity isn’t checked or enforced, unfortunately.

Random notes

Note that in gcc-4.6.3 there’s a warning flag -Wsuggest-attribute=pure and -Wsuggest-attribute=const that tells you (by using optimization analysis) candidates for purity/constness. This seems to work, but it “misses” a lot of functions. For example, trivial getter functions such as the getI function below are candidates for purity (not constness, though):

class testclass
{
private:
  int i;
public:
  int getI() const { return i; }
};

This doesn’t get suggested by gcc, though.

Be cautious when using pure and const with template classes. If you have a container template, for example, you’re better off making no assumptions on the container’s value_type. Maybe the user puts a class in it and expects certain copy semantics, for example. These would be indeterministic with pure and const attributes.

Also, be cautious with const and classes. The simple getter function above is a candidate for pure, but not for const, as this example shows:

class testclass
{
private:
  int i;
public:
  testclass() : i(0) {}
  void increase() { i++; }
  int getI() const __attribute__((const));
};

int testclass::getI() const { return i; }

int main()
{
  testclass c;
  std::cout << c.getI();
  c.increase();
  std::cout << c.getI();
}

This will output “00” instead of the correct “01”. However, returning a reference might make the getter a candidate for const:

class testclass
{
private:
  int i;
public:
  testclass() : i(0) {}
  void increase() { i++; }
  int const &getI() const __attribute__((const));
};

int const &testclass::getI() const { return i; }

int main()
{
  testclass c;
  std::cout << c.getI();
  c.increase();
  std::cout << c.getI();
}

This will output “01”, as the reference does not change because of the call to increase.

Another minor thing: Not all mathematical functions from math.h are const, or even pure. sqrt, for example, sets errno on an invalid input, and thus modifies global state.

This entry was posted in Uncategorized and tagged , , , , , , . Bookmark the permalink.

7 Responses to Pure functions in C/C++

  1. Neither “const” nor “pure” attributes are sufficient for my particular requirement – that the function always returns the same value, given the same parameters, regardless of global state.
    This is, in my opinion, the only criterion that allows a call to be eliminated as a common sub-expresson.
    My particular problem is that I want to transform a function that rebalances a tree, but that tree could be accessed through a memory-map to a file on disk.
    This is fine if (a) you have mmap, and (b) the integer values in the file were stored by another program with the same architecture/endian-ness.

    To solve this, I would need to somehow send the whole recursive rebalance function through a common-sub-expression dummy parse and use the optimised result as a compiler input.

    This is because pread() is neither const nor pure, but should (barring hardware or disk error or a pwrite to the same storage) always return the same data.
    More than that, these reads are pure, but their purity doesn’t out-scope the rebalance function.
    Nasty!

  2. Pingback: Order of parameter evaluation, some pitfalls | pimiddy

  3. Alec says:

    “-Wsuggest-attribute=pure and -Wsuggest-attribute=const that tells you (by using optimization analysis) candidates for purity/constness … but it “misses” a lot of functions. For example, trivial getter functions such as the getI function below are candidates for purity”
    I don’t know how the C++ front-end of the compiler works, but I assume it’s effectively passing a pointer to the class instance to getI.
    i.e. getI() becomes:
    int getI(testclass * self) { return self->i; }
    and thus isn’t pure.

  4. Alec says:

    *Oops, I misinterpreted you. Ignore my comment.

  5. whitslack says:

    “Note that attaching an attribute to a function definition is illegal.”

    Even if that is technically true (and I’m not sure that it is), GCC allows you to attach the const and pure attributes to function definitions:

    int __attribute__((const)) output(int x) { return x + 1; }

  6. Pingback: Thoughts On Bridging the “Lacuna Humana” | Codecraft

  7. Pingback: Is Fortran easier to optimize than C for heavy calculations?

Leave a comment