Order of parameter evaluation, some pitfalls

Introduction

In the C++ world, it’s a good idea to create multiple build configurations for your projects to try out different compilers. Especially when your project is open-source, you never know which compiler (version) people will use in the build process, so being conservative with compilers might lead to bug reports which are difficult to close. Not being open to bugs occuring with other compilers can even lead to memory leaks not being noticed. In this post, I’m going to elaborate on this, using as an example a bug I ran into the day before.

The bug

Let me first describe the scenario in which the bug occurred. The game fruitcut that I’m currently working on (shameless plug detected!), loads md3 model files to C++ structures and uploads these structures to the graphics card for rendering. The md3 loader always worked fine in gcc (multiple versions) and Microsoft Visual Studio 2010 (MSVC2010), but it had one peculiar bug in all versions of clang, even the latest trunk.

Wrong texture coordinates on a melon (in-game screenshot)

Wrong texture coordinates on a melon (in-game screenshot)

As you can see on the left, the mesh looks alright, but the texture coordinates are clearly wrong. Since this bug hasn’t been fixed in 7 months (due to not caring much about clang, to be honest), I decided to investigate. I quickly found out that the texture coordinates’ axes are flipped in clang, so instead of (x,y) we somehow get (y,x) on that compiler.

What a strange bug! This “flipping”, however, can basically only occur in two places: 1. when reading the file and storing the texture coordinates in the internal vector structure, or 2. when the loaded model is uploaded to the GPU. I decided to look at the code for step 1 first. A single (x,y) texture coordinate pair is defined as follows:

class md3_texpos
{
private:
  vector<float,2> coordinates_;
public:
  md3_texpos(
    std::istream &_istream)
  :
    coordinates_(
      md3_read_scalar<float>(
        _istream),
      md3_read_scalar<float>(
        _istream))
  {
    std::swap(
      coordinates_.x(),
      coordinates_.y());
  }

  vector<float,2> coordinates() const
  {
    return coordinates_;
  }
};

The coordinate consists of a two-dimensional vector and, on construction, gets an input stream to read its data from (whether or not this is good object-oriented style is incidental). The function md3_read_scalar<T> reads raw bytes and converts them to a numeric type T, taking care of endianness conversion. md3 stores values in little endian, the host machine might be big endian, though, so this is important.

Now, although I actually am the author of this code (thanks, git-blame!), I was still puzzled by the call to std::swap in there. There was no comment, so I didn’t know what it was supposed to do. It contradicted the md3 specification. So the first thing I did was remove it and recompile with clang and gcc. Voila, the texture coordinates were now broken in gcc and fixed in clang. The reason was now clear to me. It has to do with C++ leaving certain things implementation-defined. In this case, it is argument evaluation order.

It’s actually pretty simple. Consider the following code fragment, which has three function calls in it:

f(g(),h());

The question is: In which order do the three functions get called? Clearly, f has to be called last, because its arguments have to be initialized before the call. The C++11 standard enforces this in 1.9 15:

When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function.

However, both g → h → f and h → g → f are possible evaluation orders which satisfy that requirement. What does the Standard make of that? Well, for reasons such as performance optimizations by reordering, the order is simply not defined, see 5.2.2 4:

When a function is called, each parameter (8.3.5) shall be initialized (8.5, 12.8, 12.1) with its corresponding argument. [ Note: Such initializations are indeterminately sequenced with respect to each other (1.9) — end note ]

Thus, the implementation (i.e. the compiler) has two options for translating the constructor initialization list of md3_texpos, shown in pseudocode below:

md3_texpos(                                md3_texpos(
  std::istream &_istream)                     std::istream &_istream)
{                                           {
  float x =                                   float y =
    md3_read_scalar<float>(                     md3_read_scalar<float>(
      _istream);                                  _istream);
  float y =                                   float x =
    md3_read_scalar<float>(                     md3_read_scalar<float>(
      _istream);                                  _istream);
  coordinates_ =                              coordinates_ =
    vector<float,2>(x,y);                       vector<float,2>(x,y);
}                                           }

Of course, md3_read_scalar is not a pure statement (see my post on this). It modifies the input stream, advancing it by 4 bytes. This makes the implementation on the right side behave incorrectly, although it’s valid for the compiler to generate such code. The mysterious std::swap in the original code can now be explained. Both gcc and MSVC2010 chose the right implementation which necessitates a swap, clang chose the left one (at least in this case).

To fix the bug, I rewrote the class so we store the vector’s components separately:

class md3_texpos
{
private:
  float x_,y_;
public:
  md3_texpos(
    std::istream &_istream)
  :
    x_(
      md3_read_scalar<float>(
        _istream)),
    y_(
      md3_read_scalar<float>(
        _istream))
  {
  }

  vector<float,2> coordinates() const
  {
    return vector<float,2>(x_,y_);
  }
};

This fixes the issue and thanks to encapsulation, the API doesn’t change!

Memory leaks

The bug described above isn’t a critical one, it didn’t make the application crash or cause other serious failures. Being careless in combination with the new operator, however, can easily result in memory leaks. Consider the following code snippet:

// Function declaration:
void f(int,std::shared_ptr<T>);

// Function call
f(g(),new T());

As I’ve described above, this (loosely) translates either to

int u = g();
T *t = new T();
f(u,std::shared_ptr<T>(t));

or to

T *t = new T();
int u = g();
f(u,std::shared_ptr<T>(t));

As an exception-safe programmer, you might have already spotted the problem: The second implementation leaks memory if g() throws an exception, because nobody deletes t! And as you can see, using smart pointers like std::shared_ptr<T> does not automatically fix the leak, because the raw pointer is converted to a shared pointer after the throw statement in g.

There are two ways to solve this problem. The first is to immediately convert the raw pointer to a smart pointer:

f(g(),std::shared_ptr<T>(new T()));

This way, either…

  1. g() is executed
  2. g() throws
  3. new is never called

…or…

  1. a new T is constructed on the heap
  2. the raw pointer to the new T instance is stored inside a std::shared_ptr<T>.
  3. g() is executed
  4. g() throws
  5. The destructor of the std::shared_ptr<T> causes a delete t;
  6. No leak occurs

The second solution is std::make_shared<T>, a variadic function that returns a std::shared_ptr<T>:

f(g(),std::make_shared<T>());
Posted in Uncategorized | Tagged , , , , , , , | Leave a comment