Introduction
In the C++ world, it’s a good idea to create multiple build configurations for your projects to try out different compilers. Especially when your project is open-source, you never know which compiler (version) people will use in the build process, so being conservative with compilers might lead to bug reports which are difficult to close. Not being open to bugs occuring with other compilers can even lead to memory leaks not being noticed. In this post, I’m going to elaborate on this, using as an example a bug I ran into the day before.
The bug
Let me first describe the scenario in which the bug occurred. The game fruitcut that I’m currently working on (shameless plug detected!), loads md3 model files to C++ structures and uploads these structures to the graphics card for rendering. The md3 loader always worked fine in gcc (multiple versions) and Microsoft Visual Studio 2010 (MSVC2010), but it had one peculiar bug in all versions of clang, even the latest trunk.
As you can see on the left, the mesh looks alright, but the texture coordinates are clearly wrong. Since this bug hasn’t been fixed in 7 months (due to not caring much about clang, to be honest), I decided to investigate. I quickly found out that the texture coordinates’ axes are flipped in clang, so instead of we somehow get on that compiler.
What a strange bug! This “flipping”, however, can basically only occur in two places: 1. when reading the file and storing the texture coordinates in the internal vector structure, or 2. when the loaded model is uploaded to the GPU. I decided to look at the code for step 1 first. A single texture coordinate pair is defined as follows:
class md3_texpos { private: vector<float,2> coordinates_; public: md3_texpos( std::istream &_istream) : coordinates_( md3_read_scalar<float>( _istream), md3_read_scalar<float>( _istream)) { std::swap( coordinates_.x(), coordinates_.y()); } vector<float,2> coordinates() const { return coordinates_; } };
The coordinate consists of a two-dimensional vector and, on construction, gets an input stream to read its data from (whether or not this is good object-oriented style is incidental). The function md3_read_scalar<T>
reads raw bytes and converts them to a numeric type T
, taking care of endianness conversion. md3 stores values in little endian, the host machine might be big endian, though, so this is important.
Now, although I actually am the author of this code (thanks, git-blame!), I was still puzzled by the call to std::swap
in there. There was no comment, so I didn’t know what it was supposed to do. It contradicted the md3 specification. So the first thing I did was remove it and recompile with clang and gcc. Voila, the texture coordinates were now broken in gcc and fixed in clang. The reason was now clear to me. It has to do with C++ leaving certain things implementation-defined. In this case, it is argument evaluation order.
It’s actually pretty simple. Consider the following code fragment, which has three function calls in it:
f(g(),h());
The question is: In which order do the three functions get called? Clearly, f
has to be called last, because its arguments have to be initialized before the call. The C++11 standard enforces this in 1.9 15:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function.
However, both g → h → f
and h → g → f
are possible evaluation orders which satisfy that requirement. What does the Standard make of that? Well, for reasons such as performance optimizations by reordering, the order is simply not defined, see 5.2.2 4:
When a function is called, each parameter (8.3.5) shall be initialized (8.5, 12.8, 12.1) with its corresponding argument. [ Note: Such initializations are indeterminately sequenced with respect to each other (1.9) — end note ]
Thus, the implementation (i.e. the compiler) has two options for translating the constructor initialization list of md3_texpos
, shown in pseudocode below:
md3_texpos( md3_texpos( std::istream &_istream) std::istream &_istream) { { float x = float y = md3_read_scalar<float>( md3_read_scalar<float>( _istream); _istream); float y = float x = md3_read_scalar<float>( md3_read_scalar<float>( _istream); _istream); coordinates_ = coordinates_ = vector<float,2>(x,y); vector<float,2>(x,y); } }
Of course, md3_read_scalar
is not a pure statement (see my post on this). It modifies the input stream, advancing it by 4 bytes. This makes the implementation on the right side behave incorrectly, although it’s valid for the compiler to generate such code. The mysterious std::swap
in the original code can now be explained. Both gcc and MSVC2010 chose the right implementation which necessitates a swap, clang chose the left one (at least in this case).
To fix the bug, I rewrote the class so we store the vector’s components separately:
class md3_texpos { private: float x_,y_; public: md3_texpos( std::istream &_istream) : x_( md3_read_scalar<float>( _istream)), y_( md3_read_scalar<float>( _istream)) { } vector<float,2> coordinates() const { return vector<float,2>(x_,y_); } };
This fixes the issue and thanks to encapsulation, the API doesn’t change!
Memory leaks
The bug described above isn’t a critical one, it didn’t make the application crash or cause other serious failures. Being careless in combination with the new
operator, however, can easily result in memory leaks. Consider the following code snippet:
// Function declaration: void f(int,std::shared_ptr<T>); // Function call f(g(),new T());
As I’ve described above, this (loosely) translates either to
int u = g(); T *t = new T(); f(u,std::shared_ptr<T>(t));
or to
T *t = new T(); int u = g(); f(u,std::shared_ptr<T>(t));
As an exception-safe programmer, you might have already spotted the problem: The second implementation leaks memory if g()
throws an exception, because nobody deletes t
! And as you can see, using smart pointers like std::shared_ptr<T>
does not automatically fix the leak, because the raw pointer is converted to a shared pointer after the throw
statement in g
.
There are two ways to solve this problem. The first is to immediately convert the raw pointer to a smart pointer:
f(g(),std::shared_ptr<T>(new T()));
This way, either…
g()
is executedg()
throwsnew
is never called
…or…
- a new
T
is constructed on the heap - the raw pointer to the new
T
instance is stored inside astd::shared_ptr<T>
. g()
is executedg()
throws- The destructor of the
std::shared_ptr<T>
causes adelete t;
- No leak occurs
The second solution is std::make_shared<T>
, a variadic function that returns a std::shared_ptr<T>
:
f(g(),std::make_shared<T>());