Today, every game comes with a set of postprocessing (pp) effects. Those are effects which are applied to the already rendered scene, meaning they don’t deal with geometry, just with textures. One of the earliest and most widely used pp effects is the so called bloom effect (which should not be confused with the bloom filter data structure).
A game utilizing this pp effect does the following:
- Render scene to texture original
- Render texture original to texture highlight using a filter which extracts the bright parts of the image
- Render texture highlight to texture blur using some sort of low pass filter, possibly a (separable) 5×5 Gaussian blur multiple times to really smooth it.
- Render both original and blur to result, combining the two textures somehow (maybe just add them together).
This process can be conveniently expressed using a directed graph:
So the first idea is to represent this chain of filters as a directed graph using boost::graph. The library has the ability to do a topological sort on the filters, so you don’t have to worry about the correct order of application. What you do have to worry about is how to define what a “filter” really is and what information to store at the graph nodes. In the graph above, you can spot three types of filters: nullary filters take no texture as input but produce a texture as output (such as the filter to render the scene to a texture); unary filters take a texture as input and produce a texture as output (highlight, blur); and finally, binary filters take two textures and produce a result (the combine filter).
More generally we could define an n-ary filter as a function which takes n textures and produces exactly one texture as a result. So in the above example you have to do the following (and this is what is currently implemented in fruicut):
- Create the 4 filters, add them to the graph (specifying the dependencies of each filter). This might be done only done once at program startup. But it’s also conceivable that you might add or delete filters later to increase performance. Also, let’s assume for now that the graph stores references to a common base type
- Sort the filters in topological order.
- For each filter f in the resulting sequence, check how many predecessors it has. Try to
dynamic_castf to either
filters::binary. If that doesn’t work, there’s something wrong with the graph.
- If the filter is nullary, apply it, store its result somewhere. If the filter is not nullary, collect all result textures from the predecessors and apply the filter. Again, store the result somewhere.
- Take the result of the last filter in the sequence and render it to the framebuffer (this could also be done with a unary filter which produces an empty texture as output, but that’s an implementation detail).
So with this approach, a filter is basically a wrapper around a shader plus one or more textures (the blur shader, for example, needs two textures if it’s implemented with a Gaussian). Pretty simple, really, but there are a bunch of problems with this approach: First of all, a filter might produce more than one result. OpenGL and DirectX both support multiple render targets at once (making deferred shading feasible). But that’s nothing I personally miss – at least currently – so I won’t discuss this further.
Secondly, what if you want to use a filter more than once? Maybe you want another cool effect which, again, needs the blur filter. You’d have to create the filter twice, thus load the shader twice and create the textures twice.
Textures present a more general problem: In the above approach – you create too many of them! Assuming you add another effect using the blur shader to just blur the whole screen (something you might do to add a smooth fadeout), let’s see exactly how many.
Here, we assumed that the screen resolution is 1024×768 (which is really, really low by today’s standards) and that the bloom shader works on smaller, 512×512 textures, since that’s usually the case. As you can see, we create 4 (!!!) 1024×768 textures and 3 with size 512×512, when clearly, we can do better. For example, when combine finishes, the textures for highlight, blur and original are not used anymore and could be re-used in the last blur stage. This effect saves more textures the more independent “filter paths” you have.
This problem could be tackled with a relatively simple “texture manager” which you can query for a texture:
texture_ptr t = texture_manager.query(dim(1024,768),filter::linear,flags::none);
The above would try to retrieve an existing texture from a pool of “non locked” textures and create a new texture if the query fails. This could also be implemented with a proxy class like this:
lazy_texture t = texture_manager.query(dim(1024,768),filter::linear,flags::none); // later that day... texture_ptr real_texture = t.get_texture(); // work with real_texture
No matter how you implement it, it will slow down the first render frame, since all the textures have to be loaded.
In a similar way, we could add a “shader pool” which stores a
(ptr_)map from a pair of vertex and fragment shader file names to a loaded shader for later retrieval.
In this scheme, a filter would receive said texture manager as well as said shader pool, making it very lightweight (even copyable).
So those are my thoughts for today. Maybe I find enough time to implement the texture and shader pool – if I don’t get any better ideas, that is.