Design rationale for the templog library

Logging

Logging is a very good way of finding errors in your application if those errors only appear at a client's site on the other side of the planet. It also helps tremendously where errors only appear at certain timing patterns (like when you're multi-threading) which never occur during debugging.

But logging has very conflicting goals: It should be able to produce a lot of information in case of error without drowning you in a sea of useless data and, even more important, without slowing down your application. While this combination certainly cannot be achieved, a sophisticated logging system can give you the necessary tools and levers to balance between these according to your needs. Unfortunately, sophistication usually conflicts with yet another important goal of logging: It should be easy to use.

So how can we combine these goals, in order to create a logging system that is easy to use, efficient, and yet allows to individually control the amount of information passed on?

Syntax

Many people prefer the streaming idiom for logging. They want to write

 some_logger << "A string with parameter #1 " << some_var << and parameter #2 " << get_info() << " inserted";

where some_logger is either a stream or something stream-like. This probably stems from the fact that, at one point, everybody was using std::cerr for logging and from the general familiarity of the idiom. Using real streams for logging, however, is not very efficient. Also, it is hard to turn off logging for some or all log statements

Performance

Consider the following logging statement:

  some_logger << "A string with parameter #1 " << some_var << and parameter #2 " << another_var << " inserted";

An important efficiency problem is that the set of parameters passed to the logging statement always has to be evaluated at run-time in order to get passed on. But if the evaluation of the logging paramters has no side-effects, it could, if this particular log statement will result in the information being discarded later, safely be skipped. However, syntactic rules of C++ demand the parameters to be evaluated before the operators (<<, here) are called.

This problem can be solved by using expressions templates to defer the evaluation of the parameters: Instead of evaluating the parameters right away, a compile-time lists of references to the parameters is created and passed to the logging library where it is only ever evaluated if the information isn't discarded. A modern compiler should be able to optimize this to the point where, if logging of this specific information is disable at compile-time, the above line generates no code at all. If it is disabled at run-time, it should boil down to the equivalent of a single if statement.

On the other hand, we do not want to skip evaluation of parameters with side-effects, because we don't want to run into the classic assert problem: If NDEBUG is defined (as is usually done for release builds), the assert macro will discard anything passed to it, so that the code executed might differ between release and debug builds. Therefor, given this log statement

  some_logger << "A string with parameter #1 " << some_func() << and parameter #2 " << another_func() << " inserted";

if the two functions have side-effects, we want them to be called at run-time, even if their results are then discarded. If they do not have any side-effects, since the fact that the results are to be discarded is determinable at compile-time, this gives the compiler the opportunity for discarding the calls altogether.

Streaming

For logging, in order to achieve maximum performance while not losing important information (which sat in some buffer when the application crashed), the following must hold:

For this a expression template mechanism using operator<< needs to be created. This would put all the operands into a compile-time list, determine whether the information provided is to be logged at all, and only if it is, iterate over the compile-time list and push all the operands down some logging sink.

However, the problem with this is that the streaming idiom actually does not lend itself easily for logging where information needs to be written immediately, because streaming does not make it easy to chunk information in a deterministic way. With streaming, information is just shoved down some operator<<, ends up in some buffer, and when that buffer is full it will get flushed to the target device. The only way to control flushing of stream buffers is by explicitly and manually inserting flushing control tokens (like std::flush or std::endl) into the stream.

What is needed for this is a way to determine when all parameters are found and the information is either to be discarded or output to the device in one chunk. But given this log statement

 some_logger << "A string with parameter #1 " << some_var << and parameter #2 " << get_info() << " inserted";

when inside operator<< one cannot tell whether it is its first invocation in that line, its last, or one in between. So there is no way to know whether to flush the information given or not.

One way to solve this would be to require flushing control tokens to be appended to each log statement. Once the operator<< version taking that token is called, all others are known to have been called. But this can not be enforced at compile-time and if users forget it, some log statement's information might not get flushed. Users might then find out they lose valuable information not before they actuelly need that information.

Another way is to replace the left-most operator<< with some operator of a lower precedence (like operator=) which, due to operator precedence rules, would then be called as the very last. While this seems very clumsy and again relies on users remembering its syntax which differs from the streaming idiom, it can easily be hidden behind a macro. The macro could also serve to pass __FILE__ and __LINE__ automatically which further eases usage.

This is why the latter approach was chosen.

Getting started

This should get you started.