Logging is a very good way of finding errors in your application if those errors only appear at a client's site on the other side of the planet. It also helps tremendously where errors only appear at certain timing patterns (like when you're multi-threading) which never occur during debugging.
But logging has very conflicting goals: It should be able to produce a lot of information in case of error without drowning you in a sea of useless data and, even more important, without slowing down your application. While this combination certainly cannot be achieved, a sophisticated logging system can give you the necessary tools and levers to balance between these according to your needs. Unfortunately, sophistication usually conflicts with yet another important goal of logging: It should be easy to use.
So how can we combine these goals, in order to create a logging system that is easy to use, efficient, and yet allows to individually control the amount of information passed on?
Many people prefer the streaming idiom for logging. They want to write
some_logger << "A string with parameter #1 " << some_var << and parameter #2 " << get_info() << " inserted";
where some_logger
is either a stream or something stream-like.
This probably stems from the fact that, at one point, everybody was using
std::cerr
for logging and from the general familiarity of the
idiom. Using real streams for logging, however, is not very efficient. Also, it
is hard to turn off logging for some or all log statements
Consider the following logging statement:
some_logger << "A string with parameter #1 " << some_var << and parameter #2 " << another_var << " inserted";
An important efficiency problem is that the set of parameters passed to the logging statement always has to be evaluated at run-time in order to get passed on. But if the evaluation of the logging paramters has no side-effects, it could, if this particular log statement will result in the information being discarded later, safely be skipped. However, syntactic rules of C++ demand the parameters to be evaluated before the operators (<<, here) are called.
This problem can be solved by using expressions templates to defer the
evaluation of the parameters: Instead of evaluating the parameters right away,
a compile-time lists of references to the parameters is created and passed to
the logging library where it is only ever evaluated if the information isn't
discarded. A modern compiler should be able to optimize this to the point where,
if logging of this specific information is disable at compile-time, the above
line generates no code at all. If it is disabled at run-time, it should boil
down to the equivalent of a single if
statement.
On the other hand, we do not want to skip evaluation of parameters with
side-effects, because we don't want to run into the classic assert
problem: If NDEBUG
is defined (as is usually done for release
builds), the assert
macro will discard anything passed to it, so
that the code executed might differ between release and debug builds. Therefor,
given this log statement
some_logger << "A string with parameter #1 " << some_func() << and parameter #2 " << another_func() << " inserted";
if the two functions have side-effects, we want them to be called at run-time, even if their results are then discarded. If they do not have any side-effects, since the fact that the results are to be discarded is determinable at compile-time, this gives the compiler the opportunity for discarding the calls altogether.
For logging, in order to achieve maximum performance while not losing important information (which sat in some buffer when the application crashed), the following must hold:
For this a expression template mechanism using operator<<
needs to be created. This would put all the operands into a compile-time list,
determine whether the information provided is to be logged at all, and only if
it is, iterate over the compile-time list and push all the operands down some
logging sink.
However, the problem with this is that the streaming idiom actually does not
lend itself easily for logging where information needs to be written
immediately, because streaming does not make it easy to chunk information in a
deterministic way. With streaming, information is just shoved down some
operator<<
, ends up in some buffer, and when that
buffer is full it will get flushed to the target device. The only way to control
flushing of stream buffers is by explicitly and manually inserting flushing
control tokens (like std::flush
or std::endl
) into
the stream.
What is needed for this is a way to determine when all parameters are found and the information is either to be discarded or output to the device in one chunk. But given this log statement
some_logger << "A string with parameter #1 " << some_var << and parameter #2 " << get_info() << " inserted";when inside
operator<<
one cannot tell whether it is
its first invocation in that line, its last, or one in between. So there is no
way to know whether to flush the information given or not.
One way to solve this would be to require flushing control tokens to be
appended to each log statement. Once the operator<<
version taking that token is called, all others are known to have been called.
But this can not be enforced at compile-time and if users forget it, some log
statement's information might not get flushed. Users might then find out they
lose valuable information not before they actuelly need that information.
Another way is to replace the left-most operator<<
with some operator of a lower precedence (like operator=
)
which, due to operator precedence rules, would then be called as the very last.
While this seems very clumsy and again relies on users remembering its syntax
which differs from the streaming idiom, it can easily be hidden behind a macro.
The macro could also serve to pass __FILE__
and __LINE__
automatically which further eases usage.
This is why the latter approach was chosen.
This should get you started.