For public API methods, good programming practice recommends always checking incoming arguments for validity before using them. When a problem is found, code may return an error code, throw an exception, or abort with an assert. But when the arguments are valid, what is the runtime cost of this checking? Does declaring that a method can throw an exception slow down the method even if one isn't thrown? This article does quick benchmarking to look at some runtime costs associated with argument checking, exceptions, and asserts.
The C++ "inline" keyword marks methods as inlineable by the compiler to improve performance. But today's compilers seem to inline methods even if they aren't marked with "inline". So is there a need for the "inline" keyword? This article does quick benchmarking of multiple ways to define method bodies, with and without "inline", to see if there's any value to the keyword.
Object-oriented design favors class hierarchies with general base classes and specific subclasses. Virtual methods provide general implementations on the base class, overridden for specific implementations by subclasses. But how does this coding style affect runtime performance? This article does a quick benchmark to illustrate the cost of "virtual" on a method.
Compiler name and version macros are predefined by all C/C++ compilers to enable
#if/#endif sets around compiler-specific code, such as inline assembly, compiler-specific intrinsics, or special language features. This can be necessary in high-performance code that aims at using the best performance tricks available for each compiler. This article surveys common compilers and shows how to use predefined macros to detect the compiler name and version at compile time.
API functions to get the size of physical memory (RAM) differ between Windows, Linux, OSX, AIX, BSD, Solaris, and other UNIX-style OSes. This article provides a cross-platform function to get the physical memory size, and explains what works on what OS.
The "Resident set size" ("Working set size" on Windows) is the amount of physical memory (RAM) used by a process's code and data. Monitoring size changes is an important way to find memory leaks and improve performance, but methods to get this data differ between Windows, Linux, OSX, BSD, Solaris, and others. This article provides cross-platform functions to get the peak (maximum) and current resident set size of a process, and explains what works on what OS.
2D images, 3D volumes, and other multi-dimensional data frequently require loops that sweep through an array to compute statistics, normalize values, or apply transfer functions. Maintaining a multi-dimensional array within a single linear array is a common performance technique. Popular "hand optimizations" fiddle with array indexing and pointer math to improve performance, but how well do they work? This article benchmarks nine common multi-dimensional array loop and indexing methods and four common compilers to find the fastest method to loop through multi-dimensional arrays quickly.
Code performance always matters, and copying data is a common operation. Simple array copy code uses a loop to copy one value at a time. ISO C provides the
memcpy( ) and
memmove( ) functions to do this efficiently, but are they faster, by how much, and under what conditions? This article benchmarks five common methods and four common compilers to find the fastest method to copy an array quickly.
API functions to get the real time (wall-clock time) at sub-second resolution differ between Windows, Linux, OSX, BSD, Solaris, and other UNIX-style OSes. This article provides a cross-platform function to get the real time, and explains what works on what OS.
API functions to get the CPU time used by a process differ between Windows, Linux, OSX, BSD, Solaris, and other UNIX-style OSes. This article provides a cross-platform function to get the process CPU time, and explains what works on what OS.