2012

  • October 7, 2012

    Compiler name and version macros are predefined by all C/C++ compilers to enable #if/#endif sets around compiler-specific code, such as inline assembly, compiler-specific intrinsics, or special language features. This can be necessary in high-performance code that aims at using the best performance tricks available for each compiler. This article surveys common compilers and shows how to use predefined macros to detect the compiler name and version at compile time.

  • September 14, 2012

    API functions to get the size of physical memory (RAM) differ between Windows, Linux, OSX, AIX, BSD, Solaris, and other UNIX-style OSes. This article provides a cross-platform function to get the physical memory size, and explains what works on what OS.

  • July 12, 2012

    The "Resident set size" ("Working set size" on Windows) is the amount of physical memory (RAM) used by a process's code and data. Monitoring size changes is an important way to find memory leaks and improve performance, but methods to get this data differ between Windows, Linux, OSX, BSD, Solaris, and others. This article provides cross-platform functions to get the peak (maximum) and current resident set size of a process, and explains what works on what OS.

  • June 17, 2012

    2D images, 3D volumes, and other multi-dimensional data frequently require loops that sweep through an array to compute statistics, normalize values, or apply transfer functions. Maintaining a multi-dimensional array within a single linear array is a common performance technique. Popular "hand optimizations" fiddle with array indexing and pointer math to improve performance, but how well do they work? This article benchmarks nine common multi-dimensional array loop and indexing methods and four common compilers to find the fastest method to loop through multi-dimensional arrays quickly.

  • May 8, 2012

    Code performance always matters, and copying data is a common operation. Simple array copy code uses a loop to copy one value at a time. ISO C provides the memcpy( ) and memmove( ) functions to do this efficiently, but are they faster, by how much, and under what conditions? This article benchmarks five common methods and four common compilers to find the fastest method to copy an array quickly.

  • April 7, 2012

    API functions to get the real time (wall-clock time) at sub-second resolution differ between Windows, Linux, OSX, BSD, Solaris, and other UNIX-style OSes. This article provides a cross-platform function to get the real time, and explains what works on what OS.

  • March 7, 2012

    API functions to get the CPU time used by a process differ between Windows, Linux, OSX, BSD, Solaris, and other UNIX-style OSes. This article provides a cross-platform function to get the process CPU time, and explains what works on what OS.

  • February 9, 2012

    Processor macros are predefined by all C/C++ compilers to enable #if/#endif sets to wrap processor-specific code, such as in-line assembly for SSE instructions on x86 processors. But there are no standards for processor macros. The same compiler may have different macros on different operating systems, and different compilers for the same processor may have different macros. This article surveys common compilers and shows how to use predefined macros to detect common desktop and server processors at compile time.

Syndicate content
Nadeau software consulting
Nadeau software consulting