C/C++ tip: How to measure CPU time for benchmarking

Topics: C/C++

API functions to get the CPU time used by a process differ between Windows, Linux, OSX, BSD, Solaris, and other UNIX-style OSes. This article provides a cross-platform function to get the process CPU time, and explains what works on what OS.

How to get the process CPU time

A process's CPU time accumulates as the process runs and consumes CPU cycles. During I/O operations, thread locks, and other operations that cause the process to pause, CPU time accumulation also pauses until the process can again make headway.

Tools like POSIX ps, OSX Activity Monitor, and Windows Task Manager display the CPU time used by processes, but it's often useful to track it from within the process itself. This is particularly necessary when benchmarking algorithms or small parts of a more complex program. While all OSes provide APIs to get a process's CPU time, each OS has its quirks.

Code

The following getCPUTime( ) function works for most OSes (copy and paste, or download getCPUTime.c). On OSes that need it, link with librt to get POSIX timers (e.g. AIX, BSD, Cygwin, HP-UX, Linux, and Solaris, but not OSX). Otherwise, the default libraries are sufficient.

See the sections that follow for discussion, caveats, and why this code requires so many #ifdef's.

/*
 * Author:  David Robert Nadeau
 * Site:    http://NadeauSoftware.com/
 * License: Creative Commons Attribution 3.0 Unported License
 *          http://creativecommons.org/licenses/by/3.0/deed.en_US
 */
#if defined(_WIN32)
#include <Windows.h>

#elif defined(__unix__) || defined(__unix) || defined(unix) || (defined(__APPLE__) && defined(__MACH__))
#include <unistd.h>
#include <sys/resource.h>
#include <sys/times.h>
#include <time.h>

#else
#error "Unable to define getCPUTime( ) for an unknown OS."
#endif





/**
 * Returns the amount of CPU time used by the current process,
 * in seconds, or -1.0 if an error occurred.
 */
double getCPUTime( )
{
#if defined(_WIN32)
	/* Windows -------------------------------------------------- */
	FILETIME createTime;
	FILETIME exitTime;
	FILETIME kernelTime;
	FILETIME userTime;
	if ( GetProcessTimes( GetCurrentProcess( ),
		&createTime, &exitTime, &kernelTime, &userTime ) != -1 )
	{
		SYSTEMTIME userSystemTime;
		if ( FileTimeToSystemTime( &userTime, &userSystemTime ) != -1 )
			return (double)userSystemTime.wHour * 3600.0 +
				(double)userSystemTime.wMinute * 60.0 +
				(double)userSystemTime.wSecond +
				(double)userSystemTime.wMilliseconds / 1000.0;
	}

#elif defined(__unix__) || defined(__unix) || defined(unix) || (defined(__APPLE__) && defined(__MACH__))
	/* AIX, BSD, Cygwin, HP-UX, Linux, OSX, and Solaris --------- */

#if defined(_POSIX_TIMERS) && (_POSIX_TIMERS > 0)
	/* Prefer high-res POSIX timers, when available. */
	{
		clockid_t id;
		struct timespec ts;
#if _POSIX_CPUTIME > 0
		/* Clock ids vary by OS.  Query the id, if possible. */
		if ( clock_getcpuclockid( 0, &id ) == -1 )
#endif
#if defined(CLOCK_PROCESS_CPUTIME_ID)
			/* Use known clock id for AIX, Linux, or Solaris. */
			id = CLOCK_PROCESS_CPUTIME_ID;
#elif defined(CLOCK_VIRTUAL)
			/* Use known clock id for BSD or HP-UX. */
			id = CLOCK_VIRTUAL;
#else
			id = (clockid_t)-1;
#endif
		if ( id != (clockid_t)-1 && clock_gettime( id, &ts ) != -1 )
			return (double)ts.tv_sec +
				(double)ts.tv_nsec / 1000000000.0;
	}
#endif

#if defined(RUSAGE_SELF)
	{
		struct rusage rusage;
		if ( getrusage( RUSAGE_SELF, &rusage ) != -1 )
			return (double)rusage.ru_utime.tv_sec +
				(double)rusage.ru_utime.tv_usec / 1000000.0;
	}
#endif

#if defined(_SC_CLK_TCK)
	{
		const double ticks = (double)sysconf( _SC_CLK_TCK );
		struct tms tms;
		if ( times( &tms ) != (clock_t)-1 )
			return (double)tms.tms_utime / ticks;
	}
#endif

#if defined(CLOCKS_PER_SEC)
	{
		clock_t cl = clock( );
		if ( cl != (clock_t)-1 )
			return (double)cl / (double)CLOCKS_PER_SEC;
	}
#endif

#endif

	return -1;		/* Failed. */
}

Usage

To benchmark an algorithm's CPU time, call getCPUTime( ) at the beginning and end, then report the difference. It is not safe to assume the value returned by one function call has any meaning.

double startTime, endTime;

startTime = getCPUTime( );
...
endTime = getCPUTime( );

fprintf( stderr, "CPU time used = %lf\n", (endTime - startTime) );

Discussion

Each OS has one or more ways of providing the CPU time for a process. However, some have better resolution than others.

OS clock clock_gettime GetProcessTimes getrusage times
AIX yes yes   yes yes
BSD yes yes   yes yes
HP-UX yes yes   yes yes
Linux yes yes   yes yes
OSX yes     yes yes
Solaris yes yes   yes yes
Windows     yes    

Each of these is discussed below.

GetProcessTimes( )

On Windows and Cygwin (Linux compatibility tools under Windows), GetProcessTimes( ) fills a FILETIME struct with the CPU time used by a process, and the FileTimeToSystemTime( ) function converts the FILETIME struct to a SYSTEMTIME struct containing usable time values.

typedef struct _SYSTEMTIME
{
  WORD wYear;
  WORD wMonth;
  WORD wDayOfWeek;
  WORD wDay;
  WORD wHour;
  WORD wMinute;
  WORD wSecond;
  WORD wMilliseconds;
} SYSTEMTIME, *PSYSTEMTIME;

Availability of GetProcessTimes( ): Cygwin and Windows XP and later.

Get CPU time:

#include <Windows.h>
...

	FILETIME createTime;
FILETIME exitTime;
FILETIME kernelTime;
FILETIME userTime;
if ( GetProcessTimes( GetCurrentProcess( ),
&createTime, &exitTime, &kernelTime, &userTime ) != -1 )
{
SYSTEMTIME userSystemTime;
if ( FileTimeToSystemTime( &userTime, &userSystemTime ) != -1 )
return (double)userSystemTime.wHour * 3600.0 +
(double)userSystemTime.wMinute * 60.0 +
(double)userSystemTime.wSecond +
(double)userSystemTime.wMilliseconds / 1000.0;
}

clock_gettme( )

On most POSIX-compliant OSes, clock_gettime( ) (see manual pages for AIX, BSD, HP-UX, Linux, and Solaris) provides the most accurate measure of CPU time used by a process. The function's first argument selects a "clock id" and the second is a timespec struct filled in with the CPU time used in seconds and nanoseconds. For most OSes, the application must link with librt.

However, there are a few caveats that make this function difficult for cross-platform code:

  • The function is an optional part of the POSIX specification and is only available if _POSIX_TIMERS is defined in <unistd.h> with a value greater than 0. Currently, AIX, BSD, HP-UX, Linux, and Solaris support the function, but OSX does not.
  • The timespec structure filled by clock_gettime( ) can store a time in nanoseconds, but the clock resolution varies with the OS and system. The clock_getres( ) function returns the clock resolution if needed. This function is, again, an optional part of the POSIX specification that's only available if _POSIX_TIMERS is greater than zero. Currently, AIX, BSD, HP-UX, Linux, and Solaris all provide the function, but it always fails on Solaris.
  • The POSIX specification defines the names of several standard "clock id" values, including CLOCK_PROCESS_CPUTIME_ID to get the process CPU time. However, currently BSD and HP-UX do not define this id and instead define their own non-POSIX id CLOCK_VIRTUAL to get the process's CPU time. Confusing things further, Solaris defines both of these, but uses CLOCK_VIRTUAL to report thread CPU time, not process CPU time.
    OS Clock to use
    AIX CLOCK_PROCESS_CPUTIME_ID
    BSD CLOCK_VIRTUAL
    HP-UX CLOCK_VIRTUAL
    Linux CLOCK_PROCESS_CPUTIME_ID
    Solaris CLOCK_PROCESS_CPUTIME_ID
  • Instead of using one of the above defined constants, clock_getcpuclockid( ) returns a timer for a selected process. Using process 0 gets the CPU time clock for the current process. However, this is another optional part of the POSIX specification and is only available if _POSIX_CPUTIME is defined with a value greater than 0. Currently, only AIX and Linux provide the function, but Linux include files don't define _POSIX_CPUTIME and the function returns unreliable non-POSIX-compliant results.
  • The clock_gettime( ) function may be implemented using a processor timing register. On multi-processor systems, individual processors may have slightly different notions of time that can make the function return bogus values when a process is scheduled back and forth among the processors. On Linux, and only Linux, this condition can be detected if a call to clock_getcpuclockid( ) returns a non-POSIX error and sets errno to ENOENT. However, as noted above, clock_getcpuclockid( ) is unreliable on Linux.

The practical result of all these caveats is that clock_gettime( ) use requires a lot of #ifdef checks, and the ability to fall through to one of other CPU time functions when it fails.

Availability of clock_gettime( ): AIX, BSD, Cygwin, HP-UX, Linux, and Solaris. But clock id's are non-standard on BSD and HP-UX.

Availability of clock_getres( ): AIX, BSD, Cygwin, HP-UX, and Linux, but it fails on Solaris.

Availability of clock_getcpuclockid( ): AIX and Cygwin, but it's unreliable on Linux.

Get CPU time:

#include <unistd.h>
#include <time.h>
...

#if defined(_POSIX_TIMERS) && (_POSIX_TIMERS > 0)
	clockid_t id;
	struct timespec ts;
#if _POSIX_CPUTIME > 0
	/* Clock ids vary by OS.  Query the id, if possible. */
	if ( clock_getcpuclockid( 0, &id ) == -1 )
#endif

#if defined(CLOCK_PROCESS_CPUTIME_ID)
		/* Use known clock id for AIX, Linux, or Solaris. */
		id = CLOCK_PROCESS_CPUTIME_ID;
#elif defined(CLOCK_VIRTUAL)
		/* Use known clock id for BSD or HP-UX. */
		id = CLOCK_VIRTUAL;
#else
		id = (clockid_t)-1;
#endif
	if ( id != (clockid_t)-1 && clock_gettime( id, &ts ) != -1 )
		return (double)ts.tv_sec +
			(double)ts.tv_nsec / 1000000000.0;
#endif

getrusage( )

On all UNIX-style OSes, getrusage( ) is the most reliable way to get the CPU time used by the current process. The function fills an rusage struct with the time in seconds and microseconds. The ru_utime struct field contains the time spent in user mode, while the ru_stime field contains the time spent in system mode on behalf of the process.

Beware: Some OSes prior to wide-spread 64-bit support defined that getrusage( ) return 32-bit quantities, and getrusage64( ) return 64-bit quantities. Today, getrusage( ) now returns 64-bit quantities, and getrusage64( ) is deprecated.

Availability of getrusage( ): AIX, BSD, Cygwin, HP-UX, Linux, OSX, and Solaris.

Get CPU time:

#include <sys/resource.h>
#include <sys/times.h>
...

	struct rusage rusage;
	if ( getrusage( RUSAGE_SELF, &rusage ) != -1 )
		return (double)rusage.ru_utime.tv_sec +
			(double)rusage.ru_utime.tv_usec / 1000000.0;

times( )

On all UNIX-style OSes, the obsolete times( ) function fills a tms struct with the CPU time used in clock ticks, and the sysconf( ) function returns the number of clock ticks in a second. The tms_utime struct field includes the time spent in user mode, while the tms_stime field includes time spent in system mode on behalf of the process.

Beware: The older sysconf( ) argument CLK_TCK is deprecated and may no longer be available on some OSes. When it is available, sysconf( ) usually fails with it. Use _SC_CLK_TCK instead.

Availability of times( ): AIX, BSD, Cygwin, HP-UX, Linux, OSX, and Solaris.

Get CPU time:

#include <unistd.h>
#include <sys/times.h>
...

	const double ticks = (double)sysconf( _SC_CLK_TCK );
	struct tms tms;
	if ( times( &tms ) != (clock_t)-1 )
		return (double)tms.tms_utime / ticks;

clock( )

On all UNIX-style OSes, the very old clock( ) function returns the process's CPU time used in clock ticks, and the CLOCKS_PER_SEC macro defines the number of clock ticks in a second.

Note: The returned CPU time includes time spent in user mode AND system mode on behalf of the process.

Beware: While CLOCKS_PER_SEC was originally intended to report a value that varies with processor speed, the ISO C89 and C99 C standards, the Single UNIX Specification, and the POSIX standard all require that CLOCKS_PER_SEC have the fixed value of 1,000,000, which limits the function to microsecond resolution. Most OSes conform to these standards, but FreeBSD, Cygwin, and older OSX versions use a non-standard value.

Beware: On AIX and Solaris, the clock( ) function includes the CPU time used by the current process AND any terminated child processes for which the parent process has executed wait( ), system( ), or pclose( ) functions.

Beware: On Windows, clock( ) is supported but it returns wall-clock time, not CPU time.

Availability of clock( ): AIX, BSD, Cygwin, HP-UX, Linux, OSX, and Solaris.

Get CPU time:

#include <time.h>
...

	clock_t cl = clock( );
	if ( cl != (clock_t)-1 )
		return (double)cl / (double)CLOCKS_PER_SEC;

Other approaches

There are other OS-specific ways to get CPU time. On Linux, Solaris, and some BSD OSes, you can parse /proc/[pid]/stat to get process statistics. On OSX, the private API function proc_pidtaskinfo( ) in libproc returns process data. There are also open source libraries available, such as libproc, procps, and Sigar.

On UNIX, there are several command-line utilities to display process CPU time, including ps, top, mpstat, and others. You can also use the time utility to show the time spent by running a command.

On Windows, you can use the Task Manager to monitor CPU use.

On OSX, you can use the Activity Monitor to monitor CPU use. The Instruments profiling utility that comes with Xcode can monitor CPU use, and a lot more.

Downloads

Further reading

Related articles at NadeauSoftware.com

Web articles

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

Nadeau software consulting
Nadeau software consulting