Java tip: How to get CPU, system, and user time for benchmarking

Topics: Java
Technologies: Java 5+

Performance optimization requires that you measure the time to perform a task, then try algorithm and coding changes to make the task faster. Prior to Java 5, the only way to time a task was to measure wall clock time. Unfortunately, this gives inaccurate results when there is other activity on the system (and there always is). Java 5 introduced the java.lang.management package and methods to report CPU and user time per thread. These times are not affected by other system activity, making them just what we need for benchmarking. This article shows how to use the java.lang.management package to benchmark your application.

Timing a task using wall clock time

"Wall clock time" is the real-world elapsed time experienced by a user waiting for a task to complete. Prior to Java 1.5, measuring wall clock time was the conventional (and only) way to benchmark a Java task.

To measure wall clock time, call java.lang.System . currentTimeMillis() before and after the task and take the difference. The method returns the time in milliseconds (one thousandth of a second).

long startTimeMs = System.currentTimeMillis( );
... do task ...
long taskTimeMs  = System.currentTimeMillis( ) - startTimeMs;

Java 1.5 introduced java.lang.System . nanoTime() to get the wall clock time in nanoseconds (one billionth of a second) (but see Appendix on Times and (lack of) nanosecond accuracy). Again, get the time before and after a task and take the difference.

long startTimeNano = System.nanoTime( );
... do task ...
long taskTimeNano  = System.nanoTime( ) - startTimeNano;

Wall clock time is strongly affected by other activity on the system, such as background processes, other applications, disk or network activity, and updates to the display. On some systems, such as Windows, if the application is not on top, it will run at a lower priority and take longer. All of this can skew your benchmark results unless you are very careful to use an unloaded system and average across a large number of tests.

Using wall clock time isn't necessarily bad if you're interested in what the user will experience. But it makes it hard to get consistent benchmark numbers that reveal your own application's problems.

Timing a single-threaded task using CPU, system, and user time

To exclude the effects of other system activity, you need to measure application "System time" and "User time" instead.

  • "User time" is the time spent running your application's own code.
  • "System time" is the time spent running OS code on behalf of your application (such as for I/O).

We often refer to "CPU time" as well:

  • "CPU time" is user time plus system time. It's the total time spent using a CPU for your application.

Java 1.5 introduced the java.lang.management package to monitor the JVM. The entry point for the package is the ManagementFactory class. It's static methods return a variety of different "MXBean" objects that report JVM information. One such bean can report thread CPU and user time.

Call ManagementFactory . getThreadMXBean() to get a ThreadMXBean that describes current JVM threads. The bean's getCurrentThreadCpuTime() method returns the CPU time for the current thread. The getCurrentThreadUserTime() method returns the thread's user time. Both of these report times in nanoseconds (but see Appendix on Times and (lack of) nanosecond accuracy).

Be sure to call isCurrentThreadCpuTimeSupported() first, though. If it returns false (rare), the JVM implementation or OS does not support getting CPU or user times. In that case, you're back to using wall clock time.

import java.lang.management.*;
 
/** Get CPU time in nanoseconds. */
public long getCpuTime( ) {
    ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
    return bean.isCurrentThreadCpuTimeSupported( ) ?
        bean.getCurrentThreadCpuTime( ) : 0L;
}
 
/** Get user time in nanoseconds. */
public long getUserTime( ) {
    ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
    return bean.isCurrentThreadCpuTimeSupported( ) ?
        bean.getCurrentThreadUserTime( ) : 0L;
}

/** Get system time in nanoseconds. */
public long getSystemTime( ) {
    ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
    return bean.isCurrentThreadCpuTimeSupported( ) ?
        (bean.getCurrentCpuTime( ) - bean.getCurrentThreadUserTime( )) : 0L;
}

These methods return the CPU, user, and system time since the thread started. To time a task after the thread has started, call one or more of these before and after the task and take the difference:

long startSystemTimeNano = getSystemTime( );
long startUserTimeNano   = getUserTime( );
... do task ...
long taskUserTimeNano    = getUserTime( ) - startUserTimeNano;
long taskSystemTimeNano  = getSystemTime( ) - startSystemTimeNano;

Timing a multithreaded task using CPU, system, and user time

For multithreaded tasks, ThreadMXBean methods can give you the CPU and user time for any running thread. But call isThreadCpuTimeSupported() first to be sure the JVM and OS support it.

The ThreadMXBean methods refer to threads by their long integer ID. To get this ID for a thread, call the getId() method on its java.lang . Thread object.

long id = java.lang.Thread.currentThread( ).getId( );

Then get and sum the times for all the threads you're interested in. A -1 is returned by the ThreadMXBean methods if the thread is no longer running.

import java.lang.management.*;
 
/** Get CPU time in nanoseconds. */
public long getCpuTime( long[] ids ) {
    ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
    if ( ! bean.isThreadCpuTimeSupported( ) )
        return 0L;
    long time = 0L;
    for ( int i : ids ) {
        long t = bean.getThreadCpuTime( ids[i] );
        if ( t != -1 )
            time += t;
    }
    return time;
}
 
/** Get user time in nanoseconds. */
public long getUserTime( long[] ids ) {
    ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
    if ( ! bean.isThreadCpuTimeSupported( ) )
        return 0L;
    long time = 0L;
    for ( int i : ids ) {
        long t = bean.getThreadUserTime( ids[i] );
        if ( t != -1 )
            time += t;
    }
    return time;
}
  
/** Get system time in nanoseconds. */
public long getSystemTime( long[] ids ) {
    ThreadMXBean bean = ManagementFactory.getThreadMXBean( );
    if ( ! bean.isThreadCpuTimeSupported( ) )
        return 0L;
    long time = 0L;
    for ( int i : ids ) {
        long tc = bean.getThreadCpuTime(  ids[i] );
        long tu = bean.getThreadUserTime( ids[i] );
        if ( tc != -1 && tu != -1 )
            time += (tc - tu);
    }
    return time;
}

Timing the entire application

I'm afraid that as of Java 1.6 there is no simple way to get CPU and user time for the entire application. There are a few ways to get partial answers, though.

Using a Sun internal class to get JVM CPU time

The ManagementFactory . getOperatingSystemMXBean() method returns an OperatingSystemMXBean that reports the OS name, version, and a few other items. Sun's internal implementation of this bean is a com.sun.management.OperatingSystemMXBean. This class has a getProcessCpuTime() method to get the overall CPU time of the JVM. There is no method to report the user time for the JVM.

import java.lang.management.*;
 
/** Get JVM CPU time in milliseconds */
public long getJVMCpuTime( ) {
    OperatingSystemMXBean bean =
        ManagementFactory.getOperatingSystemMXBean( );
    if ( ! (bean instanceof
        sun.com.management.OperatingSystemMXBean) )
        return 0L;
    return ((sun.com.management.OperatingSystemMXBean)bean)
        .getProcessCpuTime( );
}

Using Sun's undocumented internal OperatingSystemMXBean is certainly questionable practice for production code. But until there is a better method, it's useful for benchmarking during development.

Polling threads listed by the ThreadMXBean to get total CPU, system, and user time

The getAllThreadIds() method on ThreadMXBean returns the IDs of all of the running threads. You can loop through them all and sum their CPU and user times. However, the list does not include threads that have died. But if your task spawned those threads to do significant work and you don't include the time they took, your benchmark results will be skewed.

As of Java 1.6, there is no way to set up code to be notified each time a thread starts or dies so that you can count their CPU and user time. Instead you'll need to poll ThreadMXBean and keep track of threads yourself. On each poll, get a list of the running threads and record their most recent CPU and user times. The class below does this by using a hash table with the thread ID as a hash key. Technically, thread IDs are not unique for all time, so it is possible that a new thread will re-use the thread ID of an old dead thread. In practice, though, thread IDs increment by one on each new thread and only wrap around after 1.8e19 threads. So, unless you use an extraordinary number of threads or run for a very long time, this class will work fine.

import java.lang.management.*;
import java.util.HashMap;
import java.util.Collection;
 
public final class ThreadTimes
    extends Thread
{
    private class Times {
        public long id;
        public long startCpuTime;
        public long startUserTime;
        public long endCpuTime;
        public long endUserTime;
    }
 
    private final long interval;
    private final long threadId;
    private final HashMap<Long,Times> history =
        HashMap<Long,Times>( );
 
    /** Create a polling thread to track times. */
    public ThreadTimes( final long interval ) {
        super( "Thread time monitor" );
        this.interval = interval;
        threadId = getId( );
        setDaemon( true );
    }
 
    /** Run the thread until interrupted. */
    public void run( ) {
        while ( !isInterrupted( ) ) {
            update( );
            try { sleep( interval ); }
            catch ( InterruptedException e ) { break; }
        }
    }
 
    /** Update the hash table of thread times. */
    private void update( ) {
        final ThreadMXBean bean =
            ManagementFactory.getThreadMXBean( );
        final long[] ids = bean.getAllThreadIds( );
        for ( long id : ids ) {
            if ( id == threadId )
                continue;   // Exclude polling thread
            final long c = bean.getThreadCpuTime( id );
            final long u = bean.getThreadUserTime( id );
            if ( c == -1 || u == -1 )
                continue;   // Thread died
 
            Times times = history.get( id );
            if ( times == null ) {
                times = new Times( );
                times.id = id;
                times.startCpuTime  = c;
                times.startUserTime = u;
                times.endCpuTime    = c;
                times.endUserTime   = u;
                history.put( id, times );
            } else {
                times.endCpuTime  = c;
                times.endUserTime = u;
            }
        }
    }
 
    /** Get total CPU time so far in nanoseconds. */
    public long getTotalCpuTime( ) {
        final Collection<Times> hist = history.values( );
        long time = 0L;
        for ( Times times : hist )
            time += times.endCpuTime - times.startCpuTime;
        return time;
    }
 
    /** Get total user time so far in nanoseconds. */
    public long getTotalUserTime( ) {
        final Collection<Times> hist = history.values( );
        long time = 0L;
        for ( Times times : hist )
            time += times.endUserTime - times.startUserTime;
        return time;
    }
 
    /** Get total system time so far in nanoseconds. */
    public long getTotalSystemTime( ) {
        return getTotalCpuTime( ) - getTotalUserTime( );
    }
}

The constructor's only argument is a polling interval in milliseconds. Be careful when picking this value. The shorter the interval the more rapid the polling and the more uptodate the results. But rapid polling slows down the JVM, which slows down your application and changes the timing of interactions between threads and with the OS. This can skew your benchmark results. For my work, I find that polling every 100ms is plenty, but your needs may require faster or slower polling.

To use this class, create and start the thread before a task. Interrupt the thread after the task and get the total CPU and user times.

ThreadTimes tt = new ThreadTimes( 100 );  // 100ms interval
tt.start( );
... do task ...
tt.interrupt( );
long taskUserTimeNano   = tt.getTotalUserTime( );
long taskSystemTimeNano = tt.getTotalSystemTime( );

The above class excludes the time used by the polling thread itself, but it includes times for the JVM's internal threads. These include the "Reference Handler", "Finalizer", "Secondary finalizer", "Signal Dispatcher" and more for other system tasks. The names for these threads are not documented nor guaranteed to never change, and the threads come and go during an application run.

These system threads are there to support your application, so perhaps they should be included in benchmark results. But if you want to exclude them, you can watch for these threads by name by getting a list of ThreadInfo objects from ThreadMXBean. Each of these objects describes a thread, including its current state and its name. Modify the update loop in the class above to watch for specific thread names.

final long[] ids = bean.getAllThreadIds( );
final ThreadInfo[] infos = bean.getThreadInfo( ids );
for ( int i = 0; i < ids.length; i++ ) {
    long id = ids[i];
    if ( id == threadId )
        continue;   // Exclude polling thread
    String name = infos[i].getThreadName( );
    if ( name.equals( "Reference Handler" ) )
        continue;
    if ( name.equals( "Signal Dispatcher" ) )
        continue;
    if ( name.equals( "Finalizer" ) )
        continue;
    ...
}

Alternatives

There are lots of external tools that you can use to get timing information. I'll just note these.

Using the UNIX time command to get total CPU and user time

For Linux, FreeBSD, Solaris, Mac OS X, and other UNIX-like operating systems, the time command reports the wall clock, user, and system times for an application. Times are for the entire application, from start to finish. It isn't possible to time just a portion of the application.

To use the command, use a terminal window and type "time" followed by the application and its arguments:

time java MyApplication args...

Using profilers

There are several profilers for Java. Any good profiler can show you the time spent within different methods. However, because profilers collect very detailed information, they can significantly slow down the application. This can change its behavior – particularly when disk or network resources are involved. Profiling is very useful, but the above CPU and user time approaches are better when benchmarking specific portions of an application, or when the impact of a profiler is a problem.

Further reading

Appendix. Times and (lack of) nanosecond accuracy

The CPU and user time methods on ThreadMXBean report times measured in nanoseconds. That sounds really accurate, but the Java documentation notes:

"The returned value is of nanoseconds precision but not necessarily nanoseconds accuracy."

Nanosecond timing won't be exact due to hardware limitations and overhead in the OS and JVM. The accuracy available may not be documented and may vary between OS and JVM releases.

You can get a feel for timer accuracy on your system by using a loop to print the current CPU time of a thread, then note how many zeroes always seem to be at the end:

ThreadMXBean threadData = ManagementFactory.getThreadMXBean( );
for ( int i = 0; i < 100; i++ )
    System.err.println( "CPU time:  " +
        threadData.getCurrentThreadCpuTime( );

...
CPU time:  90000000
CPU time:  90000000

...
CPU time:  100000000
CPU time:  100000000
...

In the above output from a Linux host, the system's timer accuracy is clearly no better than 10,000,000 nanoseconds, or 10 milliseconds. The output below from a Mac has accuracy of a few thousand nanoseconds:

CPU time:  66666000
CPU time:  66764000
CPU time:  66872000
CPU time:  66965000
...

Time accuracies are typically no better than several thousand nanoseconds. More often they are a few milliseconds. Fortunately, for many benchmarking purposes, measuring times in milliseconds is good enough.

Comments

Great Article!

I found this article really useful! How do you find out about all these things?

This was really useful!

This was really useful! Thanks for writing this up!!

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

Nadeau software consulting
Nadeau software consulting