Java 8 GC Tutorials - Herong's Tutorial Examples - v1.03, by Dr. Herong Yang - 99th Percentile Performance
This section provides a GC test program,, that uses 99th percentile performance measurements.
From previous tutorials, we learned that a long system interruption has a huge impact on latency and a small impact on throughput. This is because latency is defined based on the worst execution, while throughput is defined based on the average execution time.
One way to reduce the system interruption impact on latency is to define it as the 99th percentile (or P99) latency, which throws away 1% worst runs, then takes the latency of the rest 99% good runs.
P99 latency is a better measurement, because if system interruption happens less than 1% of the time, then P99 latency is actually 100% accurate.
I have created another GC test program, that uses P99 latency measurement:
/* * Copyright (c) All Rights Reserved. */ class GCPerfP99 { static MyList objList = null; static int objSize = 1024; // in KB, default = 1 MB static int baseSize = 32; // # of objects in the base static int chunkSize = 32; // # of objects per run chunk static int warmup = 64; // warmup loops: 64*32 = 2GB static int runs = 1000; // number of runs public static void main(String[] arg) { if (arg.length>0) objSize = Integer.parseInt(arg[0]); if (arg.length>1) baseSize = Integer.parseInt(arg[1]); if (arg.length>2) chunkSize = Integer.parseInt(arg[2]); if (arg.length>3) warmup = Integer.parseInt(arg[3]); if (arg.length>4) runs = Integer.parseInt(arg[4]); System.out.println("Parameters:"); System.out.println(" Size="+objSize+"KB" +", Base="+baseSize +", Chunk="+chunkSize +", Warmup="+warmup+", Runs="+runs); objList = new MyList(); myTest(); } public static void myTest() { for (int m=0; m<baseSize; m++) { objList.add(new MyObject()); } for (int k=0; k<warmup; k++) { for (int m=0; m<chunkSize; m++) { objList.add(new MyObject()); } for (int m=0; m<chunkSize; m++) { objList.removeTail(); } } long[] times = new long[runs+1]; times[0] = System.currentTimeMillis(); for (int i=0; i<runs; i++) { for (int m=0; m<chunkSize; m++) { objList.add(new MyObject()); } for (int m=0; m<chunkSize; m++) { objList.removeTail(); } times[i+1] = System.currentTimeMillis(); } long[] samples = new long[runs]; for (int i=0; i<runs; i++) { samples[i] = times[i+1] - times[i]; // in millis } java.util.Arrays.sort(samples); // sorted low to high int p99 = (runs*99)/100; // 99th percentile long duration = 0; for (int i=0; i<p99; i++) { duration += samples[i]; } long avePerf = (1000*p99*chunkSize)/duration; // obj/second long maxPerf = (1000*chunkSize)/samples[0]; long minPerf = (1000*chunkSize)/samples[p99-1]; long latency = 1000000/minPerf; // millis/1000 obj System.out.println("Results:"); System.out.println(" Total execution time = " +(duration/1000)+" seconds"); System.out.println(" Total objects processed = " +(runs*chunkSize)); System.out.println(" Average time per run = " +(duration/p99)+" milliseconds"); System.out.println(" Throughput = " +avePerf+" objects/second"); System.out.println(" Latency = " +latency+" milliseconds/1000 objects"); System.out.println(" Throughput (max, ave, min) = (" +maxPerf+", "+avePerf+", "+minPerf+")"); System.out.println(" Latency (min, ave, max) = (" +(1000000/maxPerf)+", "+(1000000/avePerf)+", " +(1000000/minPerf)+")"); System.out.println("1% worst runs dropped:"); for (int i=p99; i<runs; i++) { System.out.println(" Run, Time, Throughput = " +(i+1)+", "+samples[i]+", "+(1000*chunkSize)/samples[i]); } System.out.println("Press ENTER to end..."); try {; } catch (Exception e) { } }
static class MyObject { private long[] obj = null; public MyObject next = null; public MyObject prev = null; public MyObject() { obj = new long[objSize*128]; // 128*8=1024 bytes for (int i=0; i<objSize*128; i++) { obj[i] = i/2+i/3+i/4+i/5; // some work load } } } static class MyList { MyObject head = null; MyObject tail = null; void add(MyObject o) { if (head==null) { head = o; tail = o; } else { o.prev = head; = o; head = o; } } void removeTail() { if (tail!=null) { if ( { tail = null; head = null; } else { tail =; tail.prev = null; } } } } }
Changes made on the test program:
Table of Contents
Heap Memory Area and Size Control
JVM Garbage Collection Logging
Introduction of Garbage Collectors
Serial Collector - "+XX:+UseSerialGC"
Parallel Collector - "+XX:+UseParallelGC"
Concurrent Mark-Sweep (CMS) Collector - "+XX:+UseConcMarkSweepGC"
Garbage First (G1) Collector - "+XX:+UseG1GC"
Object References and Garbage Collection
►Garbage Collection Performance Test Program - GC Performance Test Program - Program Output
Performance Impact of Wait Time
Performance Impact of Chunk Size
Performance Jumps Not Related to GC
Performance Test and System Interruptions
"START /REALTIME" - Run JVM with Highest Priority
► - 99th Percentile Performance - Output Verification
Performance Tests on Serial Collector
Performance Tests on Parallel collector
Performance Tests on Concurrent collector
Performance Tests on G1 collector