Java GC Tutorials - Herong's Tutorial Examples - v1.11, by Dr. Herong Yang
GCPerfP99.java - 99th Percentile Performance
This section provides a GC test program, GCPerfP99.java, that uses 99th percentile performance measurements.
From previous tutorials, we learned that a long system interruption has a huge impact on latency and a small impact on throughput. This is because latency is defined based on the worst execution, while throughput is defined based on the average execution time.
A better way to define the latency is the 99th percentile (or P99) latency, which throws away 1% worst runs, then takes the latency of the rest 99% good runs.
P99 latency is a better measurement, because if system interruption happens less than 1% of the time, then P99 latency is actually 100% accurate.
I have created another GC test program, GCPerfP99.java that uses P99 latency measurement:
/* GCPerfP99.java * Copyright (c) HerongYang.com. All Rights Reserved. */ class GCPerfP99 { static long startTime = System.currentTimeMillis(); static MyList objList = null; static int objSize = 1024; // in KB, default = 1 MB static int baseSize = 32; // # of objects in the base static int chunkSize = 32; // # of objects per run chunk static int warmup = 64; // warmup loops: 64*32 = 2GB static int runs = 1000; // number of runs public static void main(String[] arg) { System.out.println("[" +((System.currentTimeMillis()-startTime)/1000.0) +"s] main() started"); if (arg.length>0) objSize = Integer.parseInt(arg[0]); if (arg.length>1) baseSize = Integer.parseInt(arg[1]); if (arg.length>2) chunkSize = Integer.parseInt(arg[2]); if (arg.length>3) warmup = Integer.parseInt(arg[3]); if (arg.length>4) runs = Integer.parseInt(arg[4]); System.out.println("Parameters:"); System.out.println(" Size="+objSize+"KB" +", Base="+baseSize +", Chunk="+chunkSize +", Warmup="+warmup+", Runs="+runs); objList = new MyList(); myTest(); } public static void myTest() { for (int m=0; m<baseSize; m++) { objList.add(new MyObject()); } for (int k=0; k<warmup; k++) { for (int m=0; m<chunkSize; m++) { objList.add(new MyObject()); } for (int m=0; m<chunkSize; m++) { objList.removeTail(); } } long[] times = new long[runs+1]; times[0] = System.currentTimeMillis(); for (int i=0; i<runs; i++) { System.out.println("[" +((System.currentTimeMillis()-startTime)/1000.0) +"s] Run start "+(i+1)); for (int m=0; m<chunkSize; m++) { objList.add(new MyObject()); } for (int m=0; m<chunkSize; m++) { objList.removeTail(); } times[i+1] = System.currentTimeMillis(); System.out.println("["+((times[i+1]-startTime)/1000.0) +"s] Run end "+(i+1)+": "+(times[i+1]-times[i])+"ms"); } long[] samples = new long[runs]; for (int i=0; i<runs; i++) { samples[i] = times[i+1] - times[i]; // in millis } java.util.Arrays.sort(samples); // sorted low to high int p99 = (runs*99)/100; // 99th percentile long duration = 0; for (int i=0; i<p99; i++) { duration += samples[i]; } long avePerf = (1000*p99*chunkSize)/duration; // obj/second long maxPerf = 999999; if (samples[0]>0) maxPerf = (1000*chunkSize)/samples[0]; long minPerf = (1000*chunkSize)/samples[p99-1]; long latency = 1000000/minPerf; // millis/1000 obj System.out.println("Results:"); System.out.println(" Total execution time = " +(duration/1000)+" seconds"); System.out.println(" Total objects processed = " +(runs*chunkSize)); System.out.println(" Average time per run = " +(duration/p99)+" milliseconds"); System.out.println(" Throughput = " +avePerf+" objects/second"); System.out.println(" Latency = " +latency+" milliseconds/1000 objects"); System.out.println(" Throughput (max, ave, min) = (" +maxPerf+", "+avePerf+", "+minPerf+")"); System.out.println(" Latency (min, ave, max) = (" +(1000000/maxPerf)+", "+(1000000/avePerf)+", " +(1000000/minPerf)+")"); System.out.println("1% worst runs dropped:"); for (int i=p99; i<runs; i++) { System.out.println(" Run, Time, Throughput = " +(i+1)+", "+samples[i]+", "+(1000*chunkSize)/samples[i]); } System.err.println("Press ENTER to end..."); try { System.in.read(); } catch (Exception e) { } } static class MyObject { private long[] obj = null; public MyObject next = null; public MyObject prev = null; public MyObject() { obj = new long[objSize*128]; // 128*8=1024 bytes for (int i=0; i<objSize*128; i++) { obj[i] = i/2+i/3+i/4+i/5; // some work load } } }
static class MyList { MyObject head = null; MyObject tail = null; void add(MyObject o) { if (head==null) { head = o; tail = o; } else { o.prev = head; head.next = o; head = o; } } void removeTail() { if (tail!=null) { if (tail.next==null) { tail = null; head = null; } else { tail = tail.next; tail.prev = null; } } } } }
Changes made on the test program:
Table of Contents
Heap Memory Area and Size Control
JVM Garbage Collection Logging
Introduction of Garbage Collectors
Serial Collector - "+XX:+UseSerialGC"
Parallel Collector - "+XX:+UseParallelGC"
Concurrent Mark-Sweep (CMS) Collector - "+XX:+UseConcMarkSweepGC"
Garbage First (G1) Collector - "+XX:+UseG1GC"
The Z Garbage Collector (ZGC) - "+XX:+UseZGC"
Object References and Garbage Collection
►Garbage Collection Performance Test Program
GCPerformance.java - GC Performance Test Program
GCPerformance.java - Program Output
Performance Impact of Wait Time
Performance Impact of Object Size
Performance Impact of Chunk Size
Performance Jumps Not Related to GC
Performance Test and System Interruptions
"START /REALTIME" - Run JVM with Highest Priority
►GCPerfP99.java - 99th Percentile Performance
GCPerfP99.java - Output Verification
GCPerfP99V2.java - Percentile Performance with Load
GCPerfP99V2.java - Work Load Level
GCPerfP99V2.java - Object Number and Size
Performance Tests on Serial Collector
Performance Tests on Parallel collector
Performance Tests on Concurrent collector
Performance Tests on G1 collector