This section provides a tutorial example on how to do performance comparison on search operations on different collection classes implemented in JDK 1.4.1. HashSet and LinkedHashSet have the best performance on search operations.
Once a collection is created, search is probably the most frequently used
operation. So let's compare the performance of the search operation among
the collection classes offered in JDK. Here is the main program,
CollectionTest:
import java.util.*;
class CollectionTest {
public static void main(String[] a) {
Object[] arr = getObjects();
Collection col;
System.out.println("Collection Class, Number of Objects, Time");
col = new HashSet();
testCollection(arr,col);
col = new LinkedHashSet();
testCollection(arr,col);
// col = new TreeSet(); // requires element.compareTo()
// testCollection(arr,col);
col = new Vector();
testCollection(arr,col);
col = new ArrayList();
testCollection(arr,col);
col = new LinkedList();
testCollection(arr,col);
Map dic;
System.out.println("Map Class, Number of Objects, Time");
dic = new TreeMap();
testMap(arr,dic);
dic = new HashMap();
testMap(arr,dic);
dic = new IdentityHashMap();
testMap(arr,dic);
dic = new WeakHashMap();
testMap(arr,dic);
dic = new Hashtable();
testMap(arr,dic);
}
private static void testCollection(Object[] arr, Collection col) {
for (int l=0; l<arr.length; l++) {
col.add(arr[l]);
}
Date t1 = new Date();
boolean ok = true;
for (int l=0; l<arr.length; l++) {
ok = ok && col.contains(arr[l]);
}
Date t2 = new Date();
if (ok==false) System.out.println("Search failed.");
System.out.println(col.getClass().getName()+", "+
arr.length+", "+(t2.getTime()-t1.getTime()));
}
private static void testMap(Object[] arr, Map col) {
for (int l=0; l<arr.length; l++) {
col.put(new Integer(l), arr[l]);
}
Date t1 = new Date();
boolean ok = true;
for (int l=0; l<arr.length; l++) {
ok = ok && col.containsValue(arr[l]);
}
Date t2 = new Date();
if (ok==false) System.out.println("Search failed.");
System.out.println(col.getClass().getName()+", "+
arr.length+", "+(t2.getTime()-t1.getTime()));
}
private static Object[] getObjects() {
int max_l = 5000;
Object[] arr = new Object[max_l];
for (int l=0; l<max_l; l++) {
arr[l] = new MySimpleDate();
// arr[l] = new MyDate(); // with element.compareTo() implemented
}
return arr;
}
}
The idea of the test is simple. First, an array of objects of my own class, MySimpleDate,
is created. Then, an object of the to-be-tested collection class is created, and
filled with all the objects from the array. The performance test is concentrated
on the CPU time used to search all the objects in the collection one by one.
MySimpledate class is really simple. Here is the source code:
import java.util.*;
public class MySimpleDate {
private static Random r;
private Date my_date;
public MySimpleDate () {
my_date = new Date();
long l = my_date.getTime();
int i = (int) (l/1000/60/60/24);
if (r==null) r = new Random();
i = r.nextInt(i);
l = ((long) i)*24*60*60*1000;
my_date.setTime(l);
}
}
It contains only one piece of information, an instance of time with minutes,
seconds and milliseconds truncated, and date randomized.
Since MySimpleDate class is not implementing the Comparable interface, its
objects can not be managed by TreeSet collection. So I have
commented out some lines in the CollectionTest program to avoid testing
on TreeSet collection.
Here is output of the program compiled and executed by JDK 1.4.1_01:
Collection Class, Number of Objects, Time
java.util.HashSet, 5000, 10
java.util.LinkedHashSet, 5000, 0
java.util.Vector, 5000, 661
java.util.ArrayList, 5000, 651
java.util.LinkedList, 5000, 762
Map Class, Number of Objects, Time
java.util.TreeMap, 5000, 1021
java.util.HashMap, 5000, 1712
java.util.IdentityHashMap, 5000, 391
java.util.WeakHashMap, 5000, 1572
java.util.Hashtable, 5000, 3145
A couple of notes about the output:
LinkedHashSet took no time (0 millisecond) to search 5000 elements. I can
not believe this. Something must be wrong here.
HashSet performed better than most other collection classes, as expected,
because locations of elements in a HashSet are based the hash value of each
elements, which will provide almost instant access time.
Vector, and ArrayList, took about the same time, as expected,
because they both are based on array data structure.
LinkedList took longer time than ArrayList, as expected, because iterating
through a linked list is slower that an array.
TreeMap, HashMap, WeakHashMap and Hashtable took must longer time, because maps are
not really designed to search elements directly. We should search the keys
instead of elements.
IdentityHashMap out performed all others map collections. I don't know why.
I repeated the execution 3 times with different number of elements and recorded
the following results:
HashSet and LinkedHashSet maintained an almost constant performance level,
while the number of elements doubled twice. They are perfectly designed for
the search operation.
Vector, ArrayList and LinkedList decreased their performance exponentially
as the number of elements doubles.
TreeMap, HashMap, IdentityHashMap and Hashtable decreased their performance exponentially
as the number of elements doubles.
WeakHashMap is not reliable as mentioned in the JDK specification.