It’s a fairly simple program
that has only a fixed quantity of objects with known lifetimes.
In general, your programs will always be
creating new objects based on some criteria that will be known only at the time
the program is running. You won’t know until run-time the quantity or even
the exact type of the objects you need. To solve the general programming
problem, you need to create any number of objects, anytime, anywhere. So you
can’t rely on creating a named handle to hold each one of your
objects:
MyObject myHandle;
since you’ll never know how many of
these things you’ll actually need.
To solve this rather essential problem,
Java has several ways to hold objects (or rather, handles to objects). The
built-in type is the array, which has been discussed before and will get
additional coverage in this chapter. Also, the Java utilities library has some
collection classes (also known as
container classes, but the term
“container” is used by the Swing GUI library so
“collection” will be used here) that provide more sophisticated ways
to hold and even manipulate your objects. This will comprise the remainder of
this
chapter.
Most of the necessary introduction to
arrays is in the last section of Chapter 4, which shows
how you define and initialize an array. Holding objects is the focus of this
chapter, and an array is just one way to hold objects. But there are a number of
other ways to hold objects, so what makes an array special?
There are two issues that distinguish
arrays from other types of collections: efficiency and
type. The array is the most efficient way that Java
provides to store and access a sequence of objects (actually, object handles).
The array is a simple linear sequence, which makes element access fast, but you
pay for this speed: when you create an array object, its size is fixed and
cannot be changed for the lifetime of that array object. You might suggest
creating an array of a particular size and then, if you run out of space,
creating a new one and moving all the handles from the old one to the new one.
This is the behavior of the ArrayList class, which will be studied later
in the chapter. However, because of the overhead of this size flexibility, a
ArrayList is measurably less efficient than an array.
The vector
class in C++ does know the type of objects it holds, but it has a
different drawback when compared with arrays in Java: the C++
vector’s operator[] doesn’t do bounds checking, so you
can run past the end. (It’s possible, however, to ask how big the
vector is, and the at( ) method does perform bounds
checking.) In Java, you get bounds checking regardless of whether you’re
using an array or a collection – you’ll get a
RuntimeException if you exceed the bounds. As
you’ll learn in Chapter 9, this type of exception indicates a programmer
error and thus you don’t need to check for it in your code. As an aside,
the reason the C++ vector doesn’t check bounds with every access is
speed – in Java you have the constant performance overhead of bounds
checking all the time for both arrays and collections.
The other generic collection classes that
will be studied in this chapter, List,
Set, and Map, all
deal with objects as if they had no specific type. That is, they treat them as
type Object, the root class of all classes in
Java. This works fine from one standpoint: you need to build only one
collection, and any Java object will go into that collection. (Except for
primitives – these can be placed in collections as constants using the
Java primitive wrapper classes, or as changeable values by wrapping in your own
class.) This is the second place where an array is superior to the generic
collections: when you create an array, you create it to hold a specific type.
This means that you get compile-time type checking to prevent you from putting
the wrong type in, or mistaking the type that you’re extracting. Of
course, Java will prevent you from sending an inappropriate message to an
object, either at compile-time or at run-time. So it’s not as if
it’s riskier one way or the other, it’s just nicer if the compiler
points it out to you, faster at run-time, and there’s less likelihood that
the end user will get surprised by an exception.
For efficiency and type checking
it’s always worth trying to use an array if you can. However, when
you’re trying to solve a more general problem arrays can be too
restrictive. After looking at arrays, the rest of this chapter will be devoted
to the collection classes provided by
Java.
Regardless of what type of array
you’re working with, the array identifier is actually a handle to a true
object that’s created on the heap. The heap object can be created either
implicitly, as part of the array initialization syntax, or explicitly with a
new expression. Part of the heap object (in fact, the only field or
method you can access) is the read-only length member that tells you how
many elements can be stored in that array object.
The ‘[]’ syntax
is the only other access that you have to the array object.
The following example shows the various
ways that an array can be initialized, and how the array handles can be assigned
to different array objects. It also shows that arrays of
objects and arrays of primitives are almost identical in
their use. The only difference is that arrays of objects hold handles while
arrays of primitives hold the primitive values directly. (See page 141 if you
have trouble executing this program.)
//: c08:ArraySize.java // Initialization & re-assignment of arrays class Weeble {} // A small mythical creature public class ArraySize { public static void main(String[] args) { // Arrays of objects: Weeble[] a; // Null handle Weeble[] b = new Weeble[5]; // Null handles Weeble[] c = new Weeble[4]; for(int i = 0; i < c.length; i++) c[i] = new Weeble(); Weeble[] d = { new Weeble(), new Weeble(), new Weeble() }; // Compile error: variable a not initialized: //!System.out.println("a.length=" + a.length); System.out.println("b.length = " + b.length); // The handles inside the array are // automatically initialized to null: for(int i = 0; i < b.length; i++) System.out.println("b[" + i + "]=" + b[i]); System.out.println("c.length = " + c.length); System.out.println("d.length = " + d.length); a = d; System.out.println("a.length = " + a.length); // Java 1.1 initialization syntax: a = new Weeble[] { new Weeble(), new Weeble() }; System.out.println("a.length = " + a.length); // Arrays of primitives: int[] e; // Null handle int[] f = new int[5]; int[] g = new int[4]; for(int i = 0; i < g.length; i++) g[i] = i*i; int[] h = { 11, 47, 93 }; // Compile error: variable e not initialized: //!System.out.println("e.length=" + e.length); System.out.println("f.length = " + f.length); // The primitives inside the array are // automatically initialized to zero: for(int i = 0; i < f.length; i++) System.out.println("f[" + i + "]=" + f[i]); System.out.println("g.length = " + g.length); System.out.println("h.length = " + h.length); e = h; System.out.println("e.length = " + e.length); // Java 1.1 initialization syntax: e = new int[] { 1, 2 }; System.out.println("e.length = " + e.length); } } ///:~
Here’s the output from the
program:
b.length = 5 b[0]=null b[1]=null b[2]=null b[3]=null b[4]=null c.length = 4 d.length = 3 a.length = 3 a.length = 2 f.length = 5 f[0]=0 f[1]=0 f[2]=0 f[3]=0 f[4]=0 g.length = 4 h.length = 3 e.length = 3 e.length = 2
The array a is initially just a
null handle, and the compiler prevents you from
doing anything with this handle until you’ve properly initialized it. The
array b is initialized to point to an array of Weeble handles, but
no actual Weeble objects are ever placed in that array. However, you can
still ask what the size of the array is, since b is pointing to a
legitimate object. This brings up a slight drawback: you can’t find out
how many elements are actually in the array, since length tells
you only how many elements can be placed in the array; that is, the size
of the array object, not the number of elements it actually holds. However, when
an array object is created its handles are automatically initialized to
null so you can see whether a particular array slot has an object in it
by checking to see whether it’s null. Similarly, an array of
primitives is automatically initialized to zero for numeric types, null
for char, and false for boolean.
Array c shows the creation of the
array object followed by the assignment of Weeble objects to all the
slots in the array. Array d shows the “aggregate
initialization” syntax that causes the array object to be created
(implicitly with new on the heap, just like for array c)
and initialized with Weeble objects, all in one
statement.
The expression
a = d;
shows how you can take a handle
that’s attached to one array object and assign it to another array object,
just as you can do with any other type of object handle. Now both a and
d are pointing to the same array object on the heap.
Java 1.1 adds a
new array initialization syntax, which could be thought of as a “dynamic
aggregate initialization.” The Java 1.0 aggregate
initialization used by d must be used at the point of d’s
definition, but with the Java 1.1 syntax you can create and initialize an array
object anywhere. For example, suppose hide( ) is a method that takes
an array of Weeble objects. You could call it by saying:
hide(d);
but in Java 1.1 you can also dynamically
create the array you want to pass as the argument:
hide(new Weeble[] { new Weeble(), new Weeble() });
This new syntax provides a more
convenient way to write code in some situations.
The second part of the above example
shows that primitive arrays work just like object arrays except that
primitive arrays hold the primitive values
directly.
Collection classes can hold only handles
to objects. An array, however, can be created to hold primitives directly, as
well as handles to objects. It is possible to use the
“wrapper” classes such as Integer, Double, etc. to
place primitive values inside a collection, but as you’ll see later in
this chapter in the WordCount.java example, the wrapper classes for
primitives are only somewhat useful anyway. Whether you put primitives in arrays
or wrap them in a class that’s placed in a collection is a question of
efficiency. It’s much more efficient to create and access an array of
primitives than a collection of wrapped primitives.
Of course, if you’re using a
primitive type and you need the flexibility of a collection that automatically
expands when more space is needed, the array won’t work and you’re
forced to use a collection of wrapped primitives. You might think that there
should be a specialized type of Vector for each of the primitive data
types, but Java doesn’t provide this for you. Some sort of templatizing
mechanism might someday provide a better way for Java to handle this
problem.[41]
Suppose you’re writing a method and
you don’t just want to return one thing, but a whole bunch of things.
Languages like C and C++ make this difficult because you can’t just return
an array, only a pointer to an array. This introduces problems because it
becomes messy to control the lifetime of the array, which easily leads to memory
leaks.
Java takes a similar approach, but you
just “return an array.” Actually, of course, you’re returning
a handle to an array, but with Java you never worry about responsibility for
that array – it will be around as long as you need it, and the garbage
collector will clean it up when you’re done.
As an example, consider returning an
array of String:
//: c08:IceCream.java // Returning arrays from methods public class IceCream { static String[] flav = { "Chocolate", "Strawberry", "Vanilla Fudge Swirl", "Mint Chip", "Mocha Almond Fudge", "Rum Raisin", "Praline Cream", "Mud Pie" }; static String[] flavorSet(int n) { // Force it to be positive & within bounds: n = Math.abs(n) % (flav.length + 1); String[] results = new String[n]; boolean[] picked = new boolean[flav.length]; for (int i = 0; i < n; i++) { int t; do t = (int)(Math.random() * flav.length); while (picked[t]); results[i] = flav[t]; picked[t] = true; } return results; } public static void main(String[] args) { for(int i = 0; i < 20; i++) { System.out.println( "flavorSet(" + i + ") = "); String[] fl = flavorSet(flav.length); for(int j = 0; j < fl.length; j++) System.out.println("\t" + fl[j]); } } } ///:~
The method flavorSet( )
creates an array of String called results. The size of this array
is n, determined by the argument you pass into the method. Then it
proceeds to choose flavors randomly from the array flav and place them
into results, which it finally returns. Returning an array is just like
returning any other object – it’s a handle. It’s not important
that the array was created within flavorSet( ), or that the array
was created anyplace else, for that matter. The garbage collector takes care of
cleaning up the array when you’re done with it, and the array will persist
for as long as you need it.
As an aside, notice that when
flavorSet( ) chooses flavors randomly, it ensures that a random
choice hasn’t been picked before. This is performed in a do loop
that keeps making random choices until it finds one that’s not already in
the picked array. (Of course, a String comparison could also have
been performed to see if the random choice was already in the results
array, but String comparisons are inefficient.) If it’s successful
it adds the entry and finds the next one (i gets incremented).
main( ) prints out 20 full
sets of flavors, so you can see that flavorSet( ) chooses the
flavors in a random order each time. It’s easiest to see this if you
redirect the output into a file. And while you’re looking at the file,
remember, you’re not really hungry. (You just want the ice cream,
you don’t need
it.)
To summarize what we’ve seen so
far, your first, most efficient choice to hold a group of objects should be an
array, and you’re forced into this choice if you want to hold a group of
primitives. In the remainder of the chapter we’ll look at the more general
case, when you don’t know at the time you’re writing the program how
many objects you’re going to need, or if you need a more sophisticated way
to store your objects. Java provides three types of
collection classes to solve this problem:
List, Set,
and Map. You can solve a surprising number of
problems using these tools.
Among their other characteristics –
Set, for example, holds only one object of each value, and Map is
an associative array that lets you associate any
object with any other object – the Java collection classes will
automatically resize themselves. Thus, you can put in any number of objects and
you don’t need to worry about how big to make the collection while
you’re writing the
program.
The “disadvantage” to using
the Java collections is that you lose type information when you put an object
into a collection. This happens because, when the collection was written, the
programmer of that collection had no idea what specific type you wanted to put
in the collection, and making the collection hold only your type would prevent
it from being a general-purpose tool. So instead, the collection holds handles
to objects of type Object, which is of course every object in Java, since
it’s the root of all the classes. (Of course, this doesn’t include
primitive types, since they aren’t inherited from anything.) This is a
great solution, except for these reasons:
On the up side, Java
won’t let you misuse the objects that you put into a collection. If
you throw a dog into a collection of cats, then go through and try to treat
everything in the collection as a cat, you’ll get an exception when you
get to the dog. In the same vein, if you try to cast the dog handle that you
pull out of the cat collection into a cat, you’ll get an exception at
run-time.
Here’s an example:
//: c08:CatsAndDogs.java // Simple collection example import java.util.*; class Cat { private int catNumber; Cat(int i) { catNumber = i; } void print() { System.out.println("Cat #" + catNumber); } } class Dog { private int dogNumber; Dog(int i) { dogNumber = i; } void print() { System.out.println("Dog #" + dogNumber); } } public class CatsAndDogs { public static void main(String[] args) { ArrayList cats = new ArrayList(); for(int i = 0; i < 7; i++) cats.add(new Cat(i)); // Not a problem to add a dog to cats: cats.add(new Dog(7)); for(int i = 0; i < cats.size(); i++) ((Cat)cats.get(i)).print(); // Dog is detected only at run-time } } ///:~
You can see that using an ArrayList is
straightforward: create one, put objects in using
add( ), and later get
them out with get( ).
(Note that Vector has a method
size( ) to let you
know how many elements have been added so you don’t inadvertently run off
the end and cause an exception.)
The classes Cat and Dog are
distinct – they have nothing in common except that they are
Objects. (If you don’t explicitly say what class you’re
inheriting from, you automatically inherit from Object.) The
Vector class, which comes from java.util, holds Objects, so
not only can you put Cat objects into this collection using the Vector
method add( ), but you can also add Dog objects without
complaint at either compile-time or run-time. When you go to fetch out what you
think are Cat objects using the Vector method get( ),
you get back a handle to an Object that you must cast to a Cat.
Then you need to surround the entire expression with parentheses to force the
evaluation of the cast before calling the print( ) method for
Cat, otherwise you’ll get a syntax error. Then, at run-time, when
you try to cast the Dog object to a Cat, you’ll get an
exception.
This is more than just an annoyance.
It’s something that can create some difficult-to-find bugs. If one part
(or several parts) of a program inserts objects into a collection, and you
discover only in a separate part of the program through an exception that a bad
object was placed in the collection, then you must find out where the bad insert
occurred. You do this by code inspection, which is about the worst debugging
tool you have. On the upside, it’s convenient to start with some
standardized collection classes for programming, despite the scarcity and
awkwardness.
It turns out that in some cases things
seem to work correctly without casting back to your original type. The first
case is quite special: the String class has some extra help from the
compiler to make it work smoothly. Whenever the compiler expects a String
object and it hasn’t got one, it will automatically call the
toString( ) method
that’s defined in Object and can be overridden by any Java class.
This method produces the desired String object, which is then used
wherever it was wanted.
Thus, all you need to do to make objects
of your class print out is to override the toString( ) method, as
shown in the following example:
//: c08:WorksAnyway.java // In special cases, things just seem // to work correctly. import java.util.*; class Mouse { private int mouseNumber; Mouse(int i) { mouseNumber = i; } // Magic method: public String toString() { return "This is Mouse #" + mouseNumber; } void print(String msg) { if(msg != null) System.out.println(msg); System.out.println( "Mouse number " + mouseNumber); } } class MouseTrap { static void caughtYa(Object m) { Mouse mouse = (Mouse)m; // Cast from Object mouse.print("Caught one!"); } } public class WorksAnyway { public static void main(String[] args) { ArrayList mice = new ArrayList(); for(int i = 0; i < 3; i++) mice.add(new Mouse(i)); for(int i = 0; i < mice.size(); i++) { // No cast necessary, automatic call // to Object.toString(): System.out.println( "Free mouse: " + mice.get(i)); MouseTrap.caughtYa(mice.get(i)); } } } ///:~
You can see the redefinition of
toString( ) in Mouse. In the second for loop in
main( ) you find the statement:
System.out.println("Free mouse: " + mice.get(i));
After the ‘+’ sign the
compiler expects to see a
String object.
get( ) produces an Object, so to get the desired
String the compiler implicitly calls toString( ).
Unfortunately, you can work this kind of magic only with String; it
isn’t available for any other type.
A second approach to hiding the cast has
been placed inside MouseTrap. The caughtYa( ) method accepts
not a Mouse, but an Object, which it then casts to a Mouse.
This is quite presumptuous, of course, since by accepting an Object
anything could be passed to the method. However, if the cast is incorrect
– if you passed the wrong type – you’ll get an exception at
run-time. This is not as good as compile-time checking but it’s still
robust. Note that in the use of this method:
MouseTrap.caughtYa(mice.get(i));
no cast is necessary.
You might not want to give up on this
issue just yet. A more ironclad solution is to create a new class using the
Vector, such that it will accept only your type and produce only your
type:
//: c08:GopherList.java // A type-conscious ArrayList import java.util.*; class Gopher { private int gopherNumber; Gopher(int i) { gopherNumber = i; } void print(String msg) { if(msg != null) System.out.println(msg); System.out.println( "Gopher number " + gopherNumber); } } class GopherTrap { static void caughtYa(Gopher g) { g.print("Caught one!"); } } class GopherList { private ArrayList v = new ArrayList(); public void add(Gopher m) { v.add(m); } public Gopher get(int index) { return (Gopher)v.get(index); } public int size() { return v.size(); } public static void main(String[] args) { GopherList gophers = new GopherList(); for(int i = 0; i < 3; i++) gophers.add(new Gopher(i)); for(int i = 0; i < gophers.size(); i++) GopherTrap.caughtYa(gophers.get(i)); } } ///:~
This is similar to the previous example,
except that the new GopherVector class has a private member of
type Vector (inheriting from Vector tends to be frustrating, for
reasons you’ll see later), and methods just like Vector. However,
it doesn’t accept and produce generic Objects, only Gopher
objects.
Because a GopherVector will accept
only a Gopher, if you were to say:
gophers.add(new Pigeon());
you would get an error message at
compile time. This approach, while more tedious from a coding standpoint,
will tell you immediately if you’re using a type
improperly.
Note that no cast is necessary when using
get( ) – it’s always a Gopher.
This kind of problem isn’t isolated
– there are numerous cases in which you need to create new types based on
other types, and in which it is useful to have specific type information at
compile-time. This is the concept of a
parameterized type. In C++,
this is directly supported by the language in
templates. At one point,
Java had reserved the keyword generic to someday
support parameterized types, but it’s uncertain if this will ever
occur.
In any collection class, you must have a
way to put things in and a way to get things out. After all, that’s the
primary job of a collection – to hold things. In the
ArrayList, add( ) is the way that you
insert objects, and get( ) is one way to get things out.
ArrayList is quite flexible – you can select anything at any time,
and select multiple elements at once using different
indexes.
If you want to start thinking at a higher
level, there’s a drawback: you need to know the exact type of the
collection in order to use it. This might not seem bad at first, but what if you
start out using a ArrayList, and later on in your program you decide, for
efficiency, that you want to change to a LinkedList? Or you’d like
to write a piece of code that doesn’t know or care what type of collection
it’s working with.
The concept of an
iterator can be used to achieve this next level of
abstraction. This is an object whose job is to move through a sequence of
objects and select each object in that sequence without the client programmer
knowing or caring about the underlying structure of that sequence. In addition,
an iterator is usually what’s called a “light-weight” object;
that is, one that’s cheap to create. For that reason, you’ll often
find seemingly strange constraints for iterators; for example, some iterators
can move in only one direction.
The Java Iterator is an example of
an iterator with these kinds of constraints. There’s not much you can do
with one except:
That’s
all. It’s a simple implementation of an iterator, but still powerful. To
see how it works, let’s revisit the CatsAndDogs.java program from
earlier in the chapter. In the original version, the method get( )
was used to select each element, but in the following modified version an
Iterator is used:
//: c08:CatsAndDogs2.java // Simple collection with Iterator import java.util.*; class Cat2 { private int catNumber; Cat2(int i) { catNumber = i; } void print() { System.out.println("Cat number " +catNumber); } } class Dog2 { private int dogNumber; Dog2(int i) { dogNumber = i; } void print() { System.out.println("Dog number " +dogNumber); } } public class CatsAndDogs2 { public static void main(String[] args) { ArrayList cats = new ArrayList(); for(int i = 0; i < 7; i++) cats.add(new Cat2(i)); // Not a problem to add a dog to cats: cats.add(new Dog2(7)); Iterator e = cats.iterator(); while(e.hasNext()) ((Cat2)e.next()).print(); // Dog is detected only at run-time } } ///:~
You can see that the only change is in
the last few lines. Instead of:
for(int i = 0; i < cats.size(); i++) ((Cat)cats.get(i)).print();
an Iterator is used to step
through the sequence:
while(e.hasNext()) ((Cat2)e.next()).print();
With the Iterator, you don’t
need to worry about the number of elements in the collection. That’s taken
care of for you by
hasNext( ) and
next( ).
As another example, consider the creation
of a general-purpose printing method:
//: c08:HamsterMaze.java // Using an Iterator import java.util.*; class Hamster { private int hamsterNumber; Hamster(int i) { hamsterNumber = i; } public String toString() { return "This is Hamster #" + hamsterNumber; } } class Printer { static void printAll(Iterator e) { while(e.hasNext()) System.out.println(e.next()); } } public class HamsterMaze { public static void main(String[] args) { ArrayList v = new ArrayList(); for(int i = 0; i < 3; i++) v.add(new Hamster(i)); Printer.printAll(v.iterator()); } } ///:~
Look closely at the printing
method:
static void printAll(Iterator e) { while(e.hasNext()) System.out.println(e.next()); }
Note that there’s no information
about the type of sequence. All you have is an Iterator, and that’s
all you need to know about the sequence: that you can get the next object, and
that you can know when you’re at the end. This idea of taking a collection
of objects and passing through it to perform an operation on each one is
powerful and will be seen throughout this book.
This particular example is even more
generic, since it implicitly uses the ubiquitous
toString( ) method (ubiquitous only because
it’s part of the Object class).
This uses the “automatic
conversion to String” that’s wired into Java. When any object
is handed to println( ), a String is automatically produced by
calling toString( ).
Although it’s unnecessary, you can
be more explicit using a cast, which has the effect of calling
toString( ):
System.out.println((String)e.next());
In general, however, you’ll want to
do something more than call Object methods, so you’ll run up
against the type-casting issue again. You must assume you’ve gotten an
Iterator to a sequence of the particular type you’re interested in,
and cast the resulting objects to that type (getting a run-time exception if
you’re
wrong).
The standard Java
1.0 and 1.1 library comes with a bare minimum set of
collection classes, but they’re probably enough to get by with for many of
your programming projects. (As you’ll see at the end of this chapter, Java
2 provides a radically redesigned and filled-out library
of collections.)
The ArrayList is quite simple to
use, as you’ve seen so far. Although most of the time you’ll just
use add( ) to insert objects, get( ) to get them out one
at a time, and elements( ) to get an Iterator to the
sequence, there’s also a set of other methods that can be useful. As usual
with the Java libraries, we won’t use or talk about them all here, but be
sure to look them up in the electronic documentation to get a feel for what they
can do.
The Java standard collections contain a
toString( ) method so they can produce a String
representation of themselves, including the objects they hold. Inside of
ArrayList, for example, the toString( ) steps through the
elements of the ArrayList and calls toString( ) for each one.
Suppose you’d like to print out the address of your class. It seems to
make sense to simply refer to this (in particular, C++ programmers are
prone to this approach):
//: c08:CrashJava.java // One way to crash Java import java.util.*; public class CrashJava { public String toString() { return "CrashJava address: " + this + "\n"; } public static void main(String[] args) { ArrayList v = new ArrayList(); for(int i = 0; i < 10; i++) v.add(new CrashJava()); System.out.println(v); } } ///:~
It turns out that if you simply create a
CrashJava object and print it out, you’ll get an endless sequence
of exceptions. However, if you place the CrashJava objects in an
ArrayList and print out that ArrayList as shown here, it can’t
handle it and you don’t even get an exception; Java just crashes. (But at
least it didn’t bring down my operating system.) This was tested with Java
1.1.
What’s happening is automatic type
conversion for Strings. When you say:
"CrashJava address: " + this
The compiler sees a String
followed by a ‘+’ and something that’s not a
String, so it tries to convert this to a String. It does
this conversion by calling toString( ), which produces a
recursive call. When this occurs
inside an ArrayList, it appears that the
stack overflows without the
exception-handling mechanism getting a chance to respond.
If you really do want to print the
address of the object in this case, the solution is to call the Object
toString( ) method, which does just that. So instead of saying
this, you’d say super.toString( ). (This only works if
you're directly inheriting from Object or if none of your parent classes
have overridden the toString( )
method).
A BitSet is
really a Vector of bits, and it is used if you want to efficiently store
a lot of on-off information. It’s efficient only from the standpoint of
size; if you’re looking for efficient access, it is slightly slower than
using an array of some native type.
In addition, the minimum size of the
BitSet is that of a long: 64 bits. This implies that if you’re
storing anything smaller, like 8 bits, a BitSet will be wasteful, so
you’re better off creating your own class to hold your
flags.
In a normal Vector, the collection
will expand as you add more elements. The BitSet does this as well
– sort of. That is, sometimes it works and sometimes it doesn’t,
which makes it appear that the Java version 1.0 implementation of BitSet
is just badly done. (It is fixed in Java 1.1.) The
following example shows how the BitSet works and demonstrates the version
1.0 bug:
//: c08:Bits.java // Demonstration of BitSet import java.util.*; public class Bits { public static void main(String[] args) { Random rand = new Random(); // Take the LSB of nextInt(): byte bt = (byte)rand.nextInt(); BitSet bb = new BitSet(); for(int i = 7; i >=0; i--) if(((1 << i) & bt) != 0) bb.set(i); else bb.clear(i); System.out.println("byte value: " + bt); printBitSet(bb); short st = (short)rand.nextInt(); BitSet bs = new BitSet(); for(int i = 15; i >=0; i--) if(((1 << i) & st) != 0) bs.set(i); else bs.clear(i); System.out.println("short value: " + st); printBitSet(bs); int it = rand.nextInt(); BitSet bi = new BitSet(); for(int i = 31; i >=0; i--) if(((1 << i) & it) != 0) bi.set(i); else bi.clear(i); System.out.println("int value: " + it); printBitSet(bi); // Test bitsets >= 64 bits: BitSet b127 = new BitSet(); b127.set(127); System.out.println("set bit 127: " + b127); BitSet b255 = new BitSet(65); b255.set(255); System.out.println("set bit 255: " + b255); BitSet b1023 = new BitSet(512); // Without the following, an exception is thrown // in the Java 1.0 implementation of BitSet: // b1023.set(1023); b1023.set(1024); System.out.println("set bit 1023: " + b1023); } static void printBitSet(BitSet b) { System.out.println("bits: " + b); String bbits = new String(); for(int j = 0; j < b.size() ; j++) bbits += (b.get(j) ? "1" : "0"); System.out.println("bit pattern: " + bbits); } } ///:~
The random number generator is used to
create a random byte, short, and int, and each one is
transformed into a corresponding bit pattern in a BitSet. This works fine
because a BitSet is 64 bits, so none of these cause it to increase in
size. But in Java 1.0, when the BitSet is greater
than 64 bits, some strange behavior occurs. If you set a bit that’s just
one greater than the BitSet’s currently-allocated storage, it will
expand nicely. But if you try to set bits at higher locations than that without
first just touching the boundary, you’ll get an exception, since the
BitSet won’t expand properly in Java 1.0. The example shows a
BitSet of 512 bits being created. The constructor allocates storage for
twice that number of bits. Then if you try to set bit 1024 or greater without
first setting bit 1023, you’ll throw an exception in
Java 1.0. Fortunately, this is fixed in Java
1.1, but avoid using the BitSet if you write code
for Java
1.0.
A Stack is
sometimes referred to as a “last-in, first-out” (LIFO) collection.
That is, whatever you “push” on the Stack last is the first
item you can “pop” out. Like all of the other collections in Java,
what you push and pop are Objects, so you must cast what you
pop.
What’s rather odd is that instead
of using a Vector as a building block to create a
Stack, Stack is inherited from Vector. So it has all of the
characteristics and behaviors of a Vector plus some extra
Stack behaviors. It’s difficult to know whether the designers
explicitly decided that this was an especially useful way to do things, or
whether it was just a naïve design.
Here’s a simple demonstration of
Stack that reads each line from an array and pushes it as a
String:
//: c08:Stacks.java // Demonstration of Stack Class import java.util.*; public class Stacks { static String[] months = { "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December" }; public static void main(String[] args) { Stack stk = new Stack(); for(int i = 0; i < months.length; i++) stk.push(months[i] + " "); System.out.println("stk = " + stk); // Treating a stack as a Vector: stk.addElement("The last line"); System.out.println( "element 5 = " + stk.elementAt(5)); System.out.println("popping elements:"); while(!stk.empty()) System.out.println(stk.pop()); } } ///:~
Each line in the months array is
inserted into the Stack with push( ), and later fetched from
the top of the stack with a pop( ). To make a point, Vector
operations are also performed on the Stack object. This is possible
because, by virtue of inheritance, a Stack is a Vector.
Thus, all operations that can be performed on a Vector can also be
performed on a Stack, such as
elementAt( ).
A Vector allows you to select from
a sequence of objects using a number, so in a sense it associates numbers to
objects. But what if you’d like to select from a sequence of objects using
some other criterion? A Stack is an example: its selection criterion is
“the last thing pushed on the stack.” A powerful twist on this idea
of “selecting from a sequence” is alternately termed a
map, a dictionary,
or an associative array.
Conceptually, it seems like a vector, but instead of looking up objects using a
number, you look them up using another object! This is often a key
process in a program.
The concept shows up in Java as the
abstract class Dictionary. The interface for this class is
straightforward: size( ) tells you how many elements are within,
isEmpty( ) is true if there are no elements, put(Object
key, Object value) adds a value (the thing you want), and associates it with
a key (the thing you look it up with). get(Object key) produces the value
given the corresponding key, and remove(Object key) removes the key-value
pair from the list. There are Iterators: keys( ) produces an
Iterator of the keys, and elements( ) produces an
Iterator of all the values. That’s all there is to a
Dictionary.
A Dictionary isn’t terribly
difficult to implement. Here’s a simple approach, which uses two
Vectors, one for keys and one for values:
//: c08:AssocArray.java // Simple version of a Map import java.util.*; public class AssocArray extends AbstractMap { private ArrayList keys = new ArrayList(); private ArrayList values = new ArrayList(); public int size() { return keys.size(); } public boolean isEmpty() { return keys.isEmpty(); } public Object put(Object key, Object value) { int index = keys.indexOf(key); if (index == -1) { // Key not found keys.add(key); values.add(value); return null; } else { // Key already in table; replace Object returnval = values.get(index); values.set(index, value); return returnval; } } public Object get(Object key) { int index = keys.indexOf(key); // indexOf() Returns -1 if key not found: if(index == -1) return null; return values.get(index); } public Object remove(Object key) { int index = keys.indexOf(key); if(index == -1) return null; keys.remove(index); Object returnval = values.get(index); values.remove(index); return returnval; } public Set keySet() { return new HashSet(keys); } public Collection values() { return values; } public Set entrySet() { Set set = new HashSet(); // Iterator it = keys.iterator(); // while(it.hasNext()) { // Object k = it.next(); // Object v = values.get(values.indexOf(k)); // set.add(new Map.Entry(k, v)); // } return set; } // Test it: public static void main(String[] args) { AssocArray aa = new AssocArray(); for(char c = 'a'; c <= 'z'; c++) aa.put(String.valueOf(c), String.valueOf(c) .toUpperCase()); char[] ca = { 'a', 'e', 'i', 'o', 'u' }; for(int i = 0; i < ca.length; i++) System.out.println("Uppercase: " + aa.get(String.valueOf(ca[i]))); } } ///:~
The first thing you see in the definition
of AssocArray is that it extends Dictionary. This means that
AssocArray is a type of Dictionary, so you can make the
same requests of it that you can a Dictionary. If you make your own
Dictionary, as is done here, all you need to do is fill in all the
methods that are in Dictionary. (And you must override all the
methods because all of them – with the exception of the constructor
– are abstract.)
The Vectors keys and
values are linked by a common index number. That is, if you call
put( ) with a key of “roof” and a value of
“blue” (assuming you’re associating the various parts of a
house with the colors they are to be painted) and there are already 100 elements
in the AssocArray, then “roof” will be the 101 element of
keys and “blue” will be the 101 element of values. And
if you look at get( ), when you pass “roof” in as the
key, it produces the index number with keys.indexOf( ), and
then uses that index number to produce the value in the associated values
vector.
The test in main( ) is
simple; it’s just a map of lowercase characters to uppercase characters,
which could obviously be done in a number of more efficient ways. But it shows
that AssocArray is functional.
The standard Java library contains two
different types of Maps: HashMap and TreeMap. Both have the
same interface HashMap (since they both implement Map), but they
differ in one distinct way: efficiency. If you look at what must be done for a
get( ), it seems pretty slow to search through an ArrayList for the
key. This is where HashMap speeds things up. Instead of the tedious
linear search for the key, it uses a special value called a hash code.
The hash code is a way to take some information in the object in question
and turn it into a “relatively unique” int for that object.
All objects have a hash code, and hashCode( )
is a method in the root class Object. A HashMap
takes the hashCode( ) of the object and uses it to quickly hunt
for the key. This results in a dramatic performance
improvement.[42]
The way that a HashMap works is beyond the scope of this
book[43] –
all you need to know is that HashMap is a fast Dictionary, and
that a Dictionary is a useful tool.
As an example of the use of a
HashMap, consider a program to check the randomness of Java’s
Math.random( ) method.
Ideally, it would produce a perfect distribution of random numbers, but to test
this you need to generate a bunch of random numbers and count the ones that fall
in the various ranges. A HashMap is perfect for this, since it associates
objects with objects (in this case, the values produced by
Math.random( ) with the number of times those values
appear):
//: c08:Statistics.java // Simple demonstration of HashMap import java.util.*; class Counter { int i = 1; public String toString() { return Integer.toString(i); } } class Statistics { public static void main(String[] args) { HashMap ht = new HashMap(); for(int i = 0; i < 10000; i++) { // Produce a number between 0 and 20: Integer r = new Integer((int)(Math.random() * 20)); if(ht.containsKey(r)) ((Counter)ht.get(r)).i++; else ht.put(r, new Counter()); } System.out.println(ht); } } ///:~
In main( ), each time a
random number is generated it is wrapped inside an Integer object so that
handle can be used with the HashMap. (You can’t use a primitive
with a collection, only an object handle.) The containsKey( ) method
checks to see if this key is already in the collection. (That is, has the number
been found already?) If so, the get( )
methods gets the associated value for the key, which in this case is a
Counter object. The value i inside the counter is then incremented
to indicate that one more of this particular random number has been
found.
If the key has not been found yet, the
method put( ) will place a new key-value pair
into the HashMap. Since Counter automatically initializes its
variable i to one when it’s created, it indicates the first
occurrence of this particular random number.
To display the HashMap, it is
simply printed out. The HashMap toString( ) method moves
through all the key-value pairs and calls the toString( ) for each
one. The Integer toString( ) is pre-defined, and you can see the
toString( ) for Counter. The output from one run (with some
line breaks added) is:
{19=526, 18=533, 17=460, 16=513, 15=521, 14=495, 13=512, 12=483, 11=488, 10=487, 9=514, 8=523, 7=497, 6=487, 5=480, 4=489, 3=509, 2=503, 1=475, 0=505}
You might wonder at the necessity of the
class Counter, which seems like it doesn’t even have the
functionality of the wrapper class Integer. Why not use int or
Integer? Well, you can’t use an int because all of the
collections can hold only Object handles. After seeing collections the
wrapper classes might begin to make a little more sense to you, since you
can’t put any of the primitive types in collections. However, the only
thing you can do with the Java wrappers is to
initialize them to a particular value and read that value. That is,
there’s no way to change a value once a wrapper object has been created.
This makes the Integer wrapper immediately useless to solve our problem,
so we’re forced to create a new class that does satisfy the
need.
In the previous example, a standard
library class (Integer) was used as a key for the HashMap. It
worked fine as a key, because it has all the necessary wiring to make it work
correctly as a key. But a common pitfall occurs when using HashMaps when
you create your own classes to be used as keys. For example, consider a weather
predicting system that matches Groundhog objects to Prediction
objects. It seems fairly straightforward: you create the two classes and use
Groundhog as the key and Prediction as the value:
//: c08:SpringDetector.java // Looks plausible, but doesn't work right. import java.util.*; class Groundhog { int ghNumber; Groundhog(int n) { ghNumber = n; } } class Prediction { boolean shadow = Math.random() > 0.5; public String toString() { if(shadow) return "Six more weeks of Winter!"; else return "Early Spring!"; } } public class SpringDetector { public static void main(String[] args) { HashMap ht = new HashMap(); for(int i = 0; i < 10; i++) ht.put(new Groundhog(i), new Prediction()); System.out.println("ht = " + ht + "\n"); System.out.println( "Looking up prediction for groundhog #3:"); Groundhog gh = new Groundhog(3); if(ht.containsKey(gh)) System.out.println((Prediction)ht.get(gh)); } } ///:~
Each Groundhog is given an
identity number, so you can look up a Prediction in the HashMap by
saying “Give me the Prediction associated with Groundhog
number 3.” The Prediction class contains a boolean that is
initialized using Math.random( ), and a toString( ) that
interprets the result for you. In main( ), a HashMap is
filled with Groundhogs and their associated Predictions. The
HashMap is printed so you can see that it has been filled. Then a
Groundhog with an identity number of 3 is used to look up the prediction
for Groundhog #3.
It seems simple enough, but it
doesn’t work. The problem is that Groundhog is inherited from the
common root class Object (which is what happens if you don’t
specify a base class, thus all classes are ultimately inherited from
Object). It is Object’s hashCode( ) method that
is used to generate the hash code for each object, and by default it just uses
the address of its object. Thus, the first instance of Groundhog(3) does
not produce a hash code equal to the hash code for the second instance of
Groundhog(3) that we tried to use as a lookup.
You might think that all you need to do
is write an appropriate override for
hashCode( ). But it still won’t work
until you’ve done one more thing: override the
equals( ) that is also part of Object.
This method is used by the HashMap when trying to determine if your key
is equal to any of the keys in the table. Again, the default
Object.equals( ) simply compares object addresses, so one
Groundhog(3) is not equal to another
Groundhog(3).
Thus, to use your own classes as keys in
a HashMap, you must override both hashCode( ) and
equals( ), as shown in the following solution to the problem
above:
//: c08:SpringDetector2.java // If you create a class that's used as a key in // a HashMap, you must override hashCode() // and equals(). import java.util.*; class Groundhog2 { int ghNumber; Groundhog2(int n) { ghNumber = n; } public int hashCode() { return ghNumber; } public boolean equals(Object o) { return (o instanceof Groundhog2) && (ghNumber == ((Groundhog2)o).ghNumber); } } public class SpringDetector2 { public static void main(String[] args) { HashMap ht = new HashMap(); for(int i = 0; i < 10; i++) ht.put(new Groundhog2(i),new Prediction()); System.out.println("ht = " + ht + "\n"); System.out.println( "Looking up prediction for groundhog #3:"); Groundhog2 gh = new Groundhog2(3); if(ht.containsKey(gh)) System.out.println((Prediction)ht.get(gh)); } } ///:~
Note that this uses the Prediction
class from the previous example, so SpringDetector.java must be compiled
first or you’ll get a compile-time error when you try to compile
SpringDetector2.java.
Groundhog2.hashCode( )
returns the ground hog number as an identifier. (In this example, the programmer
is responsible for ensuring that no two ground hogs exist with the same ID
number.) The hashCode( ) is not required to return a unique
identifier, but the equals( ) method must be able to strictly
determine whether two objects are equivalent.
Even though it appears that the
equals( ) method is only checking to see whether the argument is an
instance of Groundhog2 (using the instanceof keyword, which is
fully explained in Chapter 11), the instanceof actually quietly does a
second sanity check, to see if the object is null, since
instanceof produces false if the left-hand argument is
null. Assuming it’s the correct type and not null, the
comparison is based on the actual ghNumbers. This time, when you run the
program, you’ll see it produces the correct output. (Many of the Java
library classes override the hashCode( ) and equals( )
methods to be based upon their contents.)
In the first example in this book, a type
of HashMap was used called Properties. In that example, the
lines:
Properties p = System.getProperties(); p.list(System.out);
called the static method
getProperties( ) to get a special
Properties object that described the system characteristics. The method
list( ) is a method of Properties that sends the contents to
any stream output that you choose. There’s also a save( )
method to allow you to write your property list to a file in a way that it can
be retrieved later with the load( ) method.
Although the
Properties class is inherited from HashMap,
it also contains a second HashMap that acts to hold the list of
“default” properties. So if a property isn’t found in the
primary list, the defaults will be searched.
The Properties class is also
available for use in your programs (an example is ClassScanner.java in
Chapter 16). You can find more complete details in the Java library
documentation.
We can now demonstrate the true power of
the Iterator: the ability to separate the
operation of traversing a sequence from the underlying structure of that
sequence. In the following example, the class PrintData uses an
Iterator to move through a sequence and call the
toString( ) method for every object. Two
different types of collections are created, a
Vector and a
HashMap, and they are each filled with,
respectively, Mouse and Hamster objects. (These classes are
defined earlier in the chapter; notice you must have compiled
HamsterMaze.java and WorksAnyway.java for the following program to
compile.) Because an Iterator hides the structure of the underlying
collection, PrintData doesn’t know or care what kind of collection
the Iterator comes from:
//: c08:Iterators2.java // Revisiting Iterators import java.util.*; class PrintData { static void print(Iterator e) { while(e.hasNext()) System.out.println(e.next()); } } class Enumerators2 { public static void main(String[] args) { ArrayList v = new ArrayList(); for(int i = 0; i < 5; i++) v.add(new Mouse(i)); HashMap h = new HashMap(); for(int i = 0; i < 5; i++) h.put(new Integer(i), new Hamster(i)); System.out.println("ArrayList"); PrintData.print(v.iterator()); System.out.println("HashMap"); PrintData.print(h.entrySet().iterator()); } } ///:~
Note that PrintData.print( )
takes advantage of the fact that the objects in these collections are of class
Object so it can call toString( ). It’s more likely
that in your problem, you must make the assumption that your Iterator is
walking through a collection of some specific type. For example, you might
assume that everything in the collection is a Shape with a
draw( ) method. Then you must downcast from the Object that
Iterator.next() returns to produce a
Shape.
One of the things missing in the Java
1.0 and 1.1 libraries is algorithmic operations, even
simple sorting. So it makes sense to create an ArrayList
that sorts itself using the classic
Quicksort.
A problem with writing generic sorting
code is that sorting must perform comparisons based on the actual type of the
object. Of course, one approach is to write a different sorting method for every
different type, but you should be able to recognize that this does not produce
code that is easily re-used for new types.
A primary goal of programming design is
to “separate things that change from things that stay the same,” and
here, the code that stays the same is the general sort algorithm, but the thing
that changes from one use to the next is the way objects are compared. So
instead of hard-wiring the comparison code into many different sort routines,
the technique of the callback will be used. With a
callback, the part of the code that varies from case to case is encapsulated
inside its own class, and the part of the code that’s always the same will
call back to the code that changes. That way you can make different objects to
express different ways of comparison and feed them to the same sorting
code.
The following interface describes
how to compare two objects, and thus encapsulates “the things that
change” for this particular problem:
//: c08:Compare.java // Interface for sorting callback package c08; interface Compare { boolean lessThan(Object lhs, Object rhs); boolean lessThanOrEqual(Object lhs, Object rhs); } ///:~
For both methods, the lhs
represents the “left hand” object and the rhs represents the
“right hand” object in the comparison.
A subclass of ArrayList can be
created that implements the Quicksort using Compare. The algorithm, which
is known for its speed, will not be explained here. For details, see
Practical Algorithms for Programmers, by Binstock & Rex,
Addison-Wesley 1995.
//: c08:SortList.java // A generic sorting list package c08; import java.util.*; public class SortList extends ArrayList { private Compare compare; // To hold the callback public SortList(Compare comp) { compare = comp; } public void sort() { quickSort(0, size() - 1); } private void quickSort(int left, int right) { if(right > left) { Object o1 = get(right); int i = left - 1; int j = right; while(true) { while(compare.lessThan( get(++i), o1)) ; while(j > 0) if(compare.lessThanOrEqual( get(--j), o1)) break; // out of while if(i >= j) break; swap(i, j); } swap(i , right); quickSort(left, i-1); quickSort(i+1, right); } } private void swap(int loc1, int loc2) { Object tmp = get(loc1); set(loc1, get(loc2)); set(loc2, tmp); } } ///:~
You can now see the reason for the term
“callback,” since the quickSort( ) method “calls
back” to the methods in Compare. You can also see how this
technique has produced generic, reusable code.
To use the SortList, you must
create a class that implements Compare for the kind of objects that
you’re sorting. This is a place where an inner class is not essential, but
it can make sense for code organization. Here’s an example for
String objects:
//: c08:StringSortTest.java // Testing the generic sorting ArrayList package c08; import java.util.*; public class StringSortTest { static class StringCompare implements Compare { public boolean lessThan(Object l, Object r) { return ((String)l).toLowerCase().compareTo( ((String)r).toLowerCase()) < 0; } public boolean lessThanOrEqual(Object l, Object r) { return ((String)l).toLowerCase().compareTo( ((String)r).toLowerCase()) <= 0; } } public static void main(String[] args) { SortList sv = new SortList(new StringCompare()); sv.add("d"); sv.add("A"); sv.add("C"); sv.add("c"); sv.add("b"); sv.add("B"); sv.add("D"); sv.add("a"); sv.sort(); Iterator e = sv.iterator(); while(e.hasNext()) System.out.println(e.next()); } } ///:~
The inner class is static because
it does not need a link to an outer class in order for it to
function.
You can see how, once the framework is
set up, it’s easy to reuse a design like this – you simply write the
class that encapsulates “the things that change” and hand an object
to the SortList.
The comparison forces the strings to
lower case, so that the capital A’s end up next to the small
a’s and not in some entirely different place. This example shows,
however, a slight deficiency in this approach, since the test code above puts
the uppercase and lowercase single letters of the same letter in the order that
they appear: A a b B c C d D. This is not usually much of a problem, because
you’re usually working with longer strings and in that situation the
effect doesn’t show up. (The Java 2 collections
provide sorting functionality that solves this problem.)
Inheritance (extends) is used here
to create a new type of ArrayList – that is, SortList is
an ArrayList with some added functionality. The use of inheritance
here is powerful but it presents problems. It turns out that some methods are
final (described in Chapter 7), so you cannot override them. If you want
to create a sorted ArrayList that accepts and produces only String
objects you run into a wall, since add( ) and
elementAt( ) are final, and these are precisely the methods
you’d need to override so they accept and produce only String
objects. No luck there.
On the other hand, consider
composition: the placing of an object inside a new
class. Rather than rewrite the above code to accomplish this, we can simply use
a SortList inside the new class. In this case, the inner class to
implement the interface Compare will be created
anonymously:
//: c08:StrSortList.java // Automatically sorted ArrayList that // accepts and produces only Strings package c08; import java.util.*; public class StrSortList { private SortList v = new SortList( // Anonymous inner class: new Compare() { public boolean lessThan(Object l, Object r) { return ((String)l).toLowerCase().compareTo( ((String)r).toLowerCase()) < 0; } public boolean lessThanOrEqual(Object l, Object r) { return ((String)l).toLowerCase().compareTo( ((String)r).toLowerCase()) <= 0; } } ); private boolean sorted = false; public void add(String s) { v.add(s); sorted = false; } public String get(int index) { if(!sorted) { v.sort(); sorted = true; } return (String)v.get(index); } public Iterator iterator() { if(!sorted) { v.sort(); sorted = true; } return v.iterator(); } // Test it: public static void main(String[] args) { StrSortList sv = new StrSortList(); sv.add("d"); sv.add("A"); sv.add("C"); sv.add("c"); sv.add("b"); sv.add("B"); sv.add("D"); sv.add("a"); Iterator e = sv.iterator(); while(e.hasNext()) System.out.println(e.next()); } } ///:~
This quickly reuses the code from
SortList to create the desired functionality. However, not all of the
public methods from SortList and ArrayList appear in
StrSortList. When reusing code this way, you can make a definition in the
new class for each one in the contained class, or you can start with just a few
and periodically go back and add more when you need them. Eventually the new
class design will settle down.
The advantage to this approach is that it
will take only String objects and produce only String objects, and
the checking happens at compile time instead of run time. Of course,
that’s only true for add( ) and elementAt( );
elements( ) still produces an Iterator that is untyped at
compile time. Type checking for the Iterator and
in StrSortList still happens, of course, it just happens at run-time by
throwing exceptions if you do something wrong. It’s a trade-off: do you
find out about something for sure at compile time or probably at
run-time? (That is, “probably not while you’re testing the
code” and “probably when the program user tries something you
didn’t test for.”) Given the choices and the hassle, it’s
easier to use inheritance and just grit your teeth while casting – again,
if parameterized types are ever added to Java, they will
solve this problem.
You can see there’s a flag called
sorted in this class. You could sort the ArrayList every time
add( ) is called, and constantly keep it in a sorted state. But
usually people add a lot of elements to a ArrayList before beginning to
read it. So sorting after every add( ) would be less efficient than
waiting until someone wants to read the ArrayList and then sorting it, which is
what is done here. The technique of delaying a process until it is absolutely
necessary is called lazy evaluation. (There is an
analogous technique called lazy initialization which waits until a field
value is necessary before initializing
it.)
To me, collection classes are one of the
most powerful tools for raw programming. You might have gathered that I’m
somewhat disappointed in the collections provided in Java through version 1.1.
As a result, it’s a tremendous pleasure to see that collections were given
proper attention in
Java
2, and thoroughly redesigned (by Joshua Bloch at Sun). I consider the
collections library to be one of the two major features in Java 2 (the other is
the Swing library, covered in Chapter 13) because they significantly increase
your programming muscle and help bring Java in line with more mature programming
systems.
Some of the redesign makes things tighter
and more sensible. For example, many names are shorter, cleaner, and easier to
understand, as well as to type. Some names are changed to conform to accepted
terminology: a particular favorite of mine is “iterator” instead of
“enumeration.”
The redesign also fills out the
functionality of the collections library. You can now have the behavior of
linked lists, queues, and
dequeues (double-ended queues, pronounced
“decks”).
The design of a collections library is
difficult (true of most library design problems). In
C++, the STL covered the bases
with many different classes. This was better than what was available prior to
the STL (nothing), but it didn’t translate well into Java. The result was
a rather confusing morass of classes. On the other extreme, I’ve seen a
collections library that consists of a single class, “collection,”
which acts like a ArrayList and a HashMap at the same time. The
designers of the Java 2 collections library wanted to strike a balance: the full
functionality that you expect from a mature collections library, but easier to
learn and use than the STL and other similar collections libraries. The result
can seem a bit odd in places. Unlike some of the decisions made in the early
Java libraries, these oddities were not accidents, but carefully considered
decisions based on tradeoffs in complexity. It might take you a little while to
get comfortable with some aspects of the library, but I think you’ll find
yourself rapidly acquiring and using these new tools.
The Java 2 Collections library takes the
issue of “holding your objects” and divides it into two distinct
concepts:
Collections and
Maps may be implemented in many different ways, according to your
programming needs. It’s helpful to look at a diagram of the Java 2
Collections:
This diagram can be a bit overwhelming at
first, but throughout the rest of this section you’ll see that there are
really only three collection components: Map, List, and
Set, and only two or three implementations of each one (with, typically,
a preferred version). When you see this, the Java 2 Collections should not seem
so daunting.
The dotted boxes represent
interfaces, the dashed boxes represent abstract classes, and the
solid boxes are regular (concrete) classes. The dotted-line arrows indicate that
a particular class is implementing an interface (or in the case of an
abstract class, partially implementing that interface). The solid
arrows show that a class can produce objects of the class the arrow is pointing
to. For example, any Collection can produce an Iterator, while a
List can produce a ListIterator (as well as an ordinary
Iterator, since List is inherited from
Collection).
The interfaces that are concerned with
holding objects are Collection, List, Set, and
Map. Typically, you’ll write the bulk of your code to talk
to these interfaces, and the only place where you’ll specify the precise
type you’re using is at the point of creation. So you can create a
List like this:
List x = new LinkedList();
Of course, you can also decide to make
x a LinkedList (instead of a generic List) and carry
the precise type information around with x. The beauty (and the intent)
of using the interface is that if you decide you want to change your
implementation, all you need to do is change it at the point of creation, like
this:
List x = new ArrayList();
The rest of your code can remain
untouched.
In the class hierarchy, you can see a
number of classes whose names begin with “Abstract,” and
these can seem a bit confusing at first. They are simply tools that partially
implement a particular interface. If you were making your own Set, for
example, you wouldn’t start with the Set interface and implement
all the methods, instead you’d inherit from
AbstractSet and do the minimal necessary work to
make your new class. However, the Java 2 Collections library contains enough
functionality to satisfy your needs virtually all the time. So for our purposes,
you can ignore any class that begins with
“Abstract.”
Therefore, when you look at the diagram,
you’re really concerned with only those interfaces at the top of
the diagram and the concrete classes (those with solid boxes around them).
You’ll typically make an object of a concrete class, upcast it to the
corresponding interface, and then use the interface throughout the
rest of your code. Here’s a simple example, which fills a
Collection with String objects and then prints each element in the
Collection:
//: c08:newcollections:SimpleCollection.java // A simple example using Java 2 Collections package c08.newcollections; import java.util.*; public class SimpleCollection { public static void main(String[] args) { Collection c = new ArrayList(); // Upcast because we just want to // work with Collection features for(int i = 0; i < 10; i++) c.add(Integer.toString(i)); Iterator it = c.iterator(); while(it.hasNext()) System.out.println(it.next()); } } ///:~
All the code examples for the Java 2
Collections libraries will be placed in the subdirectory newcollections,
so you’ll be reminded that these work only with Java 2. As a result, you
must invoke the program by saying:
java c08.newcollections.SimpleCollection
with a similar syntax for the rest of the
programs in the package.
You can see that the Java 2 Collections
are part of the java.util library, so you don’t need to add any
extra import statements to use them.
The first line in main( )
creates an ArrayList object and then upcasts it to
a Collection. Since this example uses only the Collection methods,
any object of a class inherited from Collection would work, but
ArrayList is the typical workhorse Collection and takes the place
of Vector.
The add( ) method, as its
name suggests, puts a new element in the Collection. However, the
documentation carefully states that add( ) “ensures that this
Collection contains the specified element.” This is to allow for the
meaning of Set, which adds the element only if it isn’t already
there. With an ArrayList, or any sort of List, add( )
always means “put it in.”
All Collections can produce an
Iterator via their
iterator( ) method. An Iterator is
just like an Enumeration, which it replaces,
except:
In
SimpleCollection.java, you can see that an Iterator is created and
used to traverse the Collection, printing each
element.
The following table shows everything you
can do with a Collection, and thus, everything you can do with a
Set or a List. (List also has additional functionality.)
Maps are not inherited from Collection, and will be treated
separately.
Boolean
add(Object) |
*Ensures that the Collection contains the
argument. Returns false if it doesn’t add the argument. |
Boolean
addAll(Collection) |
*Adds all the elements in the argument.
Returns true if any elements were added. |
void
clear( ) |
*Removes all the elements in the
Collection. |
Boolean
contains(Object) |
True if the Collection contains the
argument. |
Boolean
containsAll(Collection) |
True if the Collection contains all the
elements in the argument. |
Boolean
isEmpty( ) |
True if the Collection has no elements.
|
Iterator
iterator( ) |
Returns an Iterator that you can use to
move through the elements in the Collection. |
Boolean
remove(Object) |
*If the argument is in the Collection,
one instance of that element is removed. Returns true if a removal
occurred. |
Boolean
removeAll(Collection) |
*Removes all the elements that are
contained in the argument. Returns true if any removals
occurred. |
Boolean
retainAll(Collection) |
*Retains only elements that are contained
in the argument (an “intersection” from set theory). Returns true if
any changes occurred. |
int size( ) |
Returns the number of elements in the
Collection. |
Object[]
toArray( ) |
Returns an array containing all the
elements in the Collection. |
Object[] toArray(Object[]
a) |
Returns an array containing all the
elements in the Collection, whose type is that of the array a rather than
plain Object (you must cast the array to the right
type). |
|
*This is an “optional”
method, which means it might not be implemented by a particular Collection. If
not, that method throws an UnsupportedOperationException. Exceptions will
be covered in Chapter 9. |
The following example demonstrates all of
these methods. Again, these work with anything that inherits from
Collection; an ArrayList is used as a kind of “least-common
denominator”:
//: c08:newcollections:Collection1.java // Things you can do with all Collections package c08.newcollections; import java.util.*; public class Collection1 { // Fill with 'size' elements, start // counting at 'start': public static Collection fill(Collection c, int start, int size) { for(int i = start; i < start + size; i++) c.add(Integer.toString(i)); return c; } // Default to a "start" of 0: public static Collection fill(Collection c, int size) { return fill(c, 0, size); } // Default to 10 elements: public static Collection fill(Collection c) { return fill(c, 0, 10); } // Create & upcast to Collection: public static Collection newCollection() { return fill(new ArrayList()); // ArrayList is used for simplicity, but it's // only seen as a generic Collection // everywhere else in the program. } // Fill a Collection with a range of values: public static Collection newCollection(int start, int size) { return fill(new ArrayList(), start, size); } // Moving through a List with an iterator: public static void print(Collection c) { for(Iterator x = c.iterator(); x.hasNext();) System.out.print(x.next() + " "); System.out.println(); } public static void main(String[] args) { Collection c = newCollection(); c.add("ten"); c.add("eleven"); print(c); // Make an array from the List: Object[] array = c.toArray(); // Make a String array from the List: String[] str = (String[])c.toArray(new String[1]); // Find max and min elements; this means // different things depending on the way // the Comparable interface is implemented: System.out.println("Collections.max(c) = " + Collections.max(c)); System.out.println("Collections.min(c) = " + Collections.min(c)); // Add a Collection to another Collection c.addAll(newCollection()); print(c); c.remove("3"); // Removes the first one print(c); c.remove("3"); // Removes the second one print(c); // Remove all components that are in the // argument collection: c.removeAll(newCollection()); print(c); c.addAll(newCollection()); print(c); // Is an element in this Collection? System.out.println( "c.contains(\"4\") = " + c.contains("4")); // Is a Collection in this Collection? System.out.println( "c.containsAll(newCollection()) = " + c.containsAll(newCollection())); Collection c2 = newCollection(5, 3); // Keep all the elements that are in both // c and c2 (an intersection of sets): c.retainAll(c2); print(c); // Throw away all the elements in c that // also appear in c2: c.removeAll(c2); System.out.println("c.isEmpty() = " + c.isEmpty()); c = newCollection(); print(c); c.clear(); // Remove all elements System.out.println("after c.clear():"); print(c); } } ///:~
The first methods provide a way to fill
any Collection with test data, in this case just ints converted to
Strings. The second method will be used frequently throughout the rest of
this chapter.
The two versions of
newCollection( ) create ArrayLists containing different sets
of data and return them as Collection objects, so it’s clear that
nothing other than the Collection interface is being
used.
The print( ) method will also
be used throughout the rest of this section. Since it moves through a
Collection using an Iterator, which any Collection can
produce, it will work with Lists and Sets and any
Collection that a Map produces.
main( ) uses simple exercises
to show all of the methods in Collection.
The following sections compare the
various implementations of List, Set, and Map and indicate
in each case (with an asterisk) which one should be your default choice.
You’ll notice that the legacy classes Vector, Stack, and
Hashtable are not included because in all cases there are
preferred classes within the Java 2
Collections.
List (interface) |
Order is the most important feature of a
List; it promises to maintain elements in a particular sequence.
List adds a number of methods to Collection that allow
insertion and removal of elements in the middle of a List. (This is
recommended only for a LinkedList.) A List will produce a
ListIterator, and using this you can traverse the List in both
directions, as well as insert and remove elements in the middle of the list
(again, recommended only for a LinkedList). |
ArrayList* |
A List backed by an array. Use
instead of Vector as a general-purpose object holder. Allows rapid random
access to elements, but is slow when inserting and removing elements from the
middle of a list. ListIterator should be used only for back-and-forth
traversal of an ArrayList, but not for inserting and removing elements,
which is expensive compared to LinkedList. |
LinkedList |
Provides optimal sequential access, with
inexpensive insertions and deletions from the middle of the list. Relatively
slow for random access. (Use ArrayList instead.) Also has
addFirst( ), addLast( ), getFirst( ),
getLast( ), removeFirst( ), and
removeLast( ) (which are not defined in any interfaces or base
classes) to allow it to be used as a stack, a queue, and a
dequeue. |
The methods in the following example each
cover a different group of activities: things that every list can do
(basicTest( )), moving around with an Iterator
(iterMotion( )) versus changing things with an
Iterator (iterManipulation( )), seeing the effects of
List manipulation (testVisual( )), and operations available
only to LinkedLists.
//: c08:newcollections:List1.java // Things you can do with Lists package c08.newcollections; import java.util.*; public class List1 { // Wrap Collection1.fill() for convenience: public static List fill(List a) { return (List)Collection1.fill(a); } // You can use an Iterator, just as with a // Collection, but you can also use random // access with get(): public static void print(List a) { for(int i = 0; i < a.size(); i++) System.out.print(a.get(i) + " "); System.out.println(); } static boolean b; static Object o; static int i; static Iterator it; static ListIterator lit; public static void basicTest(List a) { a.add(1, "x"); // Add at location 1 a.add("x"); // Add at end // Add a collection: a.addAll(fill(new ArrayList())); // Add a collection starting at location 3: a.addAll(3, fill(new ArrayList())); b = a.contains("1"); // Is it in there? // Is the entire collection in there? b = a.containsAll(fill(new ArrayList())); // Lists allow random access, which is cheap // for ArrayList, expensive for LinkedList: o = a.get(1); // Get object at location 1 i = a.indexOf("1"); // Tell index of object b = a.isEmpty(); // Any elements inside? it = a.iterator(); // Ordinary Iterator lit = a.listIterator(); // ListIterator lit = a.listIterator(3); // Start at loc 3 i = a.lastIndexOf("1"); // Last match a.remove(1); // Remove location 1 a.remove("3"); // Remove this object a.set(1, "y"); // Set location 1 to "y" // Keep everything that's in the argument // (the intersection of the two sets): a.retainAll(fill(new ArrayList())); // Remove everything that's in the argument: a.removeAll(fill(new ArrayList())); i = a.size(); // How big is it? a.clear(); // Remove all elements } public static void iterMotion(List a) { ListIterator it = a.listIterator(); b = it.hasNext(); b = it.hasPrevious(); o = it.next(); i = it.nextIndex(); o = it.previous(); i = it.previousIndex(); } public static void iterManipulation(List a) { ListIterator it = a.listIterator(); it.add("47"); // Must move to an element after add(): it.next(); // Remove the element that was just produced: it.remove(); // Must move to an element after remove(): it.next(); // Change the element that was just produced: it.set("47"); } public static void testVisual(List a) { print(a); List b = new ArrayList(); fill(b); System.out.print("b = "); print(b); a.addAll(b); a.addAll(fill(new ArrayList())); print(a); // Insert, remove, and replace elements // using a ListIterator: ListIterator x = a.listIterator(a.size()/2); x.add("one"); print(a); System.out.println(x.next()); x.remove(); System.out.println(x.next()); x.set("47"); print(a); // Traverse the list backwards: x = a.listIterator(a.size()); while(x.hasPrevious()) System.out.print(x.previous() + " "); System.out.println(); System.out.println("testVisual finished"); } // There are some things that only // LinkedLists can do: public static void testLinkedList() { LinkedList ll = new LinkedList(); Collection1.fill(ll, 5); print(ll); // Treat it like a stack, pushing: ll.addFirst("one"); ll.addFirst("two"); print(ll); // Like "peeking" at the top of a stack: System.out.println(ll.getFirst()); // Like popping a stack: System.out.println(ll.removeFirst()); System.out.println(ll.removeFirst()); // Treat it like a queue, pulling elements // off the tail end: System.out.println(ll.removeLast()); // With the above operations, it's a dequeue! print(ll); } public static void main(String args[]) { // Make and fill a new list each time: basicTest(fill(new LinkedList())); basicTest(fill(new ArrayList())); iterMotion(fill(new LinkedList())); iterMotion(fill(new ArrayList())); iterManipulation(fill(new LinkedList())); iterManipulation(fill(new ArrayList())); testVisual(fill(new LinkedList())); testLinkedList(); } } ///:~
In basicTest( ) and
iterMotion( ) the calls are simply made to show the proper syntax,
and while the return value is captured, it is not used. In some cases, the
return value isn’t captured since it isn’t typically used. You
should look up the full usage of each of these methods in your online
documentation before you use them.
Set has exactly the same interface
as Collection, so there isn’t any extra functionality as there is
with the two different Lists. Instead, the Set is exactly a
Collection, it just has different behavior. (This is the ideal use of
inheritance and polymorphism: to express different behavior.) A Set
allows only one instance of each object value to exist (what constitutes the
“value” of an object is more complex, as you shall see).
Set (interface) |
Each element that you add to the
Set must be unique; otherwise the Set doesn’t add the
duplicate element. Objects added to a Set must define
equals( ) to establish object uniqueness. Set has exactly the
same interface as Collection. The Set interface does not guarantee
it will maintain its elements in any particular order. |
HashSet* |
For Sets where fast lookup time is
important. Objects must also define hashCode( ). |
TreeSet |
An ordered Set backed by a
red-black tree. This way, you can extract an ordered sequence from a
Set. |
The following example does not
show everything you can do with a Set, since the interface is the same as
Collection and so was exercised in the previous example. Instead, this
demonstrates the behavior that makes a Set unique:
//: c08:newcollections:Set1.java // Things you can do with Sets package c08.newcollections; import java.util.*; public class Set1 { public static void testVisual(Set a) { Collection1.fill(a); Collection1.fill(a); Collection1.fill(a); Collection1.print(a); // No duplicates! // Add another set to this one: a.addAll(a); a.add("one"); a.add("one"); a.add("one"); Collection1.print(a); // Look something up: System.out.println("a.contains(\"one\"): " + a.contains("one")); } public static void main(String[] args) { testVisual(new HashSet()); testVisual(new TreeSet()); } } ///:~
Duplicate values are added to the
Set, but when it is printed you’ll see the Set has accepted
only one instance of each value.
When you run this program you’ll
notice that the order maintained by the HashSet is different from
TreeSet, since each has a different way of storing elements so they can
be located later. (TreeSet keeps them sorted, while HashSet uses a
hashing function, which is designed specifically for rapid lookups.) When
creating your own types, be aware that a Set needs a way to maintain a
storage order, just as with the “groundhog” examples shown earlier
in this chapter. To implement comparability with the Java 2 Collections,
however, you must implement the Comparable interface and define the
compareTo( ) method (this will be described more fully later).
Here’s an example:
//: c08:newcollections:Set2.java // Putting your own type in a Set package c08.newcollections; import java.util.*; class MyType implements Comparable { private int i; public MyType(int n) { i = n; } public boolean equals(Object o) { return (o instanceof MyType) && (i == ((MyType)o).i); } public int hashCode() { return i; } public String toString() { return i + " "; } public int compareTo(Object o) { int i2 = ((MyType) o).i; return (i2 < i ? -1 : (i2 == i ? 0 : 1)); } } public class Set2 { public static Set fill(Set a, int size) { for(int i = 0; i < size; i++) a.add(new MyType(i)); return a; } public static Set fill(Set a) { return fill(a, 10); } public static void test(Set a) { fill(a); fill(a); // Try to add duplicates fill(a); a.addAll(fill(new TreeSet())); System.out.println(a); } public static void main(String[] args) { test(new HashSet()); test(new TreeSet()); } } ///:~
The definitions for
equals( ) and hashCode( ) follow
the form given in the “groundhog” examples. You must define an
equals( ) in both cases, but the hashCode( ) is
absolutely necessary only if the class will be placed in a HashSet (which
is likely, since that should generally be your first choice as a Set
implementation). However, as a programming style you should always override
hashCode( ) when you override
equals( ).
In the compareTo( ), note
that I did not use the “simple and obvious” form return
i-i2. Although this is a common programming error, it would only work
properly if i and i2 were unsigned ints (if Java had
an “unsigned” keyword, which it does not). It breaks for
Java’s signed int, which is not big enough to represent the
difference of two signed ints. If i is a large positive integer
and j is a large negative integer, i-j will overflow and return a
negative value, which will not work.
Map (interface) |
Maintains key-value associations (pairs),
so you can look up a value using a key. |
HashMap* |
Implementation based on a hash table.
(Use this instead of Hashtable.) Provides constant-time performance for
inserting and locating pairs. Performance can be adjusted via constructors that
allow you to set the capacity and load factor of the hash
table. |
TreeMap |
Implementation based on a red-black tree.
When you view the keys or the pairs, they will be in sorted order (determined by
Comparable or Comparator, discussed later). The point of a
TreeMap is that you get the results in sorted order. TreeMap is
the only Map with the subMap( ) method, which allows you to
return a portion of the tree. |
The following example contains two sets
of test data and a fill( ) method that allows you to fill any map
with any two-dimensional array of Objects. These tools will be used in
other Map examples, as well.
//: c08:newcollections:Map1.java // Things you can do with Maps package c08.newcollections; import java.util.*; public class Map1 { public final static String[][] testData1 = { { "Happy", "Cheerful disposition" }, { "Sleepy", "Prefers dark, quiet places" }, { "Grumpy", "Needs to work on attitude" }, { "Doc", "Fantasizes about advanced degree"}, { "Dopey", "'A' for effort" }, { "Sneezy", "Struggles with allergies" }, { "Bashful", "Needs self-esteem workshop"}, }; public final static String[][] testData2 = { { "Belligerent", "Disruptive influence" }, { "Lazy", "Motivational problems" }, { "Comatose", "Excellent behavior" } }; public static Map fill(Map m, Object[][] o) { for(int i = 0; i < o.length; i++) m.put(o[i][0], o[i][1]); return m; } // Producing a Set of the keys: public static void printKeys(Map m) { System.out.print("Size = " + m.size() +", "); System.out.print("Keys: "); Collection1.print(m.keySet()); } // Producing a Collection of the values: public static void printValues(Map m) { System.out.print("Values: "); Collection1.print(m.values()); } // Iterating through Map.Entry objects (pairs): public static void print(Map m) { Collection entries = m.entrySet(); Iterator it = entries.iterator(); while(it.hasNext()) { Map.Entry e = (Map.Entry)it.next(); System.out.println("Key = " + e.getKey() + ", Value = " + e.getValue()); } } public static void test(Map m) { fill(m, testData1); // Map has 'Set' behavior for keys: fill(m, testData1); printKeys(m); printValues(m); print(m); String key = testData1[4][0]; String value = testData1[4][1]; System.out.println("m.containsKey(\"" + key + "\"): " + m.containsKey(key)); System.out.println("m.get(\"" + key + "\"): " + m.get(key)); System.out.println("m.containsValue(\"" + value + "\"): " + m.containsValue(value)); Map m2 = fill(new TreeMap(), testData2); m.putAll(m2); printKeys(m); m.remove(testData2[0][0]); printKeys(m); m.clear(); System.out.println("m.isEmpty(): " + m.isEmpty()); fill(m, testData1); // Operations on the Set change the Map: m.keySet().removeAll(m.keySet()); System.out.println("m.isEmpty(): " + m.isEmpty()); } public static void main(String args[]) { System.out.println("Testing HashMap"); test(new HashMap()); System.out.println("Testing TreeMap"); test(new TreeMap()); } } ///:~
The printKeys( ),
printValues( ), and print( ) methods are not only useful
utilities, they also demonstrate the production of Collection views of a
Map. The keySet( ) method produces a Set backed by the
keys in the Map; here, it is treated as only a Collection. Similar
treatment is given to values( ), which produces a List
containing all the values in the Map. (Note that keys must be unique,
while values can contain duplicates.) Since these Collections are backed
by the Map, any changes in a Collection will be reflected in the
associated Map.
The print( ) method grabs the
Iterator produced by entries and uses it to print both the key and
value for each pair. The rest of the program provides simple examples of each
Map operation, and tests each type of Map.
When creating your own class to use as a
key in a Map, you must deal with the same issues discussed previously for
Sets.
From the diagram on page Error!
Bookmark not defined. you can see that there are really only three
collection components: Map, List, and Set, and only two or
three implementations of each interface. If you need to use the functionality
offered by a particular interface, how do you decide which particular
implementation to use?
To understand the answer, you must be
aware that each different implementation has its own features, strengths, and
weaknesses. For example, you can see in the diagram that the
“feature” of Hashtable, Vector, and Stack is
that they are legacy classes, so that existing code doesn’t break. On the
other hand, it’s best if you don’t use those for new (Java 2)
code.
The distinction between the other
collections often comes down to what they are ”backed by;” that is,
the data structures that physically implement your desired interface.
This means that, for example, ArrayList, LinkedList, and Vector
(which is roughly equivalent to ArrayList) all implement the
List interface so your program will produce the same results regardless
of the one you use. However, ArrayList (and Vector) is
backed by an array, while the LinkedList is implemented in the usual way
for a doubly-linked list, as individual objects each containing data along with
handles to the previous and next elements in the list. Because of this, if you
want to do many insertions and removals in the middle of a list a
LinkedList is the appropriate choice. (LinkedList also has
additional functionality that is established in
AbstractSequentialList.) If not, an
ArrayList is probably faster.
As another example, a Set can be
implemented as either a TreeSet or a HashSet. A TreeSet is
backed by a TreeMap and is designed to produce a constantly-sorted set.
However, if you’re going to have larger quantities in your Set, the
performance of TreeSet insertions will get slow. When you’re
writing a program that needs a Set, you should choose HashSet by
default, and change to TreeSet when it's more important to have a
constantly-sorted set.
The most convincing way to see the
differences between the implementations of List is with a performance
test. The following code establishes an inner base class to use as a test
framework, then creates an
anonymous inner class for each
different test. Each of these inner classes is called by the test( )
method. This approach allows you to easily add and remove new kinds of
tests.
//: c08:newcollections:ListPerformance.java // Demonstrates performance differences in Lists package c08.newcollections; import java.util.*; public class ListPerformance { private static final int REPS = 100; private abstract static class Tester { String name; int size; // Test quantity Tester(String name, int size) { this.name = name; this.size = size; } abstract void test(List a); } private static Tester[] tests = { new Tester("get", 300) { void test(List a) { for(int i = 0; i < REPS; i++) { for(int j = 0; j < a.size(); j++) a.get(j); } } }, new Tester("iteration", 300) { void test(List a) { for(int i = 0; i < REPS; i++) { Iterator it = a.iterator(); while(it.hasNext()) it.next(); } } }, new Tester("insert", 1000) { void test(List a) { int half = a.size()/2; String s = "test"; ListIterator it = a.listIterator(half); for(int i = 0; i < size * 10; i++) it.add(s); } }, new Tester("remove", 5000) { void test(List a) { ListIterator it = a.listIterator(3); while(it.hasNext()) { it.next(); it.remove(); } } }, }; public static void test(List a) { // A trick to print out the class name: System.out.println("Testing " + a.getClass().getName()); for(int i = 0; i < tests.length; i++) { Collection1.fill(a, tests[i].size); System.out.print(tests[i].name); long t1 = System.currentTimeMillis(); tests[i].test(a); long t2 = System.currentTimeMillis(); System.out.println(": " + (t2 - t1)); } } public static void main(String[] args) { test(new ArrayList()); test(new LinkedList()); } } ///:~
The inner class Tester is
abstract, to provide a base class for the specific tests. It contains a
String to be printed when the test starts, a size parameter to be
used by the test for quantity of elements or repetitions of tests, a constructor
to initialize the fields, and an abstract method test( ) that
does the work. All the different types of tests are collected in one place, the
array tests, which is initialized with different anonymous inner classes
that inherit from Tester. To add or remove tests, simply add or remove an
inner class definition from the array, and everything else happens
automatically.
The List that’s handed to
test( ) is first filled with elements, then each test in the
tests array is timed. The results will vary from machine to machine; they
are intended to give only an order of magnitude comparison between the
performance of the different collections. Here is a summary of one
run:
Type |
Get |
Iteration |
Insert |
Remove |
ArrayList |
110 |
490 |
3790 |
8730 |
LinkedList |
1980 |
220 |
110 |
110 |
You can see that random accesses
(get( )) are cheap for ArrayLists and expensive for
LinkedLists. (Oddly, iteration is faster for a LinkedList
than an ArrayList, which is counter-intuitive.) On the other hand,
insertions and removals from the middle of a list are dramatically cheaper for a
LinkedList than for an ArrayList. The best approach is probably to
choose an ArrayList as your default and to change to a LinkedList
if you discover performance problems because of many insertions and removals
from the middle of the list.
You can choose between an TreeSet
and a HashSet, depending on the size of the Set (if you need
to produce an ordered sequence from a Set, use TreeSet).
The following test program gives an indication of this
tradeoff:
//: c08:newcollections:SetPerformance.java package c08.newcollections; import java.util.*; public class SetPerformance { private static final int REPS = 200; private abstract static class Tester { String name; Tester(String name) { this.name = name; } abstract void test(Set s, int size); } private static Tester[] tests = { new Tester("add") { void test(Set s, int size) { for(int i = 0; i < REPS; i++) { s.clear(); Collection1.fill(s, size); } } }, new Tester("contains") { void test(Set s, int size) { for(int i = 0; i < REPS; i++) for(int j = 0; j < size; j++) s.contains(Integer.toString(j)); } }, new Tester("iteration") { void test(Set s, int size) { for(int i = 0; i < REPS * 10; i++) { Iterator it = s.iterator(); while(it.hasNext()) it.next(); } } }, }; public static void test(Set s, int size) { // A trick to print out the class name: System.out.println("Testing " + s.getClass().getName() + " size " + size); Collection1.fill(s, size); for(int i = 0; i < tests.length; i++) { System.out.print(tests[i].name); long t1 = System.currentTimeMillis(); tests[i].test(s, size); long t2 = System.currentTimeMillis(); System.out.println(": " + ((double)(t2 - t1)/(double)size)); } } public static void main(String[] args) { // Small: test(new TreeSet(), 10); test(new HashSet(), 10); // Medium: test(new TreeSet(), 100); test(new HashSet(), 100); // Large: test(new HashSet(), 1000); test(new TreeSet(), 1000); } } ///:~
The following table shows the results of
one run (using Beta3 software on one particular platform; you should run the
test yourself as well):
Type |
Test size |
Add |
Contains |
Iteration |
|
10 |
22.0 |
11.0 |
16.0 |
TreeSet |
100 |
22.5 |
13.2 |
12.1 |
|
1000 |
31.1 |
18.7 |
11.8 |
|
10 |
5.0 |
6.0 |
27.0 |
HashSet |
100 |
6.6 |
6.6 |
10.9 |
|
1000 |
7.4 |
6.6 |
9.5 |
HashSet is generally superior to
TreeSet for all operations, and the performance is effectively
independent of size.
When choosing between implementations of
Map, the size of the Map is what most strongly affects
performance, and the following test program gives an indication of this
tradeoff:
//: c08:newcollections:MapPerformance.java // Demonstrates performance differences in Maps package c08.newcollections; import java.util.*; public class MapPerformance { private static final int REPS = 200; public static Map fill(Map m, int size) { for(int i = 0; i < size; i++) { String x = Integer.toString(i); m.put(x, x); } return m; } private abstract static class Tester { String name; Tester(String name) { this.name = name; } abstract void test(Map m, int size); } private static Tester[] tests = { new Tester("put") { void test(Map m, int size) { for(int i = 0; i < REPS; i++) { m.clear(); fill(m, size); } } }, new Tester("get") { void test(Map m, int size) { for(int i = 0; i < REPS; i++) for(int j = 0; j < size; j++) m.get(Integer.toString(j)); } }, new Tester("iteration") { void test(Map m, int size) { for(int i = 0; i < REPS * 10; i++) { Iterator it = m.entrySet().iterator(); while(it.hasNext()) it.next(); } } }, }; public static void test(Map m, int size) { // A trick to print out the class name: System.out.println("Testing " + m.getClass().getName() + " size " + size); fill(m, size); for(int i = 0; i < tests.length; i++) { System.out.print(tests[i].name); long t1 = System.currentTimeMillis(); tests[i].test(m, size); long t2 = System.currentTimeMillis(); System.out.println(": " + ((double)(t2 - t1)/(double)size)); } } public static void main(String[] args) { // Small: test(new Hashtable(), 10); test(new HashMap(), 10); test(new TreeMap(), 10); // Medium: test(new Hashtable(), 100); test(new HashMap(), 100); test(new TreeMap(), 100); // Large: test(new HashMap(), 1000); test(new Hashtable(), 1000); test(new TreeMap(), 1000); } } ///:~
Because the size of the map is the issue,
you’ll see that the timing tests divide the time by the size to normalize
each measurement. Here is one set of results. (Yours will probably be
different.)
Type |
Test size |
Put |
Get |
Iteration |
---|---|---|---|---|
|
10 |
11.0 |
5.0 |
44.0 |
Hashtable |
100 |
7.7 |
7.7 |
16.5 |
|
1000 |
8.0 |
8.0 |
14.4 |
|
10 |
16.0 |
11.0 |
22.0 |
TreeMap |
100 |
25.8 |
15.4 |
13.2 |
|
1000 |
33.8 |
20.9 |
13.6 |
|
10 |
11.0 |
6.0 |
33.0 |
HashMap |
100 |
8.2 |
7.7 |
13.7 |
|
1000 |
8.0 |
7.8 |
11.9 |
As you might expect, Hashtable
performance is roughly equivalent to HashMap (you can also see that
HashMap is generally a bit faster. Remember that HashMap is
intended to replace Hashtable). The TreeMap is generally slower
than the HashMap, so why would you use it? So you could use it not as a
Map, but as a way to create an ordered list. The behavior of a
tree is such that it’s always in order and doesn’t have to be
specially sorted. (The way it is ordered will be discussed later.) Once
you fill a TreeMap, you can call
keySet( ) to get a Set view of the
keys, then toArray( ) to produce an array of
those keys. You can then use the static method
Arrays.binarySearch( ) (discussed later) to rapidly find objects in
your sorted array. Of course, you would probably only do this if, for some
reason, the behavior of a HashMap was unacceptable, since HashMap
is designed to rapidly find things. In the end, when you’re using a
Map your first choice should be HashMap, and only if you need a
constantly-sorted Map will you need TreeMap.
There is another performance issue that
the above table does not address, and that is speed of creation. The following
program tests creation speed for different types of Map:
//: c08:newcollections:MapCreation.java // Demonstrates time differences in Map creation package c08.newcollections; import java.util.*; public class MapCreation { public static void main(String[] args) { final long REPS = 100000; long t1 = System.currentTimeMillis(); System.out.print("Hashtable"); for(long i = 0; i < REPS; i++) new Hashtable(); long t2 = System.currentTimeMillis(); System.out.println(": " + (t2 - t1)); t1 = System.currentTimeMillis(); System.out.print("TreeMap"); for(long i = 0; i < REPS; i++) new TreeMap(); t2 = System.currentTimeMillis(); System.out.println(": " + (t2 - t1)); t1 = System.currentTimeMillis(); System.out.print("HashMap"); for(long i = 0; i < REPS; i++) new HashMap(); t2 = System.currentTimeMillis(); System.out.println(": " + (t2 - t1)); } } ///:~
At the time this program was written, the
creation speed of TreeMap was dramatically faster than the other two
types. This, along with the acceptable and consistent put( )
performance of TreeMap, suggests a possible strategy if you’re
creating many Maps, and only later in your program doing many lookups:
Create and fill TreeMaps, and when you start looking things up, convert
the important TreeMaps into HashMaps using the HashMap(Map)
constructor. Again, you should only worry about this sort of thing after
it’s been proven that you have a performance bottleneck. (“First
make it work, then make it fast – if you
must.”)
It’s possible to turn an array into
a List with the static Arrays.asList( )
method:
//: c08:newcollections:Unsupported.java // Sometimes methods defined in the Collection // interfaces don't work! package c08.newcollections; import java.util.*; public class Unsupported { private static String[] s = { "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten", }; static List a = Arrays.asList(s); static List a2 = Arrays.asList( new String[] { s[3], s[4], s[5] }); public static void main(String[] args) { Collection1.print(a); // Iteration System.out.println( "a.contains(" + s[0] + ") = " + a.contains(s[0])); System.out.println( "a.containsAll(a2) = " + a.containsAll(a2)); System.out.println("a.isEmpty() = " + a.isEmpty()); System.out.println( "a.indexOf(" + s[5] + ") = " + a.indexOf(s[5])); // Traverse backwards: ListIterator lit = a.listIterator(a.size()); while(lit.hasPrevious()) System.out.print(lit.previous()); System.out.println(); // Set the elements to different values: for(int i = 0; i < a.size(); i++) a.set(i, "47"); Collection1.print(a); // Compiles, but won't run: lit.add("X"); // Unsupported operation a.clear(); // Unsupported a.add("eleven"); // Unsupported a.addAll(a2); // Unsupported a.retainAll(a2); // Unsupported a.remove(s[0]); // Unsupported a.removeAll(a2); // Unsupported } } ///:~
You’ll discover that only a portion
of the Collection and List interfaces are actually implemented.
The rest of the methods cause the unwelcome appearance of something called an
UnsupportedOperationException. You’ll learn
all about exceptions in the next chapter, but the short story is that the
Collection interface, as well as some of the other
interfaces in the Java 2 Collections library, contain
“optional” methods, which might or might not
be “supported” in the concrete class that implements that
interface. Calling an unsupported method causes an
UnsupportedOperationException to indicate a programming
error.
“What?!?” you say,
incredulous. “The whole point of interfaces and base classes is
that they promise these methods will do something meaningful! This breaks that
promise – it says that not only will calling some methods not
perform a meaningful behavior, they will stop the program! Type safety was just
thrown out the window!” It’s not quite that bad. With a
Collection, List, Set, or Map, the compiler still
restricts you to calling only the methods in that interface, so
it’s not like Smalltalk (in which you can call any method for any object,
and find out only when you run the program whether your call does anything). In
addition, most methods that take a Collection as an argument only read
from that Collection –all the “read” methods of
Collection are not optional.
This approach prevents an explosion of
interfaces in the design. Other designs for collection libraries always seem to
end up with a confusing plethora of interfaces to describe each of the
variations on the main theme and are thus difficult to learn. It’s not
even possible to capture all of the special cases in interfaces, because
someone can always invent a new interface. The “unsupported
operation” approach achieves an important goal of the Java 2 Collections
library: it is simple to learn and use. For this approach to work,
however:
In the example above,
Arrays.asList( ) produces a List that
is backed by a fixed-size array. Therefore it makes sense that the only
supported operations are the ones that don’t change the size of the array.
If, on the other hand, a new interface were required to express this
different kind of behavior (called, perhaps,
“FixedSizeList”), it would throw open the door to complexity
and soon you wouldn’t know where to start when trying to use the
library.
The documentation for a method that takes
a Collection, List, Set, or Map as an argument
should specify which of the optional methods must be implemented. For example,
sorting requires the set( ) and Iterator.set( ) methods
but not add( ) and remove( ).
Java 2 adds utilities to perform sorting
and searching for arrays or Lists. These utilities
are static methods of two new classes:
Arrays for sorting and searching arrays, and
Collections for sorting and searching
Lists.
The Arrays class has an overloaded
sort( ) and
binarySearch( ) for arrays of all the
primitive types, as well as for String and Object. Here’s an
example that shows sorting and searching an array of byte (all the other
primitives look the same) and an array of String:
//: c08:newcollections:Array1.java // Testing the sorting & searching in Arrays package c08.newcollections; import java.util.*; public class Array1 { static Random r = new Random(); static String ssource = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" + "abcdefghijklmnopqrstuvwxyz"; static char[] src = ssource.toCharArray(); // Create a random String public static String randString(int length) { char[] buf = new char[length]; int rnd; for(int i = 0; i < length; i++) { rnd = Math.abs(r.nextInt()) % src.length; buf[i] = src[rnd]; } return new String(buf); } // Create a random array of Strings: public static String[] randStrings(int length, int size) { String[] s = new String[size]; for(int i = 0; i < size; i++) s[i] = randString(length); return s; } public static void print(byte[] b) { for(int i = 0; i < b.length; i++) System.out.print(b[i] + " "); System.out.println(); } public static void print(String[] s) { for(int i = 0; i < s.length; i++) System.out.print(s[i] + " "); System.out.println(); } public static void main(String[] args) { byte[] b = new byte[15]; r.nextBytes(b); // Fill with random bytes print(b); Arrays.sort(b); print(b); int loc = Arrays.binarySearch(b, b[10]); System.out.println("Location of " + b[10] + " = " + loc); // Test String sort & search: String[] s = randStrings(4, 10); print(s); Arrays.sort(s); print(s); loc = Arrays.binarySearch(s, s[4]); System.out.println("Location of " + s[4] + " = " + loc); } } ///:~
The first part of the class contains
utilities to generate random String objects using an array of characters
from which random letters can be selected. randString( ) returns a
string of any length, and randStrings( ) creates an array of random
Strings, given the length of each String and the desired size of
the array. The two print( ) methods simplify the display of
the sample arrays. In main( ),
Random.nextBytes( ) fills the array
argument with randomly-selected bytes. (There are no corresponding
Random methods to create arrays of the other primitive data types.) Once
you have an array, you can see that it’s only a single method call to
perform a sort( ) or binarySearch( ). There’s an
important warning concerning binarySearch( ): If you do not call
sort( ) before you perform a binarySearch( ),
unpredictable behavior can occur, including infinite loops.
Sorting and searching with Strings
looks the same, but when you run the program you’ll notice something
interesting: the sorting is lexicographic, so uppercase
letters precede lowercase letters in the character set. Thus, all the capital
letters are at the beginning of the list, followed by the lowercase letters, so
‘Z’ precedes ‘a’. It turns out that even telephone books
are typically sorted this way.
What if this isn’t what you want?
For example, the index in this book would not be too useful if you had to look
in two places for everything that begins with ‘A’ or
‘a’.
When you want to sort an array of
Object, there’s a problem. What determines the ordering of two
Objects? Unfortunately, the original Java designers didn’t consider
this an important problem, or it would have been defined in the root class
Object. As a result, ordering must be imposed on Objects from the
outside, and the Java 2 Collections library provides a standard way to do this
(which is almost as good as defining it in Object).
There is a sort( ) for arrays
of Object (and String, of course, is an Object) that
takes a second argument: an object that implements the
Comparator interface (part of the Java 2
Collections library) and performs comparisons with its single
compare( ) method. This method takes the two
objects to be compared as its arguments and returns a negative integer if the
first argument is less than the second, zero if they’re equal, and a
positive integer if the first argument is greater than the second. With this
knowledge, the String portion of the example above can be re-implemented
to perform an alphabetic sort:
//: c08:newcollections:AlphaComp.java // Using Comparator to perform an alphabetic sort package c08.newcollections; import java.util.*; public class AlphaComp implements Comparator { public int compare(Object o1, Object o2) { // Assume it's used only for Strings... String s1 = ((String)o1).toLowerCase(); String s2 = ((String)o2).toLowerCase(); return s1.compareTo(s2); } public static void main(String[] args) { String[] s = Array1.randStrings(4, 10); Array1.print(s); AlphaComp ac = new AlphaComp(); Arrays.sort(s, ac); Array1.print(s); // Must use the Comparator to search, also: int loc = Arrays.binarySearch(s, s[3], ac); System.out.println("Location of " + s[3] + " = " + loc); } } ///:~
By casting to String, the
compare( ) method implicitly tests to ensure that it is used only
with String objects – the run-time system will catch any
discrepancies. After forcing both Strings to lower case, the
String.compareTo( ) method produces the desired
results.
When you use your own Comparator
to perform a sort( ), you must use that same Comparator when
using binarySearch( ).
The Arrays class has another
sort( ) method that takes a single argument: an array of
Object, but with no Comparator. This sort( ) method
must also have some way to compare two Objects. It uses the
natural comparison method
that is imparted to a class by implementing the
Comparable interface. This interface
has a single method, compareTo( ), which
compares the object to its argument and returns negative, zero, or positive
depending on whether it is less than, equal to, or greater than the argument. A
simple example demonstrates this:
//: c08:newcollections:CompClass.java // A class that implements Comparable package c08.newcollections; import java.util.*; public class CompClass implements Comparable { private int i; public CompClass(int ii) { i = ii; } public int compareTo(Object o) { // Implicitly tests for correct type: int argi = ((CompClass)o).i; if(i == argi) return 0; if(i < argi) return -1; return 1; } public static void print(Object[] a) { for(int i = 0; i < a.length; i++) System.out.print(a[i] + " "); System.out.println(); } public String toString() { return i + ""; } public static void main(String[] args) { CompClass[] a = new CompClass[20]; for(int i = 0; i < a.length; i++) a[i] = new CompClass( (int)(Math.random() *100)); print(a); Arrays.sort(a); print(a); int loc = Arrays.binarySearch(a, a[3]); System.out.println("Location of " + a[3] + " = " + loc); } } ///:~
Of course, your compareTo( )
method can be as complex as necessary.
A List can
be sorted and searched in the same fashion as an array. The static
methods to sort and search a List are contained in the class
Collections, but they have similar signatures as the ones in
Arrays: sort(List) to sort a List of objects that implement
Comparable, binarySearch(List, Object) to find an object in
the list, sort(List, Comparator) to sort a List using a
Comparator, and binarySearch(List, Object, Comparator) to find an
object in that
list.[44] This
example uses the previously-defined CompClass and AlphaComp to
demonstrate the sorting tools in
Collections:
//: c08:newcollections:ListSort.java // Sorting and searching Lists with 'Collections' package c08.newcollections; import java.util.*; public class ListSort { public static void main(String[] args) { final int SZ = 20; // Using "natural comparison method": List a = new ArrayList(); for(int i = 0; i < SZ; i++) a.add(new CompClass( (int)(Math.random() *100))); Collection1.print(a); Collections.sort(a); Collection1.print(a); Object find = a.get(SZ/2); int loc = Collections.binarySearch(a, find); System.out.println("Location of " + find + " = " + loc); // Using a Comparator: List b = new ArrayList(); for(int i = 0; i < SZ; i++) b.add(Array1.randString(4)); Collection1.print(b); AlphaComp ac = new AlphaComp(); Collections.sort(b, ac); Collection1.print(b); find = b.get(SZ/2); // Must use the Comparator to search, also: loc = Collections.binarySearch(b, find, ac); System.out.println("Location of " + find + " = " + loc); } } ///:~
The use of these methods is identical to
the ones in Arrays, but you’re using a List instead of an
array.
enumeration(Collection)
|
Produces an old-style Enumeration
for the argument. |
max(Collection) min(Collection) |
Produces the maximum or minimum element
in the argument using the natural comparison method of the objects in the
Collection. |
max(Collection, Comparator)
min(Collection,
Comparator) |
Produces the maximum or minimum element
in the Collection using the Comparator. |
nCopies(int n, Object o)
|
Returns an immutable List of size
n whose handles all point to o. |
subList(List, int min, int max)
|
Returns a new List backed by the
specified argument List that is a window into that argument with indexes
starting at min and stopping just before max. |
Note that min( ) and
max( ) work with Collection objects, not with Lists,
so you don’t need to worry about whether the Collection should be
sorted or not. (As mentioned earlier, you do need to sort( )
a List or an array before performing a
binarySearch( ).)
Often it is convenient to create a
read-only version of a Collection or Map. The Collections
class allows you to do this by passing the original container into a method that
hands back a read-only version. There are four variations on this method, one
each for Collection (if you don’t want to treat a Collection
as a more specific type), List, Set, and Map. This
example shows the proper way to build read-only versions of
each:
//: c08:newcollections:ReadOnly.java // Using the Collections.unmodifiable methods package c08.newcollections; import java.util.*; public class ReadOnly { public static void main(String[] args) { Collection c = new ArrayList(); Collection1.fill(c); // Insert useful data c = Collections.unmodifiableCollection(c); Collection1.print(c); // Reading is OK //! c.add("one"); // Can't change it List a = new ArrayList(); Collection1.fill(a); a = Collections.unmodifiableList(a); ListIterator lit = a.listIterator(); System.out.println(lit.next()); // Reading OK //! lit.add("one"); // Can't change it Set s = new HashSet(); Collection1.fill(s); s = Collections.unmodifiableSet(s); Collection1.print(s); // Reading OK //! s.add("one"); // Can't change it Map m = new HashMap(); Map1.fill(m, Map1.testData1); m = Collections.unmodifiableMap(m); Map1.print(m); // Reading OK //! m.put("Ralph", "Howdy!"); } } ///:~
In each case, you must fill the container
with meaningful data before you make it read-only. Once it is loaded, the
best approach is to replace the existing handle with the handle that is produced
by the “unmodifiable” call. That way, you
don’t run the risk of accidentally changing the contents once you’ve
made it unmodifiable. On the other hand, this tool also allows you to keep a
modifiable container as private within a class and to return a read-only
handle to that container from a method call. So you can change it from within
the class but everyone else can only read it.
Calling the “unmodifiable”
method for a particular type does not cause compile-time checking, but once the
transformation has occurred, any calls to methods that modify the contents of a
particular container will produce an
UnsupportedOperationException.
The
synchronized keyword is an important part of the
subject of multithreading, a more complicated
topic that will not be introduced until Chapter 14. Here, I shall note only that
the Collections class contains a way to automatically synchronize an
entire container. The syntax is similar to the “unmodifiable”
methods:
//: c08:newcollections:Synchronization.java // Using the Collections.synchronized methods package c08.newcollections; import java.util.*; public class Synchronization { public static void main(String[] args) { Collection c = Collections.synchronizedCollection( new ArrayList()); List list = Collections.synchronizedList( new ArrayList()); Set s = Collections.synchronizedSet( new HashSet()); Map m = Collections.synchronizedMap( new HashMap()); } } ///:~
In this case, you immediately pass the
new container through the appropriate “synchronized” method; that
way there’s no chance of accidentally exposing the unsynchronized
version.
The Java 2 Collections also have a
mechanism to prevent more than one process from modifying the contents of a
container. The problem occurs if you’re iterating through a container and
some other process steps in and inserts, removes, or changes an object in that
container. Maybe you’ve already passed that object, maybe it’s ahead
of you, maybe the size of the container shrinks after you call
size( ) – there are many scenarios for disaster. The Java 2
Collections library incorporates a fail fast
mechanism that looks for any changes to the container other than the ones your
process is personally responsible for. If it detects that someone else is
modifying the container, it immediately produces a
ConcurrentModificationException. This is the
“fail-fast” aspect – it doesn’t try to detect a problem
later on using a more complex
algorithm.
To review the collections provided in the
standard Java (1.0 and 1.1) library (BitSet is not included here since
it’s more of a special-purpose class):
If you’re
familiar with data structures, you might wonder why there’s not a larger
set of collections. From a functionality standpoint, do you really need a
larger set of collections? With a Hashtable, you can put things in and
find them quickly, and with an Enumeration, you can iterate through the
sequence and perform an operation on every element in the sequence. That’s
a powerful tool, and maybe it should be enough.
But a Hashtable has no concept of
order. Vectors and arrays give you a linear order, but it’s
expensive to insert an element into the middle of either one. In addition,
queues, dequeues, priority queues, and trees are about ordering the
elements, not just putting them in and later finding them or moving through them
linearly. These data structures are also useful, and that’s why they were
included in Standard C++. For this reason, you should consider the collections
in the standard Java library only as a starting point, and, if you must use Java
1.0 or 1.1, use the JGL when your needs go beyond that.
If you can use Java 2 you should use only
the Java 2 Collections, which are likely to satisfy all your needs. Note that
the bulk of this book was created using Java 1.1, so you’ll see that the
collections used through the rest of the book are the ones that are available
only in Java 1.1: Vector and Hashtable. This is a somewhat painful
restriction at times, but it provides better backward compatibility with older
Java code. If you’re writing new code in Java 2, the Java 2 Collections
will serve you much
better.
[41]
This is one of the places where C++ is distinctly superior to Java, since C++
supports parameterized types with the template
keyword.
[42]
If these speedups still don’t meet your performance needs, you can further
accelerate table lookup by writing your own hash table routine. This avoids
delays due to casting to and from Objects and synchronization built into
the Java Class Library hash table routine. To reach even higher levels of
performance, speed enthusiasts can use Donald Knuth’s The Art of
Computer Programming, Volume 3: Sorting and Searching, Second Edition to
replace overflow bucket lists with arrays that have two additional benefits:
they can be optimized for disk storage characteristics and they can save most of
the time of creating and garbage collecting individual
records.
[43]
The best reference I know of is Practical Algorithms for Programmers, by
Andrew Binstock and John Rex, Addison-Wesley 1995.
[44]
At the time of this writing,
Collections.sort( ) has been modified to use a stable sort
algorithm
(one that does not reorder equal
elements).