This course has already ended.

Java generics, part 2

This section discusses some finer details of how Java generics is actually implemented. The mechanism is quite different and more restricted than e.g. C++ generics.

The following example generic class was shown before:

public class MyGenericClass<A, B> {
  private A valA;
  private B valB;

  MyGenericClass(A valA, B valB) {
    this.valA = valA;
    this.valB = valB;
  }

  public A getValA() {
    return valA;
  }

  public B getValB() {
    return valB;
  }
}

We previously noted that it would be a crude simplification to think e.g. the type MyGenericClass<String, Double> would really mean the following class:

public class MyGenericClass {
  private String valA;
  private Double valB;

  MyGenericClass(String valA, Double valB) {
    this.valA = valA;
    this.valB = valB;
  }

  public String getValA() {
    return valA;
  }

  public Double getValB() {
    return valB;
  }
}

This interpretation indeed does hold to some extent, but only in the early phase of code compilation. The class MyGenericClass will eventually have the following structure:

public class MyGenericClass {
  private Object valA;
  private Object valB;

  MyGenericClass(Object valA, Object valB) {
    this.valA = valA;
    this.valB = valB;
  }

  public Object getValA() {
    return valA;
  }

  public Object getValB() {
    return valB;
  }
}

You may notice that all occurrences of a type parameter have been replaced by Object! This is caused by the fact that Java “generics” is actually not much more than a code syntax that allows the compiler to verify that a generic class or function is used only with the expected types. Once this check is complete, the compiler performs “type erasure”: the generic class or function is transformed into a “raw” form that uses Object in place of each type parameter. The raw type of a generic class can be referred to by omitting the parameter list. E.g. the raw type of ArrayList<E> is ArrayList. Raw types should not be used explicitly in code; the Java specification even states that their explicit use might be forbidden in the future.

A generic class will in the end be represented by its corresponding raw type. E.g. both ArrayList<String> and ArrayList<Double> will eventually share the same raw type ArrayList. The notion that these two would truly be two different types, one for storing strings and the other for numbers, is just a compile time illusion (but still useful as such).

Type parameters can in many cases be replaced by Object without problems. After all Object is the supertype of all reference types, so e.g. all function parameters and member variables in the class MyGenericClass can be handled using Object references. There are also some significant restrictions, some of which will be discussed soon.

Consider the following code snippet that uses MyGenericClass:

MyGenericClass<String, Double> mgc1 = new MyGenericClass<>("data", 2.5);
String s = mgc1.getValA();
Double d = mgc1.getValB();
MyGenericClass<Integer, Character> mgc2 = new MyGenericClass<>(7, 'Y');
int i = mgc2.getValA();
char c = mgc2.getValB();

A Java compiler would first perform type checks: are the operations involving mgc1 and mgc1 legal with respect to their concrete type parameters? For example attempting to set int i = mgc1.getValB(); would lead into a compiler error since the a concrete type parameter has declared the type of mgc1.valB as String. The compiler would then transform MyGenericClass into a raw type and add possibly add some necessary type conversions between the concrete type parameters and Object. The end result would roughly correspond to:

MyGenericClass mgc = new MyGenericClass("data", 2.5); // Ok: parameters can be taken as Object's.
String s = (String) mgc.getValA();                    // Type cast Object -> String.
Double d = (Double) mgc.getValB();                    // Type cast Object -> Double.
MyGenericClass mgc2 = new MyGenericClass(7, 'Y');     // Ok: parameters can be taken as Object's.
int i = mgc2.getValA();                               // Type cast Object -> Integer -> int.
char c = mgc2.getValB();                              // Type cast Object -> Character -> char.

This example illustrates how simple operations related to passing parameters are not affected by type erasure. But there are restrictions to what kinf of operations can be done inside a generic class (or a function).

Consider next the following simple generic (and illegal!) class:

public class IllegalGenerics<T> {
  public T val = new T();
}

Why is this class illegal? Consider the following code snippet:

IllegalGenerics<Date> ig1 = new IllegalGenerics<>();
IllegalGenerics<String> ig2 = new IllegalGenerics<>();
Date d = ig1.val;
String s = ig2.val;

The code looks legal at first glance: we could imagine that constructing the IllegalGenerics<Date> object ig1 initalizes Date val = new Date() and constructing the IllegalGenerics<String> object ig2 initializes String val = new String(). Thus we might expect (assuming Date is java.util.Date) that ig1 becomes a Date object representing the current time and ig2 becomes an empty string. But the preceding initializations are impossible because the generic class IllegalGenerics will in effect be a raw type, similar to shown below, during runtime (when the initializations actually take place):

public class IllegalGenerics {
  public Object val = new Object();
}

Such a raw type that would in essence need to initialize its member as a general Object, which clearly cannot lead into creating a Date object at some point and a String object in another. A generic class does not know what kind of type parameters have originally been used with it in the original source code. To avoid such problems, Java does not allow to use the new operator to create objects whose type depends on a type paratemer, and therefore the IllegalGenerics class would fail to compile.

The preceding limitation applies also to arrays: Java does not allow creating arrays whose item type depends on a type parameter. For example an attempt to create new T[10], where T is a type parameter, would give a compiler error. This might feel surprising since the corresponding raw array new Object[10] could be used for storing any type of objects, including type T (what ever it actually would be). The problem here is that Java arrays know their own item type and check during runtime that incompatible items are not inserted into them. This kind of type checks would become futile as a raw array would have no knowledge about its intended item type and would thus be unable to ensure that only items of type T are stored into it. If you need to store generic items into an array, you need to explicitly use an Object array and use type casts Object T when reading values from the array as type T objects. Such type casts would work without errors if we only store items of type T in the array.

Below is an example generic class that stores items in an Object array. This also serves as an example of how an simple array-based generic container could be implemented.

public class GenericArray<E> implements Iterable<E> {  // An iterable type.
  private Object[] vals;         // An Object array that will store objects of type E.
  private int size;              // Array size.

  public GenericArray(int size) {
    vals = new Object[size];
    this.size = size;
  }

  @SuppressWarnings("unchecked")
  public E get(int i) {
    return (E) vals[i];  // A type cast that converts the value into type E. We assume that this
  }                      // is always legal (that the array truly contains only E objects).

  public void set(int i, E val) {
    vals[i] = val;
  }

  public int size() {
    return size;
  }

  // The remaining part concerns implementing the interface Iterable<E>.
  public Iterator<E> iterator() {
    return new GAIterator();
  }

  private class GAIterator implements Iterator<E> {
    private int pos = 0;        // The current iterator index (wrt. the array "vals").

    @Override
    public boolean hasNext() {
      return pos < size;        // Is the current index still within the array bounds?
    }

    @Override
    @SuppressWarnings("unchecked")
    public E next() {
      if(pos >= size) {         // Throw an exception if next item does not exist.
         throw new NoSuchElementException("No more values in the array!");
      }
      return (E) vals[pos++];   // Return the item at current iterator index in the vals
    }                           // array. Also increment the index (move iterator forward).
  }
}

The example uses the annotation @SuppressWarnings("unchecked") in front of the functions get and next. This tells to the compiler that “trust me; I know what I’m doing: do not warn me about unchecked type conversions done in this function”. The code would compile without these annotations, but the compiler would output a warning “Note: GenericArray.java uses unchecked or unsafe operations.”. The warning is related to type conversions that involve a generic type E, for example (E) vals[i]. Java usually checks the legality of type conversions during runtime. This would be impossible in case of generic types since the original type E is no longer known during runtime. Therefore this type of type conversions will not be checked during runtime; the responsibility lies on us to truly only handle items that are compatible with the original type E. If we fail to do this, the program might behave in an undefined manner (a runtime check that notices an illegal type conversion would at least produce a sensible exception).

This simple GenericArray container could be used e.g. as follows:

GenericArray<String> sa = new GenericArray<>(2);
GenericArray<Double> da = new GenericArray<>(3);
sa.set(0, "abc");
da.set(1, 3.14);
System.out.format("Stored string \"%s\" and double %.2f%n", sa.get(0), da.get(1));
for(String s : sa) {       // Implements Iterable, so the iterating for loop can be used.
  System.out.print("\"" + s + "\" ");
}
System.out.println();
for(Double d : da) {       // Implements Iterable, so the iterating for loop can be used.
  System.out.print(d + " ");
}

The code snippet would ouput roughly:

Stored string "abc" and double 3.14
"abc" "null"
null 3.14 null

For example Java’s ArrayList<E> has been implemented in a somewhat similar manner as GenericArray<E>: also it stores items into an Object array and uses @SuppressWarnings("unchecked") annotations to mute warnings related to casting Object E.

Type parameter wildcards <?>, <? extends E> and <? super E>

If you browse e.g. the documentation of the generic interface List<E>, you will notice e.g. the following member functions:

  • addAll(Collection<? extends E> c)

  • removeAll(Collection<?> c)

  • sort(Comparator<? super E> c)

These might make you wonder what the perhaps strange looking type parameter lists <?>, <? extends E> and <? super E> mean. These are type parameter wildcards that allow a type parameter declaration to match a variety of types (instead of e.g. matching only the main type parameter E as such).

  • <?> matches any type. The function removeAll(Collection<?> c) accepts a Collection container storing any type of items. The item type may e.g. be completely unrelated with the type parameter E of List<E>.

    • This freedom is feasible in removeAll: it revomes items that the function equals determines to be equal, and all objects are guaranteed to have at least the default equals function inherited from Object.

  • <? extends E> matches type E or its any subtype. The function addAll(Collection<? extends E> c) thus accepts as a parameter any Collection container whose item type is E or its subtype.

    • It is sensible for addAll to accept also subtypes of E because subtypes are by definition compatible with their supertype (and can here be treated as objects of type E).

    • One further possible motive for using a wildcard like this is that it allows also to restrict the type parameter to be compatible with some concrete type. E.g. <? extends SomeClass> would accept a type parameter that is SomeClass or its subtybe, and this allows to refer to all members of SomeClass via an object whose type is defined by this type parameter. If we use a normal type parameter E instead, we can only refer to members of the guaranteed superclass Object via an object of type E.

  • <? super E> matches type E or its any supertype. The function sort(Comparator<? super E> c) thus accepts any Comparator object that can compare objects of type E or its supertype.

    • This is feasible in terms of the sort function: since objects of type E can be handled also as objects of any supertype of E, the comparison function is able to compare objects of type E, too.

<? extends Number> is one example of using a concrete type with a type parameter wildcard. It would match the class Number and its subtypes, such as e.g. Double, Float and Integer.

The above described three functions already gave an idea of how type parameter wildcards could be useful. Here it is good to further note that wildcards are necessary if we want to match a type that depends on a type parameter that should be allowed to vary. E.g. if a function needs to accept any kind of ArrayList as a parameter, the parameter type must be ArrayList<?>. For example ArrayList<Object> would not work because it would literally accept only the type ArrayList<Object> and not e.g. ArrayList<Integer>. Here we cannot reason that since Object is the supertype of all types, a container storing any kind of objects should be compatible with a container storing Object objects. E.g. the attempt to initialize ArrayList<Object> ol = new ArrayList<Integer>() would be illegal.

The preceding is illegal in order to prevent the following scenario:

// An illegal example that would fail to compile.
ArrayList<Integer> ia = new ArrayList<>();
ArrayList<Object> oa = ia;  // Ok, if ArrayList<Object> compatible with ArrayList<Integer>.
oa.add("I am not an Integer!");  // This is legal as a String is an Object.

Here we would have been able to insert a String object into a list that is supposed to store Integer objects. But the code does not compile because the compiler does not allow the assignment oa = ia due to the incompatibility of ArrayList<Object> and ArrayList<Integer>.

Java is slightly awkward in the sense that arrays behave differently: the array type Object[] is compatible with all reference arrays! A function that wants to accept any type of an array (as long as it contains reference type items) could take the array as Object[]. Furthermore e.g. the initialization Object[] oa = new Integer[5] is legal. Therefore an array variant of the preceding illegal example code would compile without errors:

// An example that compiles (but leads into a runtime error).
Integer[] ia = new Integer[2];
Object[] oa = ia;   // This is legal: Object[] and Integer[] are compatible.
oa[0] = "I am not an Integer!";  // This is legal as a String is an Object.

Here we seemed to be truly able to insert a String object into an Integer array. This code would however cause a runtime error (exception) because Java arrays know their own item type and check during runtime that inserted items truly are compatible. A similar check would be impossible with generic containers as they do not know their intended item type anymore during runtime. This difference in part explains why array compatibility and container compatibility are determined differently.

Posting submission...