Pages - Menu

Thursday, August 30, 2012

Java Oddities (Part I)

There's a famous lightening talk given by Gary Bernhardt about Javascript and Ruby oddities.
I would like to start a series of blog posts documenting some oddities in the Java language for fun! I'll explain why or where these oddities come from with reference to the Java Language Specification when possible. I hope you learn some new things. Feel free to email or tweet me if you would like to add to the list.

Array Declarations

Java programmers can declare array variables in several ways:

int[] a;
int b[]; // allowed to make C/C++ people happy

However, the grammar doesn't enforce a particular style for arrays of dimensions greater than one. The [] may appear as part of the type, or as part of the declarator for a particular variable, or both. The following declarations are therefore valid:

int[][] c;
int d[][];
int[] e[]; // :(
int[][] f[]; // :(

This mixed annotations is obviously not recommended by the Java Language Specification (Array Variables) as it can lead to confusions and is reported by code convention tools such as checkstyle.
This can be taken to the extreme. The following method signature in a class or interface declaration will be accepted by the standard Javac parser:

public abstract int[] foo(int[] arg)[][][][][][][][][][][];

The return type of the method foo is int[][][][][][][][][][][][].

In fact, the grammar of ClassBodyDeclaration is defined as follows:

ClassBodyDeclaration = 
   .. | TypeParameters (Type | VOID) Ident MethodDeclaratorRest | ..

MethodDeclaratorRest = 
    FormalParameters BracketsOpt [Throws TypeList] ( MethodBody | [DEFAULT AnnotationValue] ";")

BracketsOpt = {"[" "]"}

The BracketsOpt rule allows a sequence of [] to be inserted after the formal parameters definition.
The relevant lines within com.sun.tools.javac.parser.JavacParser start at 2938.

Array Covariance

Java arrays are covariant. This means that given a type S which is a subtype of a type T then S[] is considered a subtype of T[]. This property is described in the Java Language Specification (Subtyping among Array Types). This property is known to lead to ArrayStore exceptions as documented in the Java Language Specification: (Array Store Exception). For example:

Object[] o = new String[4];
o[0] = new Object(); // compiles but a runtime exception will be reported

Arrays were made covariant because before the introduction of generics it allowed library designers to write generic code (without type safety). For example, one could write a method findItems as follows:

public boolean findItems(Object[] array, Object item)
{
    ...
}

This method will accept arguments such as (String[], String) or (Integer[], Integer) and in a sense reduces code duplication since you don't need to write several methods specific to the types of the arguments. However, there is no contract between the element type of the array that is passed and the type of the item that needs to be found.

Nowadays one can use generic methods (making use of a type parameter) to achieve the same mechanism with additional type safety:

public <T> boolean findItems(T[] array, T item)
{
    ...
}

Integer Caching

int a = 1000, b = 1000;  
System.out.println(a == b); // true
Integer c = 1000, d = 1000;  
System.out.println(c == d); // false
Integer e = 100, f = 100;  
System.out.println(e == f); // true

This behaviour is documented in the Java Language Specification (Boxing Conversion):

If the value p being boxed is true, false, a byte, or a char in the range \u0000 to \u007f, or an int or short number between -128 and 127 (inclusive), then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.

For those curious, you can look up the implementation of Integer.valueOf(int), which confirms the specification:

public static Integer valueOf(int i) {
    assert IntegerCache.high >= 127;
    if (i >= IntegerCache.low && i <= IntegerCache.high)
        return IntegerCache.cache[i + (-IntegerCache.low)];
    return new Integer(i);
}

5 comments:

  1. Even generics are not safe, since they are covariant as well, and can be freely typecast away (think C-style void* casting) due to erasure:

    ArrayList<String> myList = new ArrayList<String>();
    ArrayList<Object> castList = (ArrayList<Object>)myList; // the road to hell is paved with good intentions...
    ArrayList<Integer> brokenList = (ArrayList<Integer>)castList; // oh dear, here we go...
    brokenList.add(42); // KABOOM!

    (btw, is there any way to type angle brackets in here without "Your html tag is not allowed" errors or resorting to character entities?)

    ReplyDelete
    Replies
    1. Generics are actually invariant in Java, i.e given a parametric class C<E> and given a type S that is a subtype of T, there is no relation between C<S> and C<T>. This means that the following line is a compile error in Java:

      ArrayList<Object> myList = new ArrayList<String>(); // compile error

      Type casts are a different story. They are unsafe by their nature as you basically take over the type system.

      Delete
    2. No, generics /are/ safe (in the absence of an unchecked warning). The cast doesn't change that; the second line of your example doesn't compile.

      Delete
  2. Nice post! I'm a DZone.com curator and I'd love to feature this on Javalobby if you're interested. Shoot me an email at mpron[at]dzone[dot]com

    ReplyDelete
  3. semi mid level Java developer here, great post for me .

    ReplyDelete