Saturday, April 3, 2010

Testing Identity

A quick review of hashCode(). The commonly stated rules are:
1. hashCode() must be overridden when equals() is overridden
2. when objects are equal then their hash codes should be as well

We begin with the the Car class that relies on default equality inherited from the Object class wherein the equals() method is simply a reference comparison.
package com.cars;
public class Car {
 private int vin; //Vehicle ID Number
 public Car(int vin) {
  this.vin = vin;
 }
 public int getVin() { 
  return vin;
 }
}
And we test this using Java assertions (a review of Java assertions never hurt).
package com.cars;
public class TestingIdentity {  
 public static void main(String[] args) {
  Car car1 = new Car(1);
  Car car2 = new Car(1);
  assert car1 != car2;       // line 6
  assert car1.equals(car2);  // line 7
}
On line 6 of TestingIdentity, we know the objects are not the same object but on line 7, two objects having the same values are unequal too so the assertion fails when we compile and run the code.
javac -sourcepath src -classpath ./build -d build src/com/cars/Car.java
javac -sourcepath src -classpath ./build -d build src/com/cars/TestingIdentity.java
java -ea -classpath ./build com.cars.TestingIdentity
Exception in thread "main" java.lang.AssertionError
 at com.cars.TestingIdentity.main(TestingIdentity.java:7)
In the semantics of our application, we would like instances having the same VIN to be equal. Let's write that code.
package com.cars;

public class Car {
 private int vin;
 public Car(int vin) {
  this.vin = vin;
 }
 public int getVin() { 
  return vin;
 }
 
 @Override
 public boolean equals(Object o) {
  if (o == null) 
   return false;
  if (this == o) 
   return true;
  if (o.getClass() != this.getClass())
   return false;
  Car car = (Car)o;
  if (car.getVin() == this.vin) 
   return true;
  return false;
 }
}
And when we run this code, we get no complaints from the line assert car1.equals(car2);. So, are we done? Well, we are always told that we must override hashCode() whenever we override equals(). Let's explore hashCode() to see why it must be overridden. The documentation of the HashSet.add() method states that the method "Adds the specified element to this set if it is not already present". Let's prove that failing to override hashCode() causes HashSet to fail to perform it's prescribed behavior.
package com.cars;
import java.util.Set;
import java.util.HashSet;
public class TestingIdentity {  
 public static void main(String[] args) {
  Car car1 = new Car(1);
  Car car2 = new Car(1);
  assert car1 != car2;       // line 8
  assert car1.equals(car2);  // line 9
  
  Set set = new HashSet();
  set.add(car1);
  set.add(car2);
  assert set.size() == 1;  // line 14
 } 
}
Because the objects are 'equal', we expect only one instance to be in the HashSet. When we run the code however, we get the trouble we've been looking for. The new assertion in the code above fails because the HashSet relies on hashCode() in its determination of whether the object already exists in the Set.
javac -sourcepath src -classpath ./build -d build src/com/cars/TestingIdentity.java
java -ea -classpath ./build com.cars.TestingIdentity
Exception in thread "main" java.lang.AssertionError
 at com.cars.TestingIdentity.main(TestingIdentity.java:14)
To attempt to solve this problem, we follow the admonition and override hashCode() in the Car class:
 @Override
 public int hashCode() {
  return vin;
 }
The code above now runs without violating the assertions because hashCode() is overridden so that whenever a.equals(b), then a.hashCode() == b.hashCode()

The solution is a little simplistic but it does work for the given problem. A better solution is to use org.apache.commons.lang.builder.HashCodeBuilder.