Java: == vs. .equals(), and hashCode()

the difference, and how to override .equals and .hashCode()

The difference between == and .equals is:

What == does

With primitive variables: == does exactly what you would expect; it compares for equality. No suprises here. And for primitive variables, you can only use ==. That makes sense: equals() is a method, and you can’t invoke a method on a primitive in Java.

With reference variables: == compares the references to see if they refer to the same object or not. It is similar to the idea of comparing “pointer values” in C/C++. That means that if I create:

Student s1 = new Student(1234567,"Chris","Gaucho");
Student s2 = new Student(1234567,"Chris","Gaucho");
Integer i1 = new Integer(42);
Integer i2 = new Integer(42);

Then both s1==s2 and i1==i2 will evaluate to false. In both cases, even though the object instances referred to the two references contain identical values, they are different objects on the heap. So == returns false.

By contrast, if I do this:

Student s1 = new Student(1234567,"Chris","Gaucho");
Student s2 = s1;
Integer i1 = new Integer(42);
Integer i2 = i1;

Now both s1==s2 and i1==i2 will evalute to true. Both references refer to the same object on the heap, so the references have the same value.

What .equals() does

The situation for the .equals() method depends greatly on whether we are talking about classes that are part of the standard Java API, or classes that have been coded by an application programmer.

The first thing to know is the default behavior of .equals() for the class java.lang.Object is identical to the behavior of the == operation on references.

Here is a portion of the javadoc for .equals says on this point:

The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference values x and y, this method returns true if and only if x and y refer to the same object (x == y has the value true).

The second thing to know is that what classes that extend java.lang.Object are supposed to do (and recall that every class extends java.lang.Object, so that means every class!) is to override .equals() in a way that “makes sense” for the particular object in question. Note that the javadoc expresses this idea by putting “equals” literally in “quotation marks”. The one liner for the meaning of the .equals() method for java.lang.Object indicates that “equal” is some kind of “term of art”.

Indicates whether some other object is “equal to” this one.

That’s easy to understand when, for example, we are talking about java.lang.String or java.util.ArrayList<E>:

(At least, I hope that’s easy to understand. We typically use the language of CMPSC 40 (our Discrete Math course) to express these ideas, because that language has a precise and unambiguous meaning to Computer Scientists and Applied Mathematicians. There is a reason that CMPSC 40 is called “Foundations of Computer Science”. You’ll be hearing that kind of language over and over again in your CS courses. And indeed, if you thought it was just professors that talked this way, now look! It’s showing up in the documentation of the fundamental object of one of the “real world’s” most widely used Object-Oriented Programming languages. Go figure.)

In any case, the idea of what .equals() means for String and ArrayList is hardly surprising.

But, if we implement our own class Student, with attributes int perm, String fname and String lname, then what does it mean for one Student object to be “equal” to another student object?

The answer is nuanced. It truly depends on what we are using the .equals method for in our application for this particular object.

But the first and most important thing to remember here is that unless the programmer overrides the java.lang.Object version of .equals() then you will get a version of .equals() that behaves *identically* to the ==` operator.

So we should pretty much always override .equals for any class we implement.

How we do it is application and context dependent. But four things are always crystal clear, and spelled out in the javadoc for the equals() method of java.lang.Object:

  1. We should implement it as an equivalence relation, in the CMPSC 40, MATH 8, sense of an equivance relation (that is, we need to remember our Discrete Math course, and recall what a binary relation is that is reflexive, symmetric and transitive)
  2. We should make sure it behaves consistently; in the words of the javadoc: “For any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
  3. We should make sure that if x isn’t null, then x.equals(null) always returns false (rather than a null-pointer exception, or true).
  4. We should likely also be overriding the hashCode() function as well.

Overriding the hashCode function.

So, let’s take a look at the javadoc for the hashCode function of java.lang.Object.

It says:

The general contract of hashCode is:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

We can summarize:

How we do this depends on the object.

Reading in HFJ