What’s the deal with Java equals() and hashcode()?
January 26th, 2010 Jeff Howell, Consultant
I came across this issue a couple years ago. It was surprisingly not-so-obvious to the developer, a good developer, who was sorting out a very elusive bug in a large Java application.
The symptom was that a Map of objects sometimes returned null when queried. The developer ran the code in a debugger and could see that the object was put into the Map. Yet, when the map was asked to retrieve the object, it was not found (even though it could be seen by inspection in the debugger).
Other objects were successfully stored and retrieved from the same Map.
After a lot of digging, he discovered that the stored objects had a custom equals() method, but not a custom hashcode() method. When the hashcode() method was also overridden with a correct implementation, all was fixed.
I was on the architecture team and got the question, “so why did the inventors of Java allow you to shoot off a toe?”
The answer is twofold. First, we need to understand the way Maps work in Java. The whole point of a Map is to be able to find an object faster than a linear search. Using hashed keys to locate objects is a two-step process. Internally the Map stores objects as an array of arrays. The index into the first array is the (normalized) hashcode() value of the key. This locates the second array which is searched linearly, using equals() to decide if the object is found.
The contract between equals() and hashcode() requires them to be mutually consistent; if two keys are equal (by the equals() method), they must have the same hashcode value. Otherwise, the object may not be found, as in the bug described above. This is simply mechanical. If you want the real answer, read the source code of HashMap (which is pretty cool code).
The second part of the answer is to address the “why” question. The language allows developers flexibility when determining how objects are compared for equality. If you want your User objects to be ‘equal’ if firstName, lastName, and age are all equal, but disregard Social Security Number, you can code it that way. Flexibility is good because object equality can be based on the content of the objects. So why doesn’t the language automatically create a hashcode() value that is consistent with equals? Because there is no way for the compiler to know what would be consistent.
So developers have flexibility defining equality, along with the responsibility of maintaining the contract between equals() and hashcode().
“But”, you ask, “what if I don’t plan to contain this type in Maps? I don’t have to worry about hashcode(), right”. Right, sort of. You can get into the situation above if some maintenance or feature enhancement changes that assumption, putting those objects into a Map. The bug described took two days to unravel and fix. The best strategy is to always maintain the equals() - hashcode() contract.
So how do you know if there are violations of the contract lurking in your codebase? Eclipse (surprisingly) will not flag it unless you change the settings (preferences->Java->Compiler->Errors/Warnings Potential programming problems->class overrides ‘equals()’ but not ‘hashcode()’). If you use Eclipse, you might want to set this to at least Warning. PMD will find the problem, see the basic rule OverrideBothEqualsAndHashcode. FindBugs also can find the problem, see the “HE:” bug descriptions.
Entry Filed under: Software Development






Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed