It seems that there are still some people who are unconvinced about the use of instanceof
in equality testing. Fortunately, it validates the purpose of writing the article; to get the point across to those who have been misled in the past about Java's over-abundant use of instanceof
for equality checking, where they should have been using getClass()
.
One of the nice things about the opposing point of view was that it made me think in detail about why it is bad. Essentially, it all stems from the asymmetric operation of the instanceof
operator; if the equals()
method is coded with instanceof
, then the assumed good behaviour can in fact be broken in one of two easy ways.
The only counter argument presented was that it breaks the so-called "Liskov Substitution Principle", that being that given a class A, you should be able to drop in a replacement B that extends A and have the same runtime behaviour.
Unfortunately, that doesn't hold; and what amused me was the laughably failed attempt to show that given an operation:
public void compare(Point2D a, Point2D b) { if (a.equals(b)) { // do something } }
then you can't substitue 'b' for a Point3D
without breaking code, unless it is implemented using instanceof
. Well, technically, that's right -- but whichever implementation you use, if you provide 'a' with Point3D
, it does the opposite in both cases. So I actually used his own example to debunk him :-)
This actually leads me to the point of this post: THE LISKOV SUBSTITUTION PRINCIPLE DOES NOT HOLD IN JAVA. Unfortunately, some not-so-bright spark incorporated the LSP as being a Good Thing, and from there, it has evolved like a cancer through the object oriented community as gospel.
Actually, the LSP was never stated as something that exists, or that can (or should) be applied to the Java language, but developers bring it out in rote and chant it as a mantra to justify their contract-breaking implementations. It's been discussed and debunked often enough for people who understand the problem to be flawed, but for some reason generally ignored.
The quote is often misleadingly reported as:
"For each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T."
In actual fact, the quote comes from the the original "Family Values" paper, and in its entirety, is:
"What is wanted here is something like the following substitution property: For each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is substituted for o2 then S is a subtype of T."
(emphasis mine).
The point is, that the research paper was proposing a way of defining a subtype hierarchy, in such a way, that this behaviour occurred. It was never stated that either (a) this was to do with a specific object-oriented language (or even any object-oriented language), or (b) that this property of 'subtyping' had anything to do with the actual inheritance model in Java, or any other language. What they were looking for was a mathematical definition of subtyping, such that that behaviour occurred.
It should be fairly obvious that this isn't anything to do with Java, and at no point did any of the authors claim to have proven that this link occurred in a real programming language, but for some reason, it has seeped through the OO community without consideration of the initial research purpose, nor the laughably easily way it can be denounced.
The main problem is that the LSP defines a weak constraint on the set of properties, but then claims that for any property it can be shown that the program behaviour is the same. The only way this can ever be validated is if the original objects in the system are so tightly defined by their specification that they admit no other implementation. For example, the authors of the above paper claim that given a set of properties about a method, they can claim 100% behavioural compatibility with another method that has the same contract -- and this is NOT TRUE. In essence, I can claim that the contract of the toString()
method in Java is very weak -- "Returns a string representation of the object". Now, according to the LSP, I can substitute in any method that obeys that contract, and I am guaranteed the same behaviour. Clearly, this is false.
public boolean lspHolds(Object o) { return o.contains("@"); }
I can pass in an instance of Object
in here, and it will return true
, because the default behaviour prints out the value of the reference, followed by an @, followed by the class name. I can now pass in an instance of String
, which most of the time will return false
-- which is a violation of the LSP.
The point is that a weak specification does not allow you to make such a strong statement such as "For any substitution S subtyping T, P will hold". It only allows you to make a statement based on the weak specification of the object, which in this case, is not enough to ensure that our lspHolds
method is always true
.
It should be noted that the original paper was attempting to find such a typing relation, which is a distinct mathematical statement (as are the objects they refer to) and do not necessarily map to our concept of Java classes and methods. This isn't saying that the original paper is wrong; rather, it shows that it doesn't map to real languages like Java. In any case, it was a research proposal, and often, research doesn't turn out the way you'd expect or the way that you are looking for -- but negative results are just as important as positive results are.
In any case, the main crux of LSP is showing that given method M in T, when you have method M in S, M is supposed to obey the contract of the equals method, which is unambiguous in its specification:
The equals
method implements an equivalence relation on non-null object references:
- It is reflexive: for any non-null reference value
x
,x.equals(x)
should returntrue
. - It is symmetric: for any non-null reference values
x
andy
,x.equals(y)
should returntrue
if and only ify.equals(x)
returnstrue
. - It is transitive: for any non-null reference values
x
,y
, andz
, ifx.equals(y)
returnstrue
andy.equals(z)
returnstrue
, thenx.equals(z)
should returntrue
. - It is consistent: for any non-null reference values
x
andy
, multiple invocations of x.equals(y) consistently returntrue
or consistently returnfalse
, provided no information used inequals
comparisons on the objects is modified. - For any non-null reference value
x
,x.equals(null)
should returnfalse
.
(quoted verbatim from the JavaDoc for .equals())
Specifically, in order to rely on the LSP, you have to ensure that M> obeys the contract defined in T, which implies that it must be reflexive, symmetric and transitive. If you don't have that, you don't have a valid overridden method as per the LSP definition, so you can't even use that as your argument.
Lastly, if you do try and use instanceof
, you can guarantee that you break symmetry, because the instanceof
is asymmetric. I did point out in my proof on JavaWorld, that it is possible to overcome this behaviour; however, for any implementation of .equals()
using instanceof
that obeys reflexivity and symmetry will be forced to break transitivity.
So the arguments are set in stone. The LSP doesn't hold in Java, and never has. Even if you rely on this argument to break .equals()
, you've broken the LSP, because it relies on you obeying the contracts defined in the super class. And if you get around both of those, then you can't have reflexivity, symmetry and transitivity all in one using instanceof
; thus, you've broken the contract, and hence fails the LSP.
So why do so many people use it? I blame Object Mentor, because they've integrated it into their Principles of Object Design course, from where it seems to have grown.
What's important is that there is a large amount of overlap between good ideas in the LSP -- that being, that subclassess should obey their super-class contracts -- and good design. However, the two aren't the same thing; something that their author would have done well to investigate and find out in advance before making the claim. They do have an article on LSP that goes into some detail, and does highlight the ideas of Design by Contract -- and this is where the LSP and good OO depart. To summarise; the good parts of LSP are accurately described by the Design By Contract approach; the LSP makes stronger statements that don't hold. But unfortunately, the good people at Object Mentor don't tell the students the difference between the two.