Virtually all programming languages have an "equality" operator, and many have several. There are several different TypesOfEquality
which are commonly used in programming languages, each have their own issues.
In decreasing order of strength, they are:
- IdentityOperator? (ident)
- SemanticEquivalenceOperator? (sem_eqv)
- EqualityOperator (equals)
- IsomorphicOperator? (iso)
The label following each of the WikiBadge
's above indicates the syntax we will use in the following discussion--ident (A,B) means A is identical to B.
The notion of strength
refers to the following set of properties:
ident (a,b) -> sem_equ (a,b)
sem_eqv(a,b) -> equals (a,b)
equals (a,b) -> iso (a,b)
In other words, if two objects are identical, then they are also semantically equivalent, equal, and isomorphic. Contrapositively, if two objects are not isomorphic, they cannot possibly be equal, semantically equivalent, or identical.
Many real programming languages provide semantics that are "between" the ones defined here.
is essentially a pointer comparison. Two objects are identical IfAndOnlyIf
they occupy the same storage. Identity is quick and easy to implement. Identity also has the nice property that it is invariant; if two objects are identical they will always be so; and if they are not identical then they never will be so.
Skipping semantic equivalence, we next consider the EqualityOperator
. It asks the question; "do the objects have the same (logical) state at the current time?" (we ignore issues of physical vs logical equality here). Most if not all languages provide a definition for this for language-defined types; and many provide defaults for user-defined types. There are a few potential pitfalls with language-provided defaults which are discussed in EqualityOperator
. Note that equality discussed here assumes the two objects are of the same type; we define equal(A,B) for which typeof(A) != typeof(B) to be false.
s, equality is a useful but often transitory state; just because two objects are equal now does not mean that they will be equal in the future--one or both may have their state (or the state of an object they depend on) mutated at any time. In other words, equality is not invariant. For ValueObject
s (and anything else which is immutable), equality is
invariant; if two ints are equal they will always be equal.
This observation leads to the definition of semantic equivalence
. Two objects are semantically equivalent IfAndOnlyIf
they will always
appear to be equal. As stated above, identity implies semantic invariance; for ValueObject
s semantic invariance may be implied by equality. This property of ValueObject
s allows all sorts of optimizations, such as passing them by value or reference, use of CopyOnWrite
semantics when creating new instances of a ValueObject
, etc. Semantic equivalence is, in many cases, more useful
than pure identity--"can I ever tell them apart" is frequently a more useful question than "do they occupy the same location in memory".
Identity is often used as an approximation for semantic equivalence because it's easy to compute. For ValueObject
s, as mentioned above, equality is exactly
equivalent to semantic equivalence. However, no languages that I can think of provide an exact (default) implementation of SemanticEquivalence?
, as it can be tricky to deal with--comparing two complicated ValueObject
s for equality might take a long time (or diverge, if you aren't careful). Some approaches taken by various languages:
- C++: Has no notion of semantic equivalence, really. Comparison on pointers is used for identity, comparison of objects (operator ==) used for equality, though this might be redefined to semantic equivalence.
- JavaLanguage: the "equals" method is often used for semantic equivalence; default behavior (the same as identity) is appropriate for ReferenceObjects; can be overridden for ValueObjects.
Finally, there is isomorphism. Isomorphism is for objects of two different types which are "alike" in some meaningful way; such as comparing the integer 5 with the float 5.0. In many languages, the equality operator (when applied on objects of different types) can implement isomorphism. A meaningful default behavior for a heterogenous comparison like this is to return false (however many languages consider this an error.)
I don't think object equality is common enough to justify a dedicated symbol (such as 3 equal signs). I think an "isSameObject" function or method is sufficient for most languages and techniques, easier to spot, and better self-documenting. Some people get symbol-happy when designing languages when leaving it to a library function/method would be plenty acceptable.