Photo by Michelle Ding on Unsplash
A Curious Case of Mistaken Identity: How Lambdas Break Data Class Hashing
Introduction: The Scene of the Crime
It was a dark and stormy night. My hands were flying across the keys when suddenly the codebase began to exhibit strange behavior. Hashes, which once returned the same values for identical objects, suddenly became unpredictable. Collections shuffled unexpectedly, deduplication failed, and instances thought to be identical went unrecognized. Today we dive into this mystery, revealing the culprit.
Act 1: The Setup
We introduce the protagonist of our story, DetectiveDataClass
.
data class DetectiveDataClass(
val name: String,
val age: Int,
val alias: String,
val onDetectiveAlert: () -> Unit
)
Here, DetectiveDataClass
includes a lambda, onDetectiveAlert
, meant to serve as a callback when an alert is triggered. All seems calm until…
Act 2: A Case of Mistaken Identity
The team quickly notices something off. When two detectives are instantiated with identical properties, they’re expected to be the same, right? Not quite, it seems.
val detective1 = DetectiveDataClass(
name = "Sherlock",
age = 40,
alias = "Holmes",
onDetectiveAlert = { println("Elementary!") }
)
val detective2 = DetectiveDataClass(
name = "Sherlock",
age = 40,
alias = "Holmes",
onDetectiveAlert = { println("Elementary!") }
)
println(detective1 == detective2)
// Expected true, but is actually false
println(detective1.hashCode() == detective2.hashCode())
// Expected true, but is actually false
Surprise! Although the detectives have identical names, ages, aliases, and even identical alert messages, Kotlin reports them as different. The code’s consistency is broken, and it’s clear that hashCode
and equals
are not behaving as expected.
Act 3: Tracking the Evidence
Why does this happen? It turns out that each lambda (even if it looks identical) is unique. When a lambda is created, it holds a distinct memory reference, meaning onDetectiveAlert
in detective1
and detective2
are fundamentally different objects in memory. The identity of the lambda breaks the hashing logic, and here’s proof:
// Unique ID for the first lambda
println(System.identityHashCode(detective1.onDetectiveAlert))
// Unique ID for the second lambda
println(System.identityHashCode(detective2.onDetectiveAlert))
A System.identityHashCode
returns an object’s location in memory. Meaning that it is not a content based hashcode like data class hashcode functions. So even if two objects contain the same information, because they are not the literal same object in memory, they will have a different identity hashcode.
So each lambda instance has a unique identity hashcode, highlighting the critical difference that affects equality and hashcode results. To collections and hash-based logic, these objects are not the same!
Act 4: The Clues Come Together
To solve the mystery, let’s redefine our suspects with a new plan. We’ll exclude the onDetectiveAlert
lambda from our equality and hashcode calculations by overriding them:
data class DetectiveDataClass(
val name: String,
val age: Int,
val alias: String,
val onDetectiveAlert: () -> Unit
) {
override fun equals(other: Any?): Boolean {
if (this === other) return true
if (other !is DetectiveDataClass) return false
return name == other.name &&
age == other.age &&
alias == other.alias
}
override fun hashCode(): Int {
return name.hashCode() * 31 +
age * 31 +
alias.hashCode()
}
}
Now, let’s run the check again:
val detective1 = DetectiveDataClass("Sherlock", 40, "Holmes") {
println("Elementary!")
}
val detective2 = DetectiveDataClass("Sherlock", 40, "Holmes") {
println("Elementary!")
}
println(detective1 == detective2) // Now true
println(detective1.hashCode() == detective2.hashCode()) // Now true
With this solution, the detective instances match as expected.
Act 5: The Mystery is Solved
By excluding the lambda from equals
and hashCode
, we’ve eliminated the source of instability. This allows objects to be considered identical based on core attributes rather than transient callbacks.
This increases maintenance cost and breaks the hashcode function we get from data classes for free. Any new fields added to DetectiveDataClass
will also need to be added to our overridden equals
and hashCode
functions.
Instead you can consider:
Using a method reference so that the same lambda is called.
fun alertFunction() { println("Elementary!") } val detective1 = DetectiveDataClass( name = "Sherlock", age = 40, alias = "Holmes", onDetectiveAlert = ::alertFunction ) val detective2 = DetectiveDataClass( name = "Sherlock", age = 40, alias = "Holmes", onDetectiveAlert = ::alertFunction )
Using interfaces for lambdas or callbacks if stable references are needed.
interface DetectiveAlert { fun onAlert() } data class DetectiveDataClass( val name: String, val age: Int, val alias: String, val onDetectiveAlert: DetectiveAlert ) object ElementaryAlert : DetectiveAlert { override fun onAlert() { println("Elementary!") } }
Store lambdas separately from data classes when object equality matters.
Override equals and hashCode carefully to exclude fields that could vary unexpectedly, or you know don’t matter to your equality.
Epilogue
With the mystery unraveled, our detectives can rest assured, knowing their identities are now stable and consistent. The tale of the unstable hash is a warning to all: in the world of Kotlin, lambdas may be useful but can be fickle co-conspirators when data class hashing is involved.