A Curious Case of Mistaken Identity: How Lambdas Break Data Class Hashing

Introduction: The Scene of the Crime

It was a dark and stormy night. My hands were flying across the keys when suddenly the codebase began to exhibit strange behavior. Hashes, which once returned the same values for identical objects, suddenly became unpredictable. Collections shuffled unexpectedly, deduplication failed, and instances thought to be identical went unrecognized. Today we dive into this mystery, revealing the culprit.

Act 1: The Setup

We introduce the protagonist of our story, DetectiveDataClass.

data class DetectiveDataClass(
    val name: String,
    val age: Int,
    val alias: String,
    val onDetectiveAlert: () -> Unit
)

Here, DetectiveDataClass includes a lambda, onDetectiveAlert, meant to serve as a callback when an alert is triggered. All seems calm until…

Act 2: A Case of Mistaken Identity

The team quickly notices something off. When two detectives are instantiated with identical properties, they’re expected to be the same, right? Not quite, it seems.

val detective1 = DetectiveDataClass(
    name = "Sherlock",
    age = 40,
    alias = "Holmes",
    onDetectiveAlert = { println("Elementary!") }
)

val detective2 = DetectiveDataClass(
    name = "Sherlock",
    age = 40,
    alias = "Holmes",
    onDetectiveAlert = { println("Elementary!") }
)

println(detective1 == detective2)
// Expected true, but is actually false

println(detective1.hashCode() == detective2.hashCode())
// Expected true, but is actually false

Surprise! Although the detectives have identical names, ages, aliases, and even identical alert messages, Kotlin reports them as different. The code’s consistency is broken, and it’s clear that hashCode and equals are not behaving as expected.

Act 3: Tracking the Evidence

Why does this happen? It turns out that each lambda (even if it looks identical) is unique. When a lambda is created, it holds a distinct memory reference, meaning onDetectiveAlert in detective1 and detective2 are fundamentally different objects in memory. The identity of the lambda breaks the hashing logic, and here’s proof:

// Unique ID for the first lambda
println(System.identityHashCode(detective1.onDetectiveAlert))

// Unique ID for the second lambda
println(System.identityHashCode(detective2.onDetectiveAlert))

A System.identityHashCode returns an object’s location in memory. Meaning that it is not a content based hashcode like data class hashcode functions. So even if two objects contain the same information, because they are not the literal same object in memory, they will have a different identity hashcode.

So each lambda instance has a unique identity hashcode, highlighting the critical difference that affects equality and hashcode results. To collections and hash-based logic, these objects are not the same!

Act 4: The Clues Come Together

To solve the mystery, let’s redefine our suspects with a new plan. We’ll exclude the onDetectiveAlert lambda from our equality and hashcode calculations by overriding them:

data class DetectiveDataClass(
    val name: String,
    val age: Int,
    val alias: String,
    val onDetectiveAlert: () -> Unit
) {
    override fun equals(other: Any?): Boolean {
        if (this === other) return true
        if (other !is DetectiveDataClass) return false

        return name == other.name &&
               age == other.age &&
               alias == other.alias
    }

    override fun hashCode(): Int {
        return name.hashCode() * 31 + 
            age * 31 + 
            alias.hashCode()
    }
}

Now, let’s run the check again:

val detective1 = DetectiveDataClass("Sherlock", 40, "Holmes") {
  println("Elementary!")
}
val detective2 = DetectiveDataClass("Sherlock", 40, "Holmes") {
  println("Elementary!")
}

println(detective1 == detective2) // Now true
println(detective1.hashCode() == detective2.hashCode()) // Now true

With this solution, the detective instances match as expected.

Act 5: The Mystery is Solved

By excluding the lambda from equals and hashCode, we’ve eliminated the source of instability. This allows objects to be considered identical based on core attributes rather than transient callbacks.

This increases maintenance cost and breaks the hashcode function we get from data classes for free. Any new fields added to DetectiveDataClass will also need to be added to our overridden equals and hashCode functions.

Note: This is an example for this article and probably not the best practice in real projects.

Instead you can consider:

  • Using a method reference so that the same lambda is called.

    •   fun alertFunction() {
            println("Elementary!")
        }
      
        val detective1 = DetectiveDataClass(
            name = "Sherlock",
            age = 40,
            alias = "Holmes",
            onDetectiveAlert = ::alertFunction
        )
      
        val detective2 = DetectiveDataClass(
            name = "Sherlock",
            age = 40,
            alias = "Holmes",
            onDetectiveAlert = ::alertFunction
        )
      
  • Using interfaces for lambdas or callbacks if stable references are needed.

    •   interface DetectiveAlert {
            fun onAlert()
        }
      
        data class DetectiveDataClass(
            val name: String,
            val age: Int,
            val alias: String,
            val onDetectiveAlert: DetectiveAlert
        )
      
        object ElementaryAlert : DetectiveAlert {
            override fun onAlert() {
                println("Elementary!")
            }
        }
      
  • Store lambdas separately from data classes when object equality matters.

  • Override equals and hashCode carefully to exclude fields that could vary unexpectedly, or you know don’t matter to your equality.

Epilogue

With the mystery unraveled, our detectives can rest assured, knowing their identities are now stable and consistent. The tale of the unstable hash is a warning to all: in the world of Kotlin, lambdas may be useful but can be fickle co-conspirators when data class hashing is involved.