3

what is the use of private "hash" variable in java.lang.String class. It is private and calculated/re-calculated every time hashcode method is called.

http://hg.openjdk.java.net/jdk7u/jdk7u6/jdk/file/8c2c5d63a17e/src/share/classes/java/lang/String.java

1 Answer 1

5

It's used to cache the hashCode of the String. Because String is immutable, its hashCode will never change, so attempting to recalculate it after it's already been calculated is pointless.

In the code that you've posted, it's only recalculated when the value of hash is 0, which can either occur if the hashCode hasn't been calculated yet or if the hashCode of the String is actually 0, which is possible!

For example, the hashCode of "aardvark polycyclic bitmap" is 0.

This oversight seems to have been corrected in Java 13 with the introduction of a hashIsZero field:

public int hashCode() {
    // The hash or hashIsZero fields are subject to a benign data race,
    // making it crucial to ensure that any observable result of the
    // calculation in this method stays correct under any possible read of
    // these fields. Necessary restrictions to allow this to be correct
    // without explicit memory fences or similar concurrency primitives is
    // that we can ever only write to one of these two fields for a given
    // String instance, and that the computation is idempotent and derived
    // from immutable state
    int h = hash;
    if (h == 0 && !hashIsZero) {
        h = isLatin1() ? StringLatin1.hashCode(value)
                       : StringUTF16.hashCode(value);
        if (h == 0) {
            hashIsZero = true;
        } else {
            hash = h;
        }
    }
    return h;
}
Sign up to request clarification or add additional context in comments.

2 Comments

That’s not an oversight. It was a trade-off between having to recalculate the value for some strings and adding an additional field for every string instance. But since then, the fields of the string class, as well as the memory layout of JVM implementations, have changed multiple times, further, the relevance of this problem has been reconsidered, as it may open applications to DoS attacks. So nowadays, the additional field has been considered to be no problem, especially as depending on the JVM, some fields may have no impact on the required storage.
@Holger beautiful comment. this is exactly the case about code in String class, that had enough gap left to fit that field without impacting the size at all.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.