Like it first, then watch it, develop a good habit
First look at one of the simplest printing
System.out.println(new Object());
Will output the fully qualified class name of the class and a string of strings:
java.lang.Object@6659c656
@
What is behind the sign? Is it the hashcode or the memory address of the object? Or some other value?
In fact, @
follows 060c99372a45f5 is only the hashcode value of the object, the hashcode displayed in hexadecimal, just to verify:
Object o = new Object();
int hashcode = o.hashCode();
// toString
System.out.println(o);
// hashcode 十六进制
System.out.println(Integer.toHexString(hashcode));
// hashcode
System.out.println(hashcode);
// 这个方法,也是获取对象的 hashcode;不过和 Object.hashcode 不同的是,该方法会无视重写的hashcode
System.out.println(System.identityHashCode(o));
Output result:
java.lang.Object@6659c656
6659c656
1717159510
1717159510
How is the hashcode of the object generated? Is it really the memory address?
content of this article is based on JAVA 8 HotSpot
The generation logic of hashCode
The logic of generating hashCode in JVM is not that simple. It provides several strategies, and each strategy generates different results.
core method that generates hashCode in the openjdk source code:
static inline intptr_t get_next_hash(Thread * Self, oop obj) {
intptr_t value = 0 ;
if (hashCode == 0) {
// This form uses an unguarded global Park-Miller RNG,
// so it's possible for two threads to race and generate the same RNG.
// On MP system we'll have lots of RW access to a global, so the
// mechanism induces lots of coherency traffic.
value = os::random() ;
} else
if (hashCode == 1) {
// This variation has the property of being stable (idempotent)
// between STW operations. This can be useful in some of the 1-0
// synchronization schemes.
intptr_t addrBits = intptr_t(obj) >> 3 ;
value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
} else
if (hashCode == 2) {
value = 1 ; // for sensitivity testing
} else
if (hashCode == 3) {
value = ++GVars.hcSequence ;
} else
if (hashCode == 4) {
value = intptr_t(obj) ;
} else {
// Marsaglia's xor-shift scheme with thread-specific state
// This is probably the best overall implementation -- we'll
// likely make this the default in future releases.
unsigned t = Self->_hashStateX ;
t ^= (t << 11) ;
Self->_hashStateX = Self->_hashStateY ;
Self->_hashStateY = Self->_hashStateZ ;
Self->_hashStateZ = Self->_hashStateW ;
unsigned v = Self->_hashStateW ;
v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
Self->_hashStateW = v ;
value = v ;
}
value &= markOopDesc::hash_mask;
if (value == 0) value = 0xBAD ;
assert (value != markOopDesc::no_hash, "invariant") ;
TEVENT (hashCode: GENERATE) ;
return value;
}
It can be found from the source code that the generation strategy is controlled by a hashCode
, which defaults to 5; and the definition of this variable is in another header file :
product(intx, hashCode, 5,
"(Unstable) select hashCode generation algorithm" )
The source code is very clear... (unstable) Choose the algorithm generated by hashCode, and the definition here can be controlled by the jvm startup parameters. First, confirm the default value:
java -XX:+PrintFlagsFinal -version | grep hashCode
intx hashCode = 5 {product}
openjdk version "1.8.0_282"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_282-b08)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.282-b08, mixed mode)
So we can configure different hashcode generation algorithms through the startup parameters of jvm, and test the generation results under different algorithms:
-XX:hashCode=N
Now let's take a look at the different performance of each hashcode generation algorithm.
Algorithm 0
if (hashCode == 0) {
// This form uses an unguarded global Park-Miller RNG,
// so it's possible for two threads to race and generate the same RNG.
// On MP system we'll have lots of RW access to a global, so the
// mechanism induces lots of coherency traffic.
value = os::random();
}
This generation algorithm uses a random number generation strategy of Park-Miller RNG But it should be noted that...this random algorithm will spin wait when it is high concurrency
Algorithm 1
if (hashCode == 1) {
// This variation has the property of being stable (idempotent)
// between STW operations. This can be useful in some of the 1-0
// synchronization schemes.
intptr_t addrBits = intptr_t(obj) >> 3 ;
value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
}
This algorithm is really the memory address of the object. It directly obtains the intptr_t
type pointer of the object.
Algorithm 2
if (hashCode == 2) {
value = 1 ; // for sensitivity testing
}
There is no need to explain this... It is fixed to return 1, which should be used for internal test scenarios.
Interested students, you can try -XX:hashCode=2
to turn on this algorithm and see if the hashCode results are all 1.
Algorithm 3
if (hashCode == 3) {
value = ++GVars.hcSequence ;
}
This algorithm is also very simple, self-increment, all the hashCode of the object uses this self-increment variable. Let's try the effect:
System.out.println(new Object());
System.out.println(new Object());
System.out.println(new Object());
System.out.println(new Object());
System.out.println(new Object());
System.out.println(new Object());
//output
java.lang.Object@144
java.lang.Object@145
java.lang.Object@146
java.lang.Object@147
java.lang.Object@148
java.lang.Object@149
Sure enough, it is self-increasing... It's interesting
Algorithm 4
if (hashCode == 4) {
value = intptr_t(obj) ;
}
There is not much difference between the first algorithm here, and both return the object address, but the first algorithm is a variant.
Algorithm 5
The last one, is also the default generation algorithm , which is used when the hashCode configuration is not equal to 0/1/2/3/4:
else {
// Marsaglia's xor-shift scheme with thread-specific state
// This is probably the best overall implementation -- we'll
// likely make this the default in future releases.
unsigned t = Self->_hashStateX ;
t ^= (t << 11) ;
Self->_hashStateX = Self->_hashStateY ;
Self->_hashStateY = Self->_hashStateZ ;
Self->_hashStateZ = Self->_hashStateW ;
unsigned v = Self->_hashStateW ;
v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
Self->_hashStateW = v ;
value = v ;
}
Here is a hash value obtained by the exclusive OR (XOR) operation of the current state value. Compared with the previous auto-increment algorithm and random algorithm, it is more efficient, but the repetition rate should be relatively higher, but what is the hashCode repetition? What about...
Originally, jvm does not guarantee that this value will not be repeated. For example, the chain address method in HashMap is used to resolve hash conflicts.
to sum up
hashCode can be a memory address or not a memory address, or even a constant of 1 or an auto-increment number! You can use any algorithm you want!
Originality is not easy, unauthorized reprinting is prohibited. If my article is helpful to you, please like/favorite/follow to encourage and support it ❤❤❤❤❤❤
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。