hash value in java objects
Krystal Mok
rednaxelafx at gmail.com
Sat Dec 8 07:50:17 PST 2012
The identity hash code for Java objects are calculated lazily because
most Java objects don't need it. Usually objects are only hashed when
used as key for hash-based containers, such as HashMap; the objects
used as keys this way are usually a little fraction of all objects. Of
those used as keys, many may have provided its own implementation of
hashCode(), which wouldn't need the identity hash code.
It takes space to store the hash code. In HotSpot VM, the mark word is
multiplexed to store a few kinds of metadata, but of course it can't
store all of them when they're all present. It'd be wasting space to
store data that may never be used, that's the original idea for
calculating the identity hash code lazily. The problem of wasting
space is especially bad for objects that have only a couple of fields.
The same goes for monitors.
There has been quite a few schemes for populating the identity hash
code and the monitor info lazily.
e.g. Monty VM (or better known as the CLDC HotSpot Implementation, or
CLDC-HI) uses another scheme that used "prototypical near class". You
could read section 3.1.1.3 of [1] if you're interested.
By the way, there could also be some obscure cons of using the
identity hash code unnecessarily in HotSpot. See Topic2 of [2] for
example. That blog is in Japanese, but I'm sure you can get the idea
of it just by reading the code examples. In short, it was trying to
show how unnecessary use of identity hash code could cause excessive
use of memory during GC.
- Kris
[1]: http://verdich.dk/kasper/RES.pdf
[2]: http://www.nminoru.jp/~nminoru/diary/2012/02.html#20120218p1
On Sat, Dec 8, 2012 at 1:39 PM, Xin Tong <xerox.time.tech at gmail.com> wrote:
> On Fri, Dec 7, 2012 at 11:38 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>> Short answer is the identity hash code is calculated when:
>> 1. the hashCode() method is called the first time for this object, if
>> the type of this object doesn't override Object.hashCode();
>> 2. System.identityHashCode() is called the first time for this object.
>>
> would not materialize the hashcode takes much code. why not materialize lazily ?
>
> Xin
>> A "0" value encoded as the hash in the mark word indicates that the
>> identity hash code hasn't been calculated yet (which is the initial
>> state after an object is created).
>>
>> If the type of an object overrides Object.hashCode(), then calling it
>> doesn't have anything to do with the identity hash code.
>>
>> The identity hash code of a Java object is stored in the mark word of
>> the object header if it has been calculated and the object is
>> unlocked. Otherwise it may be stored in the displaced mark word or in
>> the inflated lock.
>>
>> - Kris
>>
>> On Sat, Dec 8, 2012 at 6:31 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>>> Hash is encoded in the mark word, so I think the answer to your question is
>>> it's created when object is allocated (and thus gets a markOop). The actual
>>> hash value is decoded from the mark word at call time, but it's just a shift
>>> and mask at that point. Someone can correct me if this is wrong.
>>>
>>> Thanks
>>>
>>> Sent from my phone
>>>
>>> On Dec 7, 2012 4:52 PM, "Xin Tong" <xerox.time.tech at gmail.com> wrote:
>>>>
>>>> I am wondering when the hash values in the java objects are
>>>> materialized. are they materialized when the object is created ? or
>>>> when hashCode on the object is called. what about modifications to
>>>> the object ?
>>>>
>>>> Xin
More information about the hotspot-compiler-dev
mailing list