Question about mark words in allocated objects
Deneau, Tom
tom.deneau at amd.com
Mon Sep 16 11:39:37 PDT 2013
Doug --
Yes, that was it.
-- Tom
-----Original Message-----
From: Doug Simon [mailto:doug.simon at oracle.com]
Sent: Monday, September 16, 2013 1:35 PM
To: Deneau, Tom
Cc: graal-dev at openjdk.java.net; sumatra-dev at openjdk.java.net
Subject: Re: Question about mark words in allocated objects
On Sep 16, 2013, at 7:01 PM, "Deneau, Tom" <tom.deneau at amd.com> wrote:
> We are experimenting with an HSA device doing object allocation.
>
> In our prototype, there is one or more idle Java threads acting as a "donor threads", in that each donates its TLAB.
> In the simple case below, there is only one donor thread.
>
> Since there can be more HSA device workitems than there are donor threads, the workitems use atomic operations to bump the tlab.top pointer of a donor thread. The usual graal code then takes over to fill in the contents of the allocated item, including the mark word, class pointer, etc.
> In our usual junit test cases, we run the "kernel function" sequentially on the CPU and then as an HSA kernel on the GPU and compare results. This is a "--vm server" run so the CPU side is not going thru graal
>
> In my test case, I am using a JNI function to print out the full object contents including the header after the "kernel" completes. The atomic pointer bumps seem to work correctly but I've noticed a slight difference in the header contents.
>
> Since we are running with the default UseBiasedLocking enabled, I can see that when the object is initialized from the "prototype mark word", it is initialized with the value 5 (I assume this means anonymously biased?).
It means biasable but unlocked (see HotSpotReplacementsUtil.biasedLockPattern() and its usage in MonitorSnippets).
> I have noticed that when we run normal sequential Java and print out the mark word after the kernel has run, the mark word has a value of 1 (unlocked). But if the kernel has run on the GPU and we print out the mark word, it is still 5 in each object. The rest of the object contents matches between the CPU and GPU Runs.
>
> Where does the mark word get changed from 5 to 1 on the cpu side?
This is almost certainly due to the delayed initialization of biased locking - see BiasedLocking::init (in biasedLocking.cpp). Try adding -XX:BiasedLockingStartupDelay=0 as a VM option.
-Doug
More information about the sumatra-dev
mailing list