On constant folding of final field loads
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Mon Jun 29 13:10:42 UTC 2015
Aleksey,
Thanks a lot for the feedback!
See my answers inline.
On 6/29/15 1:35 PM, Aleksey Shipilev wrote:
> Hi,
>
> On 06/27/2015 04:27 AM, Vladimir Ivanov wrote:
>> Current prototype:
>> http://cr.openjdk.java.net/~vlivanov/8058164/webrev.00/hotspot
>> http://cr.openjdk.java.net/~vlivanov/8058164/webrev.00/jdk
>>
>> The idea is simple: JIT tracks final field changes and throws away
>> nmethods which are affected.
>
> Big picture question: do we actually care about propagating final field
> values once the object escaped (and in this sense, available to be
> introspected by the compiler)?
>
> Java memory model does not guarantee the final field visibility when the
> object had escaped. The very reason why deserialization works is because
> the deserialized object had not yet been published.
>
> That is, are we in line with the spec and general expectations by
> folding the final values, *and* not deoptimizing on the store?
Can you elaborate on your point and interaction with JMM a bit?
Are you talking about not tracking constant folded final field values at
all, since there are no guarantees by JMM such updates are visible?
>> Though Unsafe.objectFieldOffset/staticFieldOffset javadoc explicitly
>> states that returned value is not guaranteed to be a byte offset [1],
>> after following that road I don't see how offset encoding scheme can be
>> changed.
>
> Yes. Lots and lots of users rely on *fieldOffset to return the actual
> byte offset, even though it is not specified as such. This understanding
> is so prevalent, that it leaks into Unsafe.get*Unaligned, etc.
>
>
>> More realistically, since there are external dependencies on Unsafe API,
>> I'd prefer to leave sun.misc.Unsafe as is and switch to VarHandles (when
>> they are available in 9) all over JDK. Or temporarily make a private
>> copy (finally :-)) of field accessors from Unsafe, switch it to encoded
>> offsets, and use it in Reflection & java.lang.invoke API.
>
> Or, introduce Unsafe.invalidateFinalDep(Field/offset/etc), and add the
> call to it to Reflection accessors, MethodHandles invoke, VarHandle
> handles, etc. When/if Unsafe goes away, so do the unsafe
> (non-dependency-firing) final field stores. Raw memory access via Unsafe
> already escapes whatever traps you are setting in (oop + offset) path,
> so it would be nice to have the option to fire the dependency check for
> an arbitrary (?) offset.
>
>
>> Regarding alternative approaches to track the finality, an offset bitmap
>> on per-class basis can be used (containing locations of final fields).
>> Possible downsides are: (1) memory footprint (1/8th of instance size per
>> class); and (2) more complex checking logic (load a relevant piece of a
>> bitmap from a klass, instead of checking locally available offset
>> cookie). The advantage is that it is completely transparent to a user:
>> it doesn't change offset translation scheme.
>
> I like this one. Paying with slightly larger memory footprint for API
> compatibility sounds reasonable to me.
I don't care about cases when Unsafe API is abused (e.g. raw memory
writes on absolute address or arbitrary offset in an object). In the
end, it's unsafe API, right? :-)
What I want to cover is proper usages of Unsafe API to access
instance/static fields. That's the part which is used in Reflection &
java.lang.invoke API. Unsafe is used there to bypass access checks.
It doesn't mean I'm fine with breaking existing user code. But since
Unsafe is not a supported API, I admit some limited changes in major
release (e.g. 9) are allowed. What I'm trying to understand is to what
extent it can be changed.
My experiments show that simply changing offset encoding strategy
doesn't work. There are cases when absolute offsets are needed.
So, my next question is how to proceed. Does changing API and providing
2 set of functions working with absolute and encoded offsets solve the
problem? Or leaving Unsafe as is (but clarifying the API) and migrating
Reflection/j.l.i to VarHandles solve the problem? That's what I'm trying
to understand.
>
>> II. Managing relations between final fields and nmethods
>> Another aspect is how expensive dependency checking becomes.
>>
>> I took a benchmark from Nashorn/Octane (Box2D), since MethodHandle
>> inlining heavily relies on constant folding of instance final fields.
>>
>> Before After
>> checks (#) 420 12,5K
>> nmethods checked(#) 3K 1,5M
>> total time: 60ms 2s
>> deps total 19K 26K
>>
>> Though total number of dependencies in VM didn't change much (+37% =
>> 19K->26K), total number of checked dependencies (500x: 3K -> 1,5M) and
>> time spent on dependency checking (30x: 60ms -> 2s) dramatically increased.
>>
>> The reason is that constant field value dependencies created heavily
>> populated contextes which are regularly checked:
>>
>> #1 #2 #3/#4
>> Before
>> KlassDep 254 47/2,632
>> CallSiteDep 167 46/ 358
>>
>> After
>> ConstantFieldDep 11,790 0/1,494,112
>> KlassDep 286 41/ 2,769
>> CallSiteDep 249 58/ 393
>>
>> (#1 - dependency kind; #2 - total number of unique dependencies;
>> #3/#4 - invalidated nmethods/checked dependencies)
>
> Isn't the underlying problem being the dependencies are searched
> linearly? At least in ConstantFieldDep, can we compartmentalize the
> dependencies by holder class in some sort of hash table?
In some cases (when coarse-grained (per-class) tracking is used), linear
traversal is fine, since all nmethods will be invalidated.
In order to construct a more efficient data structure, you need a way to
order or hash oops. The problem with that is oops aren't stable - they
can change at any GC. So, either some stable value should be associated
with them (System.identityHashCode()?) or dependency tables should be
updated on every GC.
Unless existing machinery can be sped up to appropriate level, I
wouldn't consider complicating things so much.
The 3 optimizations I initially proposed allow to isolate
ConstantFieldDep from other kinds of dependencies, so dependency
traversal speed will affect only final field writes. Which is acceptable
IMO.
Best regards,
Vladimir Ivanov
More information about the hotspot-compiler-dev
mailing list