On constant folding of final field loads
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Jul 21 18:48:07 UTC 2015
Ian,
Thanks for sharing the reference! Though jython problem was about static
final fields, it is still applicable to the discussion. Multiple writes
to instance final fields are allowed by JVMS and should be correctly
handled.
There's still some mismatch in field finality between JVMS & JLS, but it
seems JVMS was slightly tied up since 2010. JVM is required to throw
runtime exception when attempting to write to a final field from outside
of <init>/<clinit> [1].
JVMS 6 [2] allowed to write to final fields in current class.
Fixing the problem for static final fields doesn't look like a hard
problem. JIT can inspect field holder class state and skip the
optimization if the class hasn't been fully initialized yet.
Best regards,
Vladimir Ivanov
[1]
http://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html#jvms-6.5.putfield:
"Otherwise, if the field is final, it must be declared in the current
class, and the instruction must occur in an instance initialization
method (<init>) of the current class. Otherwise, an IllegalAccessError
is thrown."
[2]
https://docs.oracle.com/javase/specs/jvms/se6/html/Instructions2.doc11.html
"Otherwise, if the field is final, it must be declared in the current
class. Otherwise, an IllegalAccessError is thrown."
On 7/21/15 8:24 PM, Ian Rogers wrote:
> Fwiw, Jikes RVM will propagate final field values without any checking
> if reflection or JNI is being used to abuse the meaning of finality. As
> Jikes RVM is seldom used to run anything other than (elderly) benchmarks
> this has only turned up one problem, specifically in jython:
>
> http://bugs.jython.org/issue1611
>
> In the bug there is some (hopefully) interesting discussion on the
> meaning of finality, notably that the language and VM spec disagree on
> when final fields may be written. This was why the problem hadn't
> occurred with Java bytecode but did occur with Jython generated bytecode.
>
> A related topic maybe that perhaps Java finalizers should be allowed to
> write to final fields too, in particular for the case that they
> represent some kind of native resource.
>
> Thanks,
> Ian Rogers, Google.
>
>
> On Fri, Jun 26, 2015 at 6:27 PM, Vladimir Ivanov
> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>
> Hi there,
>
> Recently I started looking at constant folding of loads from
> instance final fields:
> https://bugs.openjdk.java.net/browse/JDK-8058164
>
> I made some progress and wanted to share my findings and initiate a
> discussion about the problems I spotted.
>
> Current prototype:
> http://cr.openjdk.java.net/~vlivanov/8058164/webrev.00/hotspot
> http://cr.openjdk.java.net/~vlivanov/8058164/webrev.00/jdk
>
> The idea is simple: JIT tracks final field changes and throws away
> nmethods which are affected.
>
> There are 2 parts of the problem:
> - how to track changes of final field
> - how to manage relation between final fields and nmethods;
>
> I. Tracking changes of final fields
>
> There are 4 ways to circumvent runtime limitations and change a
> final field value:
> - Reflection API (Field.setAccessible())
> - Unsafe
> - JNI
> - java.lang.invoke (MethodHandles)
>
> (It's also possible to write to a final field in a constructor, but
> I consider it as a corner case and haven't addressed yet. VM can
> ignore )
>
> Since Reflection & java.lang.invoke APIs use Unsafe, it ends up with
> only 2 cases: JNI & Unsafe.
>
> For JNI it's possible to encode field "finality" in jfieldID and
> check corresponding bit in Set*Field JNI methods before updating a
> field. There are already some data encoded in the field ID, so
> extending it to record final bit as well went pretty smooth.
>
> For Unsafe it's much more complex.
> I started with a similar approach (mostly implemented in the current
> prototype) - record "finality" bit in offset cookie and check it
> when performing a write.
>
> Though Unsafe.objectFieldOffset/staticFieldOffset javadoc explicitly
> states that returned value is not guaranteed to be a byte offset
> [1], after following that road I don't see how offset encoding
> scheme can be changed.
>
> First of all, there are Unsafe.get*Unaligned methods (added in 9),
> which require byte offset (Unsafe.getLong()):
> "Fetches a value at some byte offset into a given Java object
> ...
> The specification of this method is the same as {@link
> #getLong(Object, long)} except that the offset does not need to
> have been obtained from {@link #objectFieldOffset} on the
> {@link java.lang.reflect.Field} of some Java field."
>
> Unsafe.getInt supports 3 addressing modes:
> (1) NULL + address
> (2) oop + offset
> (3) base + index * scale
>
> Since there are no methods in Unsafe to get byte offsets, there's no
> way to make (3) work with non-byte offsets for Unaligned versions.
> Both base and scale should be either byte offsets or offset cookies
> to make things work. You can get a sense of the problems looking
> into Unsafe & java.nio hacks I did to make things somewhat function
> after switching offset encoding strategy.
>
> Also, Unsafe.copyMemory() doesn't work well with offset cookies (see
> java.nio.Bits changes I did). Though source and destination
> addressing shares the same mode with Unsage.getInt() et al., the
> size of the copied region is defined in bytes. So, in order to
> perform bulk copies of consecutive memory blocks, the user should be
> able to convert offset cookie to byte offset and vice versa. There's
> no way to solve that with current API right now.
>
> I don't want to touch compatibility concerns of switching from byte
> offsets to encoded offsets, but it looks like Unsafe API needs some
> overhaul in 9 to make offset encoding viable.
>
> More realistically, since there are external dependencies on Unsafe
> API, I'd prefer to leave sun.misc.Unsafe as is and switch to
> VarHandles (when they are available in 9) all over JDK. Or
> temporarily make a private copy (finally :-)) of field accessors
> from Unsafe, switch it to encoded offsets, and use it in Reflection
> & java.lang.invoke API.
>
> Regarding alternative approaches to track the finality, an offset
> bitmap on per-class basis can be used (containing locations of final
> fields). Possible downsides are: (1) memory footprint (1/8th of
> instance size per class); and (2) more complex checking logic (load
> a relevant piece of a bitmap from a klass, instead of checking
> locally available offset cookie). The advantage is that it is
> completely transparent to a user: it doesn't change offset
> translation scheme.
>
>
> II. Managing relations between final fields and nmethods
>
> Nmethods dependencies suits that purpose pretty well, but some
> enhancements are needed.
>
> I envision 2 types of dependencies: (1) per-class (field holder);
> and (2) per-instance (value holder). Field holder is used as a context.
>
> Unless a final field is changed, there's no need to track
> per-instance dependency. VM optimistically starts in per-class mode
> and switch to per-instance mode when it sees a field change. The
> policy is chosen on per-class basis. VM should be pretty
> conservative, since false positives are expensive - a change of
> unrelated field causes recompilation of all nmethods where the same
> field was inlined (even if the value was taken from a different
> instance). Aliasing also causes false positives (same instance, but
> different final field), so fields in the same class should be
> differentiated as well.
>
> Unilke methods, fields don't have any dedicated metadata associated
> with them. All data is confined in holder klass. To be able to
> identify a field in a dependency, byte offset can be used. Right
> now, dependency management machinery supports only oops and
> metadata. So, it should be extended to support primitive values in
> dependencies (haven't done yet).
>
> Byte offset + per-instance modes completely eliminates false positives.
>
> Another aspect is how expensive dependency checking becomes.
>
> I took a benchmark from Nashorn/Octane (Box2D), since MethodHandle
> inlining heavily relies on constant folding of instance final fields.
>
> Before After
> checks (#) 420 12,5K
> nmethods checked(#) 3K 1,5M
> total time: 60ms 2s
> deps total 19K 26K
>
> Though total number of dependencies in VM didn't change much (+37% =
> 19K->26K), total number of checked dependencies (500x: 3K -> 1,5M)
> and time spent on dependency checking (30x: 60ms -> 2s) dramatically
> increased.
>
> The reason is that constant field value dependencies created heavily
> populated contextes which are regularly checked:
>
> #1 #2 #3/#4
> Before
> KlassDep 254 47/2,632
> CallSiteDep 167 46/ 358
>
> After
> ConstantFieldDep 11,790 0/1,494,112
> KlassDep 286 41/ 2,769
> CallSiteDep 249 58/ 393
>
> (#1 - dependency kind; #2 - total number of unique dependencies;
> #3/#4 - invalidated nmethods/checked dependencies)
>
> I have 3 ideas how to improve performance of dependency checking:
>
> (1) split dependency context list (nmethodBucket) into 3
> independent lists (Klass, CallSite & ConstantValue); (IMPLEMENTED)
>
> It trades size for speed - duplicate nmethods are possible, but the
> lists should be shorter on average. I already implemented it, but it
> didn't improve the benchmark I'm playing with, since the fraction of
> CallSite/Klass deps is very small compared to ConstantField.
>
> (2) group nmethodBucket entries into chunks of k-nmethods; (TODO)
>
> It should improve nmethod iteration speed in heavily populated contexts.
>
> (3) iterate only dependencies of appropriate kind; (TODO)
>
> There are 3 kinds of changes which require dependency checking:
> changes in CHA (KlassDepChange), call site target change
> (CallSiteDepChange), and constant field value change
> (ConstantFieldDepChange). Different types of changes affect disjoint
> sets of dependencies. So, instead of enumerating all dependencies in
> a nmethod, a more focused approach can be used (e.g. check only
> call_site_target_value deps for CallSiteDepChange).
>
> Since dependencies are sorted by type when serialized in a nmethod,
> it's possible to compute offsets for 3 disjoint sets and use them in
> DepStream to iterate only relevant dependencies.
>
> I hope it'll significantly reduce dependency checking costs I'm seeing.
>
> That's all for now. Thanks!
>
> Best regards,
> Vladimir Ivanov
>
> [1] "Do not expect to perform any sort of arithmetic on this offset;
> it is just a cookie which is passed to the unsafe heap memory
> accessors."
>
>
More information about the hotspot-compiler-dev
mailing list