RFR(S) 8230677: Should disable Escape Analysis if JVMTI capability can_get_owned_monitor_info was taken

Mon Sep 23 06:59:01 UTC 2019

Hi Richard,

On 22/09/2019 7:06 am, Reingruber, Richard wrote:
> Hi David,
> 
> I still think that JVMTI has to reflect the state of the abstract virtual machine and it must hide
> optimizations from its client (Joe the Java developer).

I'm not sure I agree as a blanket statement. It may not be feasible to 
perform certain optimisations and at the same time keep around enough 
information to reconstruct the state as it would have been had the 
optimization not occurred. That's why I want to see a discussion of the 
broader issue of how something like escape analysis is supposed to 
interact with facilities like JVM TI. You've flagged a couple of 
conditions under which EA should be disabled, but are they sufficient? 
Give they are both dynamic facilities what is supposed to happen if you 
keep switching things on and off? Does enabling 
can_get_owned_monitor_info require previous EA actions to be undone?

I don't know the answers here. I certainly hope these questions have 
been considered when EA was being developed - in which case I'd like to 
see the discussion. Otherwise this is a conversation that needs to be 
had, and the development of an overall approach for dealing with this, 
not trying to address things piece-meal as someone recognises a new problem.

> There has to be a way to debug an app without optimizations getting into your way. I like to debug
> my C/C++ program after I compiled it -O0 -g. There has to be something equivalent for Java
> debugging. And I don't mean -Xint. Since 1.4.1 we've got "Full Speed Debugging" [1].
> 
> So far JVMTI does an excellent job of making optimizations transparent to the user. If it didn't,
> I would like to file a JEP for an interface that shows the pure JVM state (at full speed of
> course).
> 
>    > >    > You seem to have completely missed my point. If the object is local and
>    > >    > is synchronized upon then the synchronization can be elided (and should
>    > >    > be) in which case it won't appear in GetOwnedMonitorInfo and so does not
>    > >    > escape. If the synchronization cannot be elided then the object cannot
>    > >    > be considered local. That is how Escape Analysis should be operating
>    > >    > here IMHO.
>    > >
>    > > I presume we agree that it is the state of the abstract virtual machine that must be observed
>    > > through JVMTI, right?
>    > >
>    > > The locking state of an object O after a monitorenter on O is locked on the abstract vm.
>    > >
>    > > The JIT can still elide synchronization based on a prove that it is actually redundant for the
>    > > computation. But at a safepoint JVMTI must report O as locked, because that's its state on the
>    > > abstract virtual machine.
>    >
>    > I don't agree. If the locking can be elided then it is completely
>    > elided. If the state of the "abstract VM" had to be perfectly preserved
>    > then we wouldn't be able to do the optimisations referred to in JLS 12.6:
>    >
>    > "Optimizing transformations of a program can be designed that reduce the
>    > number of objects that are reachable to be less than those which would
>    > naively be considered reachable."
> 
> I'd like to write a few words about the relationship of specification and optimizations: It's the purpose
> of a specification to define an abstract model. Optimizations are techniques employed by an
> implementation to reduce resource consumption for better performance within the bounds set by the
> spec, meaning optimizations have to be transparent to the program. You can't write a program that
> detects an optimization. If you could then the implementation would violate rules given by the
> specification.  (And I don't consider a benchmark a prove.)
> 
> Very rarely you'll find specs mention optimizations. You just don't want to mix specification and
> implementation! If they do, they trade beauty and simplicity of an abstract model, ideally based
> on formal definitions, for performance.
> 
> They do mention an optimization, if it is impossible to hide it. In the case of JLS 12.6 it is
> possible to detect that an object reference is removed from a local variable by means of
> finalizations and java.lang.ref. Implementors wanted it so badly, though, that they changed the
> spec.

Specs are imperfect and always evolving. Sometimes the consequences of 
specific optimizations may not be realised at the spec level yet need to 
be accounted for. That may or may not be the case with EA.

I find it hard to see how to do lock elision yet at the same time track 
when the lock would have been taken, without negating the whole purpose 
of eliding the lock. Afterall if the object is thread-local then the 
locking would amount to only biased-locking and it could never be 
contended. So it doesn't seem like there was much to elide in the first 
place.

Lock coarsening is also problematic.

> Another example in the future could be tail-call-optimization ...
> 
>    > I place lock elision in the same category of "optimising
>    > transformations" that changes what would "naively" be expected. Now this
>    > should be explicitly covered in the JLS/JVMS somewhere but I'm having
>    > trouble finding exactly where. This article discusses lock elision:
>    >
>    > https://www.ibm.com/developerworks/library/j-jtp10185/index.html
>    >
>    > and states:
>    >
>    > "It stands to reason, then, that if a thread enters a synchronized block
>    > protected by a lock that no other thread will ever synchronize on, then
>    > that synchronization has no effect and can therefore be removed by the
>    > optimizer. (The Java Language Specification explicitly allows this
>    > optimization.)"
> 
> I doubt you'll find it in the JLS/JVMS. It's an implementation detail permitted by the memory
> model. That's what the article is saying. If a Java app has data races, then you would not reason
> about removed synchronization. You would argue with the means of the memory model: eg the
> synchronization action A1 (JLS 17.4.2 [2]) does not 'synchronized-with' (17.4.4 [3]) sync. action
> A2, so they are not ordered.
> 
>    > I'll have to ping Brian to see if he recalls exactly where this is
>    > covered. :)
> 
> Ok :)
> Please let me know!

Brian conceded it may be more implicit in the JMM as you described above :)

Cheers,
David

> Richard.
> 
> [1] https://docs.oracle.com/javase/8/docs/technotes/guides/jpda/enhancements1.4.html#fsd
> [2] https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.2
> [3] https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.4
> 
> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Freitag, 20. September 2019 11:07
> To: Reingruber, Richard <richard.reingruber at sap.com>; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net
> Subject: Re: RFR(S) 8230677: Should disable Escape Analysis if JVMTI capability can_get_owned_monitor_info was taken
> 
> Hi Richard,
> 
> On 20/09/2019 6:31 pm, Reingruber, Richard wrote:
>> Hi David,
>>
>>     > On 20/09/2019 2:42 am, Reingruber, Richard wrote:
>>     > > Hi David,
>>     > >
>>     > > thanks for looking at this issue. And my appologies for the lengthy mail...
>>     > >
>>     > >    > > The JVMTI functions GetOwnedMonitorInfo() and GetOwnedMonitorStackDepthInfo() can be used to
>>     > >    > > retrieve objects locked by a thread. In terms of escape analysis those references escape and
>>     > >    > > optimizations like scalar replacement become invalid.
>>     > >    >
>>     > >    > What bothers me about this is that it seems like escape analysis is
>>     > >    > doing something wrong in this case.
>>     > >
>>     > > Yes it is.
>>     > >
>>     > >    > If the object is thread-local but is
>>     > >    > being synchronized upon then either:
>>     > >
>>     > > The object is not local, because it can escape through JVMTI GetOwnedMonitorInfo(). Escape analysis
>>     > > does not recognize this. That's what it is doing wrong. Consequently the state of the virtual
>>     > > machine, as observed through JVMTI, is wrong. See below...
>>     >
>>     > You seem to have completely missed my point. If the object is local and
>>     > is synchronized upon then the synchronization can be elided (and should
>>     > be) in which case it won't appear in GetOwnedMonitorInfo and so does not
>>     > escape. If the synchronization cannot be elided then the object cannot
>>     > be considered local. That is how Escape Analysis should be operating
>>     > here IMHO.
>>
>> I presume we agree that it is the state of the abstract virtual machine that must be observed
>> through JVMTI, right?
>>
>> The locking state of an object O after a monitorenter on O is locked on the abstract vm.
>>
>> The JIT can still elide synchronization based on a prove that it is actually redundant for the
>> computation. But at a safepoint JVMTI must report O as locked, because that's its state on the
>> abstract virtual machine.
> 
> I don't agree. If the locking can be elided then it is completely
> elided. If the state of the "abstract VM" had to be perfectly preserved
> then we wouldn't be able to do the optimisations referred to in JLS 12.6:
> 
> "Optimizing transformations of a program can be designed that reduce the
> number of objects that are reachable to be less than those which would
> naively be considered reachable."
> 
> I place lock elision in the same category of "optimising
> transformations" that changes what would "naively" be expected. Now this
> should be explicitly covered in the JLS/JVMS somewhere but I'm having
> trouble finding exactly where. This article discusses lock elision:
> 
> https://www.ibm.com/developerworks/library/j-jtp10185/index.html
> 
> and states:
> 
> "It stands to reason, then, that if a thread enters a synchronized block
> protected by a lock that no other thread will ever synchronize on, then
> that synchronization has no effect and can therefore be removed by the
> optimizer. (The Java Language Specification explicitly allows this
> optimization.)"
> 
> I'll have to ping Brian to see if he recalls exactly where this is
> covered. :)
> 
> David
> -----
> 
>>
>> Cheers, Richard.
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Freitag, 20. September 2019 00:59
>> To: Reingruber, Richard <richard.reingruber at sap.com>; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net
>> Subject: Re: RFR(S) 8230677: Should disable Escape Analysis if JVMTI capability can_get_owned_monitor_info was taken
>>
>> On 20/09/2019 2:42 am, Reingruber, Richard wrote:
>>> Hi David,
>>>
>>> thanks for looking at this issue. And my appologies for the lengthy mail...
>>>
>>>      > > The JVMTI functions GetOwnedMonitorInfo() and GetOwnedMonitorStackDepthInfo() can be used to
>>>      > > retrieve objects locked by a thread. In terms of escape analysis those references escape and
>>>      > > optimizations like scalar replacement become invalid.
>>>      >
>>>      > What bothers me about this is that it seems like escape analysis is
>>>      > doing something wrong in this case.
>>>
>>> Yes it is.
>>>
>>>      > If the object is thread-local but is
>>>      > being synchronized upon then either:
>>>
>>> The object is not local, because it can escape through JVMTI GetOwnedMonitorInfo(). Escape analysis
>>> does not recognize this. That's what it is doing wrong. Consequently the state of the virtual
>>> machine, as observed through JVMTI, is wrong. See below...
>>
>> You seem to have completely missed my point. If the object is local and
>> is synchronized upon then the synchronization can be elided (and should
>> be) in which case it won't appear in GetOwnedMonitorInfo and so does not
>> escape. If the synchronization cannot be elided then the object cannot
>> be considered local. That is how Escape Analysis should be operating
>> here IMHO.
>>
>> Cheers,
>> David
>> -----
>>
>>>      > a) the synchronization is elided and so the object will not appear in
>>>      > the set of owned monitors; or
>>>      > b) the fact synchronization occurs renders the object ineligible to be
>>>      > considered thread-local, and so there is no problem with it appearing in
>>>      > the set of owned monitors
>>>      >
>>>      > I think there is a bigger design philosophy issue here about the
>>>      > interaction of escape analysis and debugging/management frameworks in
>>>      > general. I'd like to see a very clear description on exactly how they
>>>      > should interact.
>>>      >
>>>
>>> I don't see too many design alternatives here. The JVMTI implementation has to present the correct
>>> state of the virtual machine according to the spec. I think it fails to do so in this case.
>>>
>>> Please look again at the test:
>>>
>>>     172         public long dontinline_endlessLoop() {
>>>     173             long cs = checkSum;
>>>     174             while (doLoop && loopCount-- > 0) {
>>>     175                 targetIsInLoop = true;
>>>     176                 checkSum += checkSum % ++cs;
>>>     177             }
>>>     178             loopCount = 3;
>>>     179             targetIsInLoop = false;
>>>     180             return checkSum;
>>>     181         }
>>>
>>>     249         public void dontinline_testMethod() {
>>>     250             LockCls l1 = new LockCls();        // to be scalar replaced
>>>     251             synchronized (l1) {
>>>     252                 inlinedTestMethodWithNestedLocking(l1);
>>>     253             }
>>>     254         }
>>>     255
>>>     256         public void inlinedTestMethodWithNestedLocking(LockCls l1) {
>>>     257             synchronized (l1) {              // nested
>>>     258                 dontinline_endlessLoop();
>>>     259             }
>>>     260         }
>>>
>>> This is the stack when the agent calls GetOwnedMonitorInfo()
>>>
>>> dontinline_endlessLoop()                    at line 176
>>> inlinedTestMethodWithNestedLocking()        at line 258  // inlined into caller frame
>>> dontinline_testMethod()                     at line 252  // compiled frame
>>>
>>> The state of the _virtual_ machine at that point is obvious:
>>>
>>> - An instance of LockCls must exist. It was allocated by a new bytecode at line 250.
>>> - That instance was locked by a monitorenter bytecode at line 251
>>>
>>> This could be proven by interpreting the execution trace bytecode by bytecode using paper and
>>> pencil (hope you won't make me do this, though ;))
>>>
>>> JVMTI is used to examine the state of the virtual machine. The result of the JVMTI call
>>> GetOwnedMonitorInfo() must include that locked instance of LockCls. It is obviously a bug if it does
>>> not.
>>>
>>>    From a more philosophical point of view compiled code is free to change the state of the physical
>>> machine in a way such that it cannot be mapped to a valid state of the virtual machine after each
>>> and every machine instruction. But it must reach points in its execution trace, where it is actually
>>> possible to present a valid state of the virtual machine to observers, e.g. JVMTI agents. These
>>> points are called safepoints.
>>>
>>> The test is a prove that compiled code fails to do so, as it reaches a safepoint where an invalid vm
>>> state is presented. EA does not take into account that the lock state can be observed using
>>> GetOwnedMonitorInfo(). As a fix EA is disabled if the corresponding capability
>>> can_get_owned_monitor_info is taken. With the fix the test passes.
>>>
>>> Note that for the very same reason EA is disabled if can_access_local_variables is taken, because
>>> the JVMTI implementation cannot handout references to objects stored in local variables if they were
>>> scalar replaced.
>>>
>>> With the proposed enhancement JDK-8227745 it is not necessary to disable EA. It allows to revert EA
>>> based optimizations just-in-time before local objects escape. Note that EA opts are already reverted
>>> today if a compiled frame gets replaced by corresponding interpreted frames (see realloc_objects()
>>> and relock_objects() in class Deoptimization)
>>>
>>> Thanks and cheers, Richard.
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8227745
>>>
>>> -----Original Message-----
>>> From: David Holmes <david.holmes at oracle.com>
>>> Sent: Donnerstag, 19. September 2019 02:43
>>> To: Reingruber, Richard <richard.reingruber at sap.com>; hotspot-compiler-dev at openjdk.java.net; serviceability-dev at openjdk.java.net
>>> Subject: Re: RFR(S) 8230677: Should disable Escape Analysis if JVMTI capability can_get_owned_monitor_info was taken
>>>
>>> Hi Richard,
>>>
>>> On 7/09/2019 12:24 am, Reingruber, Richard wrote:
>>>> Hi,
>>>>
>>>> could I please get reviews for
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/2019/8230677/webrev.0/
>>>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8230677
>>>>
>>>> The JVMTI functions GetOwnedMonitorInfo() and GetOwnedMonitorStackDepthInfo() can be used to
>>>> retrieve objects locked by a thread. In terms of escape analysis those references escape and
>>>> optimizations like scalar replacement become invalid.
>>>
>>> What bothers me about this is that it seems like escape analysis is
>>> doing something wrong in this case. If the object is thread-local but is
>>> being synchronized upon then either:
>>> a) the synchronization is elided and so the object will not appear in
>>> the set of owned monitors; or
>>> b) the fact synchronization occurs renders the object ineligible to be
>>> considered thread-local, and so there is no problem with it appearing in
>>> the set of owned monitors
>>>
>>> I think there is a bigger design philosophy issue here about the
>>> interaction of escape analysis and debugging/management frameworks in
>>> general. I'd like to see a very clear description on exactly how they
>>> should interact.
>>>
>>> Cheers,
>>> David
>>>
>>>> The runtime currently cannot cope with objects escaping through JVMTI (try included
>>>> tests). Therefore escape analysis should be disabled if an agent requests the capabilities
>>>> can_get_owned_monitor_info or can_get_owned_monitor_stack_depth_info.
>>>>
>>>> This was taken out of JDK-8227745 [1] to make it smaller. With JDK-8227745 there's no need to
>>>> disable escape analysis, instead optimizations based on escape analysis will be reverted just before
>>>> objects escape through JVMTI.
>>>>
>>>> I've run tier1 tests.
>>>>
>>>> Thanks, Richard.
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8227745
>>>>