From david.holmes at oracle.com  Wed Jul  1 04:53:54 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 1 Jul 2020 14:53:54 +1000
Subject: RFR(S): 8243586: Optimize calls to
 SystemDictionaryShared::define_shared_package for classpath
In-Reply-To: <b9d3a805-d948-9a57-3f7d-e3d788bfb5bc@oracle.com>
References: <b6aedbcc-7675-e32e-2c1b-c18d223931f9@oracle.com>
 <fb0c3a15-2855-9ad9-50a3-65140fda281e@oracle.com>
 <b9d3a805-d948-9a57-3f7d-e3d788bfb5bc@oracle.com>
Message-ID: <2bc0cc46-23f5-122d-06ec-b4a695c0f53a@oracle.com>

Hi Calvin,

Updated look good - thanks.

One comment below ...

On 30/06/2020 4:17 am, Calvin Cheung wrote:
> Hi David,
> 
> On 6/28/20 8:04 PM, David Holmes wrote:
>> Hi Calvin,
>>
>> Generally looks okay but a few comments/suggestions.
> Thanks for your review.
>>
>> On 25/06/2020 8:53 am, Calvin Cheung wrote:
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8243586
>>>
>>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.00/
>>>
>>> The proposed change is to reduce the calls to 
>>> SystemDictionaryShared::define_shared_package which does a java call 
>>> to AppClassLoader.defineOrCheckPackage. Currently, 
>>> define_shared_package is called for every shared class but it is 
>>> needed only for each package in each jar specified in the classpath.
>>
>> src/hotspot/share/classfile/packageEntry.hpp
>>
>> +?? volatile intptr_t _defined_in_class_path; // a Package java object 
>> has been define via CDS
>>
>> The name and description of this field is somewhat lacking. It is a 
>> bit map indicating which CDS classpath entries have defined classes in 
>> this package.
> I've rename the field to _defined_by_cds_in_class_path and expanded the 
> comment.
>>
>> +?? bool is_defined_by_cds_in_class_path(intptr_t idx) const {
>> +?? void set_defined_in_class_path(intptr_t idx) {
>>
>> These names should be consistent i.e.
>>
>> +?? void set_defined_by_cds_in_class_path(intptr_t idx) {
> Rename the 'set' function as you suggested.
>>
>> I agree with Ioi that the idx should be range checked by an assert.
> Added the assert.
>>
>> test/hotspot/jtreg/runtime/cds/appcds/PackageSealing.java
>>
>> I don't really understand what this test was doing previously, but it 
>> is not obvious that it is exercising the new bitmap logic in any 
>> rigorous way. Is this test change really related to the code change?
> 
> The original test was to dump and load the "sealed/pkg/C1" and "pkg/C2" 
> classes from the same jar but from different packages. Only the 
> "sealed/pkg/C1" is from a sealed packaged.
> 
> The added tests is to dump "sealed/pkg/C3", which is from a non-sealed 
> package, from non_sealed.jar.

Okay that is what confused me. The name of the package is "sealed" but 
it's not actually sealed as the jar containing C3 is not marked as sealed.

Thanks,
David
-----

> During runtime, it will load 
> "sealed/pkg/C3" and "sealed/pkg/C1" from the same package but from 
> non_sealed.jar and pkg_seal.jar, respectively. So this makes sure define 
> package is called for each package encountered in each jar.
> 
> As suggested by Ioi, I also added another test to dump "sealed/pkg/C1" 
> from a sealed package from pkg_seal.jar. During runtime, the order of 
> loading the classes and the jars in the classpath are reversed.
> 
>>
>> ?? * @compile test-classes/C1.java
>> ?? * @compile test-classes/C2.java
>> +? * @compile test-classes/C3.java
>> ?? * @compile test-classes/PackageSealingTest.java
>> ?? * @compile test-classes/Hello.java
>>
>> Unless there is a reason separate compilation is required, the number 
>> of @compile commands can be reduced - perhaps to one.
> I've combined them in to one @compile command but separated into 2 lines 
> to avoid long line.
>>
>> +???????? classList[1] = "sealed/pkg/C3"; // C3 is from a non-sealed 
>> package
>>
>> The comment seems in contradiction to the path - is it sealed or not?
> 
> C3 is from a non-sealed package. That's why I added a comment.
> 
> Note non_sealed.jar was created as follows:
> 
>  ?70???????? String nonSealedJar = 
> ClassFileInstaller.writeJar("non_sealed.jar", "sealed/pkg/C3");
> 
> The pkg_seal.jar was created as follows: (note the package_seal.mf 
> manifest)
> 
>  ? 41???????? String appJar = ClassFileInstaller.writeJar("pkg_seal.jar",
>  ? 42 
> ClassFileInstaller.Manifest.fromSourceFile("test-classes/package_seal.mf"),
>  ? 43???????????? "PackageSealingTest", "sealed/pkg/C1", "pkg/C2");
> 
>>
>> +???????? TestCommon.testDump(jars, TestCommon.list(classList2), 
>> "-XX:+TraceExceptions");
>>
>> TraceExceptions is deprecated - use -Xlog:exceptions=info
> 
> It turns out it was leftover for debugging. I've removed it.
> 
> updated webrev:
>  ??? http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.01/
> 
> thanks,
> 
> Calvin
> 
>>
>> Thanks,
>> David
>> -----
>>
>>> zprint perf results summary:
>>>
>>> instr delta = -104380126 -2.2487%
>>> time delta = -23.779 ms -3.3640%
>>>
>>> Testing: running mach5 tier1 and 2 tests.
>>>
>>> thanks,
>>>
>>> Calvin
>>>

From daniil.x.titov at oracle.com  Wed Jul  1 05:24:46 2020
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 30 Jun 2020 22:24:46 -0700
Subject: RFR(M): 8244383: jhsdb/HeapDumpTestWithActiveProcess.java fails
 with "AssertionFailure: illegal bci"
In-Reply-To: <C507C5CC-1C25-4D28-8156-3D08C77B5422@oracle.com>
References: <28e1b453-e1ea-0a1c-0ae0-0494b52f4b71@oracle.com>
 <6efbc900-732f-ee8b-5561-f9a813ebfeca@oracle.com>
 <C507C5CC-1C25-4D28-8156-3D08C77B5422@oracle.com>
Message-ID: <C94FA2CF-8A9C-47C2-A92A-1D010DE3B3C8@oracle.com>

Hi Chris,

The fix,  in general,  looks good to me. 

Some small comments for src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/amd64/AMD64CurrentFrameGuess.java:
   1) Imports at lines 30 and 35 are not used and could be removed.

        30 import sun.jvm.hotspot.interpreter.*;
        35 import sun.jvm.hotspot.utilities.*;

  2) Local variable "end" defined at line 189 is not used.
     
      189     Address end = sp.addOffsetTo(regionInBytesToSearch);

No new webrev is required.

Thanks,
Daniil


?On 6/25/20, 1:55 PM, "serviceability-dev on behalf of Chris Plummer" <serviceability-dev-retn at openjdk.java.net on behalf of chris.plummer at oracle.com> wrote:

    Ping #2. I still need one more reviewer (Thanks for the review, Dan). I 
    updated the webrev based on Dan's comments:

    http://cr.openjdk.java.net/~cjplummer/8244383/webrev.01/

    I can still make the simplification mentioned below if necessary.

    thanks,

    Chris

    On 6/23/20 11:29 AM, Chris Plummer wrote:
    > Ping!
    >
    > If this fix is too complicated, there is a simplification I can make, 
    > but at the cost of abandoning some attempts to determine the current 
    > frame when this error condition pops up. At the start of 
    > validateInterpreterFrame() it attempts to verify that the frame is 
    > valid by verifying that frame->method and frame->bcp are valid. This 
    > part is pretty simple. The complicated part is everything that follows 
    > if the verification fails. It attempts to error correct the situation 
    > by looking at various register contents and stack contents. I could 
    > just abandon this complicated code and return false if frame->method 
    > and frame->bcp don't check out. Upon return, the caller's code would 
    > be simplified to:
    >
    >             if (validateInterpreterFrame(sp, fp, pc)) {
    >               return true; // We're done. setValues() has been called 
    > for valid interpreter frame.
    >             } else {
    >               return checkLastJavaSP();
    >             }
    >
    > So there's still a chance we can determine a valid current frame if 
    > "last java frame" has been setup. However, if not setup we would not 
    > be able to. This is where the complicated code in 
    > validateInterpreterFrame() is useful because it can usually determine 
    > the current frame, even if "last java frame" is not setup, but it's 
    > rare enough that we run into this situation that I think failing to 
    > get the current frame is ok.
    >
    > So if I can get a couple promises for reviews if I make this change, 
    > I'll go ahead and do it and send out a new RFR.
    >
    > thanks,
    >
    > Chris
    >
    > On 6/18/20 5:54 PM, Chris Plummer wrote:
    >> [I've added runtime-dev to this SA review since understanding 
    >> interpreter invokes (code generated by 
    >> TemplateInterpreterGenerator::generate_normal_entry()) and stack 
    >> walking is probably more important than understanding SA.]
    >>
    >> Hello,
    >>
    >> Please help review the following:
    >>
    >> https://bugs.openjdk.java.net/browse/JDK-8244383
    >> http://cr.openjdk.java.net/~cjplummer/8244383/webrev.00/index.html
    >>
    >> The crux of the bug is when doing stack walking the topmost frame is 
    >> in an inconsistent state because we are in the middle of pushing a 
    >> new interpreter frame. Basically we are executing code generated by 
    >> TemplateInterpreterGenerator::generate_normal_entry(). Since the PC 
    >> register is in this code, SA assumes the topmost frame is an 
    >> interpreter frame.
    >>
    >> The first issue with this interpreter frame assumption is if we 
    >> haven't actually pushed the frame yet, then the current frame is the 
    >> caller's frame, and could be compiled. But since SA thinks it's 
    >> interpreted, later on it tries to convert the frame->bcp to a BCI, 
    >> but frame->bcp is only valid for interpreter frames. Thus the 
    >> "illegal BCI" failures. If the previous frame happened to be 
    >> interpreted, then the existing SA code works fine.
    >>
    >> The other state of frame pushing that was problematic was when the 
    >> new frame had been pushed, but frame->method and frame->bcp were not 
    >> setup yet. This also would lead to "illegal BCI" later on because 
    >> garbage would be stored in these locations.
    >>
    >> Fixing the above problems requires trying to determine the state of 
    >> the frame push through a series of checks, and then adapting what is 
    >> considered to be the current frame based on the outcome of the 
    >> checks. The first things checked is that frame->method is valid (we 
    >> can successfully instantiate a wrapper for the Method* without 
    >> failure) and that frame->bcp is within the method. If both these pass 
    >> then we can use the frame as-is.
    >>
    >> If the above checks fail, then we try to determine whether the issue 
    >> is that the frame is not yet pushed and the current frame is actually 
    >> compiled, or the frame has been pushed but not yet initialized. This 
    >> is done by first getting the return address from the stack or RAX 
    >> (it's location depends on how far along we are in the entry code) and 
    >> comparing this to what is stored in frame->return_addr. If they are 
    >> the same, then we have pushed the frame but not yet initialized it. 
    >> In this case we use the previous frame (senderSP() and senderFP()) as 
    >> the current frame since the current frame is not yet initialized. If 
    >> the return address check fails, then we assume the new frame is not 
    >> yet pushed, and and treat the current frame as compiled, even though 
    >> PC points into the interpreter (we replace PC with RAX in this case).
    >>
    >> Comments in the code pretty well explain all the above, so it is 
    >> probably easier to follow the logic in the code along with the 
    >> comments rather than apply my above description to the code.
    >>
    >> I should add that it's very rare that we ever get into this special 
    >> error handling code. This bug was very hard to reproduce initially. I 
    >> was only able to make progress with reproducing and debugging by 
    >> inserting delay loops in various spots in the code generated by 
    >> TemplateInterpreterGenerator::generate_normal_entry(). By doing this 
    >> I was able to reproduce the issue quite easily and hit all the logic 
    >> in the new code I've added.
    >>
    >> The fix is basically entirely contained within 
    >> AMD64CurrentFrameGuess.java. The rest of the changes are minor:
    >>
    >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/amd64/AMD64CurrentFrameGuess.java 
    >>
    >> -Main fix for CR
    >>
    >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/x86/X86Frame.java 
    >>
    >> -Added getInterpreterFrameBCP(), which is now needed by 
    >> AMD64CurrentFrameGuess.java
    >> -I also simplified some code by using the existing 
    >> getInterpreterFrameMethod()
    >>  rather than replicating inline what it does.
    >>
    >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/amd64/BsdAMD64CFrame.java 
    >>
    >> -I noticed the windows version of this code had some extra checks 
    >> that were missing
    >>  from the bsd version. I then looked at the linux version, but it had 
    >> been heavily modified
    >>  a short while back to leverage DWARF info to determine frames. So I 
    >> looked at the previous
    >>  rev and it too had these extra checks. I decided to add them to the 
    >> BSD port. I'm not sure
    >>  if it helps at all, but it certainly doesn't seem to do any harm.
    >>
    >> thanks,
    >>
    >> Chris
    >>
    >
    >


From ioi.lam at oracle.com  Wed Jul  1 05:53:08 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Tue, 30 Jun 2020 22:53:08 -0700
Subject: RFR(S): 8243586: Optimize calls to
 SystemDictionaryShared::define_shared_package for classpath
In-Reply-To: <2bc0cc46-23f5-122d-06ec-b4a695c0f53a@oracle.com>
References: <b6aedbcc-7675-e32e-2c1b-c18d223931f9@oracle.com>
 <fb0c3a15-2855-9ad9-50a3-65140fda281e@oracle.com>
 <b9d3a805-d948-9a57-3f7d-e3d788bfb5bc@oracle.com>
 <2bc0cc46-23f5-122d-06ec-b4a695c0f53a@oracle.com>
Message-ID: <e9c7ea58-0f20-58b6-e161-b25065247809@oracle.com>


On 6/30/20 9:53 PM, David Holmes wrote:
> Hi Calvin,
>
> Updated look good - thanks.
>
> One comment below ...
>
> On 30/06/2020 4:17 am, Calvin Cheung wrote:
>> Hi David,
>>
>> On 6/28/20 8:04 PM, David Holmes wrote:
>>> Hi Calvin,
>>>
>>> Generally looks okay but a few comments/suggestions.
>> Thanks for your review.
>>>
>>> On 25/06/2020 8:53 am, Calvin Cheung wrote:
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8243586
>>>>
>>>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.00/
>>>>
>>>> The proposed change is to reduce the calls to 
>>>> SystemDictionaryShared::define_shared_package which does a java 
>>>> call to AppClassLoader.defineOrCheckPackage. Currently, 
>>>> define_shared_package is called for every shared class but it is 
>>>> needed only for each package in each jar specified in the classpath.
>>>
>>> src/hotspot/share/classfile/packageEntry.hpp
>>>
>>> +?? volatile intptr_t _defined_in_class_path; // a Package java 
>>> object has been define via CDS
>>>
>>> The name and description of this field is somewhat lacking. It is a 
>>> bit map indicating which CDS classpath entries have defined classes 
>>> in this package.
>> I've rename the field to _defined_by_cds_in_class_path and expanded 
>> the comment.
>>>
>>> +?? bool is_defined_by_cds_in_class_path(intptr_t idx) const {
>>> +?? void set_defined_in_class_path(intptr_t idx) {
>>>
>>> These names should be consistent i.e.
>>>
>>> +?? void set_defined_by_cds_in_class_path(intptr_t idx) {
>> Rename the 'set' function as you suggested.
>>>
>>> I agree with Ioi that the idx should be range checked by an assert.
>> Added the assert.
>>>
>>> test/hotspot/jtreg/runtime/cds/appcds/PackageSealing.java
>>>
>>> I don't really understand what this test was doing previously, but 
>>> it is not obvious that it is exercising the new bitmap logic in any 
>>> rigorous way. Is this test change really related to the code change?
>>
>> The original test was to dump and load the "sealed/pkg/C1" and 
>> "pkg/C2" classes from the same jar but from different packages. Only 
>> the "sealed/pkg/C1" is from a sealed packaged.
>>
>> The added tests is to dump "sealed/pkg/C3", which is from a 
>> non-sealed package, from non_sealed.jar.
>
> Okay that is what confused me. The name of the package is "sealed" but 
> it's not actually sealed as the jar containing C3 is not marked as 
> sealed.


For clarity, maybe we can renamed the "sealed/pkg" package to something 
neutral like "foo", and then have

 ?? foo-sealed.jar??? -- contains foo/C1, and specifies that "foo" is a 
sealed package
 ?? foo-unsealed.jar?? -- contains foo/C3; does not specify sealed packages.

Thanks
- Ioi

>
> Thanks,
> David
> -----
>
>> During runtime, it will load "sealed/pkg/C3" and "sealed/pkg/C1" from 
>> the same package but from non_sealed.jar and pkg_seal.jar, 
>> respectively. So this makes sure define package is called for each 
>> package encountered in each jar.
>>
>> As suggested by Ioi, I also added another test to dump 
>> "sealed/pkg/C1" from a sealed package from pkg_seal.jar. During 
>> runtime, the order of loading the classes and the jars in the 
>> classpath are reversed.
>>
>>>
>>> ?? * @compile test-classes/C1.java
>>> ?? * @compile test-classes/C2.java
>>> +? * @compile test-classes/C3.java
>>> ?? * @compile test-classes/PackageSealingTest.java
>>> ?? * @compile test-classes/Hello.java
>>>
>>> Unless there is a reason separate compilation is required, the 
>>> number of @compile commands can be reduced - perhaps to one.
>> I've combined them in to one @compile command but separated into 2 
>> lines to avoid long line.
>>>
>>> +???????? classList[1] = "sealed/pkg/C3"; // C3 is from a non-sealed 
>>> package
>>>
>>> The comment seems in contradiction to the path - is it sealed or not?
>>
>> C3 is from a non-sealed package. That's why I added a comment.
>>
>> Note non_sealed.jar was created as follows:
>>
>> ??70???????? String nonSealedJar = 
>> ClassFileInstaller.writeJar("non_sealed.jar", "sealed/pkg/C3");
>>
>> The pkg_seal.jar was created as follows: (note the package_seal.mf 
>> manifest)
>>
>> ?? 41???????? String appJar = 
>> ClassFileInstaller.writeJar("pkg_seal.jar",
>> ?? 42 
>> ClassFileInstaller.Manifest.fromSourceFile("test-classes/package_seal.mf"),
>> ?? 43???????????? "PackageSealingTest", "sealed/pkg/C1", "pkg/C2");
>>
>>>
>>> +???????? TestCommon.testDump(jars, TestCommon.list(classList2), 
>>> "-XX:+TraceExceptions");
>>>
>>> TraceExceptions is deprecated - use -Xlog:exceptions=info
>>
>> It turns out it was leftover for debugging. I've removed it.
>>
>> updated webrev:
>> http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.01/
>>
>> thanks,
>>
>> Calvin
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> zprint perf results summary:
>>>>
>>>> instr delta = -104380126 -2.2487%
>>>> time delta = -23.779 ms -3.3640%
>>>>
>>>> Testing: running mach5 tier1 and 2 tests.
>>>>
>>>> thanks,
>>>>
>>>> Calvin
>>>>


From harold.seigel at oracle.com  Wed Jul  1 12:16:44 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Wed, 1 Jul 2020 08:16:44 -0400
Subject: RFR 8247741: Test
 test/hotspot/jtreg/runtime/7162488/TestUnrecognizedVmOption.java fails when
 -XX:+IgnoreUnrecognizedVMOptions is set
In-Reply-To: <4963874d-c15a-46d5-eb7c-00420ef8c0e2@oracle.com>
References: <51c50c41-4f3a-2144-41ab-840d986362ac@oracle.com>
 <4963874d-c15a-46d5-eb7c-00420ef8c0e2@oracle.com>
Message-ID: <4b2ea3b0-d796-89ec-55c1-af7333496a18@oracle.com>

Thanks David!

Harold

On 6/30/2020 6:58 PM, David Holmes wrote:
> Looks good! Thanks for fixing. :)
>
> David
>
> On 1/07/2020 12:32 am, Harold Seigel wrote:
>> Hi,
>>
>> Please review this small fix (suggested by dholmes) for JDK-8247741. 
>> The fix avoids using createTestJvm() to prevent JTreg flags from 
>> being passed to the new process created by the test.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8247741/webrev/index.html
>>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8247741
>>
>> The fix was tested by running the test using Mach5 and by running it 
>> by hand with -vmoptions:"-XX:+IgnoreUnrecognizedVMOptions" specified 
>> on the JTReg command line.
>>
>> Thanks, Harold
>>

From goetz.lindenmaier at sap.com  Wed Jul  1 13:01:30 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 1 Jul 2020 13:01:30 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
Message-ID: <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi David,
 
This obviously works, but I'm not sure whether the case
skipped is worth adding a field to NPE. 
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/03/
(This is not a final webrev, the other fix is still in there commented out.)

The NPE is already skipped or simplified in a row of other cases
  - serialization
  - expressions involving more than 5 bytecodes (recursion depth
     of the algorithm)
  - hidden top frames
  - internal backtrace information is lost
So this further case, which I consider quite unlikely, just adds
to existing ones. 
Other decisions were made to reduce the changes to the 
NPE class and memory consumption. E.g., the message is 
not persisted once computed, but recomputed on each
call to getMessage().

I'll post it to core-libs-dev if we follow this version of the
fix further.

Best regards,
  Goetz. 


> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Wednesday, July 1, 2020 12:50 AM
> To: Remi Forax <forax at univ-mlv.fr>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>
> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> On 30/06/2020 5:53 pm, Remi Forax wrote:
> > ----- Mail original -----
> >> De: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>
> >> ?: "Christoph Dreis" <christoph.dreis at freenet.de>, "hotspot-runtime-
> dev" <hotspot-runtime-dev at openjdk.java.net>
> >> Envoy?: Lundi 29 Juin 2020 14:07:08
> >> Objet: RE: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> >
> > [...]
> >
> >> An alternative fix I could imagine would be to
> >> override fillInStackTrace in NPE.java. It could
> >> call getMessage() and then super.fillInStackTrace,
> >> and return a new exception with the message.
> >
> > overriding fillInStackTrace() and marking it when it is called twice is the
> other solution.
> 
> That was my thought now I understand how/when this process kicks in.
> Just add a "filled" count that the overriding fillInStackTrace
> increments, and have getMessage check the count before calling
> getExtendedNPEMessage.
> 
> Cheers,
> David
> -----
> 
> >> But this would also compute the message in cases where it is not printed.
> >
> > you don't need that, you can change the field Throwable.stackTrace with a
> marker object not unlike UNASSIGNED_STACK to detect if NPE.fillInStackTrace
> is called twice,
> > but as is said earlier, it's not a small change because Throwable should stay
> thread safe and serializable.
> >
> > So i agree with you and Christoph that the "best" solution is to document
> that null.fillInStackTrace() doesn't get a detailed error message, apart if
> someone has a better fix.
> >
> >>
> >> Best regards,
> >>   Goetz.
> >
> > R?mi
> >
> >>
> >>> -----Original Message-----
> >>> From: Christoph Dreis <christoph.dreis at freenet.de>
> >>> Sent: Monday, June 29, 2020 1:28 PM
> >>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
> runtime-
> >>> dev at openjdk.java.net
> >>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> >>> after calling fillInStackTrace
> >>>
> >>> Hi Goetz,
> >>>
> >>>> If changing the stack trace by calling fillInStackTrace in user code, the
> >>>> NPE algorithm lacks the proper information to compute the message.
> >>>> Thus, we must omit it after that call.
> >>>
> >>>> I implement this by checking for a call to fillInStackTrace at the bci
> >>>> recorded in the exception.
> >>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> >>> jdk15/01/
> >>>
> >>> I tried this when reporting the issue already. The problem with this is,
> that it
> >>> suppresses any valid exception.
> >>>
> >>> E.g. the following example would not throw any helpful NPE anymore.
> >>>
> >>> public class Main {
> >>> 	public static void main(String[] args) {
> >>> 		NullPointerException ex = null;
> >>> 		ex.fillInStackTrace();
> >>> 	}
> >>> }
> >>>
> >>> Cheers,
> >>> Christoph
> >>>

From calvin.cheung at oracle.com  Wed Jul  1 18:02:10 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 1 Jul 2020 11:02:10 -0700
Subject: RFR(S): 8243586: Optimize calls to
 SystemDictionaryShared::define_shared_package for classpath
In-Reply-To: <2bc0cc46-23f5-122d-06ec-b4a695c0f53a@oracle.com>
References: <b6aedbcc-7675-e32e-2c1b-c18d223931f9@oracle.com>
 <fb0c3a15-2855-9ad9-50a3-65140fda281e@oracle.com>
 <b9d3a805-d948-9a57-3f7d-e3d788bfb5bc@oracle.com>
 <2bc0cc46-23f5-122d-06ec-b4a695c0f53a@oracle.com>
Message-ID: <9b6e7b44-2cbf-e4b2-00c2-40d451273f4e@oracle.com>

Hi David,

Thanks for taking another look.

thanks,

Calvin

On 6/30/20 9:53 PM, David Holmes wrote:
> Hi Calvin,
>
> Updated look good - thanks.
>
> One comment below ...
>
> On 30/06/2020 4:17 am, Calvin Cheung wrote:
>> Hi David,
>>
>> On 6/28/20 8:04 PM, David Holmes wrote:
>>> Hi Calvin,
>>>
>>> Generally looks okay but a few comments/suggestions.
>> Thanks for your review.
>>>
>>> On 25/06/2020 8:53 am, Calvin Cheung wrote:
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8243586
>>>>
>>>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.00/
>>>>
>>>> The proposed change is to reduce the calls to 
>>>> SystemDictionaryShared::define_shared_package which does a java 
>>>> call to AppClassLoader.defineOrCheckPackage. Currently, 
>>>> define_shared_package is called for every shared class but it is 
>>>> needed only for each package in each jar specified in the classpath.
>>>
>>> src/hotspot/share/classfile/packageEntry.hpp
>>>
>>> +?? volatile intptr_t _defined_in_class_path; // a Package java 
>>> object has been define via CDS
>>>
>>> The name and description of this field is somewhat lacking. It is a 
>>> bit map indicating which CDS classpath entries have defined classes 
>>> in this package.
>> I've rename the field to _defined_by_cds_in_class_path and expanded 
>> the comment.
>>>
>>> +?? bool is_defined_by_cds_in_class_path(intptr_t idx) const {
>>> +?? void set_defined_in_class_path(intptr_t idx) {
>>>
>>> These names should be consistent i.e.
>>>
>>> +?? void set_defined_by_cds_in_class_path(intptr_t idx) {
>> Rename the 'set' function as you suggested.
>>>
>>> I agree with Ioi that the idx should be range checked by an assert.
>> Added the assert.
>>>
>>> test/hotspot/jtreg/runtime/cds/appcds/PackageSealing.java
>>>
>>> I don't really understand what this test was doing previously, but 
>>> it is not obvious that it is exercising the new bitmap logic in any 
>>> rigorous way. Is this test change really related to the code change?
>>
>> The original test was to dump and load the "sealed/pkg/C1" and 
>> "pkg/C2" classes from the same jar but from different packages. Only 
>> the "sealed/pkg/C1" is from a sealed packaged.
>>
>> The added tests is to dump "sealed/pkg/C3", which is from a 
>> non-sealed package, from non_sealed.jar.
>
> Okay that is what confused me. The name of the package is "sealed" but 
> it's not actually sealed as the jar containing C3 is not marked as 
> sealed.
>
> Thanks,
> David
> -----
>
>> During runtime, it will load "sealed/pkg/C3" and "sealed/pkg/C1" from 
>> the same package but from non_sealed.jar and pkg_seal.jar, 
>> respectively. So this makes sure define package is called for each 
>> package encountered in each jar.
>>
>> As suggested by Ioi, I also added another test to dump 
>> "sealed/pkg/C1" from a sealed package from pkg_seal.jar. During 
>> runtime, the order of loading the classes and the jars in the 
>> classpath are reversed.
>>
>>>
>>> ?? * @compile test-classes/C1.java
>>> ?? * @compile test-classes/C2.java
>>> +? * @compile test-classes/C3.java
>>> ?? * @compile test-classes/PackageSealingTest.java
>>> ?? * @compile test-classes/Hello.java
>>>
>>> Unless there is a reason separate compilation is required, the 
>>> number of @compile commands can be reduced - perhaps to one.
>> I've combined them in to one @compile command but separated into 2 
>> lines to avoid long line.
>>>
>>> +???????? classList[1] = "sealed/pkg/C3"; // C3 is from a non-sealed 
>>> package
>>>
>>> The comment seems in contradiction to the path - is it sealed or not?
>>
>> C3 is from a non-sealed package. That's why I added a comment.
>>
>> Note non_sealed.jar was created as follows:
>>
>> ??70???????? String nonSealedJar = 
>> ClassFileInstaller.writeJar("non_sealed.jar", "sealed/pkg/C3");
>>
>> The pkg_seal.jar was created as follows: (note the package_seal.mf 
>> manifest)
>>
>> ?? 41???????? String appJar = 
>> ClassFileInstaller.writeJar("pkg_seal.jar",
>> ?? 42 
>> ClassFileInstaller.Manifest.fromSourceFile("test-classes/package_seal.mf"),
>> ?? 43???????????? "PackageSealingTest", "sealed/pkg/C1", "pkg/C2");
>>
>>>
>>> +???????? TestCommon.testDump(jars, TestCommon.list(classList2), 
>>> "-XX:+TraceExceptions");
>>>
>>> TraceExceptions is deprecated - use -Xlog:exceptions=info
>>
>> It turns out it was leftover for debugging. I've removed it.
>>
>> updated webrev:
>> http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.01/
>>
>> thanks,
>>
>> Calvin
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> zprint perf results summary:
>>>>
>>>> instr delta = -104380126 -2.2487%
>>>> time delta = -23.779 ms -3.3640%
>>>>
>>>> Testing: running mach5 tier1 and 2 tests.
>>>>
>>>> thanks,
>>>>
>>>> Calvin
>>>>

From calvin.cheung at oracle.com  Wed Jul  1 18:04:17 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 1 Jul 2020 11:04:17 -0700
Subject: RFR(S): 8243586: Optimize calls to
 SystemDictionaryShared::define_shared_package for classpath
In-Reply-To: <e9c7ea58-0f20-58b6-e161-b25065247809@oracle.com>
References: <b6aedbcc-7675-e32e-2c1b-c18d223931f9@oracle.com>
 <fb0c3a15-2855-9ad9-50a3-65140fda281e@oracle.com>
 <b9d3a805-d948-9a57-3f7d-e3d788bfb5bc@oracle.com>
 <2bc0cc46-23f5-122d-06ec-b4a695c0f53a@oracle.com>
 <e9c7ea58-0f20-58b6-e161-b25065247809@oracle.com>
Message-ID: <b564cb19-3d4b-0c18-acc4-605b03146e69@oracle.com>

Hi Ioi,

I've renamed the package name as you suggested.

Updated webrev:
 ??? http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.02/

Only test related files have been changed.

I'll push after the current round of testing look good.

thanks,

Calvin

On 6/30/20 10:53 PM, Ioi Lam wrote:
>
>
> On 6/30/20 9:53 PM, David Holmes wrote:
>> Hi Calvin,
>>
>> Updated look good - thanks.
>>
>> One comment below ...
>>
>> On 30/06/2020 4:17 am, Calvin Cheung wrote:
>>> Hi David,
>>>
>>> On 6/28/20 8:04 PM, David Holmes wrote:
>>>> Hi Calvin,
>>>>
>>>> Generally looks okay but a few comments/suggestions.
>>> Thanks for your review.
>>>>
>>>> On 25/06/2020 8:53 am, Calvin Cheung wrote:
>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8243586
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.00/
>>>>>
>>>>> The proposed change is to reduce the calls to 
>>>>> SystemDictionaryShared::define_shared_package which does a java 
>>>>> call to AppClassLoader.defineOrCheckPackage. Currently, 
>>>>> define_shared_package is called for every shared class but it is 
>>>>> needed only for each package in each jar specified in the classpath.
>>>>
>>>> src/hotspot/share/classfile/packageEntry.hpp
>>>>
>>>> +?? volatile intptr_t _defined_in_class_path; // a Package java 
>>>> object has been define via CDS
>>>>
>>>> The name and description of this field is somewhat lacking. It is a 
>>>> bit map indicating which CDS classpath entries have defined classes 
>>>> in this package.
>>> I've rename the field to _defined_by_cds_in_class_path and expanded 
>>> the comment.
>>>>
>>>> +?? bool is_defined_by_cds_in_class_path(intptr_t idx) const {
>>>> +?? void set_defined_in_class_path(intptr_t idx) {
>>>>
>>>> These names should be consistent i.e.
>>>>
>>>> +?? void set_defined_by_cds_in_class_path(intptr_t idx) {
>>> Rename the 'set' function as you suggested.
>>>>
>>>> I agree with Ioi that the idx should be range checked by an assert.
>>> Added the assert.
>>>>
>>>> test/hotspot/jtreg/runtime/cds/appcds/PackageSealing.java
>>>>
>>>> I don't really understand what this test was doing previously, but 
>>>> it is not obvious that it is exercising the new bitmap logic in any 
>>>> rigorous way. Is this test change really related to the code change?
>>>
>>> The original test was to dump and load the "sealed/pkg/C1" and 
>>> "pkg/C2" classes from the same jar but from different packages. Only 
>>> the "sealed/pkg/C1" is from a sealed packaged.
>>>
>>> The added tests is to dump "sealed/pkg/C3", which is from a 
>>> non-sealed package, from non_sealed.jar.
>>
>> Okay that is what confused me. The name of the package is "sealed" 
>> but it's not actually sealed as the jar containing C3 is not marked 
>> as sealed.
>
>
> For clarity, maybe we can renamed the "sealed/pkg" package to 
> something neutral like "foo", and then have
>
> ?? foo-sealed.jar??? -- contains foo/C1, and specifies that "foo" is a 
> sealed package
> ?? foo-unsealed.jar?? -- contains foo/C3; does not specify sealed 
> packages.
>
> Thanks
> - Ioi
>
>>
>> Thanks,
>> David
>> -----
>>
>>> During runtime, it will load "sealed/pkg/C3" and "sealed/pkg/C1" 
>>> from the same package but from non_sealed.jar and pkg_seal.jar, 
>>> respectively. So this makes sure define package is called for each 
>>> package encountered in each jar.
>>>
>>> As suggested by Ioi, I also added another test to dump 
>>> "sealed/pkg/C1" from a sealed package from pkg_seal.jar. During 
>>> runtime, the order of loading the classes and the jars in the 
>>> classpath are reversed.
>>>
>>>>
>>>> ?? * @compile test-classes/C1.java
>>>> ?? * @compile test-classes/C2.java
>>>> +? * @compile test-classes/C3.java
>>>> ?? * @compile test-classes/PackageSealingTest.java
>>>> ?? * @compile test-classes/Hello.java
>>>>
>>>> Unless there is a reason separate compilation is required, the 
>>>> number of @compile commands can be reduced - perhaps to one.
>>> I've combined them in to one @compile command but separated into 2 
>>> lines to avoid long line.
>>>>
>>>> +???????? classList[1] = "sealed/pkg/C3"; // C3 is from a 
>>>> non-sealed package
>>>>
>>>> The comment seems in contradiction to the path - is it sealed or not?
>>>
>>> C3 is from a non-sealed package. That's why I added a comment.
>>>
>>> Note non_sealed.jar was created as follows:
>>>
>>> ??70???????? String nonSealedJar = 
>>> ClassFileInstaller.writeJar("non_sealed.jar", "sealed/pkg/C3");
>>>
>>> The pkg_seal.jar was created as follows: (note the package_seal.mf 
>>> manifest)
>>>
>>> ?? 41???????? String appJar = 
>>> ClassFileInstaller.writeJar("pkg_seal.jar",
>>> ?? 42 
>>> ClassFileInstaller.Manifest.fromSourceFile("test-classes/package_seal.mf"),
>>> ?? 43???????????? "PackageSealingTest", "sealed/pkg/C1", "pkg/C2");
>>>
>>>>
>>>> +???????? TestCommon.testDump(jars, TestCommon.list(classList2), 
>>>> "-XX:+TraceExceptions");
>>>>
>>>> TraceExceptions is deprecated - use -Xlog:exceptions=info
>>>
>>> It turns out it was leftover for debugging. I've removed it.
>>>
>>> updated webrev:
>>> http://cr.openjdk.java.net/~ccheung/jdk16/8243586/webrev.01/
>>>
>>> thanks,
>>>
>>> Calvin
>>>
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> zprint perf results summary:
>>>>>
>>>>> instr delta = -104380126 -2.2487%
>>>>> time delta = -23.779 ms -3.3640%
>>>>>
>>>>> Testing: running mach5 tier1 and 2 tests.
>>>>>
>>>>> thanks,
>>>>>
>>>>> Calvin
>>>>>
>

From coleen.phillimore at oracle.com  Wed Jul  1 19:15:29 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 1 Jul 2020 15:15:29 -0400
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <ce8be08a-c40d-6ebc-22d2-b4d0bb57e7c6@oracle.com>


Hi Goetz,
I think this last version makes the most sense to me because it moves 
the decision to compute the helpful message up to the Java code where 
the condition is more easily tested.? Rather than down in the 
implementation where we look for the method name to the invoke bytecode.
Hopefully core-libs will concur.
Coleen

On 7/1/20 9:01 AM, Lindenmaier, Goetz wrote:
> Hi David,
>   
> This obviously works, but I'm not sure whether the case
> skipped is worth adding a field to NPE.
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/03/
> (This is not a final webrev, the other fix is still in there commented out.)
>
> The NPE is already skipped or simplified in a row of other cases
>    - serialization
>    - expressions involving more than 5 bytecodes (recursion depth
>       of the algorithm)
>    - hidden top frames
>    - internal backtrace information is lost
> So this further case, which I consider quite unlikely, just adds
> to existing ones.
> Other decisions were made to reduce the changes to the
> NPE class and memory consumption. E.g., the message is
> not persisted once computed, but recomputed on each
> call to getMessage().
>
> I'll post it to core-libs-dev if we follow this version of the
> fix further.
>
> Best regards,
>    Goetz.
>
>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Wednesday, July 1, 2020 12:50 AM
>> To: Remi Forax <forax at univ-mlv.fr>; Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> On 30/06/2020 5:53 pm, Remi Forax wrote:
>>> ----- Mail original -----
>>>> De: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>
>>>> ?: "Christoph Dreis" <christoph.dreis at freenet.de>, "hotspot-runtime-
>> dev" <hotspot-runtime-dev at openjdk.java.net>
>>>> Envoy?: Lundi 29 Juin 2020 14:07:08
>>>> Objet: RE: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>> [...]
>>>
>>>> An alternative fix I could imagine would be to
>>>> override fillInStackTrace in NPE.java. It could
>>>> call getMessage() and then super.fillInStackTrace,
>>>> and return a new exception with the message.
>>> overriding fillInStackTrace() and marking it when it is called twice is the
>> other solution.
>>
>> That was my thought now I understand how/when this process kicks in.
>> Just add a "filled" count that the overriding fillInStackTrace
>> increments, and have getMessage check the count before calling
>> getExtendedNPEMessage.
>>
>> Cheers,
>> David
>> -----
>>
>>>> But this would also compute the message in cases where it is not printed.
>>> you don't need that, you can change the field Throwable.stackTrace with a
>> marker object not unlike UNASSIGNED_STACK to detect if NPE.fillInStackTrace
>> is called twice,
>>> but as is said earlier, it's not a small change because Throwable should stay
>> thread safe and serializable.
>>> So i agree with you and Christoph that the "best" solution is to document
>> that null.fillInStackTrace() doesn't get a detailed error message, apart if
>> someone has a better fix.
>>>> Best regards,
>>>>    Goetz.
>>> R?mi
>>>
>>>>> -----Original Message-----
>>>>> From: Christoph Dreis <christoph.dreis at freenet.de>
>>>>> Sent: Monday, June 29, 2020 1:28 PM
>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
>> runtime-
>>>>> dev at openjdk.java.net
>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>>>> after calling fillInStackTrace
>>>>>
>>>>> Hi Goetz,
>>>>>
>>>>>> If changing the stack trace by calling fillInStackTrace in user code, the
>>>>>> NPE algorithm lacks the proper information to compute the message.
>>>>>> Thus, we must omit it after that call.
>>>>>> I implement this by checking for a call to fillInStackTrace at the bci
>>>>>> recorded in the exception.
>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>>>>> jdk15/01/
>>>>>
>>>>> I tried this when reporting the issue already. The problem with this is,
>> that it
>>>>> suppresses any valid exception.
>>>>>
>>>>> E.g. the following example would not throw any helpful NPE anymore.
>>>>>
>>>>> public class Main {
>>>>> 	public static void main(String[] args) {
>>>>> 		NullPointerException ex = null;
>>>>> 		ex.fillInStackTrace();
>>>>> 	}
>>>>> }
>>>>>
>>>>> Cheers,
>>>>> Christoph
>>>>>


From mandy.chung at oracle.com  Wed Jul  1 20:39:45 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Wed, 1 Jul 2020 13:39:45 -0700
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <42e22eec-541f-89c8-7538-6afbb18091cc@oracle.com>

Hi Goetz,

This approach seems reasonable.? This is only specific to NPE and so I 
also prefer implementing an overridden NPE::fillInStackTrace to setting 
Throwable::stackTrace field to another marker object.

For the javadoc of NPE::fillInStackTrace, it should simply inherit from 
Throwable, i.e. using ? {@inheritDoc}.? I think your @implNote is not 
needed.


 ? 94???????? if (result == this) numStackTracesFilledIn++;

Do you see any case that returns a different object than this?? I 
believe fillInStackTrace always returns this.

Mandy

On 7/1/20 6:01 AM, Lindenmaier, Goetz wrote:
> Hi David,
>   
> This obviously works, but I'm not sure whether the case
> skipped is worth adding a field to NPE.
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/03/
> (This is not a final webrev, the other fix is still in there commented out.)
>
> The NPE is already skipped or simplified in a row of other cases
>    - serialization
>    - expressions involving more than 5 bytecodes (recursion depth
>       of the algorithm)
>    - hidden top frames
>    - internal backtrace information is lost
> So this further case, which I consider quite unlikely, just adds
> to existing ones.
> Other decisions were made to reduce the changes to the
> NPE class and memory consumption. E.g., the message is
> not persisted once computed, but recomputed on each
> call to getMessage().
>
> I'll post it to core-libs-dev if we follow this version of the
> fix further.
>
> Best regards,
>    Goetz.
>
>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Wednesday, July 1, 2020 12:50 AM
>> To: Remi Forax <forax at univ-mlv.fr>; Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> On 30/06/2020 5:53 pm, Remi Forax wrote:
>>> ----- Mail original -----
>>>> De: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>
>>>> ?: "Christoph Dreis" <christoph.dreis at freenet.de>, "hotspot-runtime-
>> dev" <hotspot-runtime-dev at openjdk.java.net>
>>>> Envoy?: Lundi 29 Juin 2020 14:07:08
>>>> Objet: RE: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>> [...]
>>>
>>>> An alternative fix I could imagine would be to
>>>> override fillInStackTrace in NPE.java. It could
>>>> call getMessage() and then super.fillInStackTrace,
>>>> and return a new exception with the message.
>>> overriding fillInStackTrace() and marking it when it is called twice is the
>> other solution.
>>
>> That was my thought now I understand how/when this process kicks in.
>> Just add a "filled" count that the overriding fillInStackTrace
>> increments, and have getMessage check the count before calling
>> getExtendedNPEMessage.
>>
>> Cheers,
>> David
>> -----
>>
>>>> But this would also compute the message in cases where it is not printed.
>>> you don't need that, you can change the field Throwable.stackTrace with a
>> marker object not unlike UNASSIGNED_STACK to detect if NPE.fillInStackTrace
>> is called twice,
>>> but as is said earlier, it's not a small change because Throwable should stay
>> thread safe and serializable.
>>> So i agree with you and Christoph that the "best" solution is to document
>> that null.fillInStackTrace() doesn't get a detailed error message, apart if
>> someone has a better fix.
>>>> Best regards,
>>>>    Goetz.
>>> R?mi
>>>
>>>>> -----Original Message-----
>>>>> From: Christoph Dreis <christoph.dreis at freenet.de>
>>>>> Sent: Monday, June 29, 2020 1:28 PM
>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
>> runtime-
>>>>> dev at openjdk.java.net
>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>>>> after calling fillInStackTrace
>>>>>
>>>>> Hi Goetz,
>>>>>
>>>>>> If changing the stack trace by calling fillInStackTrace in user code, the
>>>>>> NPE algorithm lacks the proper information to compute the message.
>>>>>> Thus, we must omit it after that call.
>>>>>> I implement this by checking for a call to fillInStackTrace at the bci
>>>>>> recorded in the exception.
>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>>>>> jdk15/01/
>>>>>
>>>>> I tried this when reporting the issue already. The problem with this is,
>> that it
>>>>> suppresses any valid exception.
>>>>>
>>>>> E.g. the following example would not throw any helpful NPE anymore.
>>>>>
>>>>> public class Main {
>>>>> 	public static void main(String[] args) {
>>>>> 		NullPointerException ex = null;
>>>>> 		ex.fillInStackTrace();
>>>>> 	}
>>>>> }
>>>>>
>>>>> Cheers,
>>>>> Christoph
>>>>>


From david.holmes at oracle.com  Wed Jul  1 23:58:58 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 2 Jul 2020 09:58:58 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>

Hi Goetz,

On 1/07/2020 11:01 pm, Lindenmaier, Goetz wrote:
> Hi David,
>   
> This obviously works, but I'm not sure whether the case
> skipped is worth adding a field to NPE.
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/03/
> (This is not a final webrev, the other fix is still in there commented out.)

As Mandy points out you don't need to repeat the documention, nor add an 
impl note as this is purely internal implementation details.

The method itself can simply be:

public synchronized Throwable fillInStackTrace() {
     numStackTracesFilledIn++;
     return super.fillInStackTrace();
}

> The NPE is already skipped or simplified in a row of other cases
>    - serialization
>    - expressions involving more than 5 bytecodes (recursion depth
>       of the algorithm)
>    - hidden top frames
>    - internal backtrace information is lost
> So this further case, which I consider quite unlikely, just adds
> to existing ones.
> Other decisions were made to reduce the changes to the
> NPE class and memory consumption. E.g., the message is
> not persisted once computed, but recomputed on each
> call to getMessage().

I don't think one int field is a problem in terms of memory consumption.

> I'll post it to core-libs-dev if we follow this version of the
> fix further.

Sounds good.

Thanks,
David
-----

> Best regards,
>    Goetz.
> 
> 
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Wednesday, July 1, 2020 12:50 AM
>> To: Remi Forax <forax at univ-mlv.fr>; Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> On 30/06/2020 5:53 pm, Remi Forax wrote:
>>> ----- Mail original -----
>>>> De: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>
>>>> ?: "Christoph Dreis" <christoph.dreis at freenet.de>, "hotspot-runtime-
>> dev" <hotspot-runtime-dev at openjdk.java.net>
>>>> Envoy?: Lundi 29 Juin 2020 14:07:08
>>>> Objet: RE: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>>
>>> [...]
>>>
>>>> An alternative fix I could imagine would be to
>>>> override fillInStackTrace in NPE.java. It could
>>>> call getMessage() and then super.fillInStackTrace,
>>>> and return a new exception with the message.
>>>
>>> overriding fillInStackTrace() and marking it when it is called twice is the
>> other solution.
>>
>> That was my thought now I understand how/when this process kicks in.
>> Just add a "filled" count that the overriding fillInStackTrace
>> increments, and have getMessage check the count before calling
>> getExtendedNPEMessage.
>>
>> Cheers,
>> David
>> -----
>>
>>>> But this would also compute the message in cases where it is not printed.
>>>
>>> you don't need that, you can change the field Throwable.stackTrace with a
>> marker object not unlike UNASSIGNED_STACK to detect if NPE.fillInStackTrace
>> is called twice,
>>> but as is said earlier, it's not a small change because Throwable should stay
>> thread safe and serializable.
>>>
>>> So i agree with you and Christoph that the "best" solution is to document
>> that null.fillInStackTrace() doesn't get a detailed error message, apart if
>> someone has a better fix.
>>>
>>>>
>>>> Best regards,
>>>>    Goetz.
>>>
>>> R?mi
>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Christoph Dreis <christoph.dreis at freenet.de>
>>>>> Sent: Monday, June 29, 2020 1:28 PM
>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
>> runtime-
>>>>> dev at openjdk.java.net
>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>>>> after calling fillInStackTrace
>>>>>
>>>>> Hi Goetz,
>>>>>
>>>>>> If changing the stack trace by calling fillInStackTrace in user code, the
>>>>>> NPE algorithm lacks the proper information to compute the message.
>>>>>> Thus, we must omit it after that call.
>>>>>
>>>>>> I implement this by checking for a call to fillInStackTrace at the bci
>>>>>> recorded in the exception.
>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>>>>> jdk15/01/
>>>>>
>>>>> I tried this when reporting the issue already. The problem with this is,
>> that it
>>>>> suppresses any valid exception.
>>>>>
>>>>> E.g. the following example would not throw any helpful NPE anymore.
>>>>>
>>>>> public class Main {
>>>>> 	public static void main(String[] args) {
>>>>> 		NullPointerException ex = null;
>>>>> 		ex.fillInStackTrace();
>>>>> 	}
>>>>> }
>>>>>
>>>>> Cheers,
>>>>> Christoph
>>>>>

From Alan.Bateman at oracle.com  Thu Jul  2 06:10:59 2020
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 2 Jul 2020 07:10:59 +0100
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
Message-ID: <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>

On 02/07/2020 00:58, David Holmes wrote:
>
> I don't think one int field is a problem in terms of memory consumption.
No, but it may change the serial form. If there is counter for the 
number of times that fillInStackTrace then it should be transient (and 
no need to explicitly initialize it to its default value).

-Alan

From david.holmes at oracle.com  Thu Jul  2 07:46:42 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 2 Jul 2020 17:46:42 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
Message-ID: <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>

On 2/07/2020 4:10 pm, Alan Bateman wrote:
> On 02/07/2020 00:58, David Holmes wrote:
>>
>> I don't think one int field is a problem in terms of memory consumption.
> No, but it may change the serial form. If there is counter for the 
> number of times that fillInStackTrace then it should be transient (and 
> no need to explicitly initialize it to its default value).

Yes good point! Thanks Alan!

David

> -Alan

From goetz.lindenmaier at sap.com  Thu Jul  2 14:11:47 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 2 Jul 2020 14:11:47 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
Message-ID: <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi everybody, 

Thanks for the plentiful feedback!

I tried to incorporate all in this webrev.
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/04/

Alan, thanks for pointing out the transient issue. After serialization, no
message will be generated anyways, so checking the field is pointless, 
and the field can have any value.
Mandy, thanks for pointing me to @inheritDoc.

About checking
  if (result == this) numStackTracesFilledIn++;
I could also do result.numStackTracesFilledIn++;
But yes, looking at the implementation of 
fillInStackTrace it returns this.  But I like the 
code be locally correct. If the called function is 
changed, the local code will break. After all, 
fillInStackTrace returns the object.

But as several of you agreed, let's keep it as 
Is now. 

Best regards,
  Goetz.

> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Thursday, July 2, 2020 9:47 AM
> To: Alan Bateman <Alan.Bateman at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; Remi Forax <forax at univ-mlv.fr>
> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> On 2/07/2020 4:10 pm, Alan Bateman wrote:
> > On 02/07/2020 00:58, David Holmes wrote:
> >>
> >> I don't think one int field is a problem in terms of memory consumption.
> > No, but it may change the serial form. If there is counter for the
> > number of times that fillInStackTrace then it should be transient (and
> > no need to explicitly initialize it to its default value).
> 
> Yes good point! Thanks Alan!
> 
> David
> 
> > -Alan

From christoph.dreis at freenet.de  Thu Jul  2 15:18:12 2020
From: christoph.dreis at freenet.de (Christoph Dreis)
Date: Thu, 02 Jul 2020 17:18:12 +0200
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>

Hi Goetz,

> Thanks for the plentiful feedback!

> I tried to incorporate all in this webrev.
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/04/

Thanks for pushing this forward.

I noticed a minor typo:
In line 1286 of the new test in NullPointerExceptionTest the comment says "crated" while it probably should read "created".

Cheers,
Christoph


> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Thursday, July 2, 2020 9:47 AM
> To: Alan Bateman <Alan.Bateman at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; Remi Forax <forax at univ-mlv.fr>
> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> On 2/07/2020 4:10 pm, Alan Bateman wrote:
> > On 02/07/2020 00:58, David Holmes wrote:
> >>
> >> I don't think one int field is a problem in terms of memory consumption.
> > No, but it may change the serial form. If there is counter for the
> > number of times that fillInStackTrace then it should be transient (and
> > no need to explicitly initialize it to its default value).
> 
> Yes good point! Thanks Alan!
> 
> David
> 
> > -Alan


From goetz.lindenmaier at sap.com  Thu Jul  2 16:45:36 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 2 Jul 2020 16:45:36 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
Message-ID: <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Christoph, 

I fixed the comment, thanks for pointing that out.

Best regards,
  Geotz

> -----Original Message-----
> From: Christoph Dreis <christoph.dreis at freenet.de>
> Sent: Thursday, July 2, 2020 5:18 PM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> Cc: hotspot-runtime-dev <hotspot-runtime-dev at openjdk.java.net>; 'David
> Holmes' <david.holmes at oracle.com>; Alan Bateman
> <Alan.Bateman at oracle.com>; Remi Forax <forax at univ-mlv.fr>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> Hi Goetz,
> 
> > Thanks for the plentiful feedback!
> 
> > I tried to incorporate all in this webrev.
> > http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> jdk15/04/
> 
> Thanks for pushing this forward.
> 
> I noticed a minor typo:
> In line 1286 of the new test in NullPointerExceptionTest the comment says
> "crated" while it probably should read "created".
> 
> Cheers,
> Christoph
> 
> 
> > -----Original Message-----
> > From: David Holmes <david.holmes at oracle.com>
> > Sent: Thursday, July 2, 2020 9:47 AM
> > To: Alan Bateman <Alan.Bateman at oracle.com>; Lindenmaier, Goetz
> > <goetz.lindenmaier at sap.com>; Remi Forax <forax at univ-mlv.fr>
> > Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> > <hotspot-runtime-dev at openjdk.java.net>
> > Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> > after calling fillInStackTrace
> >
> > On 2/07/2020 4:10 pm, Alan Bateman wrote:
> > > On 02/07/2020 00:58, David Holmes wrote:
> > >>
> > >> I don't think one int field is a problem in terms of memory consumption.
> > > No, but it may change the serial form. If there is counter for the
> > > number of times that fillInStackTrace then it should be transient (and
> > > no need to explicitly initialize it to its default value).
> >
> > Yes good point! Thanks Alan!
> >
> > David
> >
> > > -Alan
> 


From forax at univ-mlv.fr  Thu Jul  2 17:28:34 2020
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Thu, 2 Jul 2020 19:28:34 +0200 (CEST)
Subject: [15] RFR: 8248476: No helpful NullPointerException message
 after calling fillInStackTrace
In-Reply-To: <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <1622497653.718295.1593710914814.JavaMail.zimbra@u-pem.fr>

Hi Goetz,
I believe numStackTracesFilledIn has to be declared volatile too,
because otherwise if one thread call explicitly fillInStackTrace() and another call getMessage(), the second thread can see the value numStackTracesFilledIn not updated.

R?mi

----- Mail original -----
> De: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>
> ?: "Christoph Dreis" <christoph.dreis at freenet.de>
> Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>, "David Holmes" <david.holmes at oracle.com>, "Alan
> Bateman" <Alan.Bateman at oracle.com>, "Remi Forax" <forax at univ-mlv.fr>
> Envoy?: Jeudi 2 Juillet 2020 18:45:36
> Objet: RE: [15] RFR: 8248476: No helpful NullPointerException message after calling fillInStackTrace

> Hi Christoph,
> 
> I fixed the comment, thanks for pointing that out.
> 
> Best regards,
>  Geotz
> 
>> -----Original Message-----
>> From: Christoph Dreis <christoph.dreis at freenet.de>
>> Sent: Thursday, July 2, 2020 5:18 PM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>> Cc: hotspot-runtime-dev <hotspot-runtime-dev at openjdk.java.net>; 'David
>> Holmes' <david.holmes at oracle.com>; Alan Bateman
>> <Alan.Bateman at oracle.com>; Remi Forax <forax at univ-mlv.fr>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>> 
>> Hi Goetz,
>> 
>> > Thanks for the plentiful feedback!
>> 
>> > I tried to incorporate all in this webrev.
>> > http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>> jdk15/04/
>> 
>> Thanks for pushing this forward.
>> 
>> I noticed a minor typo:
>> In line 1286 of the new test in NullPointerExceptionTest the comment says
>> "crated" while it probably should read "created".
>> 
>> Cheers,
>> Christoph
>> 
>> 
>> > -----Original Message-----
>> > From: David Holmes <david.holmes at oracle.com>
>> > Sent: Thursday, July 2, 2020 9:47 AM
>> > To: Alan Bateman <Alan.Bateman at oracle.com>; Lindenmaier, Goetz
>> > <goetz.lindenmaier at sap.com>; Remi Forax <forax at univ-mlv.fr>
>> > Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> > <hotspot-runtime-dev at openjdk.java.net>
>> > Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> > after calling fillInStackTrace
>> >
>> > On 2/07/2020 4:10 pm, Alan Bateman wrote:
>> > > On 02/07/2020 00:58, David Holmes wrote:
>> > >>
>> > >> I don't think one int field is a problem in terms of memory consumption.
>> > > No, but it may change the serial form. If there is counter for the
>> > > number of times that fillInStackTrace then it should be transient (and
>> > > no need to explicitly initialize it to its default value).
>> >
>> > Yes good point! Thanks Alan!
>> >
>> > David
>> >
>> > > -Alan

From Alan.Bateman at oracle.com  Thu Jul  2 18:47:31 2020
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 2 Jul 2020 19:47:31 +0100
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <7281D8A1-2C61-45AF-807E-B9026ED631B4@freenet.de>
 <AM4PR0202MB2964B42555C5BCFE1D5F0992EC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <90229272.987804.1593503634927.JavaMail.zimbra@u-pem.fr>
 <3f09e223-0046-4771-0a76-793cf92a69f7@oracle.com>
 <AM4PR0202MB29649BF48721B1BC4330431BEC6C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>


On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> Hi Christoph,
>
> I fixed the comment, thanks for pointing that out.
>
One other thing is that NPE::getMessage reads numStackTracesFilledIn 
without synchronization.

-Alan

From forax at univ-mlv.fr  Thu Jul  2 18:52:12 2020
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Thu, 2 Jul 2020 20:52:12 +0200 (CEST)
Subject: [15] RFR: 8248476: No helpful NullPointerException message
 after calling fillInStackTrace
In-Reply-To: <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
Message-ID: <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>

yes,
it's what i was saying,
given that a NPE can be thrown very early, before VarHandle is initialized, i believe that declaring numStackTracesFilledIn volatile is the best way to tackle that.

R?mi

----- Mail original -----
> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph Dreis" <christoph.dreis at freenet.de>
> Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>, "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
> <forax at univ-mlv.fr>
> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException message after calling fillInStackTrace

> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>> Hi Christoph,
>>
>> I fixed the comment, thanks for pointing that out.
>>
> One other thing is that NPE::getMessage reads numStackTracesFilledIn
> without synchronization.
> 
> -Alan

From goetz.lindenmaier at sap.com  Thu Jul  2 19:30:16 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 2 Jul 2020 19:30:16 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
Message-ID: <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>

Hi Remi,

But how does volatile help?
I see the test for numStackTracesFilledIn == 1 then gets always the
right value. 
But the backtrace may not be changed until I read it in 
getExtendedNPEMessage.  The other thread could change it after
checking numStackTracesFilledIn and before I read the backtrace.

I want to vote again for the much more simple version 
proposed in webrev 02: 
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/02/

It's drawback is only that for this code:
  ex = null;
  ex.fillInStackTrace()
no message is created.

I think this really is acceptable.


Remi, I didn't comment on this statement from a previous mail:
> > Hmm, Throwable.stackTrace is used for the stack trace at some point.
> yes, it contains the Java stack trace, but if the Java stack trace is filled you don't 
> compute any helpful message anyway.
The internal structure is no more deleted when the stack trace 
is filled. So the message can be computed later, too.  

Best regards,
  Goetz.

> -----Original Message-----
> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> Sent: Thursday, July 2, 2020 8:52 PM
> To: Alan Bateman <Alan.Bateman at oracle.com>
> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph Dreis
> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-runtime-
> dev at openjdk.java.net>; David Holmes <david.holmes at oracle.com>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> yes,
> it's what i was saying,
> given that a NPE can be thrown very early, before VarHandle is initialized, i
> believe that declaring numStackTracesFilledIn volatile is the best way to
> tackle that.
> 
> R?mi
> 
> ----- Mail original -----
> > De: "Alan Bateman" <Alan.Bateman at oracle.com>
> > ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph Dreis"
> <christoph.dreis at freenet.de>
> > Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>,
> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
> > <forax at univ-mlv.fr>
> > Envoy?: Jeudi 2 Juillet 2020 20:47:31
> > Objet: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> > On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >> Hi Christoph,
> >>
> >> I fixed the comment, thanks for pointing that out.
> >>
> > One other thing is that NPE::getMessage reads numStackTracesFilledIn
> > without synchronization.
> >
> > -Alan

From forax at univ-mlv.fr  Thu Jul  2 20:09:56 2020
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Thu, 2 Jul 2020 22:09:56 +0200 (CEST)
Subject: [15] RFR: 8248476: No helpful NullPointerException message
 after calling fillInStackTrace
In-Reply-To: <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
Message-ID: <687478080.754273.1593720596913.JavaMail.zimbra@u-pem.fr>

----- Mail original -----
> De: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>
> ?: "Remi Forax" <forax at univ-mlv.fr>, "Alan Bateman" <Alan.Bateman at oracle.com>
> Cc: "Christoph Dreis" <christoph.dreis at freenet.de>, "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>, "David
> Holmes" <david.holmes at oracle.com>
> Envoy?: Jeudi 2 Juillet 2020 21:30:16
> Objet: RE: [15] RFR: 8248476: No helpful NullPointerException message after calling fillInStackTrace

> Hi Remi,
> 
> But how does volatile help?
> I see the test for numStackTracesFilledIn == 1 then gets always the
> right value.
> But the backtrace may not be changed until I read it in
> getExtendedNPEMessage.  The other thread could change it after
> checking numStackTracesFilledIn and before I read the backtrace.

it helps because if it's not the same thread that explicitly calls fillInStackTrace and calls getExtendedNPEMessage, the one that calls getExtendedNPEMessage() can see 0 instead of 1 but have the stack trace changed.
because there is no happen before relationship between the write to the stacktrace field and the write to numStackTracesFilledIn.

> 
> I want to vote again for the much more simple version
> proposed in webrev 02:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/02/
> 
> It's drawback is only that for this code:
>  ex = null;
>  ex.fillInStackTrace()
> no message is created.
> 
> I think this really is acceptable.

as i said earlier, for me too it's a better idea than add a bunch of code in NPE.java just for one corner of a corner case (explicitly calling fillInStackTrace() ... with null).

R?mi

From Monica.Beckwith at microsoft.com  Thu Jul  2 21:38:07 2020
From: Monica.Beckwith at microsoft.com (Monica Beckwith)
Date: Thu, 2 Jul 2020 21:38:07 +0000
Subject: Update: JEP drafted (was: RE: [EXTERNAL] Re: [aarch64-port-dev ]
 OpenJDK extension to AArch64 and Windows)
Message-ID: <DM6PR21MB1323FA866ECB223970E4947DE56D0@DM6PR21MB1323.namprd21.prod.outlook.com>

Hello all,

Here?s our JEP: https://bugs.openjdk.java.net/browse/JDK-8248496.Thank you all so much for helping us out with the process. 

We also have refined our changesets, and we are in the process of attaching them to the umbrella bug: https://bugs.openjdk.java.net/browse/JDK-8248238.

@Dalibor Topic: I wanted to provide more details on the UI support - we have recently tested ?JFC applications and applets?[0] in the JDK `demo/jfc` directory on three different Arm64 systems[1]. I found a bug related to Vectored Exception Handling (VEH) and Structured Exception Handling (SEH). @Ludovic is involved in the VEH/SEH discussions on the mailing list. And internally, we will make sure that the changeset for `VEH for aarch64` will incorporate the changes. 

@David Holmes: I will start a new RFR with the umbrella bug ID (8248238) in the subject once we have all the changesets ready to go.

Thanks,

Monica

[0]: https://docs.oracle.com/javase/7/docs/technotes/samples/demos.html 
[1]: https://github.com/microsoft/openjdk-aarch64/blob/master/Arm64_systems.md

-----Original Message-----
From: Dalibor Topic <dalibor.topic at oracle.com> 
Sent: Friday, June 26, 2020 8:07 AM
To: Andrew Haley <aph at redhat.com>; Mario Torre <neugens at redhat.com>
Cc: Monica Beckwith <Monica.Beckwith at microsoft.com>; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64 <openjdk-aarch64 at microsoft.com>
Subject: [EXTERNAL] Re: [aarch64-port-dev ] OpenJDK extension to AArch64 and Windows

On 26.06.2020 14:32, Andrew Haley wrote:
> On 26/06/2020 13:18, Dalibor Topic wrote:
>> Since there is no aarch64-port repository tracking jdk/jdk yet to 
>> host the port's changes in development (and stage it for a later 
>> merge into mainline once it's ready), I assume that's going to be the 
>> first step in that case.
> 
> Given that the patches are simpler and smaller than many changes that 
> just go into mainline, is there an actual reason for a new repo?

I assume that there would be some work remaining to be done on the port, since it's not quite done yet. For example, the UI layer has not been ported, according to Microsoft, which means that the port is not really fully functional in its current state. [0]

If that's just a matter of days, then sure, I fully understand that you may not want to add a new aarch64-port repo just for that.

On the other hand, if there is a question mark over whether the port would become fully functional in the coming weeks or months, then trying to integrate the port-specific parts of it piecemeal into mainline before that the case ... would seem a bit premature to me.

In that case, an aarch64-port specific jdk/jdk staging repo might be more useful.

cheers,
dalibor topic

[0]
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.reddit.com%2Fr%2Fjava%2Fcomments%2Fhf4ofr%2Fannouncing_openjdk_for_windows_10_on_arm%2Ffvvmi8g%2F&amp;data=02%7C01%7CMonica.Beckwith%40microsoft.com%7C2914f9cc69324198ff0608d819d1eca3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637287736697879904&amp;sdata=9LRrh9xmix%2BMitQ1PAQg%2F1f9nEsuMcmoKXmFtrsHa60%3D&amp;reserved=0

--
<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.oracle.com%2F&amp;data=02%7C01%7CMonica.Beckwith%40microsoft.com%7C2914f9cc69324198ff0608d819d1eca3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637287736697879904&amp;sdata=%2FbjvUvv0gpWHeRO%2BSa4%2Btxt3YY3rCf4unXLq%2B0fzvJs%3D&amp;reserved=0> Dalibor Topic Consulting Product Manager
Phone: +494089091214 <tel:+494089091214>, Mobile: +491737185961 <tel:+491737185961>, Video: dalibor.topic at oracle.com <sip:dalibor.topic at oracle.com>

Oracle Global Services Germany GmbH
Hauptverwaltung: Riesstr. 25, D-80992 M?nchen
Registergericht: Amtsgericht M?nchen, HRB 246209
Gesch?ftsf?hrer: Ralf Herrmann


From david.holmes at oracle.com  Fri Jul  3 01:37:20 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 3 Jul 2020 11:37:20 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
Message-ID: <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>

Hi Goetz,

On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> Hi Remi,
> 
> But how does volatile help?
> I see the test for numStackTracesFilledIn == 1 then gets always the
> right value.
> But the backtrace may not be changed until I read it in
> getExtendedNPEMessage.  The other thread could change it after
> checking numStackTracesFilledIn and before I read the backtrace.

True. To ensure you process the original backtrace only you need to add 
synchronization in getMessage():

       public String getMessage() {
           String message = super.getMessage();
           // If the stack trace was changed the extended NPE algorithm
           // will compute a wrong message.
+         synchronized(this) {
!             if (message == null && numStackTracesFilledIn == 1) {
!                 return getExtendedNPEMessage();
!             }
+         }
           return message;
       }

To be honest the idea that someone would share an exception instance and 
concurrently mutate it with fillInStackTrace() whilst printing out 
information about it just seems highly unrealistic. But the above fixes 
it simply. Though after looking at comments in the test I would also 
suggest that setStackTrace be updated:

        synchronized (this) {
             if (this.stackTrace == null && // Immutable stack
                 backtrace == null) // Test for out of protocol state
                 return;
+           numStackTracesFilledIn++;
             this.stackTrace = defensiveCopy;
         }
     }

as that would seem to be another hole in the mechanism.

> I want to vote again for the much more simple version
> proposed in webrev 02:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/02/

I much prefer the latest version that recognises that only the original 
stack can be processed.

In the test:

+         // This holds for explicitly crated NPEs, but also for implicilty

Two typos: crated  & implicilty

Thanks,
David
-----


> It's drawback is only that for this code:
>    ex = null;
>    ex.fillInStackTrace()
> no message is created.
> 
> I think this really is acceptable.
> 
> 
> Remi, I didn't comment on this statement from a previous mail:
>>> Hmm, Throwable.stackTrace is used for the stack trace at some point.
>> yes, it contains the Java stack trace, but if the Java stack trace is filled you don't
>> compute any helpful message anyway.
> The internal structure is no more deleted when the stack trace
> is filled. So the message can be computed later, too.
> 
> Best regards,
>    Goetz.
> 
>> -----Original Message-----
>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>> Sent: Thursday, July 2, 2020 8:52 PM
>> To: Alan Bateman <Alan.Bateman at oracle.com>
>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph Dreis
>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-runtime-
>> dev at openjdk.java.net>; David Holmes <david.holmes at oracle.com>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> yes,
>> it's what i was saying,
>> given that a NPE can be thrown very early, before VarHandle is initialized, i
>> believe that declaring numStackTracesFilledIn volatile is the best way to
>> tackle that.
>>
>> R?mi
>>
>> ----- Mail original -----
>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph Dreis"
>> <christoph.dreis at freenet.de>
>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>,
>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>> <forax at univ-mlv.fr>
>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>> Hi Christoph,
>>>>
>>>> I fixed the comment, thanks for pointing that out.
>>>>
>>> One other thing is that NPE::getMessage reads numStackTracesFilledIn
>>> without synchronization.
>>>
>>> -Alan

From goetz.lindenmaier at sap.com  Fri Jul  3 06:32:18 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 3 Jul 2020 06:32:18 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
Message-ID: <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi,

> True. To ensure you process the original backtrace only you need to add
> synchronization in getMessage():
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/05/

I added the volatile, too, but as I understand the synchronized 
block brings sufficient memory barriers that this also works 
without.

> To be honest the idea that someone would share an exception instance and
> concurrently mutate it with fillInStackTrace() whilst printing out
> information about it just seems highly unrealistic.
Yes, contention here is quite unlikely, so it should not harm performance ??

> Though after looking at comments in the test I would also
> suggest that setStackTrace be updated:
The test shows that after setStackTrace still the correct message
is computed. This is because the algorithm uses Throwable::backtrace
and not Throwable::stacktrace.  Throwable::backtrace is not
affected by setStackTrace. 
The behavior is just as with any exception. If you fiddle 
with the stack trace, but don't adapt the message text, 
the message might refer to other code than the stack trace
points to. 

Best regards,
  Goetz.


> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Friday, July 3, 2020 3:37 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> Hi Goetz,
> 
> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> > Hi Remi,
> >
> > But how does volatile help?
> > I see the test for numStackTracesFilledIn == 1 then gets always the
> > right value.
> > But the backtrace may not be changed until I read it in
> > getExtendedNPEMessage.  The other thread could change it after
> > checking numStackTracesFilledIn and before I read the backtrace.
> 
> True. To ensure you process the original backtrace only you need to add
> synchronization in getMessage():
> 
>        public String getMessage() {
>            String message = super.getMessage();
>            // If the stack trace was changed the extended NPE algorithm
>            // will compute a wrong message.
> +         synchronized(this) {
> !             if (message == null && numStackTracesFilledIn == 1) {
> !                 return getExtendedNPEMessage();
> !             }
> +         }
>            return message;
>        }
> 
> To be honest the idea that someone would share an exception instance and
> concurrently mutate it with fillInStackTrace() whilst printing out
> information about it just seems highly unrealistic. But the above fixes
> it simply. Though after looking at comments in the test I would also
> suggest that setStackTrace be updated:
> 
>         synchronized (this) {
>              if (this.stackTrace == null && // Immutable stack
>                  backtrace == null) // Test for out of protocol state
>                  return;
> +           numStackTracesFilledIn++;
>              this.stackTrace = defensiveCopy;
>          }
>      }
> 
> as that would seem to be another hole in the mechanism.
> 
> > I want to vote again for the much more simple version
> > proposed in webrev 02:
> > http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> jdk15/02/
> 
> I much prefer the latest version that recognises that only the original
> stack can be processed.
> 
> In the test:
> 
> +         // This holds for explicitly crated NPEs, but also for implicilty
> 
> Two typos: crated  & implicilty
> 
> Thanks,
> David
> -----
> 
> 
> > It's drawback is only that for this code:
> >    ex = null;
> >    ex.fillInStackTrace()
> > no message is created.
> >
> > I think this really is acceptable.
> >
> >
> > Remi, I didn't comment on this statement from a previous mail:
> >>> Hmm, Throwable.stackTrace is used for the stack trace at some point.
> >> yes, it contains the Java stack trace, but if the Java stack trace is filled you
> don't
> >> compute any helpful message anyway.
> > The internal structure is no more deleted when the stack trace
> > is filled. So the message can be computed later, too.
> >
> > Best regards,
> >    Goetz.
> >
> >> -----Original Message-----
> >> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> >> Sent: Thursday, July 2, 2020 8:52 PM
> >> To: Alan Bateman <Alan.Bateman at oracle.com>
> >> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph Dreis
> >> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-runtime-
> >> dev at openjdk.java.net>; David Holmes <david.holmes at oracle.com>
> >> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> >> after calling fillInStackTrace
> >>
> >> yes,
> >> it's what i was saying,
> >> given that a NPE can be thrown very early, before VarHandle is initialized,
> i
> >> believe that declaring numStackTracesFilledIn volatile is the best way to
> >> tackle that.
> >>
> >> R?mi
> >>
> >> ----- Mail original -----
> >>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> >>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph
> Dreis"
> >> <christoph.dreis at freenet.de>
> >>> Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>,
> >> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
> >>> <forax at univ-mlv.fr>
> >>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> >>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException message
> >> after calling fillInStackTrace
> >>
> >>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >>>> Hi Christoph,
> >>>>
> >>>> I fixed the comment, thanks for pointing that out.
> >>>>
> >>> One other thing is that NPE::getMessage reads numStackTracesFilledIn
> >>> without synchronization.
> >>>
> >>> -Alan

From david.holmes at oracle.com  Fri Jul  3 07:29:51 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 3 Jul 2020 17:29:51 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>

Hi Goetz,

On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
> Hi,
> 
>> True. To ensure you process the original backtrace only you need to add
>> synchronization in getMessage():
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/05/
> 
> I added the volatile, too, but as I understand the synchronized
> block brings sufficient memory barriers that this also works
> without.

No "volatile" needed, or wanted, when all access is within synchronized 
regions.

>> To be honest the idea that someone would share an exception instance and
>> concurrently mutate it with fillInStackTrace() whilst printing out
>> information about it just seems highly unrealistic.
> Yes, contention here is quite unlikely, so it should not harm performance ??

Contention was not my concern at all. :)

>> Though after looking at comments in the test I would also
>> suggest that setStackTrace be updated:
> The test shows that after setStackTrace still the correct message
> is computed. This is because the algorithm uses Throwable::backtrace
> and not Throwable::stacktrace.  Throwable::backtrace is not
> affected by setStackTrace.
> The behavior is just as with any exception. If you fiddle
> with the stack trace, but don't adapt the message text,
> the message might refer to other code than the stack trace
> points to.

But you can't adapt the message text - there is no setMessage! If the 
message is NULL and you call setStackTrace() then getMessage(), it makes 
no sense to return the extended error message that was associated with 
the original stack/backtrace.

Cheers,
David

> Best regards,
>    Goetz.
> 
> 
> 
> 
> 
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Friday, July 3, 2020 3:37 AM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> Hi Goetz,
>>
>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>> Hi Remi,
>>>
>>> But how does volatile help?
>>> I see the test for numStackTracesFilledIn == 1 then gets always the
>>> right value.
>>> But the backtrace may not be changed until I read it in
>>> getExtendedNPEMessage.  The other thread could change it after
>>> checking numStackTracesFilledIn and before I read the backtrace.
>>
>> True. To ensure you process the original backtrace only you need to add
>> synchronization in getMessage():
>>
>>         public String getMessage() {
>>             String message = super.getMessage();
>>             // If the stack trace was changed the extended NPE algorithm
>>             // will compute a wrong message.
>> +         synchronized(this) {
>> !             if (message == null && numStackTracesFilledIn == 1) {
>> !                 return getExtendedNPEMessage();
>> !             }
>> +         }
>>             return message;
>>         }
>>
>> To be honest the idea that someone would share an exception instance and
>> concurrently mutate it with fillInStackTrace() whilst printing out
>> information about it just seems highly unrealistic. But the above fixes
>> it simply. Though after looking at comments in the test I would also
>> suggest that setStackTrace be updated:
>>
>>          synchronized (this) {
>>               if (this.stackTrace == null && // Immutable stack
>>                   backtrace == null) // Test for out of protocol state
>>                   return;
>> +           numStackTracesFilledIn++;
>>               this.stackTrace = defensiveCopy;
>>           }
>>       }
>>
>> as that would seem to be another hole in the mechanism.
>>
>>> I want to vote again for the much more simple version
>>> proposed in webrev 02:
>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>> jdk15/02/
>>
>> I much prefer the latest version that recognises that only the original
>> stack can be processed.
>>
>> In the test:
>>
>> +         // This holds for explicitly crated NPEs, but also for implicilty
>>
>> Two typos: crated  & implicilty
>>
>> Thanks,
>> David
>> -----
>>
>>
>>> It's drawback is only that for this code:
>>>     ex = null;
>>>     ex.fillInStackTrace()
>>> no message is created.
>>>
>>> I think this really is acceptable.
>>>
>>>
>>> Remi, I didn't comment on this statement from a previous mail:
>>>>> Hmm, Throwable.stackTrace is used for the stack trace at some point.
>>>> yes, it contains the Java stack trace, but if the Java stack trace is filled you
>> don't
>>>> compute any helpful message anyway.
>>> The internal structure is no more deleted when the stack trace
>>> is filled. So the message can be computed later, too.
>>>
>>> Best regards,
>>>     Goetz.
>>>
>>>> -----Original Message-----
>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph Dreis
>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-runtime-
>>>> dev at openjdk.java.net>; David Holmes <david.holmes at oracle.com>
>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>>> after calling fillInStackTrace
>>>>
>>>> yes,
>>>> it's what i was saying,
>>>> given that a NPE can be thrown very early, before VarHandle is initialized,
>> i
>>>> believe that declaring numStackTracesFilledIn volatile is the best way to
>>>> tackle that.
>>>>
>>>> R?mi
>>>>
>>>> ----- Mail original -----
>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph
>> Dreis"
>>>> <christoph.dreis at freenet.de>
>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>,
>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>> <forax at univ-mlv.fr>
>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>>> after calling fillInStackTrace
>>>>
>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>> Hi Christoph,
>>>>>>
>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>
>>>>> One other thing is that NPE::getMessage reads numStackTracesFilledIn
>>>>> without synchronization.
>>>>>
>>>>> -Alan

From goetz.lindenmaier at sap.com  Fri Jul  3 09:01:29 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 3 Jul 2020 09:01:29 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
Message-ID: <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi David,

> But you can't adapt the message text - there is no setMessage! If the
> message is NULL and you call setStackTrace() then getMessage(), it makes
> no sense to return the extended error message that was associated with
> the original stack/backtrace.
That is true. This is because it is complete nonsense to call setStackTrace() on the
exception thrown by the runtime. If someone does so, it's his
problem. 
We would have to fix this for all the exceptions thrown by the runtime
that give a message.  Else it is not consistent.

I guess the normal usecase of setStackTrace is the other way around:
Change the message and throw a new exception with the existing
stack trace:

try {
  a.x;
catch (NullPointerException e) {
  throw new NullPointerException("My own error message").setStackTrace(e.getStackTrace);
}

And not taking an arbitrary stack trace and put it into an exception 
with existing message.

Best regards,
  Goetz.


> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Friday, July 3, 2020 9:30 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> Hi Goetz,
> 
> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
> > Hi,
> >
> >> True. To ensure you process the original backtrace only you need to add
> >> synchronization in getMessage():
> > http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> jdk15/05/
> >
> > I added the volatile, too, but as I understand the synchronized
> > block brings sufficient memory barriers that this also works
> > without.
> 
> No "volatile" needed, or wanted, when all access is within synchronized
> regions.
> 
> >> To be honest the idea that someone would share an exception instance
> and
> >> concurrently mutate it with fillInStackTrace() whilst printing out
> >> information about it just seems highly unrealistic.
> > Yes, contention here is quite unlikely, so it should not harm performance
> ??
> 
> Contention was not my concern at all. :)
> 
> >> Though after looking at comments in the test I would also
> >> suggest that setStackTrace be updated:
> > The test shows that after setStackTrace still the correct message
> > is computed. This is because the algorithm uses Throwable::backtrace
> > and not Throwable::stacktrace.  Throwable::backtrace is not
> > affected by setStackTrace.
> > The behavior is just as with any exception. If you fiddle
> > with the stack trace, but don't adapt the message text,
> > the message might refer to other code than the stack trace
> > points to.
> 
> But you can't adapt the message text - there is no setMessage! If the
> message is NULL and you call setStackTrace() then getMessage(), it makes
> no sense to return the extended error message that was associated with
> the original stack/backtrace.
> 
> Cheers,
> David
> 
> > Best regards,
> >    Goetz.
> >
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: David Holmes <david.holmes at oracle.com>
> >> Sent: Friday, July 3, 2020 3:37 AM
> >> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
> mlv.fr'
> >> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> >> <hotspot-runtime-dev at openjdk.java.net>
> >> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> >> after calling fillInStackTrace
> >>
> >> Hi Goetz,
> >>
> >> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> >>> Hi Remi,
> >>>
> >>> But how does volatile help?
> >>> I see the test for numStackTracesFilledIn == 1 then gets always the
> >>> right value.
> >>> But the backtrace may not be changed until I read it in
> >>> getExtendedNPEMessage.  The other thread could change it after
> >>> checking numStackTracesFilledIn and before I read the backtrace.
> >>
> >> True. To ensure you process the original backtrace only you need to add
> >> synchronization in getMessage():
> >>
> >>         public String getMessage() {
> >>             String message = super.getMessage();
> >>             // If the stack trace was changed the extended NPE algorithm
> >>             // will compute a wrong message.
> >> +         synchronized(this) {
> >> !             if (message == null && numStackTracesFilledIn == 1) {
> >> !                 return getExtendedNPEMessage();
> >> !             }
> >> +         }
> >>             return message;
> >>         }
> >>
> >> To be honest the idea that someone would share an exception instance
> and
> >> concurrently mutate it with fillInStackTrace() whilst printing out
> >> information about it just seems highly unrealistic. But the above fixes
> >> it simply. Though after looking at comments in the test I would also
> >> suggest that setStackTrace be updated:
> >>
> >>          synchronized (this) {
> >>               if (this.stackTrace == null && // Immutable stack
> >>                   backtrace == null) // Test for out of protocol state
> >>                   return;
> >> +           numStackTracesFilledIn++;
> >>               this.stackTrace = defensiveCopy;
> >>           }
> >>       }
> >>
> >> as that would seem to be another hole in the mechanism.
> >>
> >>> I want to vote again for the much more simple version
> >>> proposed in webrev 02:
> >>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> >> jdk15/02/
> >>
> >> I much prefer the latest version that recognises that only the original
> >> stack can be processed.
> >>
> >> In the test:
> >>
> >> +         // This holds for explicitly crated NPEs, but also for implicilty
> >>
> >> Two typos: crated  & implicilty
> >>
> >> Thanks,
> >> David
> >> -----
> >>
> >>
> >>> It's drawback is only that for this code:
> >>>     ex = null;
> >>>     ex.fillInStackTrace()
> >>> no message is created.
> >>>
> >>> I think this really is acceptable.
> >>>
> >>>
> >>> Remi, I didn't comment on this statement from a previous mail:
> >>>>> Hmm, Throwable.stackTrace is used for the stack trace at some point.
> >>>> yes, it contains the Java stack trace, but if the Java stack trace is filled
> you
> >> don't
> >>>> compute any helpful message anyway.
> >>> The internal structure is no more deleted when the stack trace
> >>> is filled. So the message can be computed later, too.
> >>>
> >>> Best regards,
> >>>     Goetz.
> >>>
> >>>> -----Original Message-----
> >>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> >>>> Sent: Thursday, July 2, 2020 8:52 PM
> >>>> To: Alan Bateman <Alan.Bateman at oracle.com>
> >>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph Dreis
> >>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-runtime-
> >>>> dev at openjdk.java.net>; David Holmes <david.holmes at oracle.com>
> >>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> message
> >>>> after calling fillInStackTrace
> >>>>
> >>>> yes,
> >>>> it's what i was saying,
> >>>> given that a NPE can be thrown very early, before VarHandle is
> initialized,
> >> i
> >>>> believe that declaring numStackTracesFilledIn volatile is the best way to
> >>>> tackle that.
> >>>>
> >>>> R?mi
> >>>>
> >>>> ----- Mail original -----
> >>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> >>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph
> >> Dreis"
> >>>> <christoph.dreis at freenet.de>
> >>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>,
> >>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
> >>>>> <forax at univ-mlv.fr>
> >>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> >>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException message
> >>>> after calling fillInStackTrace
> >>>>
> >>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >>>>>> Hi Christoph,
> >>>>>>
> >>>>>> I fixed the comment, thanks for pointing that out.
> >>>>>>
> >>>>> One other thing is that NPE::getMessage reads numStackTracesFilledIn
> >>>>> without synchronization.
> >>>>>
> >>>>> -Alan

From felix.yang at huawei.com  Fri Jul  3 09:07:29 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Fri, 3 Jul 2020 09:07:29 +0000
Subject: RFR(XS): 8248219: aarch64: missing memory barrier in
 fast_storefield and fast_accessfield
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com> 
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>

Ping...

Is it OK to push?

Thanks,
Felix

> -----Original Message-----
> From: Yangfei (Felix)
> Sent: Sunday, June 28, 2020 8:32 PM
> To: 'Andrew Haley' <aph at redhat.com>; hotspot-runtime-
> dev at openjdk.java.net
> Cc: aarch64-port-dev at openjdk.java.net
> Subject: RE: RFR(XS): 8248219: aarch64: missing memory barrier in
> fast_storefield and fast_accessfield
> 
> Hi Andrew,
> 
>   Sorry for the late reply.  It's Dragon Boat Festival here in China.
> 
> > -----Original Message-----
> > From: Andrew Haley [mailto:aph at redhat.com]
> > Sent: Wednesday, June 24, 2020 6:45 PM
> > To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-runtime-
> > dev at openjdk.java.net
> > Cc: aarch64-port-dev at openjdk.java.net
> > Subject: Re: RFR(XS): 8248219: aarch64: missing memory barrier in
> > fast_storefield and fast_accessfield
> >
> > On 24/06/2020 10:38, Yangfei (Felix) wrote:
> > >   Suggestions?
> >
> > Great catch!
> 
> Thanks for the quick review :-)
> 
> > Thanks for that, I completely agree. Please benchmark the two and
> > unless there is an advantage for the address dependency we'll go with
> > LoadLoad.  It
> 
> I use a simple test [1] to exercise the read & write object field operation and
> run it in -Xint mode.
> Test result on our aarch64 platform show no advantage for the address
> dependency way.
> So I think I am OK to go with the LoadLoad way.
> 
> Webrev: http://cr.openjdk.java.net/~fyang/8248219/webrev.00/
> 
> > looks like all of the ConstantPoolCacheEntry::set methods use
> > Atomic::release_store, so everything should be fine there.
> >
> > Did you also look to see if we need similar memory barriers elsewhere?
> 
> Yes, I checked and only saw these two places.
> 
> > We're going to need backports for all extant OpenJDK versions. Please
> > let us know if you can handle that too.
> 
> Well, I think at least I can handle jdk8u-shenandoah, jdk11u and jdk/jdk.
> 
> Felix
> 
> [1]
> class TestCP {
> 
>   public int num;
> 
>   void run(int reps) {
>     for (int i = 0; i < reps; i++) {
>       num += i;
>     }
>   }
> 
>   public static void main(String[] args) throws Exception {
>     int reps = Integer.parseInt(System.getProperty("repos"));
>     TestCP t = new TestCP();
> 
>     Long startTime = System.nanoTime();
>     t.run(reps);
>     Long endTime = System.nanoTime();
>     System.out.println(endTime - startTime);
>   }
> }

From aph at redhat.com  Fri Jul  3 17:03:32 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 3 Jul 2020 18:03:32 +0100
Subject: RFR(XS): 8248219: aarch64: missing memory barrier in
 fast_storefield and fast_accessfield
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
Message-ID: <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>

On 03/07/2020 10:07, Yangfei (Felix) wrote:
> Is it OK to push?

Sure, thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From david.holmes at oracle.com  Fri Jul  3 22:08:33 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 4 Jul 2020 08:08:33 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>

On 3/07/2020 7:01 pm, Lindenmaier, Goetz wrote:
> Hi David,
> 
>> But you can't adapt the message text - there is no setMessage! If the
>> message is NULL and you call setStackTrace() then getMessage(), it makes
>> no sense to return the extended error message that was associated with
>> the original stack/backtrace.
> That is true. This is because it is complete nonsense to call setStackTrace() on the
> exception thrown by the runtime. If someone does so, it's his
> problem.
> We would have to fix this for all the exceptions thrown by the runtime
> that give a message.  Else it is not consistent.

Sorry but that consistency argument is a huge stretch in the case of the 
helpful NPE message because the original message is empty! This is only 
about helpful NPE message and you can trivially disable it for this case.

> I guess the normal usecase of setStackTrace is the other way around:
> Change the message and throw a new exception with the existing
> stack trace:
> 
> try {
>    a.x;
> catch (NullPointerException e) {
>    throw new NullPointerException("My own error message").setStackTrace(e.getStackTrace);
> }
> 
> And not taking an arbitrary stack trace and put it into an exception
> with existing message.

Interesting usage.

Cheers,
David
-----

> Best regards,
>    Goetz.
> 
> 
> 
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Friday, July 3, 2020 9:30 AM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> Hi Goetz,
>>
>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>> Hi,
>>>
>>>> True. To ensure you process the original backtrace only you need to add
>>>> synchronization in getMessage():
>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>> jdk15/05/
>>>
>>> I added the volatile, too, but as I understand the synchronized
>>> block brings sufficient memory barriers that this also works
>>> without.
>>
>> No "volatile" needed, or wanted, when all access is within synchronized
>> regions.
>>
>>>> To be honest the idea that someone would share an exception instance
>> and
>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>> information about it just seems highly unrealistic.
>>> Yes, contention here is quite unlikely, so it should not harm performance
>> ??
>>
>> Contention was not my concern at all. :)
>>
>>>> Though after looking at comments in the test I would also
>>>> suggest that setStackTrace be updated:
>>> The test shows that after setStackTrace still the correct message
>>> is computed. This is because the algorithm uses Throwable::backtrace
>>> and not Throwable::stacktrace.  Throwable::backtrace is not
>>> affected by setStackTrace.
>>> The behavior is just as with any exception. If you fiddle
>>> with the stack trace, but don't adapt the message text,
>>> the message might refer to other code than the stack trace
>>> points to.
>>
>> But you can't adapt the message text - there is no setMessage! If the
>> message is NULL and you call setStackTrace() then getMessage(), it makes
>> no sense to return the extended error message that was associated with
>> the original stack/backtrace.
>>
>> Cheers,
>> David
>>
>>> Best regards,
>>>     Goetz.
>>>
>>>
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: David Holmes <david.holmes at oracle.com>
>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
>> mlv.fr'
>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>>> after calling fillInStackTrace
>>>>
>>>> Hi Goetz,
>>>>
>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>> Hi Remi,
>>>>>
>>>>> But how does volatile help?
>>>>> I see the test for numStackTracesFilledIn == 1 then gets always the
>>>>> right value.
>>>>> But the backtrace may not be changed until I read it in
>>>>> getExtendedNPEMessage.  The other thread could change it after
>>>>> checking numStackTracesFilledIn and before I read the backtrace.
>>>>
>>>> True. To ensure you process the original backtrace only you need to add
>>>> synchronization in getMessage():
>>>>
>>>>          public String getMessage() {
>>>>              String message = super.getMessage();
>>>>              // If the stack trace was changed the extended NPE algorithm
>>>>              // will compute a wrong message.
>>>> +         synchronized(this) {
>>>> !             if (message == null && numStackTracesFilledIn == 1) {
>>>> !                 return getExtendedNPEMessage();
>>>> !             }
>>>> +         }
>>>>              return message;
>>>>          }
>>>>
>>>> To be honest the idea that someone would share an exception instance
>> and
>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>> information about it just seems highly unrealistic. But the above fixes
>>>> it simply. Though after looking at comments in the test I would also
>>>> suggest that setStackTrace be updated:
>>>>
>>>>           synchronized (this) {
>>>>                if (this.stackTrace == null && // Immutable stack
>>>>                    backtrace == null) // Test for out of protocol state
>>>>                    return;
>>>> +           numStackTracesFilledIn++;
>>>>                this.stackTrace = defensiveCopy;
>>>>            }
>>>>        }
>>>>
>>>> as that would seem to be another hole in the mechanism.
>>>>
>>>>> I want to vote again for the much more simple version
>>>>> proposed in webrev 02:
>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>>>> jdk15/02/
>>>>
>>>> I much prefer the latest version that recognises that only the original
>>>> stack can be processed.
>>>>
>>>> In the test:
>>>>
>>>> +         // This holds for explicitly crated NPEs, but also for implicilty
>>>>
>>>> Two typos: crated  & implicilty
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>
>>>>> It's drawback is only that for this code:
>>>>>      ex = null;
>>>>>      ex.fillInStackTrace()
>>>>> no message is created.
>>>>>
>>>>> I think this really is acceptable.
>>>>>
>>>>>
>>>>> Remi, I didn't comment on this statement from a previous mail:
>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at some point.
>>>>>> yes, it contains the Java stack trace, but if the Java stack trace is filled
>> you
>>>> don't
>>>>>> compute any helpful message anyway.
>>>>> The internal structure is no more deleted when the stack trace
>>>>> is filled. So the message can be computed later, too.
>>>>>
>>>>> Best regards,
>>>>>      Goetz.
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph Dreis
>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-runtime-
>>>>>> dev at openjdk.java.net>; David Holmes <david.holmes at oracle.com>
>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>> message
>>>>>> after calling fillInStackTrace
>>>>>>
>>>>>> yes,
>>>>>> it's what i was saying,
>>>>>> given that a NPE can be thrown very early, before VarHandle is
>> initialized,
>>>> i
>>>>>> believe that declaring numStackTracesFilledIn volatile is the best way to
>>>>>> tackle that.
>>>>>>
>>>>>> R?mi
>>>>>>
>>>>>> ----- Mail original -----
>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph
>>>> Dreis"
>>>>>> <christoph.dreis at freenet.de>
>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>,
>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>> <forax at univ-mlv.fr>
>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>>>>> after calling fillInStackTrace
>>>>>>
>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>> Hi Christoph,
>>>>>>>>
>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>
>>>>>>> One other thing is that NPE::getMessage reads numStackTracesFilledIn
>>>>>>> without synchronization.
>>>>>>>
>>>>>>> -Alan

From thomas.stuefe at gmail.com  Sat Jul  4 05:35:19 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 4 Jul 2020 07:35:19 +0200
Subject: [16] RFR(T) 8248426: NMT:
 VirtualMemoryTracker::split_reserved_region()
 does not properly update summary counting
In-Reply-To: <75c254fe-64dc-96ab-4a7b-7975bbfb29a6@redhat.com>
References: <9d25bba3-f651-a8bd-aab0-3f561c262b37@redhat.com>
 <CAA-vtUzod1A4xikHuKjMX9TO7ctzH-C=OFQJgKou3OfCaDz-_g@mail.gmail.com>
 <CAA-vtUyjXL0dCOUjzn+T-41c_VYBJkPJx2PCt6_qEvze9muA4A@mail.gmail.com>
 <e9a27b65-da90-a8db-3953-0c235acd4bf0@redhat.com>
 <CAA-vtUzM0f0+LeYDD9JW6FP-0n18+7F9_C-AZZMWPL0L8STE_A@mail.gmail.com>
 <75c254fe-64dc-96ab-4a7b-7975bbfb29a6@redhat.com>
Message-ID: <CAA-vtUyoZuRBkMkMmpdQuK+TUZmU6Op8DWHfWG8YZniAhWPK2Q@mail.gmail.com>

Hi Zhengyu,

sorry for the wait.

On Tue, Jun 30, 2020 at 3:51 PM Zhengyu Gu <zgu at redhat.com> wrote:

> Hi Thomas,
>
> On 6/30/20 12:26 AM, Thomas St?fe wrote:
> >
> >
> > On Mon, Jun 29, 2020 at 4:29 PM Zhengyu Gu <zgu at redhat.com
> > <mailto:zgu at redhat.com>> wrote:
> >
> >     Hi Thomas,
> >
> >     On 6/29/20 2:30 AM, Thomas St?fe wrote:
> >      >
> >      >     - splitting the reserved region
> >      >
> >      >     Could we not simply add the call
> >      >     to VirtualMemorySummary::record_released_memory(size,
> >      >     reserved_rgn->flag()); to
> >      >     VirtualMemoryTracker::split_reserved_region() instead?
> >
> >     I assume you could. Then removing and adding reserved region calls
> >     are asymmetrical.
> >
> >
> > How so?
>
> I don't like to mix different level calls in one method.
>
> >
> >     For this particular case, remove_released_region() turns out to be
> >     exactly what you suggested, since reserved_rgn->same_region(addr,
> size)
> >     == true.
> >
> >     ...
> >     VirtualMemorySummary::record_released_memory(size,
> >     reserved_rgn->flag());
> >
> >     ...
> >
> >     return _reserved_regions->remove(rgn);
> >
> >
> >
> > One other disadvantage is that
> > using VirtualMemoryTracker::remove_released_region() will do the region
> > lookup again, so we pay twice for the lookup.
>
> Yep.
>
> How about refactoring remove_released_region() method into two? it
> addresses both problems.
>
> http://cr.openjdk.java.net/~zgu/JDK-8248426/webrev.01/index.html
>
>
This looks good, but why did you remove the "!same region" condition? I
believe that is needed for the case when CDS' first mapping encounters
errors, so before it rebuilds CDS at another location it removes (a) all
mappings and then (b) the enclosing reservation. (a) should be ignored by
NMT but (b) should not. I may be wrong, I have had no coffee yet.

Cheers, Thomas

Thanks,
>
> -Zhengyu
>
> >
> > But if you are still unconvinced, I won't hold you up. The change
> > certainly works as it is now and is okay for me, it just would not be my
> > preferred solution.
> >
> > Cheers, Thomas
> >
> >     Thanks,
> >
> >     -Zhengyu
> >
> >
> >      >
> >      >     Thanks, Thomas
> >      >
> >      >
> >      >
> >      >
> >      >
> >      >     On Fri, Jun 26, 2020 at 10:54 PM Zhengyu Gu <zgu at redhat.com
> >     <mailto:zgu at redhat.com>
> >      >     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>> wrote:
> >      >
> >      >         Hi,
> >      >
> >      >         Please review this trivial patch that fixes summary
> >     counting in
> >      >         VirtualMemoryTracker::split_reserved_region().
> >      >
> >      >         The method uses internal method to remove a reserved
> region,
> >      >         which does
> >      >         not update counting information. It should use high level
> >     tracking
> >      >         method VirtualMemoryTracker::remove_released_region()
> >     instead.
> >      >
> >      >         Without patch, NMT summary reports uncategorized memory,
> e.g.
> >      >
> >      >         -                   Unknown (reserved=1060416KB,
> >     committed=0KB)
> >      >                                       (mmap: reserved=1060416KB,
> >      >         committed=0KB)
> >      >
> >      >
> >      >
> >      >
> >      >         Bug: https://bugs.openjdk.java.net/browse/JDK-8248426
> >      >         Webrev:
> >     http://cr.openjdk.java.net/~zgu/JDK-8248426/webrev.00/
> >      >
> >      >         Test:
> >      >             hotspot_nmt
> >      >             Submit test in progress
> >      >
> >      >
> >      >         Thanks,
> >      >
> >      >         -Zhengyu
> >      >
> >
>
>

From aph at redhat.com  Sat Jul  4 07:30:52 2020
From: aph at redhat.com (Andrew Haley)
Date: Sat, 4 Jul 2020 08:30:52 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
Message-ID: <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>

On 03/07/2020 18:03, Andrew Haley wrote:
> On 03/07/2020 10:07, Yangfei (Felix) wrote:
>> Is it OK to push?
> 
> Sure, thanks.

We will need backports for (deep intake of breath) 15, 11 and 8. Can
you please do these as well? We'd be very grateful!

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From felix.yang at huawei.com  Mon Jul  6 02:09:51 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Mon, 6 Jul 2020 02:09:51 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>

Hi,

> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Saturday, July 4, 2020 3:31 PM
> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-runtime-
> dev at openjdk.java.net
> Cc: aarch64-port-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
> barrier in fast_storefield and fast_accessfield
> 
> On 03/07/2020 18:03, Andrew Haley wrote:
> > On 03/07/2020 10:07, Yangfei (Felix) wrote:
> >> Is it OK to push?
> >
> > Sure, thanks.

Thanks.  This has been pushed to jdk/jdk: https://hg.openjdk.java.net/jdk/jdk/rev/b9529fcbbd33 

> We will need backports for (deep intake of breath) 15, 11 and 8. Can you
> please do these as well? We'd be very grateful!

Sure.  These are on my radar.

1. For 8, I have prepared a backport webrev: http://cr.openjdk.java.net/~fyang/8248219-8u-backport/webrev.00/ 
     Jtreg tested with an 8u aarch64 release build.  OK for aarch64-port/jdk8u-shenandoah?

2. For 11, patch applies cleanly to jdk11u-dev and I have added a jdk11u-fix-request label and corresponding comment on the issue.

3. For 15, should I simply add 15 to " Affects Version/s" of the issue and push to jdk/jdk15 after necessary test?
     Please confirm if this is correct in procedure.

Felix

From aph at redhat.com  Mon Jul  6 08:44:29 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 6 Jul 2020 09:44:29 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
Message-ID: <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>

On 06/07/2020 03:09, Yangfei (Felix) wrote:
> 1. For 8, I have prepared a backport webrev: http://cr.openjdk.java.net/~fyang/8248219-8u-backport/webrev.00/ 
>      Jtreg tested with an 8u aarch64 release build.  OK for aarch64-port/jdk8u-shenandoah?

We need another approver. When we have one, please push to jdk8u-dev, not jdk8u-shenandoah.

> 2. For 11, patch applies cleanly to jdk11u-dev and I have added a jdk11u-fix-request label and corresponding comment on the issue.

Please also add a jdk8u-fix-request.

> 3. For 15, should I simply add 15 to " Affects Version/s" of the issue and push to jdk/jdk15 after necessary test?
>      Please confirm if this is correct in procedure.

I'm not sure.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From daniel.daugherty at oracle.com  Mon Jul  6 16:35:29 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 6 Jul 2020 12:35:29 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
Message-ID: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>

Greetings,

It's time to remove the AsyncDeflateIdleMonitors option from JDK16. We can
also get rid of the safepoint based deflation mechanism since turning off
async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way left to
use it.

This is marked as an "S/M" review because the number of touched/deleted
lines makes it a Medium review, but the number of touched/changed lines
(outside of the deletions) makes it a Small review. It's actually a pretty
fast read... :-)

Here's the bug ID:

 ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the safepoint
 ??????????????? based deflation mechanism
 ??? https://bugs.openjdk.java.net/browse/JDK-8246476

Here's the webrev URL:

 ??? http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/

The webrev is baselined on Thomas S's fix for 8248650 which is jdk-16+4
plus a dozen or so changesets.

This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and there are
no regressions (and very few known failures). My inflation stress testing
is still in process. I had to restart that testing after a thunderstorm
related power failure took down my servers in Florida. Sigh...

Thanks, in advance, for any comments, questions, or suggestions.

Dan

From mark.reinhold at oracle.com  Mon Jul  6 22:36:16 2020
From: mark.reinhold at oracle.com (mark.reinhold at oracle.com)
Date: Mon,  6 Jul 2020 15:36:16 -0700 (PDT)
Subject: New candidate JEP: 387: Elastic Metaspace
Message-ID: <20200706223616.1370F3BA704@eggemoggin.niobe.net>

https://openjdk.java.net/jeps/387

- Mark

From ioi.lam at oracle.com  Tue Jul  7 00:01:17 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 6 Jul 2020 17:01:17 -0700
Subject: RFR(T) 8248886 InstanceKlass::initialize_impl crashes with
 -XX:-UsePerfData after JDK-8246019
Message-ID: <7d3a687e-d97d-26f6-d55d-355135e81dd5@oracle.com>

Hi, here's a quick fix for a crash that happens with "java 
-XX:-UsePerfData -version"

$ hg diff
diff -r d90de88ba4d0 src/hotspot/share/oops/instanceKlass.cpp
--- a/src/hotspot/share/oops/instanceKlass.cpp??? Mon Jul 06 15:14:44 
2020 -0700
+++ b/src/hotspot/share/oops/instanceKlass.cpp??? Mon Jul 06 17:00:52 
2020 -0700
@@ -1169,7 +1169,9 @@
 ?????? call_class_initializer(THREAD);
 ???? } else {
 ?????? // The elapsed time is so small it's not worth counting.
-????? ClassLoader::perf_classes_inited()->inc();
+????? if (UsePerfData) {
+??????? ClassLoader::perf_classes_inited()->inc();
+????? }
 ?????? call_class_initializer(THREAD);
 ???? }
 ?? }


Thanks
- Ioi


From calvin.cheung at oracle.com  Tue Jul  7 00:31:23 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Mon, 6 Jul 2020 17:31:23 -0700
Subject: RFR(T) 8248886 InstanceKlass::initialize_impl crashes with
 -XX:-UsePerfData after JDK-8246019
In-Reply-To: <7d3a687e-d97d-26f6-d55d-355135e81dd5@oracle.com>
References: <7d3a687e-d97d-26f6-d55d-355135e81dd5@oracle.com>
Message-ID: <9e0e6be1-dace-01bd-b183-eec85bb53dc6@oracle.com>

Hi Ioi,

The fix looks good and it seems trivial.

thanks,

Calvin

On 7/6/20 5:01 PM, Ioi Lam wrote:
> Hi, here's a quick fix for a crash that happens with "java 
> -XX:-UsePerfData -version"
>
> $ hg diff
> diff -r d90de88ba4d0 src/hotspot/share/oops/instanceKlass.cpp
> --- a/src/hotspot/share/oops/instanceKlass.cpp??? Mon Jul 06 15:14:44 
> 2020 -0700
> +++ b/src/hotspot/share/oops/instanceKlass.cpp??? Mon Jul 06 17:00:52 
> 2020 -0700
> @@ -1169,7 +1169,9 @@
> ?????? call_class_initializer(THREAD);
> ???? } else {
> ?????? // The elapsed time is so small it's not worth counting.
> -????? ClassLoader::perf_classes_inited()->inc();
> +????? if (UsePerfData) {
> +??????? ClassLoader::perf_classes_inited()->inc();
> +????? }
> ?????? call_class_initializer(THREAD);
> ???? }
> ?? }
>
>
> Thanks
> - Ioi
>
>

From ioi.lam at oracle.com  Tue Jul  7 00:53:43 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 6 Jul 2020 17:53:43 -0700
Subject: RFR(T) 8248886 InstanceKlass::initialize_impl crashes with
 -XX:-UsePerfData after JDK-8246019
In-Reply-To: <9e0e6be1-dace-01bd-b183-eec85bb53dc6@oracle.com>
References: <7d3a687e-d97d-26f6-d55d-355135e81dd5@oracle.com>
 <9e0e6be1-dace-01bd-b183-eec85bb53dc6@oracle.com>
Message-ID: <f69d3d38-ddb0-3f01-5f09-f8ffbc7a474d@oracle.com>

Thanks Calvin! Pushed.

- Ioi

On 7/6/20 5:31 PM, Calvin Cheung wrote:
> Hi Ioi,
>
> The fix looks good and it seems trivial.
>
> thanks,
>
> Calvin
>
> On 7/6/20 5:01 PM, Ioi Lam wrote:
>> Hi, here's a quick fix for a crash that happens with "java 
>> -XX:-UsePerfData -version"
>>
>> $ hg diff
>> diff -r d90de88ba4d0 src/hotspot/share/oops/instanceKlass.cpp
>> --- a/src/hotspot/share/oops/instanceKlass.cpp??? Mon Jul 06 15:14:44 
>> 2020 -0700
>> +++ b/src/hotspot/share/oops/instanceKlass.cpp??? Mon Jul 06 17:00:52 
>> 2020 -0700
>> @@ -1169,7 +1169,9 @@
>> ?????? call_class_initializer(THREAD);
>> ???? } else {
>> ?????? // The elapsed time is so small it's not worth counting.
>> -????? ClassLoader::perf_classes_inited()->inc();
>> +????? if (UsePerfData) {
>> +??????? ClassLoader::perf_classes_inited()->inc();
>> +????? }
>> ?????? call_class_initializer(THREAD);
>> ???? }
>> ?? }
>>
>>
>> Thanks
>> - Ioi
>>
>>


From felix.yang at huawei.com  Tue Jul  7 01:20:34 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Tue, 7 Jul 2020 01:20:34 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>

Hi,

> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Monday, July 6, 2020 4:44 PM
> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-runtime-
> dev at openjdk.java.net
> Cc: aarch64-port-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
> barrier in fast_storefield and fast_accessfield
> 
> On 06/07/2020 03:09, Yangfei (Felix) wrote:
> > 1. For 8, I have prepared a backport webrev:
> http://cr.openjdk.java.net/~fyang/8248219-8u-backport/webrev.00/
> >      Jtreg tested with an 8u aarch64 release build.  OK for aarch64-
> port/jdk8u-shenandoah?
> 
> We need another approver. When we have one, please push to jdk8u-dev,
> not jdk8u-shenandoah.

Hmm... I haven't seen an aarch64 port in jdk8u-dev yet.

> > 2. For 11, patch applies cleanly to jdk11u-dev and I have added a jdk11u-fix-
> request label and corresponding comment on the issue.
> 
> Please also add a jdk8u-fix-request.

I saw the jdk11u-fix-yes label was added and I have pushed to jdk11u-dev.
As I remembered, jdk8u-fix-request label could be added when the aarch64 port is merged to jdk8u master.

> > 3. For 15, should I simply add 15 to " Affects Version/s" of the issue and
> push to jdk/jdk15 after necessary test?
> >      Please confirm if this is correct in procedure.
> 
> I'm not sure.

CCing to jdk-dev. 
So the question is: Is it OK to push this fix to jdk/jdk15 for now given that JDK15 is currently in Rampdown Phase One? 
Could someone help elaborate on the correct procedure please?  

From david.holmes at oracle.com  Tue Jul  7 01:30:41 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 7 Jul 2020 11:30:41 +1000
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>
Message-ID: <240133a3-17c9-ef16-5359-812d153ef0d7@oracle.com>

Hi Felix,

On 7/07/2020 11:20 am, Yangfei (Felix) wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Andrew Haley [mailto:aph at redhat.com]
>> Sent: Monday, July 6, 2020 4:44 PM
>> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-runtime-
>> dev at openjdk.java.net
>> Cc: aarch64-port-dev at openjdk.java.net
>> Subject: Re: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
>> barrier in fast_storefield and fast_accessfield
>>
>> On 06/07/2020 03:09, Yangfei (Felix) wrote:
>>> 1. For 8, I have prepared a backport webrev:
>> http://cr.openjdk.java.net/~fyang/8248219-8u-backport/webrev.00/
>>>       Jtreg tested with an 8u aarch64 release build.  OK for aarch64-
>> port/jdk8u-shenandoah?
>>
>> We need another approver. When we have one, please push to jdk8u-dev,
>> not jdk8u-shenandoah.
> 
> Hmm... I haven't seen an aarch64 port in jdk8u-dev yet.
> 
>>> 2. For 11, patch applies cleanly to jdk11u-dev and I have added a jdk11u-fix-
>> request label and corresponding comment on the issue.
>>
>> Please also add a jdk8u-fix-request.
> 
> I saw the jdk11u-fix-yes label was added and I have pushed to jdk11u-dev.
> As I remembered, jdk8u-fix-request label could be added when the aarch64 port is merged to jdk8u master.
> 
>>> 3. For 15, should I simply add 15 to " Affects Version/s" of the issue and
>> push to jdk/jdk15 after necessary test?
>>>       Please confirm if this is correct in procedure.
>>
>> I'm not sure.
> 
> CCing to jdk-dev.
> So the question is: Is it OK to push this fix to jdk/jdk15 for now given that JDK15 is currently in Rampdown Phase One?
> Could someone help elaborate on the correct procedure please?

This is a P3 bug and so can be fixed in RDP1 [1]. It could have been 
pushed directly to the jdk15 repo and would then have been automatically 
forward ported to the jdk repo for JDK 16. But it is fine to hg export 
from the main jdk repo and hg import into jdk15.

[1] http://openjdk.java.net/jeps/3#rdp-1

Thanks,
David
-----

From sakatakui at oss.nttdata.com  Tue Jul  7 06:46:58 2020
From: sakatakui at oss.nttdata.com (Koichi Sakata)
Date: Tue, 7 Jul 2020 15:46:58 +0900
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
Message-ID: <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>

Hi all,

>>>  > https://bugs.openjdk.java.net/browse/JDK-8247818

Could we start addressing the issue (about GCC 10 build warnings)?

There is an unclear point as follows.In my opinion, that doesn't cause 
any problems if it is historical or an alignment issue. Mainly the 
reason is that the byte_at_put function is a private function.

>>>>> +  } else {
>>>>> +    memcpy(_body, name, length);
>>>>>     }
>>>>>   }
>>>>
>>>> So you are replacing byte_at_put with a memcpy call. On the surface
>>>> that seems reasonable, but I have to wonder why we were using the
>>>> loop in the first place. It may just be historical or it may relate
>>>> to an alignment issue, or something else. Hopefully someone else
>>>> (e.g. Coleen :) ) can shed more light here.

Thanks,
Koichi


On 2020/06/19 15:11, Koichi Sakata wrote:
>  > Probably best to re-send as the mention of "Hotspot" in subject might
>  > put off core-libs folk from looking at it. :)
> 
> I will do so.Thank you.
> 
> Thanks,
> Koichi
> 
> On 2020/06/19 14:56, David Holmes wrote:
>> On 19/06/2020 11:59 am, Koichi Sakata wrote:
>>> Hi David,
>>>
>>> ?> This is in relation to the hotspot part as these issues need to be
>>> ?> handled separately. I have filed:
>>> ?>
>>> ?> https://bugs.openjdk.java.net/browse/JDK-8247818
>>>
>>> Thank you, David.I have something to ask you.
>>> Should I send only the other part of the patch (i.e. 
>>> NetworkInterface.c and k_standard.c) to core-lib ML again? I've sent 
>>> the whole one to core-lib before.
>>
>> Probably best to re-send as the mention of "Hotspot" in subject might 
>> put off core-libs folk from looking at it. :)
>>
>> Cheers,
>> David
>>
>>> ?> I'm not really clear on the warning here but this is an area where we
>>> ?> trick the compiler somewhat. The _body[] is declared with a size 
>>> of 2,
>>> ?> but when we allocate Symbols we allocate sufficient memory for 
>>> _body to
>>> ?> contain the entire symbol.
>>> ?>
>>> ?> That said I'm struggling to see how we allocate the additional space
>>> ?> needed for the _hash_and_refcount and _length fields ???
>>>
>>> I was thinking exactly the same thing. I've learned a lot from Ioi's 
>>> explanation.
>>>
>>> ?> The check for length==0 introduces more overhead than just always
>>> ?> setting _body[0]=0, so there is no need to add it.
>>>
>>> I understood that clearly. Thank you for teaching me.
>>>
>>> Thanks,
>>> Koichi
>>>
>>> On 2020/06/18 10:56, David Holmes wrote:
>>>> Hi Koichi,
>>>>
>>>> This is in relation to the hotspot part as these issues need to be 
>>>> handled separately. I have filed:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8247818
>>>>
>>>> On 18/06/2020 8:46 am, Koichi Sakata wrote:
>>>>> Hi all,
>>>>>
>>>>> I tried to build OpenJDK fastdebug with GCC 10.1 on Ubuntu 18.04, 
>>>>> but I saw some compiler warnings as follows:
>>>>>
>>>>> In file included from 
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/classfile/systemDictionary.hpp:31, 
>>>>>
>>>>> ????????????????? from 
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/classfile/javaClasses.hpp:28, 
>>>>>
>>>>> ????????????????? from 
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/precompiled/precompiled.hpp:35: 
>>>>>
>>>>> In member function 'void Symbol::byte_at_put(int, u1)',
>>>>> ???? inlined from 'Symbol::Symbol(const u1*, int, int)' at 
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/oops/symbol.cpp:55:16:
>>>>> /home/jyukutyo/code/jdk/src/hotspot/share/oops/symbol.hpp:130:18: 
>>>>> error: writing 1 byte into a region of size 0 
>>>>> [-Werror=stringop-overflow=]
>>>>> ?? 130 |???? _body[index] = value;
>>>>> ?????? |???? ~~~~~~~~~~~~~^~~~~~~
>>>>
>>>> I'm not really clear on the warning here but this is an area where 
>>>> we trick the compiler somewhat. The _body[] is declared with a size 
>>>> of 2, but when we allocate Symbols we allocate sufficient memory for 
>>>> _body to contain the entire symbol.
>>>>
>>>> That said I'm struggling to see how we allocate the additional space 
>>>> needed for the _hash_and_refcount and _length fields ???
>>>>
>>>>> I can resolve them with the following patch. I believe it fixes 
>>>>> those potential bugs, so I'd like to contribute it.
>>>>> (Our company has signed OCA.)
>>>>>
>>>>> Thanks,
>>>>> Koichi
>>>>>
>>>>> ===== PATCH =====
>>>>> diff -r 20d92fe3ac52 src/hotspot/share/oops/symbol.cpp
>>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Tue Jun 16 03:16:41 2020 
>>>>> +0000
>>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Thu Jun 18 07:08:50 2020 
>>>>> +0900
>>>>> @@ -50,9 +50,10 @@
>>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>> ??? _hash_and_refcount = 
>>>>> pack_hash_and_refcount((short)os::random(), refcount);
>>>>> ??? _length = length;
>>>>> -? _body[0] = 0;? // in case length == 0
>>>>> -? for (int i = 0; i < length; i++) {
>>>>> -??? byte_at_put(i, name[i]);
>>>>> +? if (length == 0) {
>>>>> +??? _body[0] = 0;
>>>>
>>>> The check for length==0 introduces more overhead than just always 
>>>> setting _body[0]=0, so there is no need to add it.
>>>>
>>>>> +? } else {
>>>>> +??? memcpy(_body, name, length);
>>>>> ??? }
>>>>> ??}
>>>>
>>>> So you are replacing byte_at_put with a memcpy call. On the surface 
>>>> that seems reasonable, but I have to wonder why we were using the 
>>>> loop in the first place. It may just be historical or it may relate 
>>>> to an alignment issue, or something else. Hopefully someone else 
>>>> (e.g. Coleen :) ) can shed more light here.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> diff -r 20d92fe3ac52 src/hotspot/share/oops/symbol.hpp
>>>>> --- a/src/hotspot/share/oops/symbol.hpp??? Tue Jun 16 03:16:41 2020 
>>>>> +0000
>>>>> +++ b/src/hotspot/share/oops/symbol.hpp??? Thu Jun 18 07:08:50 2020 
>>>>> +0900
>>>>> @@ -125,11 +125,6 @@
>>>>> ????? return (int)heap_word_size(byte_size(length));
>>>>> ??? }
>>>>>
>>>>> -? void byte_at_put(int index, u1 value) {
>>>>> -??? assert(index >=0 && index < length(), "symbol index overflow");
>>>>> -??? _body[index] = value;
>>>>> -? }
>>>>> -
>>>>> ??? Symbol(const u1* name, int length, int refcount);
>>>>> ??? void* operator new(size_t size, int len) throw();
>>>>> ??? void* operator new(size_t size, int len, Arena* arena) throw();
>>>>
>>>>

From sgehwolf at redhat.com  Tue Jul  7 08:45:47 2020
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Tue, 07 Jul 2020 10:45:47 +0200
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
Message-ID: <28f344dfa7e834662d68db89d984a07e45b38e60.camel@redhat.com>

On Mon, 2020-07-06 at 09:44 +0100, Andrew Haley wrote:
> On 06/07/2020 03:09, Yangfei (Felix) wrote:
> 
> > 1. For 8, I have prepared a backport webrev: http://cr.openjdk.java.net/~fyang/8248219-8u-backport/webrev.00/ 
> >       Jtreg tested with an 8u aarch64 release build.  OK for aarch64-port/jdk8u-shenandoah?
> 
> 
> We need another approver. When we have one, please push to jdk8u-dev, not jdk8u-shenandoah.

When you say jdk8u-dev do you mean this tree?
http://hg.openjdk.java.net/jdk8u/jdk8u-dev/

If so, that one doesn't have aarch64. I'd think aarch64-port/jdk8u-
shenandoah would be the right place, no?

Thanks,
Severin


From aph at redhat.com  Tue Jul  7 08:50:09 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 7 Jul 2020 09:50:09 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>
Message-ID: <d16feffb-31d9-6a9c-0ae3-8cb0ee2ab2e7@redhat.com>

On 07/07/2020 02:20, Yangfei (Felix) wrote:
>>> 1. For 8, I have prepared a backport webrev:
>> http://cr.openjdk.java.net/~fyang/8248219-8u-backport/webrev.00/
>>>      Jtreg tested with an 8u aarch64 release build.  OK for aarch64-
>> port/jdk8u-shenandoah?
>>
>> We need another approver. When we have one, please push to jdk8u-dev,
>> not jdk8u-shenandoah.
> 
> Hmm... I haven't seen an aarch64 port in jdk8u-dev yet.

Sorry, my brain crashed. You are right.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Jul  7 08:50:42 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 7 Jul 2020 09:50:42 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <28f344dfa7e834662d68db89d984a07e45b38e60.camel@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
 <28f344dfa7e834662d68db89d984a07e45b38e60.camel@redhat.com>
Message-ID: <b6e9b967-ea68-53a2-a4f2-e94af28e9176@redhat.com>

On 07/07/2020 09:45, Severin Gehwolf wrote:
> When you say jdk8u-dev do you mean this tree?
> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/
> 
> If so, that one doesn't have aarch64. I'd think aarch64-port/jdk8u-
> shenandoah would be the right place, no?

Yes, yes.  :-)

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From felix.yang at huawei.com  Tue Jul  7 08:59:48 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Tue, 7 Jul 2020 08:59:48 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <d16feffb-31d9-6a9c-0ae3-8cb0ee2ab2e7@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>
 <d16feffb-31d9-6a9c-0ae3-8cb0ee2ab2e7@redhat.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E6404D@dggeml527-mbx.china.huawei.com>

> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Tuesday, July 7, 2020 4:50 PM
> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-runtime-
> dev at openjdk.java.net
> Cc: aarch64-port-dev at openjdk.java.net; jdk-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
> barrier in fast_storefield and fast_accessfield
> 
> On 07/07/2020 02:20, Yangfei (Felix) wrote:
> >>> 1. For 8, I have prepared a backport webrev:
> >> http://cr.openjdk.java.net/~fyang/8248219-8u-backport/webrev.00/
> >>>      Jtreg tested with an 8u aarch64 release build.  OK for aarch64-
> >> port/jdk8u-shenandoah?
> >>
> >> We need another approver. When we have one, please push to jdk8u-
> dev,
> >> not jdk8u-shenandoah.
> >
> > Hmm... I haven't seen an aarch64 port in jdk8u-dev yet.
> 
> Sorry, my brain crashed. You are right.

Thanks for confirming that.  Will push this to aarch64-port/jdk8u-shenandoah.

Felix

From zgu at redhat.com  Tue Jul  7 11:53:21 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 7 Jul 2020 07:53:21 -0400
Subject: [16] RFR(T) 8248426: NMT:
 VirtualMemoryTracker::split_reserved_region() does not properly update
 summary counting
In-Reply-To: <CAA-vtUyoZuRBkMkMmpdQuK+TUZmU6Op8DWHfWG8YZniAhWPK2Q@mail.gmail.com>
References: <9d25bba3-f651-a8bd-aab0-3f561c262b37@redhat.com>
 <CAA-vtUzod1A4xikHuKjMX9TO7ctzH-C=OFQJgKou3OfCaDz-_g@mail.gmail.com>
 <CAA-vtUyjXL0dCOUjzn+T-41c_VYBJkPJx2PCt6_qEvze9muA4A@mail.gmail.com>
 <e9a27b65-da90-a8db-3953-0c235acd4bf0@redhat.com>
 <CAA-vtUzM0f0+LeYDD9JW6FP-0n18+7F9_C-AZZMWPL0L8STE_A@mail.gmail.com>
 <75c254fe-64dc-96ab-4a7b-7975bbfb29a6@redhat.com>
 <CAA-vtUyoZuRBkMkMmpdQuK+TUZmU6Op8DWHfWG8YZniAhWPK2Q@mail.gmail.com>
Message-ID: <f80a519d-e0f1-a1c7-6e2c-77a86c14bae1@redhat.com>

Hi Thomas,

On 7/4/20 1:35 AM, Thomas St?fe wrote:
> Hi Zhengyu,
> 
> sorry for the wait.

No problem.

> 
> 
> This looks good, but why did you remove the "!same region" condition? I 
> believe that is needed for the case when CDS' first mapping encounters 
> errors, so before it rebuilds CDS at another location it removes (a) all 
> mappings and then (b) the enclosing reservation. (a) should be ignored 
> by NMT but (b) should not. I may be wrong, I have had no coffee yet.
> Cheers, Thomas

In new version, same region is handled in line #470

  470   if (reserved_rgn->same_region(addr, size)) {
  471     return remove_released_region(reserved_rgn);
  472   }

so, !same region is always true.

Thanks,

-Zhengyu

> 
>     Thanks,
> 
>     -Zhengyu
> 
>      >
>      > But if you are still unconvinced, I won't hold you up. The change
>      > certainly works as it is now and is okay for me, it just would
>     not be my
>      > preferred solution.
>      >
>      > Cheers, Thomas
>      >
>      >? ? ?Thanks,
>      >
>      >? ? ?-Zhengyu
>      >
>      >
>      >? ? ? >
>      >? ? ? >? ? ?Thanks, Thomas
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >? ? ?On Fri, Jun 26, 2020 at 10:54 PM Zhengyu Gu
>     <zgu at redhat.com <mailto:zgu at redhat.com>
>      >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>      >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>> wrote:
>      >? ? ? >
>      >? ? ? >? ? ? ? ?Hi,
>      >? ? ? >
>      >? ? ? >? ? ? ? ?Please review this trivial patch that fixes summary
>      >? ? ?counting in
>      >? ? ? >? ? ? ? ?VirtualMemoryTracker::split_reserved_region().
>      >? ? ? >
>      >? ? ? >? ? ? ? ?The method uses internal method to remove a
>     reserved region,
>      >? ? ? >? ? ? ? ?which does
>      >? ? ? >? ? ? ? ?not update counting information. It should use
>     high level
>      >? ? ?tracking
>      >? ? ? >? ? ? ? ?method VirtualMemoryTracker::remove_released_region()
>      >? ? ?instead.
>      >? ? ? >
>      >? ? ? >? ? ? ? ?Without patch, NMT summary reports uncategorized
>     memory, e.g.
>      >? ? ? >
>      >? ? ? >? ? ? ? ?-? ? ? ? ? ? ? ? ? ?Unknown (reserved=1060416KB,
>      >? ? ?committed=0KB)
>      >? ? ? >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(mmap:
>     reserved=1060416KB,
>      >? ? ? >? ? ? ? ?committed=0KB)
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >? ? ? ? ?Bug: https://bugs.openjdk.java.net/browse/JDK-8248426
>      >? ? ? >? ? ? ? ?Webrev:
>      > http://cr.openjdk.java.net/~zgu/JDK-8248426/webrev.00/
>      >? ? ? >
>      >? ? ? >? ? ? ? ?Test:
>      >? ? ? >? ? ? ? ? ? ?hotspot_nmt
>      >? ? ? >? ? ? ? ? ? ?Submit test in progress
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >? ? ? ? ?Thanks,
>      >? ? ? >
>      >? ? ? >? ? ? ? ?-Zhengyu
>      >? ? ? >
>      >
> 


From felix.yang at huawei.com  Tue Jul  7 12:47:18 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Tue, 7 Jul 2020 12:47:18 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <240133a3-17c9-ef16-5359-812d153ef0d7@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>
 <240133a3-17c9-ef16-5359-812d153ef0d7@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E646A0@dggeml527-mbx.china.huawei.com>

Hi David,

> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Tuesday, July 7, 2020 9:31 AM
> To: Yangfei (Felix) <felix.yang at huawei.com>; Andrew Haley
> <aph at redhat.com>; hotspot-runtime-dev at openjdk.java.net
> Cc: aarch64-port-dev at openjdk.java.net; jdk-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
> barrier in fast_storefield and fast_accessfield
> 

Cut...

> >
> > CCing to jdk-dev.
> > So the question is: Is it OK to push this fix to jdk/jdk15 for now given that
> JDK15 is currently in Rampdown Phase One?
> > Could someone help elaborate on the correct procedure please?
> 
> This is a P3 bug and so can be fixed in RDP1 [1]. It could have been pushed
> directly to the jdk15 repo and would then have been automatically forward
> ported to the jdk repo for JDK 16. But it is fine to hg export from the main jdk
> repo and hg import into jdk15.
> 
> [1] http://openjdk.java.net/jeps/3#rdp-1

Thanks for the quick reply.  It's clear now.
Patch from the main jdk repo applies cleanly to jdk15.
I have performed Tier1-3 testing with an aarch64 jdk15 release build.  Will push.

Felix

From thomas.stuefe at gmail.com  Tue Jul  7 13:34:18 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 7 Jul 2020 15:34:18 +0200
Subject: [16] RFR(T) 8248426: NMT:
 VirtualMemoryTracker::split_reserved_region()
 does not properly update summary counting
In-Reply-To: <f80a519d-e0f1-a1c7-6e2c-77a86c14bae1@redhat.com>
References: <9d25bba3-f651-a8bd-aab0-3f561c262b37@redhat.com>
 <CAA-vtUzod1A4xikHuKjMX9TO7ctzH-C=OFQJgKou3OfCaDz-_g@mail.gmail.com>
 <CAA-vtUyjXL0dCOUjzn+T-41c_VYBJkPJx2PCt6_qEvze9muA4A@mail.gmail.com>
 <e9a27b65-da90-a8db-3953-0c235acd4bf0@redhat.com>
 <CAA-vtUzM0f0+LeYDD9JW6FP-0n18+7F9_C-AZZMWPL0L8STE_A@mail.gmail.com>
 <75c254fe-64dc-96ab-4a7b-7975bbfb29a6@redhat.com>
 <CAA-vtUyoZuRBkMkMmpdQuK+TUZmU6Op8DWHfWG8YZniAhWPK2Q@mail.gmail.com>
 <f80a519d-e0f1-a1c7-6e2c-77a86c14bae1@redhat.com>
Message-ID: <CAA-vtUzd0kOEY5a1-zskhn6WjNt_t8ZeFgkEGJEF6=Cy+woATA@mail.gmail.com>

Hi Zhengyu,

okay, I get it now. Thank you. Reviewed from my side.

Cheers, Thomas

On Tue, Jul 7, 2020 at 1:53 PM Zhengyu Gu <zgu at redhat.com> wrote:

> Hi Thomas,
>
> On 7/4/20 1:35 AM, Thomas St?fe wrote:
> > Hi Zhengyu,
> >
> > sorry for the wait.
>
> No problem.
>
> >
> >
> > This looks good, but why did you remove the "!same region" condition? I
> > believe that is needed for the case when CDS' first mapping encounters
> > errors, so before it rebuilds CDS at another location it removes (a) all
> > mappings and then (b) the enclosing reservation. (a) should be ignored
> > by NMT but (b) should not. I may be wrong, I have had no coffee yet.
> > Cheers, Thomas
>
> In new version, same region is handled in line #470
>
>   470   if (reserved_rgn->same_region(addr, size)) {
>   471     return remove_released_region(reserved_rgn);
>   472   }
>
> so, !same region is always true.
>
> Thanks,
>
> -Zhengyu
>
> >
> >     Thanks,
> >
> >     -Zhengyu
> >
> >      >
> >      > But if you are still unconvinced, I won't hold you up. The change
> >      > certainly works as it is now and is okay for me, it just would
> >     not be my
> >      > preferred solution.
> >      >
> >      > Cheers, Thomas
> >      >
> >      >     Thanks,
> >      >
> >      >     -Zhengyu
> >      >
> >      >
> >      >      >
> >      >      >     Thanks, Thomas
> >      >      >
> >      >      >
> >      >      >
> >      >      >
> >      >      >
> >      >      >     On Fri, Jun 26, 2020 at 10:54 PM Zhengyu Gu
> >     <zgu at redhat.com <mailto:zgu at redhat.com>
> >      >     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
> >      >      >     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>
> >     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>> wrote:
> >      >      >
> >      >      >         Hi,
> >      >      >
> >      >      >         Please review this trivial patch that fixes summary
> >      >     counting in
> >      >      >         VirtualMemoryTracker::split_reserved_region().
> >      >      >
> >      >      >         The method uses internal method to remove a
> >     reserved region,
> >      >      >         which does
> >      >      >         not update counting information. It should use
> >     high level
> >      >     tracking
> >      >      >         method
> VirtualMemoryTracker::remove_released_region()
> >      >     instead.
> >      >      >
> >      >      >         Without patch, NMT summary reports uncategorized
> >     memory, e.g.
> >      >      >
> >      >      >         -                   Unknown (reserved=1060416KB,
> >      >     committed=0KB)
> >      >      >                                       (mmap:
> >     reserved=1060416KB,
> >      >      >         committed=0KB)
> >      >      >
> >      >      >
> >      >      >
> >      >      >
> >      >      >         Bug:
> https://bugs.openjdk.java.net/browse/JDK-8248426
> >      >      >         Webrev:
> >      > http://cr.openjdk.java.net/~zgu/JDK-8248426/webrev.00/
> >      >      >
> >      >      >         Test:
> >      >      >             hotspot_nmt
> >      >      >             Submit test in progress
> >      >      >
> >      >      >
> >      >      >         Thanks,
> >      >      >
> >      >      >         -Zhengyu
> >      >      >
> >      >
> >
>
>

From kim.barrett at oracle.com  Tue Jul  7 14:36:33 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 7 Jul 2020 10:36:33 -0400
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
Message-ID: <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>

> On Jul 7, 2020, at 2:46 AM, Koichi Sakata <sakatakui at oss.nttdata.com> wrote:
> 
> Hi all,
> 
>>>> > https://bugs.openjdk.java.net/browse/JDK-8247818
> 
> Could we start addressing the issue (about GCC 10 build warnings)?
> 
> There is an unclear point as follows.In my opinion, that doesn't cause any problems if it is historical or an alignment issue. Mainly the reason is that the byte_at_put function is a private function.
> 
>>>>>> +  } else {
>>>>>> +    memcpy(_body, name, length);
>>>>>>    }
>>>>>>  }
>>>>> 
>>>>> So you are replacing byte_at_put with a memcpy call. On the surface
>>>>> that seems reasonable, but I have to wonder why we were using the
>>>>> loop in the first place. It may just be historical or it may relate
>>>>> to an alignment issue, or something else. Hopefully someone else
>>>>> (e.g. Coleen :) ) can shed more light here.

I've only looked at the HotSpot change (to symbol.[ch]pp), so only
JDK-8247818.  Presumably there should be another bug for the core-libs
issues. 

Using memcpy instead of byte_at_put (and getting rid of byte_at_put)
seems like a good idea to me.

However, the first two elements of _body are used by identity_hash().
That seems like a possible reason to force initialization of both
elements, which currently isn't done for length == 1.  But maybe it
doesn't matter that identity_hash isn't consistent between processes,
in which case forcing the initialization of _body[0] also shouldn't
be needed.


From zgu at redhat.com  Tue Jul  7 15:12:23 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 7 Jul 2020 11:12:23 -0400
Subject: [16] RFR(T) 8248426: NMT:
 VirtualMemoryTracker::split_reserved_region() does not properly update
 summary counting
In-Reply-To: <CAA-vtUzd0kOEY5a1-zskhn6WjNt_t8ZeFgkEGJEF6=Cy+woATA@mail.gmail.com>
References: <9d25bba3-f651-a8bd-aab0-3f561c262b37@redhat.com>
 <CAA-vtUzod1A4xikHuKjMX9TO7ctzH-C=OFQJgKou3OfCaDz-_g@mail.gmail.com>
 <CAA-vtUyjXL0dCOUjzn+T-41c_VYBJkPJx2PCt6_qEvze9muA4A@mail.gmail.com>
 <e9a27b65-da90-a8db-3953-0c235acd4bf0@redhat.com>
 <CAA-vtUzM0f0+LeYDD9JW6FP-0n18+7F9_C-AZZMWPL0L8STE_A@mail.gmail.com>
 <75c254fe-64dc-96ab-4a7b-7975bbfb29a6@redhat.com>
 <CAA-vtUyoZuRBkMkMmpdQuK+TUZmU6Op8DWHfWG8YZniAhWPK2Q@mail.gmail.com>
 <f80a519d-e0f1-a1c7-6e2c-77a86c14bae1@redhat.com>
 <CAA-vtUzd0kOEY5a1-zskhn6WjNt_t8ZeFgkEGJEF6=Cy+woATA@mail.gmail.com>
Message-ID: <d08d855e-b435-a5a5-bd78-90ff2efc2ea5@redhat.com>

Thanks, Thomas.

-Zhengyu

On 7/7/20 9:34 AM, Thomas St?fe wrote:
> Hi Zhengyu,
> 
> okay, I get it now. Thank you. Reviewed from my side.
> 
> Cheers, Thomas
> 
> On Tue, Jul 7, 2020 at 1:53 PM Zhengyu Gu <zgu at redhat.com 
> <mailto:zgu at redhat.com>> wrote:
> 
>     Hi Thomas,
> 
>     On 7/4/20 1:35 AM, Thomas St?fe wrote:
>      > Hi Zhengyu,
>      >
>      > sorry for the wait.
> 
>     No problem.
> 
>      >
>      >
>      > This looks good, but why did you remove the "!same region"
>     condition? I
>      > believe that is needed for the case when CDS' first mapping
>     encounters
>      > errors, so before it rebuilds CDS at another location it removes
>     (a) all
>      > mappings and then (b) the enclosing reservation. (a) should be
>     ignored
>      > by NMT but (b) should not. I may be wrong, I have had no coffee yet.
>      > Cheers, Thomas
> 
>     In new version, same region is handled in line #470
> 
>      ? 470? ?if (reserved_rgn->same_region(addr, size)) {
>      ? 471? ? ?return remove_released_region(reserved_rgn);
>      ? 472? ?}
> 
>     so, !same region is always true.
> 
>     Thanks,
> 
>     -Zhengyu
> 
>      >
>      >? ? ?Thanks,
>      >
>      >? ? ?-Zhengyu
>      >
>      >? ? ? >
>      >? ? ? > But if you are still unconvinced, I won't hold you up. The
>     change
>      >? ? ? > certainly works as it is now and is okay for me, it just would
>      >? ? ?not be my
>      >? ? ? > preferred solution.
>      >? ? ? >
>      >? ? ? > Cheers, Thomas
>      >? ? ? >
>      >? ? ? >? ? ?Thanks,
>      >? ? ? >
>      >? ? ? >? ? ?-Zhengyu
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ?Thanks, Thomas
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ?On Fri, Jun 26, 2020 at 10:54 PM Zhengyu Gu
>      >? ? ?<zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>      >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>
>      >? ? ? >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>      >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>>> wrote:
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Hi,
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Please review this trivial patch that fixes
>     summary
>      >? ? ? >? ? ?counting in
>      >? ? ? >? ? ? >? ? ? ? ?VirtualMemoryTracker::split_reserved_region().
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?The method uses internal method to remove a
>      >? ? ?reserved region,
>      >? ? ? >? ? ? >? ? ? ? ?which does
>      >? ? ? >? ? ? >? ? ? ? ?not update counting information. It should use
>      >? ? ?high level
>      >? ? ? >? ? ?tracking
>      >? ? ? >? ? ? >? ? ? ? ?method
>     VirtualMemoryTracker::remove_released_region()
>      >? ? ? >? ? ?instead.
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Without patch, NMT summary reports
>     uncategorized
>      >? ? ?memory, e.g.
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?-? ? ? ? ? ? ? ? ? ?Unknown
>     (reserved=1060416KB,
>      >? ? ? >? ? ?committed=0KB)
>      >? ? ? >? ? ? >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(mmap:
>      >? ? ?reserved=1060416KB,
>      >? ? ? >? ? ? >? ? ? ? ?committed=0KB)
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Bug:
>     https://bugs.openjdk.java.net/browse/JDK-8248426
>      >? ? ? >? ? ? >? ? ? ? ?Webrev:
>      >? ? ? > http://cr.openjdk.java.net/~zgu/JDK-8248426/webrev.00/
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Test:
>      >? ? ? >? ? ? >? ? ? ? ? ? ?hotspot_nmt
>      >? ? ? >? ? ? >? ? ? ? ? ? ?Submit test in progress
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Thanks,
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?-Zhengyu
>      >? ? ? >? ? ? >
>      >? ? ? >
>      >
> 


From john.r.rose at oracle.com  Tue Jul  7 16:45:05 2020
From: john.r.rose at oracle.com (John Rose)
Date: Tue, 7 Jul 2020 09:45:05 -0700
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
 <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
Message-ID: <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>

On Jul 7, 2020, at 7:36 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
> Using memcpy instead of byte_at_put (and getting rid of byte_at_put)
> seems like a good idea to me.

We have had problems with memcpy in the long past, and I?m
personally still nervous when I see a call to it.  The problem is
subtle, and escaped everybody?s notice the first time around,
and I?d prefer not to make more bugs of the same kind.

What problem?  Well, memcpy does not guarantee integrity
of the copy from the point of view of racing threads.  Most
specifically, it is allowed to copy machine words (in memory)
byte-by-byte.  This leads to race conditions (word tearing)
if the destination of the memcpy is being read by another
thread while it is being written by the memcpy thread.
If the data being written is in the Java heap, the Java
memory model will be violated.  If the data is somewhere
else, perhaps some assumption about lock-free programming
could be violated.  And (here is the important point) the
choice of memcpy to copy using bytes instead of words
is private to memcpy and can change (essentially at any
moment.  Clearly a well-groomed memcpy will use word-wise
copies when it can, and so *in practice* memcpy will never
cause word tearing.  But *in rare cases* we have seen bugs
where some edge case is touched and a word gets torn.
These bugs *being rare* are exceedingly difficult to detect.

All this happened long ago, and was a painful learning
experience for the whole team.  One of our responses was
to pivot away from using memcpy and use our own loops
instead, even though we knew they were less performant
in some cases.  Another response was to codify our best
practices in the Copy class, and to more clearly document
which copy operations required exactly which behaviors.

Nowadays memcpy participates fully in C++ optimizations,
so there are new reasons to reach for it.  But the old reasons
to be cautious of it are still present.  (AFAIK all platforms
preserve memcpy?s ?right? to tear words.  But even if some
platform made it better behaved, HotSpot?s cross-platform
code must make worst-case assumptions.)  So I want to
add this cautionary note about memcpy; I don?t want us
to use memcpy as a cure-all for copy problems, and
forget that it?s not always the right tool for the job.

I have a specific ask:  Most parts of the HotSpot code base
are under our direct and detailed control, and many parts
of those have heightened concurrency requirements.
(By heightened, I mean that the usual C++ rules about
undetermined behavior in the case of races or word
tearing, etc., are unacceptable to us.  We are building
a runtime for a language with different rules than C++.)
So my ask is that we don?t just say ?memcpy? in (most
of) our source code; we should say something like
Copy::private_bytes, not just memcpy.  (There?s a
bikeshed color here, which I don?t care about.)
The Copy:: routine can be just an alias for memcpy.

So what?s the advantage of having a private reserve
Special Name for good old memcpy?  Isn?t this just
obscuring what?s going on in the code?  No, it?s not.
Bitter experience tells us that memcpy has some
hidden sharp edges, and our best move for protecting
ourselves is to wrap it in a custom-made handle.

By asking maintainers to refer to a new file, copy.hpp,
we are *also* giving them the chance to read some
informative *documentation* on the memcpy-alias,
that explains, for *our* source base, what are the
best practices for using this tricky primitive.  Yes,
memcpy is documented adequately, and surely
competent maintainers know that documentation,
but (as you can see) we need HotSpot-specific
documentation for tricky primitives like memcpy,
and the best way to ensure that such documentation
is appropriately referenced is to make a local alias,
a better handle, for the tricky primitive.

All that said, I have no problem with memcpy being
used (though I prefer a locally documented alias!) in
the places where it is appearing in our source base.
Those places are, yes, private to some construction
process, or singly-threaded for some other reason,
perhaps a safepoint.  But I think you understand now
that I am nervous when I see more and more uses
of memcpy in our code, because I think that now
it?s just a matter of time before someone concludes
that memcpy is the new (old) best way to copy data,
and the hard work of codifying our practices in
Copy will start to be neglected.  A new programmer
in our code base could make the mistake of just
reaching for memcpy instead of doing the work of
deciding which kind of Copy to use.  And that will
eventually cause a difficult-to-detect bug.  Let?s not.

? John


From gnu.andrew at redhat.com  Tue Jul  7 17:00:42 2020
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Tue, 7 Jul 2020 18:00:42 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E6404D@dggeml527-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>
 <d16feffb-31d9-6a9c-0ae3-8cb0ee2ab2e7@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E6404D@dggeml527-mbx.china.huawei.com>
Message-ID: <346ecc4f-bf55-6278-30b0-fd64a1586980@redhat.com>

On 07/07/2020 09:59, Yangfei (Felix) wrote:
>> -----Original Message-----
>> From: Andrew Haley [mailto:aph at redhat.com]
>> Sent: Tuesday, July 7, 2020 4:50 PM
>> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-runtime-
>> dev at openjdk.java.net
>> Cc: aarch64-port-dev at openjdk.java.net; jdk-dev at openjdk.java.net
>> Subject: Re: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
>> barrier in fast_storefield and fast_accessfield
>>
>> On 07/07/2020 02:20, Yangfei (Felix) wrote:
>>>>> 1. For 8, I have prepared a backport webrev:
>>>> http://cr.openjdk.java.net/~fyang/8248219-8u-backport/webrev.00/
>>>>>      Jtreg tested with an 8u aarch64 release build.  OK for aarch64-
>>>> port/jdk8u-shenandoah?
>>>>
>>>> We need another approver. When we have one, please push to jdk8u-
>> dev,
>>>> not jdk8u-shenandoah.
>>>
>>> Hmm... I haven't seen an aarch64 port in jdk8u-dev yet.
>>
>> Sorry, my brain crashed. You are right.
> 
> Thanks for confirming that.  Will push this to aarch64-port/jdk8u-shenandoah.
> 
> Felix
> 

It would be preferable to avoid pushing to aarch64-port/jdk8u-shenandoah
at the moment, as it is being prepared for the release next week.

This change will be part of aarch64-shenandoah-jdk8u272-b01.
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From zgu at redhat.com  Tue Jul  7 17:13:14 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 7 Jul 2020 13:13:14 -0400
Subject: [16] RFR(T) 8248426: NMT:
 VirtualMemoryTracker::split_reserved_region() does not properly update
 summary counting
In-Reply-To: <CAA-vtUzd0kOEY5a1-zskhn6WjNt_t8ZeFgkEGJEF6=Cy+woATA@mail.gmail.com>
References: <9d25bba3-f651-a8bd-aab0-3f561c262b37@redhat.com>
 <CAA-vtUzod1A4xikHuKjMX9TO7ctzH-C=OFQJgKou3OfCaDz-_g@mail.gmail.com>
 <CAA-vtUyjXL0dCOUjzn+T-41c_VYBJkPJx2PCt6_qEvze9muA4A@mail.gmail.com>
 <e9a27b65-da90-a8db-3953-0c235acd4bf0@redhat.com>
 <CAA-vtUzM0f0+LeYDD9JW6FP-0n18+7F9_C-AZZMWPL0L8STE_A@mail.gmail.com>
 <75c254fe-64dc-96ab-4a7b-7975bbfb29a6@redhat.com>
 <CAA-vtUyoZuRBkMkMmpdQuK+TUZmU6Op8DWHfWG8YZniAhWPK2Q@mail.gmail.com>
 <f80a519d-e0f1-a1c7-6e2c-77a86c14bae1@redhat.com>
 <CAA-vtUzd0kOEY5a1-zskhn6WjNt_t8ZeFgkEGJEF6=Cy+woATA@mail.gmail.com>
Message-ID: <91358426-2d18-e57e-cdb8-25972c436b28@redhat.com>

The change is no longer trivial, may I get a second review?

Thanks,

-Zhengyu

On 7/7/20 9:34 AM, Thomas St?fe wrote:
> Hi Zhengyu,
> 
> okay, I get it now. Thank you. Reviewed from my side.
> 
> Cheers, Thomas
> 
> On Tue, Jul 7, 2020 at 1:53 PM Zhengyu Gu <zgu at redhat.com 
> <mailto:zgu at redhat.com>> wrote:
> 
>     Hi Thomas,
> 
>     On 7/4/20 1:35 AM, Thomas St?fe wrote:
>      > Hi Zhengyu,
>      >
>      > sorry for the wait.
> 
>     No problem.
> 
>      >
>      >
>      > This looks good, but why did you remove the "!same region"
>     condition? I
>      > believe that is needed for the case when CDS' first mapping
>     encounters
>      > errors, so before it rebuilds CDS at another location it removes
>     (a) all
>      > mappings and then (b) the enclosing reservation. (a) should be
>     ignored
>      > by NMT but (b) should not. I may be wrong, I have had no coffee yet.
>      > Cheers, Thomas
> 
>     In new version, same region is handled in line #470
> 
>      ? 470? ?if (reserved_rgn->same_region(addr, size)) {
>      ? 471? ? ?return remove_released_region(reserved_rgn);
>      ? 472? ?}
> 
>     so, !same region is always true.
> 
>     Thanks,
> 
>     -Zhengyu
> 
>      >
>      >? ? ?Thanks,
>      >
>      >? ? ?-Zhengyu
>      >
>      >? ? ? >
>      >? ? ? > But if you are still unconvinced, I won't hold you up. The
>     change
>      >? ? ? > certainly works as it is now and is okay for me, it just would
>      >? ? ?not be my
>      >? ? ? > preferred solution.
>      >? ? ? >
>      >? ? ? > Cheers, Thomas
>      >? ? ? >
>      >? ? ? >? ? ?Thanks,
>      >? ? ? >
>      >? ? ? >? ? ?-Zhengyu
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ?Thanks, Thomas
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ?On Fri, Jun 26, 2020 at 10:54 PM Zhengyu Gu
>      >? ? ?<zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>      >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>
>      >? ? ? >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>      >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>     <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>>> wrote:
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Hi,
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Please review this trivial patch that fixes
>     summary
>      >? ? ? >? ? ?counting in
>      >? ? ? >? ? ? >? ? ? ? ?VirtualMemoryTracker::split_reserved_region().
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?The method uses internal method to remove a
>      >? ? ?reserved region,
>      >? ? ? >? ? ? >? ? ? ? ?which does
>      >? ? ? >? ? ? >? ? ? ? ?not update counting information. It should use
>      >? ? ?high level
>      >? ? ? >? ? ?tracking
>      >? ? ? >? ? ? >? ? ? ? ?method
>     VirtualMemoryTracker::remove_released_region()
>      >? ? ? >? ? ?instead.
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Without patch, NMT summary reports
>     uncategorized
>      >? ? ?memory, e.g.
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?-? ? ? ? ? ? ? ? ? ?Unknown
>     (reserved=1060416KB,
>      >? ? ? >? ? ?committed=0KB)
>      >? ? ? >? ? ? >? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(mmap:
>      >? ? ?reserved=1060416KB,
>      >? ? ? >? ? ? >? ? ? ? ?committed=0KB)
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Bug:
>     https://bugs.openjdk.java.net/browse/JDK-8248426
>      >? ? ? >? ? ? >? ? ? ? ?Webrev:
>      >? ? ? > http://cr.openjdk.java.net/~zgu/JDK-8248426/webrev.00/
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Test:
>      >? ? ? >? ? ? >? ? ? ? ? ? ?hotspot_nmt
>      >? ? ? >? ? ? >? ? ? ? ? ? ?Submit test in progress
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?Thanks,
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? >? ? ? ? ?-Zhengyu
>      >? ? ? >? ? ? >
>      >? ? ? >
>      >
> 


From daniel.daugherty at oracle.com  Tue Jul  7 21:21:52 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 7 Jul 2020 17:21:52 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
Message-ID: <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>

Ping! Any takers??? Code deletion should be really appealing here!!

Dan


On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> It's time to remove the AsyncDeflateIdleMonitors option from JDK16. We 
> can
> also get rid of the safepoint based deflation mechanism since turning off
> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way left to
> use it.
>
> This is marked as an "S/M" review because the number of touched/deleted
> lines makes it a Medium review, but the number of touched/changed lines
> (outside of the deletions) makes it a Small review. It's actually a 
> pretty
> fast read... :-)
>
> Here's the bug ID:
>
> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the safepoint
> ??????????????? based deflation mechanism
> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>
> Here's the webrev URL:
>
> ??? http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>
> The webrev is baselined on Thomas S's fix for 8248650 which is jdk-16+4
> plus a dozen or so changesets.
>
> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and there are
> no regressions (and very few known failures). My inflation stress testing
> is still in process. I had to restart that testing after a thunderstorm
> related power failure took down my servers in Florida. Sigh...
>
> Thanks, in advance, for any comments, questions, or suggestions.
>
> Dan


From kim.barrett at oracle.com  Tue Jul  7 23:29:17 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 7 Jul 2020 19:29:17 -0400
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
 <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
 <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>
Message-ID: <58B4572E-2731-48C2-A944-ED767BDCA57F@oracle.com>

> On Jul 7, 2020, at 12:45 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> On Jul 7, 2020, at 7:36 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
>> 
>> Using memcpy instead of byte_at_put (and getting rid of byte_at_put)
>> seems like a good idea to me.
> 
> We have had problems with memcpy in the long past, and I?m
> personally still nervous when I see a call to it.  The problem is
> subtle, and escaped everybody?s notice the first time around,
> and I?d prefer not to make more bugs of the same kind.
> 
> What problem?  [? snipped long discussion ?]
> 
> All that said, I have no problem with memcpy being
> used (though I prefer a locally documented alias!) in
> the places where it is appearing in our source base.
> Those places are, yes, private to some construction
> process, or singly-threaded for some other reason,
> perhaps a safepoint.  But I think you understand now
> that I am nervous when I see more and more uses
> of memcpy in our code, because I think that now
> it?s just a matter of time before someone concludes
> that memcpy is the new (old) best way to copy data,
> and the hard work of codifying our practices in
> Copy will start to be neglected.  A new programmer
> in our code base could make the mistake of just
> reaching for memcpy instead of doing the work of
> deciding which kind of Copy to use.  And that will
> eventually cause a difficult-to-detect bug.  Let?s not.
> 
> ? John

[This is somewhat summarizing an off-email discussion John and I had.]

It shouldn't be surprising that we have uses of bare memcpy, since we
don't have Copy::disjoint_bytes. In the particular case at hand,
there's no issue of word tearing since we aren't copying words, and
we're not even doing an aligned copy. But I wouldn't object to, and in
fact would encourage, the use of Copy::disjoint_bytes there if it
existed. Not having that function effectively denormalizes Copy in
favor of bare memcpy, making it easy to forget that Copy even exists.
I think the split between bare memcpy and Copy::conjoint_words in
HotSpot currently is close to even. And there are a score or so bare
memmoves.

I think the places where word tearing is an issue ought to be using
one of the "atomic" Copy functions.  (And it's not just word tearing
that can be an issue, as we found with SPARC BIS.  You convinced me
some time ago that memset_with_concurrent_readers (my fault for that)
should have been an "atomic" Copy function.  I'll probably deal with
that RFE soon; it's much easier now that SPARC has been removed.)

All of this is something of a digression from the change at hand.  In
the absence of Copy::disjoint_bytes (which this change shouldn't be
worrying about), I think using memcpy is fine for this change.


From felix.yang at huawei.com  Wed Jul  8 01:15:18 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Wed, 8 Jul 2020 01:15:18 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
 barrier in fast_storefield and fast_accessfield
In-Reply-To: <346ecc4f-bf55-6278-30b0-fd64a1586980@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E4C9F1@dggeml527-mbx.china.huawei.com>
 <e7e130a4-41aa-2bbd-88f0-31e4899be50b@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A578@dggeml507-mbx.china.huawei.com>
 <f8083adb-bdf0-675b-0c84-ebad7ccd4704@redhat.com>
 <dad22956-2382-c2a9-00a4-ee1702a22bee@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E60126@dggeml527-mbx.china.huawei.com>
 <7a418bcb-3abd-d95e-d379-7a5e9722eb45@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E63C7B@dggeml527-mbx.china.huawei.com>
 <d16feffb-31d9-6a9c-0ae3-8cb0ee2ab2e7@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E6404D@dggeml527-mbx.china.huawei.com>
 <346ecc4f-bf55-6278-30b0-fd64a1586980@redhat.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E64DE0@dggeml527-mbx.china.huawei.com>

Hi,

> -----Original Message-----
> From: Andrew Hughes [mailto:gnu.andrew at redhat.com]
> Sent: Wednesday, July 8, 2020 1:01 AM
> To: Yangfei (Felix) <felix.yang at huawei.com>; Andrew Haley
> <aph at redhat.com>; hotspot-runtime-dev at openjdk.java.net
> Cc: aarch64-port-dev at openjdk.java.net; jdk-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] RFR(XS): 8248219: aarch64: missing memory
> barrier in fast_storefield and fast_accessfield
> 

Cut...

> >
> > Thanks for confirming that.  Will push this to aarch64-port/jdk8u-
> shenandoah.
> >
> > Felix
> >
> 
> It would be preferable to avoid pushing to aarch64-port/jdk8u-shenandoah
> at the moment, as it is being prepared for the release next week.
>
> This change will be part of aarch64-shenandoah-jdk8u272-b01.

I have pushed before I see this email.  Sorry if this causes trouble.
This reminds me to put a jdk8u-fix-request label on another 8u-specific bug (JDK-8248851) which is currently under review.
I am looking forward to see the merge of aarch64 port into 8u upstream :-)  Then push approval could be done the same way.

Thanks,
Felix

From david.holmes at oracle.com  Wed Jul  8 07:51:31 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 8 Jul 2020 17:51:31 +1000
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
Message-ID: <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>

On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
> Ping! Any takers??? Code deletion should be really appealing here!!

Sorry Dan didn't get to it before vacation. But if you can wait till 
Monday ...

Cheers,
David

> Dan
> 
> 
> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> It's time to remove the AsyncDeflateIdleMonitors option from JDK16. We 
>> can
>> also get rid of the safepoint based deflation mechanism since turning off
>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way left to
>> use it.
>>
>> This is marked as an "S/M" review because the number of touched/deleted
>> lines makes it a Medium review, but the number of touched/changed lines
>> (outside of the deletions) makes it a Small review. It's actually a 
>> pretty
>> fast read... :-)
>>
>> Here's the bug ID:
>>
>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the safepoint
>> ??????????????? based deflation mechanism
>> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>>
>> Here's the webrev URL:
>>
>> ??? http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>
>> The webrev is baselined on Thomas S's fix for 8248650 which is jdk-16+4
>> plus a dozen or so changesets.
>>
>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and there are
>> no regressions (and very few known failures). My inflation stress testing
>> is still in process. I had to restart that testing after a thunderstorm
>> related power failure took down my servers in Florida. Sigh...
>>
>> Thanks, in advance, for any comments, questions, or suggestions.
>>
>> Dan
> 

From xxinliu at amazon.com  Wed Jul  8 08:26:01 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Wed, 8 Jul 2020 08:26:01 +0000
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
Message-ID: <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>

hi, Reviewers,

Please allow me to ping this CR.
It's the last left-over task for -XX:ControlIntrinsic=. it adds a sanity check for user-input.

Thanks,
--lx

On 6/25/20 6:59 PM, Liu, Xin wrote:

hi, Reviewers,

Could you review this patch?

bug: https://bugs.openjdk.java.net/browse/JDK-8247732

webrev: http://cr.openjdk.java.net/~xliu/8247732/00/webrev/


The core logic is class ControlIntrinsicValidator in compilerDirectives.hpp

It iterates the ccstrlist option and makes sure user-input intrinsic ids are all valid.  It stops and take a record when it meets the first unrecognized intrinsic.

I used constraints to validate the global options ControlIntrinsic and DisableIntrinsic.

ControlIntrinsic/DisableIntrinsic in compiler directives are more complex. The matched directive is only parsed when hotspot attempts to compile the corresponding method.

I validate at that time and JVM will crash if it doesnot meet guarantee() statement.

I added Method::external_name_short() which only returns the shorter method name in the form of  "classname::method".

Probably hotspot has had similar code, but I failed to discover. please let me know and I will remove it.


Test:

passed hotspot:tier1 and gtest:all

manually tests with wrong inputs.

https://bugs.openjdk.java.net/browse/JDK-8247732?focusedCommentId=14349960&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14349960


From goetz.lindenmaier at sap.com  Wed Jul  8 09:23:53 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 8 Jul 2020 09:23:53 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <25d5804f-2a8a-036a-3a5a-9795934551ed@oracle.com>
 <c3a21eaf-c801-44e8-9910-1f9862c14c63@oracle.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
Message-ID: <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi,

Is this good to be pushed now?  I would like to push it 
before RDP2 of jdk15, which is next week. 

@David
> Sorry but that consistency argument is a huge stretch in the case of the
> helpful NPE message because the original message is empty! 
No, it is not empty. It is computed lazy, but this is not visible to the 
user.  Especially, if I implement what you propose, the user can first 
see the message, and then suddenly it is gone.  This is really unexpected!
> This is only
> about helpful NPE message and you can trivially disable it for this case.
It's not hard to do it for all the exceptions, either.  The counter 
would have to be moved to Throwable, and all exceptions that 
get a message from the runtime would have to be marked as such.
Then setStackTrace in throwable would just reset the message.

I implemented an example where wrong stack traces are 
printed with LinkageError and NPE, modifying a jtreg test:
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/05/mess_with_exceptions.patch
See also the generated output added to a comment in the patch.
If the NEP message text was missing in the second printout, I think 
this really would be unexpected.
Please note that the correct message is printed after messing
with the stack trace, it's the stack trace that is wrong.
(Not as with the problem I am fixing here where a wrong
message is printed.)

Best regards,
  Goetz.


> 
> > I guess the normal usecase of setStackTrace is the other way around:
> > Change the message and throw a new exception with the existing
> > stack trace:
> >
> > try {
> >    a.x;
> > catch (NullPointerException e) {
> >    throw new NullPointerException("My own error
> message").setStackTrace(e.getStackTrace);
> > }
> >
> > And not taking an arbitrary stack trace and put it into an exception
> > with existing message.
> 
> Interesting usage.
> 
> Cheers,
> David
> -----
> 
> > Best regards,
> >    Goetz.
> >
> >
> >
> >> -----Original Message-----
> >> From: David Holmes <david.holmes at oracle.com>
> >> Sent: Friday, July 3, 2020 9:30 AM
> >> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
> mlv.fr'
> >> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> >> <hotspot-runtime-dev at openjdk.java.net>
> >> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> >> after calling fillInStackTrace
> >>
> >> Hi Goetz,
> >>
> >> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
> >>> Hi,
> >>>
> >>>> True. To ensure you process the original backtrace only you need to
> add
> >>>> synchronization in getMessage():
> >>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> >> jdk15/05/
> >>>
> >>> I added the volatile, too, but as I understand the synchronized
> >>> block brings sufficient memory barriers that this also works
> >>> without.
> >>
> >> No "volatile" needed, or wanted, when all access is within synchronized
> >> regions.
> >>
> >>>> To be honest the idea that someone would share an exception instance
> >> and
> >>>> concurrently mutate it with fillInStackTrace() whilst printing out
> >>>> information about it just seems highly unrealistic.
> >>> Yes, contention here is quite unlikely, so it should not harm performance
> >> ??
> >>
> >> Contention was not my concern at all. :)
> >>
> >>>> Though after looking at comments in the test I would also
> >>>> suggest that setStackTrace be updated:
> >>> The test shows that after setStackTrace still the correct message
> >>> is computed. This is because the algorithm uses Throwable::backtrace
> >>> and not Throwable::stacktrace.  Throwable::backtrace is not
> >>> affected by setStackTrace.
> >>> The behavior is just as with any exception. If you fiddle
> >>> with the stack trace, but don't adapt the message text,
> >>> the message might refer to other code than the stack trace
> >>> points to.
> >>
> >> But you can't adapt the message text - there is no setMessage! If the
> >> message is NULL and you call setStackTrace() then getMessage(), it makes
> >> no sense to return the extended error message that was associated with
> >> the original stack/backtrace.
> >>
> >> Cheers,
> >> David
> >>
> >>> Best regards,
> >>>     Goetz.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: David Holmes <david.holmes at oracle.com>
> >>>> Sent: Friday, July 3, 2020 3:37 AM
> >>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
> >> mlv.fr'
> >>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> >>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> message
> >>>> after calling fillInStackTrace
> >>>>
> >>>> Hi Goetz,
> >>>>
> >>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> >>>>> Hi Remi,
> >>>>>
> >>>>> But how does volatile help?
> >>>>> I see the test for numStackTracesFilledIn == 1 then gets always the
> >>>>> right value.
> >>>>> But the backtrace may not be changed until I read it in
> >>>>> getExtendedNPEMessage.  The other thread could change it after
> >>>>> checking numStackTracesFilledIn and before I read the backtrace.
> >>>>
> >>>> True. To ensure you process the original backtrace only you need to
> add
> >>>> synchronization in getMessage():
> >>>>
> >>>>          public String getMessage() {
> >>>>              String message = super.getMessage();
> >>>>              // If the stack trace was changed the extended NPE algorithm
> >>>>              // will compute a wrong message.
> >>>> +         synchronized(this) {
> >>>> !             if (message == null && numStackTracesFilledIn == 1) {
> >>>> !                 return getExtendedNPEMessage();
> >>>> !             }
> >>>> +         }
> >>>>              return message;
> >>>>          }
> >>>>
> >>>> To be honest the idea that someone would share an exception instance
> >> and
> >>>> concurrently mutate it with fillInStackTrace() whilst printing out
> >>>> information about it just seems highly unrealistic. But the above fixes
> >>>> it simply. Though after looking at comments in the test I would also
> >>>> suggest that setStackTrace be updated:
> >>>>
> >>>>           synchronized (this) {
> >>>>                if (this.stackTrace == null && // Immutable stack
> >>>>                    backtrace == null) // Test for out of protocol state
> >>>>                    return;
> >>>> +           numStackTracesFilledIn++;
> >>>>                this.stackTrace = defensiveCopy;
> >>>>            }
> >>>>        }
> >>>>
> >>>> as that would seem to be another hole in the mechanism.
> >>>>
> >>>>> I want to vote again for the much more simple version
> >>>>> proposed in webrev 02:
> >>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> NPE_fillInStackTrace-
> >>>> jdk15/02/
> >>>>
> >>>> I much prefer the latest version that recognises that only the original
> >>>> stack can be processed.
> >>>>
> >>>> In the test:
> >>>>
> >>>> +         // This holds for explicitly crated NPEs, but also for implicilty
> >>>>
> >>>> Two typos: crated  & implicilty
> >>>>
> >>>> Thanks,
> >>>> David
> >>>> -----
> >>>>
> >>>>
> >>>>> It's drawback is only that for this code:
> >>>>>      ex = null;
> >>>>>      ex.fillInStackTrace()
> >>>>> no message is created.
> >>>>>
> >>>>> I think this really is acceptable.
> >>>>>
> >>>>>
> >>>>> Remi, I didn't comment on this statement from a previous mail:
> >>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at some
> point.
> >>>>>> yes, it contains the Java stack trace, but if the Java stack trace is filled
> >> you
> >>>> don't
> >>>>>> compute any helpful message anyway.
> >>>>> The internal structure is no more deleted when the stack trace
> >>>>> is filled. So the message can be computed later, too.
> >>>>>
> >>>>> Best regards,
> >>>>>      Goetz.
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> >>>>>> Sent: Thursday, July 2, 2020 8:52 PM
> >>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph
> Dreis
> >>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
> runtime-
> >>>>>> dev at openjdk.java.net>; David Holmes <david.holmes at oracle.com>
> >>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >> message
> >>>>>> after calling fillInStackTrace
> >>>>>>
> >>>>>> yes,
> >>>>>> it's what i was saying,
> >>>>>> given that a NPE can be thrown very early, before VarHandle is
> >> initialized,
> >>>> i
> >>>>>> believe that declaring numStackTracesFilledIn volatile is the best way
> to
> >>>>>> tackle that.
> >>>>>>
> >>>>>> R?mi
> >>>>>>
> >>>>>> ----- Mail original -----
> >>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> >>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph
> >>>> Dreis"
> >>>>>> <christoph.dreis at freenet.de>
> >>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
> dev at openjdk.java.net>,
> >>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
> >>>>>>> <forax at univ-mlv.fr>
> >>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> >>>>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException
> message
> >>>>>> after calling fillInStackTrace
> >>>>>>
> >>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >>>>>>>> Hi Christoph,
> >>>>>>>>
> >>>>>>>> I fixed the comment, thanks for pointing that out.
> >>>>>>>>
> >>>>>>> One other thing is that NPE::getMessage reads
> numStackTracesFilledIn
> >>>>>>> without synchronization.
> >>>>>>>
> >>>>>>> -Alan

From sakatakui at oss.nttdata.com  Wed Jul  8 12:35:38 2020
From: sakatakui at oss.nttdata.com (Koichi Sakata)
Date: Wed, 8 Jul 2020 21:35:38 +0900
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <58B4572E-2731-48C2-A944-ED767BDCA57F@oracle.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
 <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
 <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>
 <58B4572E-2731-48C2-A944-ED767BDCA57F@oracle.com>
Message-ID: <9ddcf08f-291a-bc2b-7652-22b32b02ad87@oss.nttdata.com>

Thank you, John and Kim. I was able to understand that deeply.
I recognize that we can use memcpy in this situation.

I fixed my patch because it had unnecessary code that was pointed before.
I would appreciate if anyone could sponsor this patch.

Thanks,
Koichi

===== PATCH =====
diff -r f0792f0ffce9 src/hotspot/share/oops/symbol.cpp
--- a/src/hotspot/share/oops/symbol.cpp	Tue Jun 23 21:23:00 2020 -0700
+++ b/src/hotspot/share/oops/symbol.cpp	Wed Jul 08 17:51:27 2020 +0900
@@ -50,10 +50,8 @@
  Symbol::Symbol(const u1* name, int length, int refcount) {
    _hash_and_refcount =  pack_hash_and_refcount((short)os::random(), 
refcount);
    _length = length;
-  _body[0] = 0;  // in case length == 0
-  for (int i = 0; i < length; i++) {
-    byte_at_put(i, name[i]);
-  }
+  _body[0] = 0;
+  memcpy(_body, name, length);
  }

  void* Symbol::operator new(size_t sz, int len) throw() {
diff -r f0792f0ffce9 src/hotspot/share/oops/symbol.hpp
--- a/src/hotspot/share/oops/symbol.hpp	Tue Jun 23 21:23:00 2020 -0700
+++ b/src/hotspot/share/oops/symbol.hpp	Wed Jul 08 17:51:27 2020 +0900
@@ -125,11 +125,6 @@
      return (int)heap_word_size(byte_size(length));
    }

-  void byte_at_put(int index, u1 value) {
-    assert(index >=0 && index < length(), "symbol index overflow");
-    _body[index] = value;
-  }
-
    Symbol(const u1* name, int length, int refcount);
    void* operator new(size_t size, int len) throw();
    void* operator new(size_t size, int len, Arena* arena) throw();


On 2020/07/08 8:29, Kim Barrett wrote:
>> On Jul 7, 2020, at 12:45 PM, John Rose <john.r.rose at oracle.com> wrote:
>>
>> On Jul 7, 2020, at 7:36 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>>
>>> Using memcpy instead of byte_at_put (and getting rid of byte_at_put)
>>> seems like a good idea to me.
>>
>> We have had problems with memcpy in the long past, and I?m
>> personally still nervous when I see a call to it.  The problem is
>> subtle, and escaped everybody?s notice the first time around,
>> and I?d prefer not to make more bugs of the same kind.
>>
>> What problem?  [? snipped long discussion ?]
>>
>> All that said, I have no problem with memcpy being
>> used (though I prefer a locally documented alias!) in
>> the places where it is appearing in our source base.
>> Those places are, yes, private to some construction
>> process, or singly-threaded for some other reason,
>> perhaps a safepoint.  But I think you understand now
>> that I am nervous when I see more and more uses
>> of memcpy in our code, because I think that now
>> it?s just a matter of time before someone concludes
>> that memcpy is the new (old) best way to copy data,
>> and the hard work of codifying our practices in
>> Copy will start to be neglected.  A new programmer
>> in our code base could make the mistake of just
>> reaching for memcpy instead of doing the work of
>> deciding which kind of Copy to use.  And that will
>> eventually cause a difficult-to-detect bug.  Let?s not.
>>
>> ? John
> 
> [This is somewhat summarizing an off-email discussion John and I had.]
> 
> It shouldn't be surprising that we have uses of bare memcpy, since we
> don't have Copy::disjoint_bytes. In the particular case at hand,
> there's no issue of word tearing since we aren't copying words, and
> we're not even doing an aligned copy. But I wouldn't object to, and in
> fact would encourage, the use of Copy::disjoint_bytes there if it
> existed. Not having that function effectively denormalizes Copy in
> favor of bare memcpy, making it easy to forget that Copy even exists.
> I think the split between bare memcpy and Copy::conjoint_words in
> HotSpot currently is close to even. And there are a score or so bare
> memmoves.
> 
> I think the places where word tearing is an issue ought to be using
> one of the "atomic" Copy functions.  (And it's not just word tearing
> that can be an issue, as we found with SPARC BIS.  You convinced me
> some time ago that memset_with_concurrent_readers (my fault for that)
> should have been an "atomic" Copy function.  I'll probably deal with
> that RFE soon; it's much easier now that SPARC has been removed.)
> 
> All of this is something of a digression from the change at hand.  In
> the absence of Copy::disjoint_bytes (which this change shouldn't be
> worrying about), I think using memcpy is fine for this change.
> 

From david.holmes at oracle.com  Wed Jul  8 12:52:50 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 8 Jul 2020 22:52:50 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>

On 8/07/2020 7:23 pm, Lindenmaier, Goetz wrote:
> Hi,
> 
> Is this good to be pushed now?  I would like to push it
> before RDP2 of jdk15, which is next week.
> 
> @David
>> Sorry but that consistency argument is a huge stretch in the case of the
>> helpful NPE message because the original message is empty!
> No, it is not empty. It is computed lazy, but this is not visible to the
> user.  Especially, if I implement what you propose, the user can first
> see the message, and then suddenly it is gone.  This is really unexpected!

Your extended message is only computed when there is no original message.

You're concerned about this scenario:

catch (NullPointerException npe) {
   String msg1 = npe.getMessage(); // gets extends NPE message
   npe.setStackTrace(...);
   String msg2 = npe.getMessage(); // gets null
}

While I find it hard to imagine anyone doing this you can easily have 
specified that the extended message is only available with the original 
stacktrace, hence after a second call to fillInStackTrace, or a call to 
setStackTrace, then the message reverts to being empty. To me that makes 
far more sense than having msg2 continue to report the extended info for 
the original stacktrace when it now has a new stacktrace.

I'm really not seeing why calling fillInstackTrace() a second time 
should be treated any differently to calling setStackTrace(). They 
should be handled consistently IMO.

>> This is only
>> about helpful NPE message and you can trivially disable it for this case.
> It's not hard to do it for all the exceptions, either.  The counter
> would have to be moved to Throwable, and all exceptions that
> get a message from the runtime would have to be marked as such.
> Then setStackTrace in throwable would just reset the message.

We are not talking about all exceptions only about your NPE extended 
error message.

David
-----

> I implemented an example where wrong stack traces are
> printed with LinkageError and NPE, modifying a jtreg test:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/05/mess_with_exceptions.patch
> See also the generated output added to a comment in the patch.
> If the NEP message text was missing in the second printout, I think
> this really would be unexpected.
> Please note that the correct message is printed after messing
> with the stack trace, it's the stack trace that is wrong.
> (Not as with the problem I am fixing here where a wrong
> message is printed.)
> 
> Best regards,
>    Goetz.
> 
> 
> 
>>
>>> I guess the normal usecase of setStackTrace is the other way around:
>>> Change the message and throw a new exception with the existing
>>> stack trace:
>>>
>>> try {
>>>     a.x;
>>> catch (NullPointerException e) {
>>>     throw new NullPointerException("My own error
>> message").setStackTrace(e.getStackTrace);
>>> }
>>>
>>> And not taking an arbitrary stack trace and put it into an exception
>>> with existing message.
>>
>> Interesting usage.
>>
>> Cheers,
>> David
>> -----
>>
>>> Best regards,
>>>     Goetz.
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: David Holmes <david.holmes at oracle.com>
>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
>> mlv.fr'
>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>>> after calling fillInStackTrace
>>>>
>>>> Hi Goetz,
>>>>
>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>> Hi,
>>>>>
>>>>>> True. To ensure you process the original backtrace only you need to
>> add
>>>>>> synchronization in getMessage():
>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>>>> jdk15/05/
>>>>>
>>>>> I added the volatile, too, but as I understand the synchronized
>>>>> block brings sufficient memory barriers that this also works
>>>>> without.
>>>>
>>>> No "volatile" needed, or wanted, when all access is within synchronized
>>>> regions.
>>>>
>>>>>> To be honest the idea that someone would share an exception instance
>>>> and
>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>> information about it just seems highly unrealistic.
>>>>> Yes, contention here is quite unlikely, so it should not harm performance
>>>> ??
>>>>
>>>> Contention was not my concern at all. :)
>>>>
>>>>>> Though after looking at comments in the test I would also
>>>>>> suggest that setStackTrace be updated:
>>>>> The test shows that after setStackTrace still the correct message
>>>>> is computed. This is because the algorithm uses Throwable::backtrace
>>>>> and not Throwable::stacktrace.  Throwable::backtrace is not
>>>>> affected by setStackTrace.
>>>>> The behavior is just as with any exception. If you fiddle
>>>>> with the stack trace, but don't adapt the message text,
>>>>> the message might refer to other code than the stack trace
>>>>> points to.
>>>>
>>>> But you can't adapt the message text - there is no setMessage! If the
>>>> message is NULL and you call setStackTrace() then getMessage(), it makes
>>>> no sense to return the extended error message that was associated with
>>>> the original stack/backtrace.
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>>> Best regards,
>>>>>      Goetz.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
>>>> mlv.fr'
>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>> message
>>>>>> after calling fillInStackTrace
>>>>>>
>>>>>> Hi Goetz,
>>>>>>
>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>> Hi Remi,
>>>>>>>
>>>>>>> But how does volatile help?
>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets always the
>>>>>>> right value.
>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>> getExtendedNPEMessage.  The other thread could change it after
>>>>>>> checking numStackTracesFilledIn and before I read the backtrace.
>>>>>>
>>>>>> True. To ensure you process the original backtrace only you need to
>> add
>>>>>> synchronization in getMessage():
>>>>>>
>>>>>>           public String getMessage() {
>>>>>>               String message = super.getMessage();
>>>>>>               // If the stack trace was changed the extended NPE algorithm
>>>>>>               // will compute a wrong message.
>>>>>> +         synchronized(this) {
>>>>>> !             if (message == null && numStackTracesFilledIn == 1) {
>>>>>> !                 return getExtendedNPEMessage();
>>>>>> !             }
>>>>>> +         }
>>>>>>               return message;
>>>>>>           }
>>>>>>
>>>>>> To be honest the idea that someone would share an exception instance
>>>> and
>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>> information about it just seems highly unrealistic. But the above fixes
>>>>>> it simply. Though after looking at comments in the test I would also
>>>>>> suggest that setStackTrace be updated:
>>>>>>
>>>>>>            synchronized (this) {
>>>>>>                 if (this.stackTrace == null && // Immutable stack
>>>>>>                     backtrace == null) // Test for out of protocol state
>>>>>>                     return;
>>>>>> +           numStackTracesFilledIn++;
>>>>>>                 this.stackTrace = defensiveCopy;
>>>>>>             }
>>>>>>         }
>>>>>>
>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>
>>>>>>> I want to vote again for the much more simple version
>>>>>>> proposed in webrev 02:
>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>> NPE_fillInStackTrace-
>>>>>> jdk15/02/
>>>>>>
>>>>>> I much prefer the latest version that recognises that only the original
>>>>>> stack can be processed.
>>>>>>
>>>>>> In the test:
>>>>>>
>>>>>> +         // This holds for explicitly crated NPEs, but also for implicilty
>>>>>>
>>>>>> Two typos: crated  & implicilty
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>
>>>>>>> It's drawback is only that for this code:
>>>>>>>       ex = null;
>>>>>>>       ex.fillInStackTrace()
>>>>>>> no message is created.
>>>>>>>
>>>>>>> I think this really is acceptable.
>>>>>>>
>>>>>>>
>>>>>>> Remi, I didn't comment on this statement from a previous mail:
>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at some
>> point.
>>>>>>>> yes, it contains the Java stack trace, but if the Java stack trace is filled
>>>> you
>>>>>> don't
>>>>>>>> compute any helpful message anyway.
>>>>>>> The internal structure is no more deleted when the stack trace
>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>
>>>>>>> Best regards,
>>>>>>>       Goetz.
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph
>> Dreis
>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>> runtime-
>>>>>>>> dev at openjdk.java.net>; David Holmes <david.holmes at oracle.com>
>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>> message
>>>>>>>> after calling fillInStackTrace
>>>>>>>>
>>>>>>>> yes,
>>>>>>>> it's what i was saying,
>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
>>>> initialized,
>>>>>> i
>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the best way
>> to
>>>>>>>> tackle that.
>>>>>>>>
>>>>>>>> R?mi
>>>>>>>>
>>>>>>>> ----- Mail original -----
>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>, "Christoph
>>>>>> Dreis"
>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>> dev at openjdk.java.net>,
>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException
>> message
>>>>>>>> after calling fillInStackTrace
>>>>>>>>
>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>> Hi Christoph,
>>>>>>>>>>
>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>
>>>>>>>>> One other thing is that NPE::getMessage reads
>> numStackTracesFilledIn
>>>>>>>>> without synchronization.
>>>>>>>>>
>>>>>>>>> -Alan

From daniel.daugherty at oracle.com  Wed Jul  8 14:19:50 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 8 Jul 2020 10:19:50 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
Message-ID: <2b53179b-2af3-3ac8-1926-74d44f15f72c@oracle.com>

On 7/8/20 3:51 AM, David Holmes wrote:
> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>> Ping! Any takers??? Code deletion should be really appealing here!!
>
> Sorry Dan didn't get to it before vacation. But if you can wait till 
> Monday ...

Enjoy the time away!? As always, I'll wait for your code review.

Dan


>
> Cheers,
> David
>
>> Dan
>>
>>
>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> It's time to remove the AsyncDeflateIdleMonitors option from JDK16. 
>>> We can
>>> also get rid of the safepoint based deflation mechanism since 
>>> turning off
>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way 
>>> left to
>>> use it.
>>>
>>> This is marked as an "S/M" review because the number of touched/deleted
>>> lines makes it a Medium review, but the number of touched/changed lines
>>> (outside of the deletions) makes it a Small review. It's actually a 
>>> pretty
>>> fast read... :-)
>>>
>>> Here's the bug ID:
>>>
>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the 
>>> safepoint
>>> ??????????????? based deflation mechanism
>>> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>>>
>>> Here's the webrev URL:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>
>>> The webrev is baselined on Thomas S's fix for 8248650 which is jdk-16+4
>>> plus a dozen or so changesets.
>>>
>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and there 
>>> are
>>> no regressions (and very few known failures). My inflation stress 
>>> testing
>>> is still in process. I had to restart that testing after a thunderstorm
>>> related power failure took down my servers in Florida. Sigh...
>>>
>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>
>>> Dan
>>


From kim.barrett at oracle.com  Wed Jul  8 18:04:52 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 8 Jul 2020 14:04:52 -0400
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <9ddcf08f-291a-bc2b-7652-22b32b02ad87@oss.nttdata.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
 <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
 <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>
 <58B4572E-2731-48C2-A944-ED767BDCA57F@oracle.com>
 <9ddcf08f-291a-bc2b-7652-22b32b02ad87@oss.nttdata.com>
Message-ID: <F53BEF9B-1F39-4F9F-B971-E73400D8FF7E@oracle.com>

> On Jul 8, 2020, at 8:35 AM, Koichi Sakata <sakatakui at oss.nttdata.com> wrote:
> 
> I fixed my patch because it had unnecessary code that was pointed before.
> I would appreciate if anyone could sponsor this patch.

What about this unaddressed comment?

> On Jul 7, 2020, at 10:36 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
> However, the first two elements of _body are used by identity_hash().
> That seems like a possible reason to force initialization of both
> elements, which currently isn't done for length == 1.  But maybe it
> doesn't matter that identity_hash isn't consistent between processes,
> in which case forcing the initialization of _body[0] also shouldn't
> be needed.

I think it?s a waste to initialize _body[0] or a bug to not initialize _body[1].
I?ve no idea which.


From ioi.lam at oracle.com  Wed Jul  8 18:32:18 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 8 Jul 2020 11:32:18 -0700
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <F53BEF9B-1F39-4F9F-B971-E73400D8FF7E@oracle.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
 <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
 <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>
 <58B4572E-2731-48C2-A944-ED767BDCA57F@oracle.com>
 <9ddcf08f-291a-bc2b-7652-22b32b02ad87@oss.nttdata.com>
 <F53BEF9B-1F39-4F9F-B971-E73400D8FF7E@oracle.com>
Message-ID: <716bce70-7cbc-98da-5588-e08dffbe21fa@oracle.com>


On 7/8/20 11:04 AM, Kim Barrett wrote:
>> On Jul 8, 2020, at 8:35 AM, Koichi Sakata <sakatakui at oss.nttdata.com> wrote:
>>
>> I fixed my patch because it had unnecessary code that was pointed before.
>> I would appreciate if anyone could sponsor this patch.

I can sponsor the patch. I un-edited the line that assigns _body[0] to 
minimize the delta.

diff -r c29c9012c0ed src/hotspot/share/oops/symbol.cpp
--- a/src/hotspot/share/oops/symbol.cpp??? Tue Jul 07 23:11:13 2020 -0700
+++ b/src/hotspot/share/oops/symbol.cpp??? Wed Jul 08 11:25:48 2020 -0700
@@ -52,9 +52,7 @@
 ?? _hash_and_refcount =? pack_hash_and_refcount((short)os::random(), 
refcount);
 ?? _length = length;
 ?? _body[0] = 0;? // in case length == 0
-? for (int i = 0; i < length; i++) {
-??? byte_at_put(i, name[i]);
-? }
+? memcpy(_body, name, length);
 ?}


> What about this unaddressed comment?
>
>> On Jul 7, 2020, at 10:36 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
>> However, the first two elements of _body are used by identity_hash().
>> That seems like a possible reason to force initialization of both
>> elements, which currently isn't done for length == 1.  But maybe it
>> doesn't matter that identity_hash isn't consistent between processes,
>> in which case forcing the initialization of _body[0] also shouldn't
>> be needed.
> I think it?s a waste to initialize _body[0] or a bug to not initialize _body[1].
> I?ve no idea which.
>
I think this should be done in a separate RFE. I filed
https://bugs.openjdk.java.net/browse/JDK-8249087
Symbol constructor unnecessarily initializes _body[0]

Thanks
- Ioi

From ioi.lam at oracle.com  Wed Jul  8 18:42:14 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 8 Jul 2020 11:42:14 -0700
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <716bce70-7cbc-98da-5588-e08dffbe21fa@oracle.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
 <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
 <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>
 <58B4572E-2731-48C2-A944-ED767BDCA57F@oracle.com>
 <9ddcf08f-291a-bc2b-7652-22b32b02ad87@oss.nttdata.com>
 <F53BEF9B-1F39-4F9F-B971-E73400D8FF7E@oracle.com>
 <716bce70-7cbc-98da-5588-e08dffbe21fa@oracle.com>
Message-ID: <399f5689-3181-31ab-cc8b-9f20bd942aab@oracle.com>


On 7/8/20 11:32 AM, Ioi Lam wrote:
>
>
> On 7/8/20 11:04 AM, Kim Barrett wrote:
>>> On Jul 8, 2020, at 8:35 AM, Koichi Sakata 
>>> <sakatakui at oss.nttdata.com> wrote:
>>>
>>> I fixed my patch because it had unnecessary code that was pointed 
>>> before.
>>> I would appreciate if anyone could sponsor this patch.
>
> I can sponsor the patch. I un-edited the line that assigns _body[0] to 
> minimize the delta.
>
> diff -r c29c9012c0ed src/hotspot/share/oops/symbol.cpp
> --- a/src/hotspot/share/oops/symbol.cpp??? Tue Jul 07 23:11:13 2020 -0700
> +++ b/src/hotspot/share/oops/symbol.cpp??? Wed Jul 08 11:25:48 2020 -0700
> @@ -52,9 +52,7 @@
> ?? _hash_and_refcount = pack_hash_and_refcount((short)os::random(), 
> refcount);
> ?? _length = length;
> ?? _body[0] = 0;? // in case length == 0
> -? for (int i = 0; i < length; i++) {
> -??? byte_at_put(i, name[i]);
> -? }
> +? memcpy(_body, name, length);
> ?}
>

Here's the webrev. Is everyone OK with it?

8247818: GCC 10 warning stringop-overflow with symbol code
Reviewed-by: kbarrett, iklam
Contributed-by: sakatakui at oss.nttdata.com

http://cr.openjdk.java.net/~iklam/jdk16/8247818-symbol-gcc-warning.v01/

Thanks
- Ioi

>
>> What about this unaddressed comment?
>>
>>> On Jul 7, 2020, at 10:36 AM, Kim Barrett <kim.barrett at oracle.com> 
>>> wrote:
>>> However, the first two elements of _body are used by identity_hash().
>>> That seems like a possible reason to force initialization of both
>>> elements, which currently isn't done for length == 1.? But maybe it
>>> doesn't matter that identity_hash isn't consistent between processes,
>>> in which case forcing the initialization of _body[0] also shouldn't
>>> be needed.
>> I think it?s a waste to initialize _body[0] or a bug to not 
>> initialize _body[1].
>> I?ve no idea which.
>>
> I think this should be done in a separate RFE. I filed
> https://bugs.openjdk.java.net/browse/JDK-8249087
> Symbol constructor unnecessarily initializes _body[0]
>
> Thanks
> - Ioi


From igor.ignatyev at oracle.com  Wed Jul  8 19:43:43 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 8 Jul 2020 12:43:43 -0700
Subject: RFR [15] : 8249029: clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_defmeth tests
Message-ID: <9EC87F8D-662E-44B6-9EA1-F798A74D54B8@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249029/webrev.00
> 750 lines changed: 0 ins; 376 del; 374 mod;

Hi all,

could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_defmeth tests?
from the main issue(8204985):
> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.

effectively, the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/vm/runtime/defmeth  | xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`

testing: :vmTestbase_vm_defmeth on linux-x64
webrev: http://cr.openjdk.java.net/~iignatyev//8249029/webrev.00
JBS: https://bugs.openjdk.java.net/browse/JDK-8249029

Thanks,
-- Igor

From kim.barrett at oracle.com  Wed Jul  8 20:13:57 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 8 Jul 2020 16:13:57 -0400
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <399f5689-3181-31ab-cc8b-9f20bd942aab@oracle.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
 <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
 <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>
 <58B4572E-2731-48C2-A944-ED767BDCA57F@oracle.com>
 <9ddcf08f-291a-bc2b-7652-22b32b02ad87@oss.nttdata.com>
 <F53BEF9B-1F39-4F9F-B971-E73400D8FF7E@oracle.com>
 <716bce70-7cbc-98da-5588-e08dffbe21fa@oracle.com>
 <399f5689-3181-31ab-cc8b-9f20bd942aab@oracle.com>
Message-ID: <5E119BF3-B88D-4A6A-8099-73A89B94634D@oracle.com>

> On Jul 8, 2020, at 2:42 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
> 
> 
> 
> On 7/8/20 11:32 AM, Ioi Lam wrote:
>> 
>> 
>> On 7/8/20 11:04 AM, Kim Barrett wrote:
>>>> On Jul 8, 2020, at 8:35 AM, Koichi Sakata <sakatakui at oss.nttdata.com> wrote:
>>>> 
>>>> I fixed my patch because it had unnecessary code that was pointed before.
>>>> I would appreciate if anyone could sponsor this patch.
>> 
>> I can sponsor the patch. I un-edited the line that assigns _body[0] to minimize the delta.
>> 
>> diff -r c29c9012c0ed src/hotspot/share/oops/symbol.cpp
>> --- a/src/hotspot/share/oops/symbol.cpp    Tue Jul 07 23:11:13 2020 -0700
>> +++ b/src/hotspot/share/oops/symbol.cpp    Wed Jul 08 11:25:48 2020 -0700
>> @@ -52,9 +52,7 @@
>>    _hash_and_refcount = pack_hash_and_refcount((short)os::random(), refcount);
>>    _length = length;
>>    _body[0] = 0;  // in case length == 0
>> -  for (int i = 0; i < length; i++) {
>> -    byte_at_put(i, name[i]);
>> -  }
>> +  memcpy(_body, name, length);
>>  }
>> 
> 
> Here's the webrev. Is everyone OK with it?
> 
> 8247818: GCC 10 warning stringop-overflow with symbol code
> Reviewed-by: kbarrett, iklam
> Contributed-by: sakatakui at oss.nttdata.com
> 
> http://cr.openjdk.java.net/~iklam/jdk16/8247818-symbol-gcc-warning.v01/

Looks good.

>> 
>>> What about this unaddressed comment?
>>> 
>>>> On Jul 7, 2020, at 10:36 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>>> However, the first two elements of _body are used by identity_hash().
>>>> That seems like a possible reason to force initialization of both
>>>> elements, which currently isn't done for length == 1.  But maybe it
>>>> doesn't matter that identity_hash isn't consistent between processes,
>>>> in which case forcing the initialization of _body[0] also shouldn't
>>>> be needed.
>>> I think it?s a waste to initialize _body[0] or a bug to not initialize _body[1].
>>> I?ve no idea which.
>>> 
>> I think this should be done in a separate RFE. I filed
>> https://bugs.openjdk.java.net/browse/JDK-8249087
>> Symbol constructor unnecessarily initializes _body[0]

OK.


From sakatakui at oss.nttdata.com  Thu Jul  9 00:55:35 2020
From: sakatakui at oss.nttdata.com (Koichi Sakata)
Date: Thu, 9 Jul 2020 09:55:35 +0900
Subject: Avoid some GCC 10.X warnings in HotSpot
In-Reply-To: <5E119BF3-B88D-4A6A-8099-73A89B94634D@oracle.com>
References: <00995823-80f2-539d-aeb0-f3751dd43969@oss.nttdata.com>
 <3991a38f-e382-16e4-07d7-b75f1c0347c2@oracle.com>
 <a5efddbf-ab22-2c0d-05e2-da91e34fb61a@oss.nttdata.com>
 <5efde308-dbb6-acc1-3ba9-6d9d2a5e297f@oracle.com>
 <35a0a7ca-9ebc-d563-f434-36ce1064340d@oss.nttdata.com>
 <505f9b5e-e642-7fa4-ef09-ab5860c47ee5@oss.nttdata.com>
 <58B7F53B-6F35-485C-AC33-0577F367FEB1@oracle.com>
 <6FCD96F0-8ECA-4048-A672-2255F0365743@oracle.com>
 <58B4572E-2731-48C2-A944-ED767BDCA57F@oracle.com>
 <9ddcf08f-291a-bc2b-7652-22b32b02ad87@oss.nttdata.com>
 <F53BEF9B-1F39-4F9F-B971-E73400D8FF7E@oracle.com>
 <716bce70-7cbc-98da-5588-e08dffbe21fa@oracle.com>
 <399f5689-3181-31ab-cc8b-9f20bd942aab@oracle.com>
 <5E119BF3-B88D-4A6A-8099-73A89B94634D@oracle.com>
Message-ID: <41bef195-5eab-1004-2c2b-0250aa599c75@oss.nttdata.com>

Thank you for all your help.

Koichi

On 2020/07/09 5:13, Kim Barrett wrote:
>> On Jul 8, 2020, at 2:42 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>>
>>
>> On 7/8/20 11:32 AM, Ioi Lam wrote:
>>>
>>>
>>> On 7/8/20 11:04 AM, Kim Barrett wrote:
>>>>> On Jul 8, 2020, at 8:35 AM, Koichi Sakata <sakatakui at oss.nttdata.com> wrote:
>>>>>
>>>>> I fixed my patch because it had unnecessary code that was pointed before.
>>>>> I would appreciate if anyone could sponsor this patch.
>>>
>>> I can sponsor the patch. I un-edited the line that assigns _body[0] to minimize the delta.
>>>
>>> diff -r c29c9012c0ed src/hotspot/share/oops/symbol.cpp
>>> --- a/src/hotspot/share/oops/symbol.cpp    Tue Jul 07 23:11:13 2020 -0700
>>> +++ b/src/hotspot/share/oops/symbol.cpp    Wed Jul 08 11:25:48 2020 -0700
>>> @@ -52,9 +52,7 @@
>>>     _hash_and_refcount = pack_hash_and_refcount((short)os::random(), refcount);
>>>     _length = length;
>>>     _body[0] = 0;  // in case length == 0
>>> -  for (int i = 0; i < length; i++) {
>>> -    byte_at_put(i, name[i]);
>>> -  }
>>> +  memcpy(_body, name, length);
>>>   }
>>>
>>
>> Here's the webrev. Is everyone OK with it?
>>
>> 8247818: GCC 10 warning stringop-overflow with symbol code
>> Reviewed-by: kbarrett, iklam
>> Contributed-by: sakatakui at oss.nttdata.com
>>
>> http://cr.openjdk.java.net/~iklam/jdk16/8247818-symbol-gcc-warning.v01/
> 
> Looks good.
> 
>>>
>>>> What about this unaddressed comment?
>>>>
>>>>> On Jul 7, 2020, at 10:36 AM, Kim Barrett <kim.barrett at oracle.com> wrote:
>>>>> However, the first two elements of _body are used by identity_hash().
>>>>> That seems like a possible reason to force initialization of both
>>>>> elements, which currently isn't done for length == 1.  But maybe it
>>>>> doesn't matter that identity_hash isn't consistent between processes,
>>>>> in which case forcing the initialization of _body[0] also shouldn't
>>>>> be needed.
>>>> I think it?s a waste to initialize _body[0] or a bug to not initialize _body[1].
>>>> I?ve no idea which.
>>>>
>>> I think this should be done in a separate RFE. I filed
>>> https://bugs.openjdk.java.net/browse/JDK-8249087
>>> Symbol constructor unnecessarily initializes _body[0]
> 
> OK.
> 

From shade at redhat.com  Thu Jul  9 06:36:15 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 9 Jul 2020 08:36:15 +0200
Subject: RFR (S) 8249137: Remove CollectedHeap::obj_size
Message-ID: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8249137

It was added by JDK-8211270 to support old-style Shenandoah that needed a separate fwdptr slot.
After JDK-8224584 it does not need this anymore. Additionally, CH::obj_size may disagree with other
code that pokes at layout helper directly, for example GraphKit::new_instance.

This also avoids a virtual call on some paths, although those paths are not very performance-sensitive.

The patch is a simple series of few-liners:

diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.cpp
--- a/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09 04:32:30 2020 +0200
+++ b/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09 08:05:46 2020 +0200
@@ -578,6 +578,2 @@

-size_t CollectedHeap::obj_size(oop obj) const {
-  return obj->size();
-}
-
 uint32_t CollectedHeap::hash_oop(oop obj) const {
diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.hpp
--- a/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09 04:32:30 2020 +0200
+++ b/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09 08:05:46 2020 +0200
@@ -495,4 +495,2 @@

-  virtual size_t obj_size(oop obj) const;
-
   // Non product verification and debugging.
diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiEnv.cpp
--- a/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 04:32:30 2020 +0200
+++ b/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 08:05:46 2020 +0200
@@ -488,3 +488,3 @@
   NULL_CHECK(mirror, JVMTI_ERROR_INVALID_OBJECT);
-  *size_ptr = (jlong)Universe::heap()->obj_size(mirror) * wordSize;
+  *size_ptr = (jlong)mirror->size() * wordSize;
   return JVMTI_ERROR_NONE;
diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiExport.cpp
--- a/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 04:32:30 2020 +0200
+++ b/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 08:05:46 2020 +0200
@@ -1067,3 +1067,3 @@
      _jobj = (jobject)to_jobject(obj);
-     _size = Universe::heap()->obj_size(obj) * wordSize;
+     _size = obj->size() * wordSize;
    };
diff -r 9cc348ebdc82 src/hotspot/share/prims/whitebox.cpp
--- a/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 04:32:30 2020 +0200
+++ b/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 08:05:46 2020 +0200
@@ -389,3 +389,3 @@
   oop p = JNIHandles::resolve(obj);
-  return Universe::heap()->obj_size(p) * HeapWordSize;
+  return p->size() * HeapWordSize;
 WB_END

Testing: tier{1,2}; jdk-submit (running)

-- 
Thanks,
-Aleksey


From shade at redhat.com  Thu Jul  9 08:25:19 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 9 Jul 2020 10:25:19 +0200
Subject: RFR (XXS/T) 8249141: Fix indent in java_lang_Record definition in
 vmSymbols.hpp
Message-ID: <56883f17-2069-b6e7-419a-80213e369c32@redhat.com>

Trivial thing:
  https://bugs.openjdk.java.net/browse/JDK-8249141

Noticed this while adding another definition there. Seems cleaner to adjust the indent separately:
  https://cr.openjdk.java.net/~shade/8249141/webrev.01/

Testing: builds; nothing else

-- 
Thanks,
-Aleksey


From david.holmes at oracle.com  Thu Jul  9 08:52:09 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 9 Jul 2020 18:52:09 +1000
Subject: RFR (XXS/T) 8249141: Fix indent in java_lang_Record definition in
 vmSymbols.hpp
In-Reply-To: <56883f17-2069-b6e7-419a-80213e369c32@redhat.com>
References: <56883f17-2069-b6e7-419a-80213e369c32@redhat.com>
Message-ID: <5616a1d1-8f48-0a75-7d64-13f819a853b2@oracle.com>

Looks good and trivial. But I would have just fixed while in the area as 
this adds a lot of process overhead.

Cheers,
David

On 9/07/2020 6:25 pm, Aleksey Shipilev wrote:
> Trivial thing:
>    https://bugs.openjdk.java.net/browse/JDK-8249141
> 
> Noticed this while adding another definition there. Seems cleaner to adjust the indent separately:
>    https://cr.openjdk.java.net/~shade/8249141/webrev.01/
> 
> Testing: builds; nothing else
> 

From rkennke at redhat.com  Thu Jul  9 08:55:04 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 09 Jul 2020 10:55:04 +0200
Subject: RFR (S) 8249137: Remove CollectedHeap::obj_size
In-Reply-To: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>
References: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>
Message-ID: <b544098a92fbdcaba1273860bccc76f4269768f3.camel@redhat.com>

Looks good to me!

Thanks,
Roman

> > 
> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8249137
> 
> It was added by JDK-8211270 to support old-style Shenandoah that
> needed a separate fwdptr slot.
> After JDK-8224584 it does not need this anymore. Additionally,
> CH::obj_size may disagree with other
> code that pokes at layout helper directly, for example
> GraphKit::new_instance.
> 
> This also avoids a virtual call on some paths, although those paths
> are not very performance-sensitive.
> 
> The patch is a simple series of few-liners:
> 
> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.cpp
> --- a/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09
> 04:32:30 2020 +0200
> +++ b/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09
> 08:05:46 2020 +0200
> @@ -578,6 +578,2 @@
> 
> -size_t CollectedHeap::obj_size(oop obj) const {
> -  return obj->size();
> -}
> -
>  uint32_t CollectedHeap::hash_oop(oop obj) const {
> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.hpp
> --- a/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09
> 04:32:30 2020 +0200
> +++ b/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09
> 08:05:46 2020 +0200
> @@ -495,4 +495,2 @@
> 
> -  virtual size_t obj_size(oop obj) const;
> -
>    // Non product verification and debugging.
> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiEnv.cpp
> --- a/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 04:32:30
> 2020 +0200
> +++ b/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 08:05:46
> 2020 +0200
> @@ -488,3 +488,3 @@
>    NULL_CHECK(mirror, JVMTI_ERROR_INVALID_OBJECT);
> -  *size_ptr = (jlong)Universe::heap()->obj_size(mirror) * wordSize;
> +  *size_ptr = (jlong)mirror->size() * wordSize;
>    return JVMTI_ERROR_NONE;
> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiExport.cpp
> --- a/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 04:32:30
> 2020 +0200
> +++ b/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 08:05:46
> 2020 +0200
> @@ -1067,3 +1067,3 @@
>       _jobj = (jobject)to_jobject(obj);
> -     _size = Universe::heap()->obj_size(obj) * wordSize;
> +     _size = obj->size() * wordSize;
>     };
> diff -r 9cc348ebdc82 src/hotspot/share/prims/whitebox.cpp
> --- a/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 04:32:30
> 2020 +0200
> +++ b/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 08:05:46
> 2020 +0200
> @@ -389,3 +389,3 @@
>    oop p = JNIHandles::resolve(obj);
> -  return Universe::heap()->obj_size(p) * HeapWordSize;
> +  return p->size() * HeapWordSize;
>  WB_END
> 
> Testing: tier{1,2}; jdk-submit (running)
> 


From shade at redhat.com  Thu Jul  9 10:27:48 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 9 Jul 2020 12:27:48 +0200
Subject: RFR (XXS/T) 8249141: Fix indent in java_lang_Record definition in
 vmSymbols.hpp
In-Reply-To: <5616a1d1-8f48-0a75-7d64-13f819a853b2@oracle.com>
References: <56883f17-2069-b6e7-419a-80213e369c32@redhat.com>
 <5616a1d1-8f48-0a75-7d64-13f819a853b2@oracle.com>
Message-ID: <cfb79b55-95b1-f428-4154-db48b377802b@redhat.com>

On 7/9/20 10:52 AM, David Holmes wrote:
> Looks good and trivial. But I would have just fixed while in the area as 
> this adds a lot of process overhead.

Thanks, pushed. Process overhead for trivial one-liners does not bother me, and it makes subsequent
patches squeaky clean :)

-- 
Thanks,
-Aleksey


From daniel.daugherty at oracle.com  Thu Jul  9 13:34:15 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 9 Jul 2020 09:34:15 -0400
Subject: RFR (S) 8249137: Remove CollectedHeap::obj_size
In-Reply-To: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>
References: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>
Message-ID: <70984ebf-2717-c7b3-7076-12e2c8c7515c@oracle.com>

Adding serviceability-dev at ... since a couple of JVM/TI files are changed
in this RFR. Also, I moved the bug from hotspot/runtime -> hotspot/gc.

Dan


On 7/9/20 2:36 AM, Aleksey Shipilev wrote:
> RFE:
>    https://bugs.openjdk.java.net/browse/JDK-8249137
>
> It was added by JDK-8211270 to support old-style Shenandoah that needed a separate fwdptr slot.
> After JDK-8224584 it does not need this anymore. Additionally, CH::obj_size may disagree with other
> code that pokes at layout helper directly, for example GraphKit::new_instance.
>
> This also avoids a virtual call on some paths, although those paths are not very performance-sensitive.
>
> The patch is a simple series of few-liners:
>
> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.cpp
> --- a/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09 04:32:30 2020 +0200
> +++ b/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09 08:05:46 2020 +0200
> @@ -578,6 +578,2 @@
>
> -size_t CollectedHeap::obj_size(oop obj) const {
> -  return obj->size();
> -}
> -
>   uint32_t CollectedHeap::hash_oop(oop obj) const {
> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.hpp
> --- a/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09 04:32:30 2020 +0200
> +++ b/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09 08:05:46 2020 +0200
> @@ -495,4 +495,2 @@
>
> -  virtual size_t obj_size(oop obj) const;
> -
>     // Non product verification and debugging.
> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiEnv.cpp
> --- a/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 04:32:30 2020 +0200
> +++ b/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 08:05:46 2020 +0200
> @@ -488,3 +488,3 @@
>     NULL_CHECK(mirror, JVMTI_ERROR_INVALID_OBJECT);
> -  *size_ptr = (jlong)Universe::heap()->obj_size(mirror) * wordSize;
> +  *size_ptr = (jlong)mirror->size() * wordSize;
>     return JVMTI_ERROR_NONE;
> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiExport.cpp
> --- a/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 04:32:30 2020 +0200
> +++ b/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 08:05:46 2020 +0200
> @@ -1067,3 +1067,3 @@
>        _jobj = (jobject)to_jobject(obj);
> -     _size = Universe::heap()->obj_size(obj) * wordSize;
> +     _size = obj->size() * wordSize;
>      };
> diff -r 9cc348ebdc82 src/hotspot/share/prims/whitebox.cpp
> --- a/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 04:32:30 2020 +0200
> +++ b/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 08:05:46 2020 +0200
> @@ -389,3 +389,3 @@
>     oop p = JNIHandles::resolve(obj);
> -  return Universe::heap()->obj_size(p) * HeapWordSize;
> +  return p->size() * HeapWordSize;
>   WB_END
>
> Testing: tier{1,2}; jdk-submit (running)
>


From luhenry at microsoft.com  Thu Jul  9 13:55:15 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 9 Jul 2020 13:55:15 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
Message-ID: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>

Hello,

As part of adding support for Windows-AArch64, I've had the opportunity to read through most of the Windows-x86 code. In doing so, I found some code that I think can be simplified and made easier to read and maintain.

The three areas I have found are:
- Atomics: Hotspot doesn't make use of existing intrinsics provided by MSVC and Win32, even ones available since Windows XP.
- Exception handling: there is some code repetition which, even if functional, is subpar.
- Frames: we can use the existing os::fetch_frame_from_context to simplify the code and reduce frame parsing logic duplication.

I've split the webrevs along the above lines, making each simpler to review. I'm also hosting these webrevs on Bernhard Urban's CR as I currently do not have authorship. I'll also work with him to update the description of the JBS.

JBS: https://bugs.openjdk.java.net/browse/JDK-8248817
Webrevs:
http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/
http://cr.openjdk.java.net/~burban/luhenry/8248817-exception-handling/
http://cr.openjdk.java.net/~burban/luhenry/8248817-frames/

Tests: jtreg:hotspot:tier, jtreg:jdk:tier1, jtreg:jdk:tier2, jtreg:langtools on Windows-x86 and Windows-x86_64, no regressions.

Thank you,

--
Ludovic

From frederic.parain at oracle.com  Thu Jul  9 15:04:07 2020
From: frederic.parain at oracle.com (Frederic Parain)
Date: Thu, 9 Jul 2020 11:04:07 -0400
Subject: RFR: 8249149 Remove obsolete UseNewFieldLayout option and associated
 code
Message-ID: <AB74C263-6F9C-4660-A302-8469E13E38C7@oracle.com>

Please review this patch removing the old field layout code
that was deprecated in JDK15.

CR: https://bugs.openjdk.java.net/browse/JDK-8249149

Webrev: http://cr.openjdk.java.net/~fparain/8249149/webrev.00/index.html

Tested with tier 1 to 3.

Thank you,

Fred


From harold.seigel at oracle.com  Thu Jul  9 15:34:12 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Thu, 9 Jul 2020 11:34:12 -0400
Subject: RFR: 8249149 Remove obsolete UseNewFieldLayout option and
 associated code
In-Reply-To: <AB74C263-6F9C-4660-A302-8469E13E38C7@oracle.com>
References: <AB74C263-6F9C-4660-A302-8469E13E38C7@oracle.com>
Message-ID: <8aefdca2-76b5-5bf2-f1f5-2b62df8af826@oracle.com>

Hi Fred,

The changes look good.

Thanks, Harold

On 7/9/2020 11:04 AM, Frederic Parain wrote:
> Please review this patch removing the old field layout code
> that was deprecated in JDK15.
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8249149
>
> Webrev: http://cr.openjdk.java.net/~fparain/8249149/webrev.00/index.html
>
> Tested with tier 1 to 3.
>
> Thank you,
>
> Fred
>

From yumin.qi at oracle.com  Thu Jul  9 15:43:59 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Thu, 9 Jul 2020 08:43:59 -0700
Subject: [16] RFR(T) 8248426: NMT:
 VirtualMemoryTracker::split_reserved_region() does not properly update
 summary counting
In-Reply-To: <91358426-2d18-e57e-cdb8-25972c436b28@redhat.com>
References: <9d25bba3-f651-a8bd-aab0-3f561c262b37@redhat.com>
 <CAA-vtUzod1A4xikHuKjMX9TO7ctzH-C=OFQJgKou3OfCaDz-_g@mail.gmail.com>
 <CAA-vtUyjXL0dCOUjzn+T-41c_VYBJkPJx2PCt6_qEvze9muA4A@mail.gmail.com>
 <e9a27b65-da90-a8db-3953-0c235acd4bf0@redhat.com>
 <CAA-vtUzM0f0+LeYDD9JW6FP-0n18+7F9_C-AZZMWPL0L8STE_A@mail.gmail.com>
 <75c254fe-64dc-96ab-4a7b-7975bbfb29a6@redhat.com>
 <CAA-vtUyoZuRBkMkMmpdQuK+TUZmU6Op8DWHfWG8YZniAhWPK2Q@mail.gmail.com>
 <f80a519d-e0f1-a1c7-6e2c-77a86c14bae1@redhat.com>
 <CAA-vtUzd0kOEY5a1-zskhn6WjNt_t8ZeFgkEGJEF6=Cy+woATA@mail.gmail.com>
 <91358426-2d18-e57e-cdb8-25972c436b28@redhat.com>
Message-ID: <0cdac550-7a7e-71fc-220d-d9aa9cf8f5d0@oracle.com>

HI, Zhengyu

 ? Looks good to me!


Thanks

Yumin

On 7/7/20 10:13 AM, Zhengyu Gu wrote:
> The change is no longer trivial, may I get a second review?
>
> Thanks,
>
> -Zhengyu
>
> On 7/7/20 9:34 AM, Thomas St?fe wrote:
>> Hi Zhengyu,
>>
>> okay, I get it now. Thank you. Reviewed from my side.
>>
>> Cheers, Thomas
>>
>> On Tue, Jul 7, 2020 at 1:53 PM Zhengyu Gu <zgu at redhat.com 
>> <mailto:zgu at redhat.com>> wrote:
>>
>> ??? Hi Thomas,
>>
>> ??? On 7/4/20 1:35 AM, Thomas St?fe wrote:
>> ???? > Hi Zhengyu,
>> ???? >
>> ???? > sorry for the wait.
>>
>> ??? No problem.
>>
>> ???? >
>> ???? >
>> ???? > This looks good, but why did you remove the "!same region"
>> ??? condition? I
>> ???? > believe that is needed for the case when CDS' first mapping
>> ??? encounters
>> ???? > errors, so before it rebuilds CDS at another location it removes
>> ??? (a) all
>> ???? > mappings and then (b) the enclosing reservation. (a) should be
>> ??? ignored
>> ???? > by NMT but (b) should not. I may be wrong, I have had no 
>> coffee yet.
>> ???? > Cheers, Thomas
>>
>> ??? In new version, same region is handled in line #470
>>
>> ???? ? 470? ?if (reserved_rgn->same_region(addr, size)) {
>> ???? ? 471? ? ?return remove_released_region(reserved_rgn);
>> ???? ? 472? ?}
>>
>> ??? so, !same region is always true.
>>
>> ??? Thanks,
>>
>> ??? -Zhengyu
>>
>> ???? >
>> ???? >? ? ?Thanks,
>> ???? >
>> ???? >? ? ?-Zhengyu
>> ???? >
>> ???? >? ? ? >
>> ???? >? ? ? > But if you are still unconvinced, I won't hold you up. The
>> ??? change
>> ???? >? ? ? > certainly works as it is now and is okay for me, it 
>> just would
>> ???? >? ? ?not be my
>> ???? >? ? ? > preferred solution.
>> ???? >? ? ? >
>> ???? >? ? ? > Cheers, Thomas
>> ???? >? ? ? >
>> ???? >? ? ? >? ? ?Thanks,
>> ???? >? ? ? >
>> ???? >? ? ? >? ? ?-Zhengyu
>> ???? >? ? ? >
>> ???? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ?Thanks, Thomas
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ?On Fri, Jun 26, 2020 at 10:54 PM Zhengyu Gu
>> ???? >? ? ?<zgu at redhat.com <mailto:zgu at redhat.com>
>> ??? <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>> ???? >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>> ??? <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>
>> ???? >? ? ? >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>> ??? <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>> ???? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>> ??? <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>>> wrote:
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?Hi,
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?Please review this trivial patch that fixes
>> ??? summary
>> ???? >? ? ? >? ? ?counting in
>> ???? >? ? ? >? ? ? > ?VirtualMemoryTracker::split_reserved_region().
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?The method uses internal method to remove a
>> ???? >? ? ?reserved region,
>> ???? >? ? ? >? ? ? >? ? ? ? ?which does
>> ???? >? ? ? >? ? ? >? ? ? ? ?not update counting information. It 
>> should use
>> ???? >? ? ?high level
>> ???? >? ? ? >? ? ?tracking
>> ???? >? ? ? >? ? ? >? ? ? ? ?method
>> ??? VirtualMemoryTracker::remove_released_region()
>> ???? >? ? ? >? ? ?instead.
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?Without patch, NMT summary reports
>> ??? uncategorized
>> ???? >? ? ?memory, e.g.
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?- ?Unknown
>> ??? (reserved=1060416KB,
>> ???? >? ? ? >? ? ?committed=0KB)
>> ???? >? ? ? >? ? ? > ? ?(mmap:
>> ???? >? ? ?reserved=1060416KB,
>> ???? >? ? ? >? ? ? >? ? ? ? ?committed=0KB)
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?Bug:
>> ??? https://bugs.openjdk.java.net/browse/JDK-8248426
>> ???? >? ? ? >? ? ? >? ? ? ? ?Webrev:
>> ???? >? ? ? > http://cr.openjdk.java.net/~zgu/JDK-8248426/webrev.00/
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?Test:
>> ???? >? ? ? >? ? ? >? ? ? ? ? ? ?hotspot_nmt
>> ???? >? ? ? >? ? ? >? ? ? ? ? ? ?Submit test in progress
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?Thanks,
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >? ? ? >? ? ? ? ?-Zhengyu
>> ???? >? ? ? >? ? ? >
>> ???? >? ? ? >
>> ???? >
>>
>

From zgu at redhat.com  Thu Jul  9 17:20:07 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 9 Jul 2020 13:20:07 -0400
Subject: [16] RFR(T) 8248426: NMT:
 VirtualMemoryTracker::split_reserved_region() does not properly update
 summary counting
In-Reply-To: <0cdac550-7a7e-71fc-220d-d9aa9cf8f5d0@oracle.com>
References: <9d25bba3-f651-a8bd-aab0-3f561c262b37@redhat.com>
 <CAA-vtUzod1A4xikHuKjMX9TO7ctzH-C=OFQJgKou3OfCaDz-_g@mail.gmail.com>
 <CAA-vtUyjXL0dCOUjzn+T-41c_VYBJkPJx2PCt6_qEvze9muA4A@mail.gmail.com>
 <e9a27b65-da90-a8db-3953-0c235acd4bf0@redhat.com>
 <CAA-vtUzM0f0+LeYDD9JW6FP-0n18+7F9_C-AZZMWPL0L8STE_A@mail.gmail.com>
 <75c254fe-64dc-96ab-4a7b-7975bbfb29a6@redhat.com>
 <CAA-vtUyoZuRBkMkMmpdQuK+TUZmU6Op8DWHfWG8YZniAhWPK2Q@mail.gmail.com>
 <f80a519d-e0f1-a1c7-6e2c-77a86c14bae1@redhat.com>
 <CAA-vtUzd0kOEY5a1-zskhn6WjNt_t8ZeFgkEGJEF6=Cy+woATA@mail.gmail.com>
 <91358426-2d18-e57e-cdb8-25972c436b28@redhat.com>
 <0cdac550-7a7e-71fc-220d-d9aa9cf8f5d0@oracle.com>
Message-ID: <036fae2e-76d9-b8a8-eb82-6c78d2ae7282@redhat.com>

Thanks, Yumin.

-Zhengyu

On 7/9/20 11:43 AM, Yumin Qi wrote:
> HI, Zhengyu
> 
>  ? Looks good to me!
> 
> 
> Thanks
> 
> Yumin
> 
> On 7/7/20 10:13 AM, Zhengyu Gu wrote:
>> The change is no longer trivial, may I get a second review?
>>
>> Thanks,
>>
>> -Zhengyu
>>
>> On 7/7/20 9:34 AM, Thomas St?fe wrote:
>>> Hi Zhengyu,
>>>
>>> okay, I get it now. Thank you. Reviewed from my side.
>>>
>>> Cheers, Thomas
>>>
>>> On Tue, Jul 7, 2020 at 1:53 PM Zhengyu Gu <zgu at redhat.com 
>>> <mailto:zgu at redhat.com>> wrote:
>>>
>>> ??? Hi Thomas,
>>>
>>> ??? On 7/4/20 1:35 AM, Thomas St?fe wrote:
>>> ???? > Hi Zhengyu,
>>> ???? >
>>> ???? > sorry for the wait.
>>>
>>> ??? No problem.
>>>
>>> ???? >
>>> ???? >
>>> ???? > This looks good, but why did you remove the "!same region"
>>> ??? condition? I
>>> ???? > believe that is needed for the case when CDS' first mapping
>>> ??? encounters
>>> ???? > errors, so before it rebuilds CDS at another location it removes
>>> ??? (a) all
>>> ???? > mappings and then (b) the enclosing reservation. (a) should be
>>> ??? ignored
>>> ???? > by NMT but (b) should not. I may be wrong, I have had no 
>>> coffee yet.
>>> ???? > Cheers, Thomas
>>>
>>> ??? In new version, same region is handled in line #470
>>>
>>> ???? ? 470? ?if (reserved_rgn->same_region(addr, size)) {
>>> ???? ? 471? ? ?return remove_released_region(reserved_rgn);
>>> ???? ? 472? ?}
>>>
>>> ??? so, !same region is always true.
>>>
>>> ??? Thanks,
>>>
>>> ??? -Zhengyu
>>>
>>> ???? >
>>> ???? >? ? ?Thanks,
>>> ???? >
>>> ???? >? ? ?-Zhengyu
>>> ???? >
>>> ???? >? ? ? >
>>> ???? >? ? ? > But if you are still unconvinced, I won't hold you up. The
>>> ??? change
>>> ???? >? ? ? > certainly works as it is now and is okay for me, it 
>>> just would
>>> ???? >? ? ?not be my
>>> ???? >? ? ? > preferred solution.
>>> ???? >? ? ? >
>>> ???? >? ? ? > Cheers, Thomas
>>> ???? >? ? ? >
>>> ???? >? ? ? >? ? ?Thanks,
>>> ???? >? ? ? >
>>> ???? >? ? ? >? ? ?-Zhengyu
>>> ???? >? ? ? >
>>> ???? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ?Thanks, Thomas
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ?On Fri, Jun 26, 2020 at 10:54 PM Zhengyu Gu
>>> ???? >? ? ?<zgu at redhat.com <mailto:zgu at redhat.com>
>>> ??? <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>>> ???? >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>>> ??? <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>
>>> ???? >? ? ? >? ? ? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>>> ??? <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>
>>> ???? >? ? ?<mailto:zgu at redhat.com <mailto:zgu at redhat.com>
>>> ??? <mailto:zgu at redhat.com <mailto:zgu at redhat.com>>>>> wrote:
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?Hi,
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?Please review this trivial patch that fixes
>>> ??? summary
>>> ???? >? ? ? >? ? ?counting in
>>> ???? >? ? ? >? ? ? > ?VirtualMemoryTracker::split_reserved_region().
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?The method uses internal method to remove a
>>> ???? >? ? ?reserved region,
>>> ???? >? ? ? >? ? ? >? ? ? ? ?which does
>>> ???? >? ? ? >? ? ? >? ? ? ? ?not update counting information. It 
>>> should use
>>> ???? >? ? ?high level
>>> ???? >? ? ? >? ? ?tracking
>>> ???? >? ? ? >? ? ? >? ? ? ? ?method
>>> ??? VirtualMemoryTracker::remove_released_region()
>>> ???? >? ? ? >? ? ?instead.
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?Without patch, NMT summary reports
>>> ??? uncategorized
>>> ???? >? ? ?memory, e.g.
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?- ?Unknown
>>> ??? (reserved=1060416KB,
>>> ???? >? ? ? >? ? ?committed=0KB)
>>> ???? >? ? ? >? ? ? > ? ?(mmap:
>>> ???? >? ? ?reserved=1060416KB,
>>> ???? >? ? ? >? ? ? >? ? ? ? ?committed=0KB)
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?Bug:
>>> ??? https://bugs.openjdk.java.net/browse/JDK-8248426
>>> ???? >? ? ? >? ? ? >? ? ? ? ?Webrev:
>>> ???? >? ? ? > http://cr.openjdk.java.net/~zgu/JDK-8248426/webrev.00/
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?Test:
>>> ???? >? ? ? >? ? ? >? ? ? ? ? ? ?hotspot_nmt
>>> ???? >? ? ? >? ? ? >? ? ? ? ? ? ?Submit test in progress
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?Thanks,
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >? ? ? >? ? ? ? ?-Zhengyu
>>> ???? >? ? ? >? ? ? >
>>> ???? >? ? ? >
>>> ???? >
>>>
>>


From ioi.lam at oracle.com  Thu Jul  9 20:53:04 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 9 Jul 2020 13:53:04 -0700
Subject: RFR: 8249149 Remove obsolete UseNewFieldLayout option and
 associated code
In-Reply-To: <8aefdca2-76b5-5bf2-f1f5-2b62df8af826@oracle.com>
References: <AB74C263-6F9C-4660-A302-8469E13E38C7@oracle.com>
 <8aefdca2-76b5-5bf2-f1f5-2b62df8af826@oracle.com>
Message-ID: <f0e45852-2441-0e4f-331d-4ba73eb7ca72@oracle.com>

LGTM.

Thanks
- Ioi

On 7/9/20 8:34 AM, Harold Seigel wrote:
> Hi Fred,
>
> The changes look good.
>
> Thanks, Harold
>
> On 7/9/2020 11:04 AM, Frederic Parain wrote:
>> Please review this patch removing the old field layout code
>> that was deprecated in JDK15.
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8249149
>>
>> Webrev: http://cr.openjdk.java.net/~fparain/8249149/webrev.00/index.html
>>
>> Tested with tier 1 to 3.
>>
>> Thank you,
>>
>> Fred
>>


From luhenry at microsoft.com  Thu Jul  9 22:19:33 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 9 Jul 2020 22:19:33 +0000
Subject: RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding
 memory model
Message-ID: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>

Hello,

This small fix is in the context of the larger support for Windows-AArch64. I am using Bernhard Urban's CR as I am currently not an author.

ThreadCritical is used to synchronize the allocation of new Arena chunks. However, on platforms with weaker memory models than x86 (primarily ARM), the original ThreadCritical initialization code would be racy, leading to crashes. To fix that, we switch to initializing the ThreadCritical static data by using a functionally-sound Win32 API focused on initialization [1]. This approach also has the advantage of simplifying the code, and get it closer to how it is done on Linux.

JBS: https://bugs.openjdk.java.net/browse/JDK-8248657
Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248657/webrev.00/
Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Windows-x86_64, no regressions

Thank you,

--
Ludovic

[1] https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-initonceinitialize

From shade at redhat.com  Fri Jul 10 08:37:51 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 10 Jul 2020 10:37:51 +0200
Subject: RFR (S) 8249137: Remove CollectedHeap::obj_size
In-Reply-To: <70984ebf-2717-c7b3-7076-12e2c8c7515c@oracle.com>
References: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>
 <70984ebf-2717-c7b3-7076-12e2c8c7515c@oracle.com>
Message-ID: <a2432ff6-79b2-aefc-0450-d1c2b3cabd22@redhat.com>

Okay, thanks.

I already have 2 reviewers (rkennke, tschatzl), do I need more specifically from serviceability-dev@?

-Aleksey

On 7/9/20 3:34 PM, Daniel D. Daugherty wrote:
> Adding serviceability-dev at ... since a couple of JVM/TI files are changed
> in this RFR. Also, I moved the bug from hotspot/runtime -> hotspot/gc.
> 
> Dan
> 
> 
> On 7/9/20 2:36 AM, Aleksey Shipilev wrote:
>> RFE:
>>    https://bugs.openjdk.java.net/browse/JDK-8249137
>>
>> It was added by JDK-8211270 to support old-style Shenandoah that needed a separate fwdptr slot.
>> After JDK-8224584 it does not need this anymore. Additionally, CH::obj_size may disagree with other
>> code that pokes at layout helper directly, for example GraphKit::new_instance.
>>
>> This also avoids a virtual call on some paths, although those paths are not very performance-sensitive.
>>
>> The patch is a simple series of few-liners:
>>
>> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.cpp
>> --- a/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09 04:32:30 2020 +0200
>> +++ b/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09 08:05:46 2020 +0200
>> @@ -578,6 +578,2 @@
>>
>> -size_t CollectedHeap::obj_size(oop obj) const {
>> -  return obj->size();
>> -}
>> -
>>   uint32_t CollectedHeap::hash_oop(oop obj) const {
>> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.hpp
>> --- a/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09 04:32:30 2020 +0200
>> +++ b/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09 08:05:46 2020 +0200
>> @@ -495,4 +495,2 @@
>>
>> -  virtual size_t obj_size(oop obj) const;
>> -
>>     // Non product verification and debugging.
>> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiEnv.cpp
>> --- a/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 04:32:30 2020 +0200
>> +++ b/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 08:05:46 2020 +0200
>> @@ -488,3 +488,3 @@
>>     NULL_CHECK(mirror, JVMTI_ERROR_INVALID_OBJECT);
>> -  *size_ptr = (jlong)Universe::heap()->obj_size(mirror) * wordSize;
>> +  *size_ptr = (jlong)mirror->size() * wordSize;
>>     return JVMTI_ERROR_NONE;
>> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiExport.cpp
>> --- a/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 04:32:30 2020 +0200
>> +++ b/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 08:05:46 2020 +0200
>> @@ -1067,3 +1067,3 @@
>>        _jobj = (jobject)to_jobject(obj);
>> -     _size = Universe::heap()->obj_size(obj) * wordSize;
>> +     _size = obj->size() * wordSize;
>>      };
>> diff -r 9cc348ebdc82 src/hotspot/share/prims/whitebox.cpp
>> --- a/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 04:32:30 2020 +0200
>> +++ b/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 08:05:46 2020 +0200
>> @@ -389,3 +389,3 @@
>>     oop p = JNIHandles::resolve(obj);
>> -  return Universe::heap()->obj_size(p) * HeapWordSize;
>> +  return p->size() * HeapWordSize;
>>   WB_END
>>
>> Testing: tier{1,2}; jdk-submit (running)
>>
> 


-- 
Thanks,
-Aleksey


From frederic.parain at oracle.com  Fri Jul 10 12:49:03 2020
From: frederic.parain at oracle.com (Frederic Parain)
Date: Fri, 10 Jul 2020 08:49:03 -0400
Subject: RFR: 8249149 Remove obsolete UseNewFieldLayout option and
 associated code
In-Reply-To: <8aefdca2-76b5-5bf2-f1f5-2b62df8af826@oracle.com>
References: <AB74C263-6F9C-4660-A302-8469E13E38C7@oracle.com>
 <8aefdca2-76b5-5bf2-f1f5-2b62df8af826@oracle.com>
Message-ID: <A69E086C-8A48-46F2-9ED3-DCEF014BB7CF@oracle.com>

Thanks Harold!

Fred

> On Jul 9, 2020, at 11:34, Harold Seigel <harold.seigel at oracle.com> wrote:
> 
> Hi Fred,
> 
> The changes look good.
> 
> Thanks, Harold
> 
> On 7/9/2020 11:04 AM, Frederic Parain wrote:
>> Please review this patch removing the old field layout code
>> that was deprecated in JDK15.
>> 
>> CR: https://bugs.openjdk.java.net/browse/JDK-8249149
>> 
>> Webrev: http://cr.openjdk.java.net/~fparain/8249149/webrev.00/index.html
>> 
>> Tested with tier 1 to 3.
>> 
>> Thank you,
>> 
>> Fred
>> 


From frederic.parain at oracle.com  Fri Jul 10 12:49:18 2020
From: frederic.parain at oracle.com (Frederic Parain)
Date: Fri, 10 Jul 2020 08:49:18 -0400
Subject: RFR: 8249149 Remove obsolete UseNewFieldLayout option and
 associated code
In-Reply-To: <f0e45852-2441-0e4f-331d-4ba73eb7ca72@oracle.com>
References: <AB74C263-6F9C-4660-A302-8469E13E38C7@oracle.com>
 <8aefdca2-76b5-5bf2-f1f5-2b62df8af826@oracle.com>
 <f0e45852-2441-0e4f-331d-4ba73eb7ca72@oracle.com>
Message-ID: <4E570507-7D13-4872-AA80-87B29147955C@oracle.com>

Thanks Ioi!

Fred


> On Jul 9, 2020, at 16:53, Ioi Lam <ioi.lam at oracle.com> wrote:
> 
> LGTM.
> 
> Thanks
> - Ioi
> 
> On 7/9/20 8:34 AM, Harold Seigel wrote:
>> Hi Fred,
>> 
>> The changes look good.
>> 
>> Thanks, Harold
>> 
>> On 7/9/2020 11:04 AM, Frederic Parain wrote:
>>> Please review this patch removing the old field layout code
>>> that was deprecated in JDK15.
>>> 
>>> CR: https://bugs.openjdk.java.net/browse/JDK-8249149
>>> 
>>> Webrev: http://cr.openjdk.java.net/~fparain/8249149/webrev.00/index.html
>>> 
>>> Tested with tier 1 to 3.
>>> 
>>> Thank you,
>>> 
>>> Fred
>>> 
> 


From daniel.daugherty at oracle.com  Fri Jul 10 15:07:01 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 10 Jul 2020 11:07:01 -0400
Subject: RFR (S) 8249137: Remove CollectedHeap::obj_size
In-Reply-To: <a2432ff6-79b2-aefc-0450-d1c2b3cabd22@redhat.com>
References: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>
 <70984ebf-2717-c7b3-7076-12e2c8c7515c@oracle.com>
 <a2432ff6-79b2-aefc-0450-d1c2b3cabd22@redhat.com>
Message-ID: <99b679ec-fcae-6f30-1186-1cf8de809c6f@oracle.com>

On 7/10/20 4:37 AM, Aleksey Shipilev wrote:
> Okay, thanks.
>
> I already have 2 reviewers (rkennke, tschatzl), do I need more specifically from serviceability-dev@?

Since you're touching the Serviceability team's code, it would be
polite to wait for a review...

Dan


>
> -Aleksey
>
> On 7/9/20 3:34 PM, Daniel D. Daugherty wrote:
>> Adding serviceability-dev at ... since a couple of JVM/TI files are changed
>> in this RFR. Also, I moved the bug from hotspot/runtime -> hotspot/gc.
>>
>> Dan
>>
>>
>> On 7/9/20 2:36 AM, Aleksey Shipilev wrote:
>>> RFE:
>>>     https://bugs.openjdk.java.net/browse/JDK-8249137
>>>
>>> It was added by JDK-8211270 to support old-style Shenandoah that needed a separate fwdptr slot.
>>> After JDK-8224584 it does not need this anymore. Additionally, CH::obj_size may disagree with other
>>> code that pokes at layout helper directly, for example GraphKit::new_instance.
>>>
>>> This also avoids a virtual call on some paths, although those paths are not very performance-sensitive.
>>>
>>> The patch is a simple series of few-liners:
>>>
>>> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.cpp
>>> --- a/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09 04:32:30 2020 +0200
>>> +++ b/src/hotspot/share/gc/shared/collectedHeap.cpp     Thu Jul 09 08:05:46 2020 +0200
>>> @@ -578,6 +578,2 @@
>>>
>>> -size_t CollectedHeap::obj_size(oop obj) const {
>>> -  return obj->size();
>>> -}
>>> -
>>>    uint32_t CollectedHeap::hash_oop(oop obj) const {
>>> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.hpp
>>> --- a/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09 04:32:30 2020 +0200
>>> +++ b/src/hotspot/share/gc/shared/collectedHeap.hpp     Thu Jul 09 08:05:46 2020 +0200
>>> @@ -495,4 +495,2 @@
>>>
>>> -  virtual size_t obj_size(oop obj) const;
>>> -
>>>      // Non product verification and debugging.
>>> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiEnv.cpp
>>> --- a/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 04:32:30 2020 +0200
>>> +++ b/src/hotspot/share/prims/jvmtiEnv.cpp      Thu Jul 09 08:05:46 2020 +0200
>>> @@ -488,3 +488,3 @@
>>>      NULL_CHECK(mirror, JVMTI_ERROR_INVALID_OBJECT);
>>> -  *size_ptr = (jlong)Universe::heap()->obj_size(mirror) * wordSize;
>>> +  *size_ptr = (jlong)mirror->size() * wordSize;
>>>      return JVMTI_ERROR_NONE;
>>> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiExport.cpp
>>> --- a/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 04:32:30 2020 +0200
>>> +++ b/src/hotspot/share/prims/jvmtiExport.cpp   Thu Jul 09 08:05:46 2020 +0200
>>> @@ -1067,3 +1067,3 @@
>>>         _jobj = (jobject)to_jobject(obj);
>>> -     _size = Universe::heap()->obj_size(obj) * wordSize;
>>> +     _size = obj->size() * wordSize;
>>>       };
>>> diff -r 9cc348ebdc82 src/hotspot/share/prims/whitebox.cpp
>>> --- a/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 04:32:30 2020 +0200
>>> +++ b/src/hotspot/share/prims/whitebox.cpp      Thu Jul 09 08:05:46 2020 +0200
>>> @@ -389,3 +389,3 @@
>>>      oop p = JNIHandles::resolve(obj);
>>> -  return Universe::heap()->obj_size(p) * HeapWordSize;
>>> +  return p->size() * HeapWordSize;
>>>    WB_END
>>>
>>> Testing: tier{1,2}; jdk-submit (running)
>>>
>


From calvin.cheung at oracle.com  Fri Jul 10 16:39:52 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Fri, 10 Jul 2020 09:39:52 -0700
Subject: RFR(S): 8246308: Reference count for PackageEntry::name may be
 incorrectly decremented
Message-ID: <16cdc27e-2f2c-cf03-7284-e1d71f9cb79e@oracle.com>

JBS: https://bugs.openjdk.java.net/browse/JDK-8246308

webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8246308/webrev.00/

Please refer to the bug report for a description of the problem.
The proposed change also fixes a similar problem in 
systemDictionaryShared.cpp.

Passed tier1,2 tests.

thanks,

Calvin


From yumin.qi at oracle.com  Fri Jul 10 16:42:04 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Fri, 10 Jul 2020 09:42:04 -0700
Subject: RFR(S): 8246308: Reference count for PackageEntry::name may be
 incorrectly decremented
In-Reply-To: <16cdc27e-2f2c-cf03-7284-e1d71f9cb79e@oracle.com>
References: <16cdc27e-2f2c-cf03-7284-e1d71f9cb79e@oracle.com>
Message-ID: <316aefab-020b-2ce1-0f1b-9effd6e75a48@oracle.com>

Hi, Calvin

 ? Looks good to me!


Thanks

Yumin

On 7/10/20 9:39 AM, Calvin Cheung wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8246308
>
> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8246308/webrev.00/
>
> Please refer to the bug report for a description of the problem.
> The proposed change also fixes a similar problem in 
> systemDictionaryShared.cpp.
>
> Passed tier1,2 tests.
>
> thanks,
>
> Calvin
>
>

From calvin.cheung at oracle.com  Fri Jul 10 21:07:30 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Fri, 10 Jul 2020 14:07:30 -0700
Subject: RFR(S): 8246308: Reference count for PackageEntry::name may be
 incorrectly decremented
In-Reply-To: <316aefab-020b-2ce1-0f1b-9effd6e75a48@oracle.com>
References: <16cdc27e-2f2c-cf03-7284-e1d71f9cb79e@oracle.com>
 <316aefab-020b-2ce1-0f1b-9effd6e75a48@oracle.com>
Message-ID: <6fc3ea5c-ee34-a7f2-03c6-088533f2fbfd@oracle.com>

Thanks Yumin!

On 7/10/20 9:42 AM, Yumin Qi wrote:
> Hi, Calvin
>
> ? Looks good to me!
>
>
> Thanks
>
> Yumin
>
> On 7/10/20 9:39 AM, Calvin Cheung wrote:
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8246308
>>
>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8246308/webrev.00/
>>
>> Please refer to the bug report for a description of the problem.
>> The proposed change also fixes a similar problem in 
>> systemDictionaryShared.cpp.
>>
>> Passed tier1,2 tests.
>>
>> thanks,
>>
>> Calvin
>>
>>

From ioi.lam at oracle.com  Fri Jul 10 21:30:54 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Fri, 10 Jul 2020 14:30:54 -0700
Subject: RFR(S): 8246308: Reference count for PackageEntry::name may be
 incorrectly decremented
In-Reply-To: <16cdc27e-2f2c-cf03-7284-e1d71f9cb79e@oracle.com>
References: <16cdc27e-2f2c-cf03-7284-e1d71f9cb79e@oracle.com>
Message-ID: <bf53d3b8-7cd8-8c9c-e9d4-6d9c46a3036e@oracle.com>

Looks good to me.

Thanks
- Ioi

On 7/10/20 9:39 AM, Calvin Cheung wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8246308
>
> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8246308/webrev.00/
>
> Please refer to the bug report for a description of the problem.
> The proposed change also fixes a similar problem in 
> systemDictionaryShared.cpp.
>
> Passed tier1,2 tests.
>
> thanks,
>
> Calvin
>
>


From calvin.cheung at oracle.com  Fri Jul 10 23:51:45 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Fri, 10 Jul 2020 16:51:45 -0700
Subject: RFR(S): 8246308: Reference count for PackageEntry::name may be
 incorrectly decremented
In-Reply-To: <bf53d3b8-7cd8-8c9c-e9d4-6d9c46a3036e@oracle.com>
References: <16cdc27e-2f2c-cf03-7284-e1d71f9cb79e@oracle.com>
 <bf53d3b8-7cd8-8c9c-e9d4-6d9c46a3036e@oracle.com>
Message-ID: <eedc3256-027f-64a1-78da-e2d58b2278b5@oracle.com>

Thanks Ioi!

On 7/10/20 2:30 PM, Ioi Lam wrote:
> Looks good to me.
>
> Thanks
> - Ioi
>
> On 7/10/20 9:39 AM, Calvin Cheung wrote:
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8246308
>>
>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8246308/webrev.00/
>>
>> Please refer to the bug report for a description of the problem.
>> The proposed change also fixes a similar problem in 
>> systemDictionaryShared.cpp.
>>
>> Passed tier1,2 tests.
>>
>> thanks,
>>
>> Calvin
>>
>>
>

From david.holmes at oracle.com  Sat Jul 11 13:14:43 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 11 Jul 2020 23:14:43 +1000
Subject: RFR(S): 8248657: Windows: strengthening in ThreadCritical
 regarding memory model
In-Reply-To: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>

Hi Ludovic,

Sorry but this fix seems specific to the Windows-Aarch64 port work and 
as such should be fixed as part of that port when the JEP is approved 
and targeted.

David

On 10/07/2020 8:19 am, Ludovic Henry wrote:
> Hello,
> 
> This small fix is in the context of the larger support for Windows-AArch64. I am using Bernhard Urban's CR as I am currently not an author.
> 
> ThreadCritical is used to synchronize the allocation of new Arena chunks. However, on platforms with weaker memory models than x86 (primarily ARM), the original ThreadCritical initialization code would be racy, leading to crashes. To fix that, we switch to initializing the ThreadCritical static data by using a functionally-sound Win32 API focused on initialization [1]. This approach also has the advantage of simplifying the code, and get it closer to how it is done on Linux.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8248657
> Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248657/webrev.00/
> Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Windows-x86_64, no regressions
> 
> Thank you,
> 
> --
> Ludovic
> 
> [1] https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-initonceinitialize
> 

From kim.barrett at oracle.com  Sat Jul 11 14:15:05 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sat, 11 Jul 2020 10:15:05 -0400
Subject: RFR(S): 8248657: Windows: strengthening in ThreadCritical
 regarding memory model
In-Reply-To: <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
Message-ID: <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>

> On Jul 11, 2020, at 9:14 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Hi Ludovic,
> 
> Sorry but this fix seems specific to the Windows-Aarch64 port work and as such should be fixed as part of that port when the JEP is approved and targeted.
> 
> David

I'm inclined to disagree with that assessment.

This change seems to me to make the code noticeably simpler and easier
to understand, while also making it platform-independent (eliminating
a non-TSO race). Those seem like good things, regardless of any
aarch64 port that might be coming.

My only (very small) quibble with it is that I think it could be made
simpler still by using a thread-safe function scoped static variable
to control the initialization, rather than using InitOnceExecuteOnce.
But unfortunately, that thread-safety guarantee is a C++11 feature. I
expect that feature is present in VS 2015 or later, but until JEP 347
gets integrated (very soon, I hope), we still support older versions.

So I think this change looks good.


> On 10/07/2020 8:19 am, Ludovic Henry wrote:
>> Hello,
>> This small fix is in the context of the larger support for Windows-AArch64. I am using Bernhard Urban's CR as I am currently not an author.
>> ThreadCritical is used to synchronize the allocation of new Arena chunks. However, on platforms with weaker memory models than x86 (primarily ARM), the original ThreadCritical initialization code would be racy, leading to crashes. To fix that, we switch to initializing the ThreadCritical static data by using a functionally-sound Win32 API focused on initialization [1]. This approach also has the advantage of simplifying the code, and get it closer to how it is done on Linux.
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8248657
>> Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248657/webrev.00/
>> Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Windows-x86_64, no regressions
>> Thank you,
>> --
>> Ludovic
>> [1] https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-initonceinitialize


From thomas.stuefe at gmail.com  Sun Jul 12 05:26:44 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sun, 12 Jul 2020 07:26:44 +0200
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
Message-ID: <CAA-vtUzSXHOkq7aZL8VUOvZ-VJn1=i-6hjKtqxDgSjiuFp=g9w@mail.gmail.com>

I agree with Kim, this is so much better.

I also like that lock_count now starts at 0.

And that we assert if initial event creation fails instead of just ignoring
it (Okay we would have noticed in ~ThreadCritical()).

Only small style nit (unrelated to this patch) is that IMHO we should use
TRUE or FALSE, not true or false, when facing win32 APIs taking BOOL.

This is okay from my side.

Cheers, Thomas


On Sat, Jul 11, 2020 at 4:17 PM Kim Barrett <kim.barrett at oracle.com> wrote:

> > On Jul 11, 2020, at 9:14 AM, David Holmes <david.holmes at oracle.com>
> wrote:
> >
> > Hi Ludovic,
> >
> > Sorry but this fix seems specific to the Windows-Aarch64 port work and
> as such should be fixed as part of that port when the JEP is approved and
> targeted.
> >
> > David
>
> I'm inclined to disagree with that assessment.
>
> This change seems to me to make the code noticeably simpler and easier
> to understand, while also making it platform-independent (eliminating
> a non-TSO race). Those seem like good things, regardless of any
> aarch64 port that might be coming.
>
> My only (very small) quibble with it is that I think it could be made
> simpler still by using a thread-safe function scoped static variable
> to control the initialization, rather than using InitOnceExecuteOnce.
> But unfortunately, that thread-safety guarantee is a C++11 feature. I
> expect that feature is present in VS 2015 or later, but until JEP 347
> gets integrated (very soon, I hope), we still support older versions.
>
> So I think this change looks good.
>
>
> > On 10/07/2020 8:19 am, Ludovic Henry wrote:
> >> Hello,
> >> This small fix is in the context of the larger support for
> Windows-AArch64. I am using Bernhard Urban's CR as I am currently not an
> author.
> >> ThreadCritical is used to synchronize the allocation of new Arena
> chunks. However, on platforms with weaker memory models than x86 (primarily
> ARM), the original ThreadCritical initialization code would be racy,
> leading to crashes. To fix that, we switch to initializing the
> ThreadCritical static data by using a functionally-sound Win32 API focused
> on initialization [1]. This approach also has the advantage of simplifying
> the code, and get it closer to how it is done on Linux.
> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8248657
> >> Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248657/webrev.00/
> >> Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1,
> jtreg:test/jdk:tier2, jtreg:test/langtools on Windows-x86_64, no regressions
> >> Thank you,
> >> --
> >> Ludovic
> >> [1]
> https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-initonceinitialize
>
>
>

From thomas.stuefe at gmail.com  Sun Jul 12 06:08:17 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sun, 12 Jul 2020 08:08:17 +0200
Subject: RFR(S): Use Vectored Exception Handling on Windows
In-Reply-To: <MWHPR21MB0511F8E1132F81170290209FB0920@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511A8150D4CAEBF3181E61EB0980@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUw1nEo_o4ayQBv=MJcKFCTXfvY2ThNL1x9evcvT7fuYyg@mail.gmail.com>
 <MWHPR21MB0511F8E1132F81170290209FB0920@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <CAA-vtUzh65R01wHTW9-ObQZ7j0vNWjp_RuYivOrpGHoJNtyNgw@mail.gmail.com>

Hi Ludovic,

sorry for the delay, and thanks for the extensive answer. Please find
remarks inline.

On Fri, Jun 26, 2020 at 12:11 AM Ludovic Henry <luhenry at microsoft.com>
wrote:

> Hi Thomas,
>
> It seems that the problem you're describing stems from the current
> exception handler treating two cases: 1. any exception knowingly triggered
> by Java code and treated by HotSpot (ex: safepoint-polling, arraycopy
> stubs, stackoverflow in Java code), and 2. exceptional cases leading to
> crashes (ex: uncaught C++ exception, an access violation in VM or
> native/external code, etc.). There is the same problem on Unix because
> there is only one system (signal handling) for both cases. Fortunately,
> Windows proposes different systems, each with its own advantages.
>
> The order in which Windows invokes each of these systems is the following:
>  1. Vectored Exception Handler registered with
> `AddVectoredExceptionHandler`
>  2. Structured Exception Handler
>  3. Vectored Exception Handler registered with `AddVectoredContinueHandler`
>  4. Unhandled Exception Handler
>
> Today, Hotspot on x86/x86_64 catches the exception at 2. via a handler
> registered with `RtlAddFunctionTable`. This handler does both the
> Java-triggered exceptions and any other exceptions.
>
> Now, from the point of view of an external library or application
> embedding the JVM inside their own process, they still have all the above
> options to register an exception handler, irrespective of how Hotspot does
> it. This creates the following cases:
>  - If the application uses VEH: they will (with Hotspot using SEH) be
> called _before_ Hotspot's exception handler and will then have to be aware
> that they may get exceptions unrelated to them and will have to ignore them
> accordingly
>  - If the application uses SEH: they will only get exceptions related to
> their code area
>
> If Hotspot is to use VEH, an exception would play as follow:
>  - If the application uses VEH and their registered handler executes
> _before_ Hotspot's one: same as above
>  - If the application uses VEH and their registered handler executes
> _after_ Hotspot's one: Hotspot has to make sure that the exception was
> triggered by Hotspot and ignore them otherwise (a range check on the PC can
> be used here to emulate how it's done with RltAddFunctionTable)
>  - If the application uses SEH: the same case as to where the
> application's handler executes _after_ Hotspot's one
>
> This all assumes that Hotspot's VEH handler doesn't trigger a crash report
> (VMError::report_and_die) on any exception it doesn't know how to handle.
> The simplest way to do that is simply _not_ to do it in Hotspot's VEH
> handler, and to do it by registering a Win32 Unhandled Exception Handler
> (with SetUnhandlerdExceptionFilter [1]). This handler is _only_ called when
> no other exception handler treated the exception (by returning
> EXCEPTION_CONTINUE_EXECUTION or EXCEPTION_EXECUTE_HANDLER). Invoking it
> means the application is "toast" and not in a runnable state anymore, which
> fits nicely with the purpose of the Hotspot crash report.
>
>
Okay, If I get this correctly:

Today:
  App uses VEH - they execute before us and have to handle this correctly
(->A)
  App uses SEH - no interaction

With proposed switch:
  App uses VEH - they may or may not execute before us. If they come before
us: (->A). If they come after us -> (B)
  App uses SEH -> (B)

A) this case exists today. An app getting signals via VEH would have to
willingly ignore signals for us to get them. This does not change, your
patch would mean this happens less often, so I do not see a backward
compatibility problem here.

B) this is a new case. We would have to ignore signals not meant for us.
Technically by just ignoring them. Distinguishing this is a bit difficult
though. Note the subtle difference to Unix: there we have signal chaining,
so an application which is really really interested in signals for its own
purposes uses it (e.g. by preloading libjsig) and then we know its handler
and hand over the signal.

On windows we do not know this (?), we only can distinguish our crashes
from their crashes via crash pc, rejecting any crash not in our code
(dynamic or static). Well, arguably this would be just how it is today with
our code scoped via SEH. With the added safety net of the unhandled
exception filter (what happens if multiple parties call this?).

Okay this seems safe enough to try it at least.

My only very small personal gripe would be that I always liked how I can
quickly use SEH to check if a pointer is valid without disturbing anyone.
But within the hotspot at least I can just as well use SafeFetch.

Thank you,

Thomas

I hope this sheds some light on possible solutions ahead of us.
>
> Thank you,
>
> --
> Ludovic
>
> [1]
> https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter
> ________________________________________
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Sunday, June 21, 2020 05:55
> To: Ludovic Henry
> Cc: hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR(S): Use Vectored Exception Handling on Windows
>
> Hi,
>
> We at SAP had used VEH in our own Windows Itanium port and I dimly
> remember it being a source of problems. That is many years ago and I
> realize that it is not worth much, but it makes me bit apprehensive of this
> change.
>
> The main problem I see is that this will be an observable change in
> behavior.
>
> We currently use SEH, so our error handler is guaranteed to be invoked
> only for exceptions from within our own code. With VEH we now follow the
> Unix way of things and suddenly our error handler becomes a global resource.
>
> We will suddenly be invoked for crashes outside the VM, e.g. in foreign
> launcher code atop of us or in non-java side threads, which will generate
> whole new classes of hs-err files for crashes the VM is not responsible
> for. Which are then perceived as VM crashes and sent to us vendors instead
> of going to the right people. This is the way it works on Unix today, and
> it is a constant annoyance and increases our support workload.
>
> We also may introduce new problems since suddenly we interfere with
> application exception handling. At the very least, we have to think up a
> scheme for signal chaining (both ways: VM->foreign code and foreign
> code->VM). For the first, we probably need some form of libjsig preloading,
> or some other way to divert signal handler instalment. That would also need
> cooperation from the application programmers and/or operators.
>
> Matters are even more complicated, since foreign code may use SEH instead
> of VEH, so what happens if a JNI library below me wants to use SEH, does
> that still work?
>
> I feel this should not be rushed. Even considered "brittle" SEH has served
> us well, I do not recall many problems in the past aside from having to add
> the occasional __try/__except. Are there actual bugs we have to solve?
>
> Lastly, personally I always found SEH quite a neat concept, and one of the
> few places where Windows was superior to Unix :)
>
> Thanks, Thomas
>
>
> On Fri, Jun 19, 2020 at 5:23 PM Ludovic Henry <luhenry at microsoft.com
> <mailto:luhenry at microsoft.com>> wrote:
> Hello,
>
> First, some context and definitions:
> - when talking about exception here, I'm talking about Win32 exception
> which are equivalent to signals on Linux and other Unix, I am _not_ talking
> about Java exceptions.
> - an explanation of an _exception filter_ can be found at
> https://docs.microsoft.com/en-us/cpp/cpp/writing-an-exception-filter?view=vs-2019
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665642403&sdata=fjcrwcQYAg3TstTSO2YHKziszwlusbYV6uUXINydD1E%3D&reserved=0>.
> There is only a limited concept of that in Java with type-based exception
> filter (ex: `try { ... } catch (IOException ioe) { ... } catch (Throwable
> t) { ... }`).
> - in Win32, there exist two exception handling mechanism:
>   - Structured Exception Handling: the historical one, based on `__try {}
> __except (...) {}`
>   - Vectored Exception Handling: introduced in Windows XP / Windows Server
> 2003, much more similar to signals on Linux
>
> These exception handling mechanisms are used to catch any exceptions like
> Access Violation, Stack Overflow, Divide by Zero, Overflow, and more. These
> exceptions are equivalent to signal on Linux and are then core to many
> mechanisms in the OpenJDK.
>
> Today, the OpenJDK uses Structured Exception Handling to catch such
> exceptions, creating several requirements. First, all code that might
> trigger an exception on purpose (like a Access Violation / SIGSEGV in the
> arraycopy stub), needs to be wrapped up in a __try / __except. Because it's
> not feasible to wrap every single instance of such code, these __try /
> __except are put at the top-level most function of any thread started by
> the runtime. Second, for code generated by Hotspot, `RtlAddFunctionTable`
> is used to simulate the use of __try / __except for a specific code area.
> This function needs platform specific code with the generation of  a
> trampoline that calls the exception filter declared in the runtime. It's
> also meant to be used as a one to one mapping with try / catch in user
> code, and not as a "catch all the exceptions in this code area". Third,
> Structured Exception Handling expects to be able to unwind the stack.
> However, because Hotspot doesn't guarantee the usage of the
> platform-specific ABI internally, the platform-specific unwinder might
> break. Hotspot's usage of `RtlAddFunctionTable` for the code cache relies
> on the assumption that Structured Exception Handling never tries to unwind
> the stack (which it would fail to do because of the different ABI) before
> calling the registered exception filter.
>
> Discussing that with Windows Kernel maintainers, this approach is highly
> discouraged, considered brittle, and the better solution is Vectored
> Exception Handling. Vectored Exception Handling is conceptually much more
> similar to signal / sigaction on Linux and other Unix systems. It will
> catch all exceptions happening across the process, and no __try / __except
> will be required. It also removes the requirement to call
> `RtlAddFunctionTable`.  The exception filter then behaves like a signal
> handler with the possibility to modify the registers at will, modifying the
> PC to step over an instruction after an expected Access Violation for
> example. Vectored Exception Handling is also already used for AOT code.
>
> The changes can be found at
> http://cr.openjdk.java.net/~burban/ludovic_vecexc/<
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665652395&sdata=pTewy1%2BeB43HX4y0ypDwMDGRjBoNP6yBGrhRi7ncm1c%3D&reserved=0>.
> As I am not an author, I have not created a corresponding bug in JBS.
>
> Thank you, and looking forward for your feedback!
>
> --
> Ludovic
>
>
>

From aph at redhat.com  Sun Jul 12 14:25:56 2020
From: aph at redhat.com (Andrew Haley)
Date: Sun, 12 Jul 2020 15:25:56 +0100
Subject: RFR(S): 8248657: Windows: strengthening in ThreadCritical
 regarding memory model
In-Reply-To: <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
Message-ID: <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>

On 11/07/2020 15:15, Kim Barrett wrote:

 > This change seems to me to make the code noticeably simpler and
 > easier to understand, while also making it platform-independent
 > (eliminating a non-TSO race).

Which is, let us not forget, undefined behaviour. It's best to treat
all such cases as bugs, even if they don't affect x86. But it's not
always non-TSO machines that come out badly: this reminds me of
JDK-8225716, a race condition which only showed on x86-32.

 > Those seem like good things, regardless of any aarch64 port that
 > might be coming.

Indeed.

My opinion is that unnecessary platform dependencies are in effect
technical debt. Another example is the use of non-portable integer
types in AArch64 -- to a large extent my doing -- and it makes sense
to do the cleanup in mainline. That way the Windows import patches
will be as clean and simple as they can be.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From chris.plummer at oracle.com  Sun Jul 12 19:08:11 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Sun, 12 Jul 2020 12:08:11 -0700
Subject: RFR (S) 8249137: Remove CollectedHeap::obj_size
In-Reply-To: <99b679ec-fcae-6f30-1186-1cf8de809c6f@oracle.com>
References: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>
 <70984ebf-2717-c7b3-7076-12e2c8c7515c@oracle.com>
 <a2432ff6-79b2-aefc-0450-d1c2b3cabd22@redhat.com>
 <99b679ec-fcae-6f30-1186-1cf8de809c6f@oracle.com>
Message-ID: <f68ba7d4-2693-57c4-59ef-6b0c04b3260c@oracle.com>

Looks good to me.

Chris

On 7/10/20 8:07 AM, Daniel D. Daugherty wrote:
> On 7/10/20 4:37 AM, Aleksey Shipilev wrote:
>> Okay, thanks.
>>
>> I already have 2 reviewers (rkennke, tschatzl), do I need more 
>> specifically from serviceability-dev@?
>
> Since you're touching the Serviceability team's code, it would be
> polite to wait for a review...
>
> Dan
>
>
>>
>> -Aleksey
>>
>> On 7/9/20 3:34 PM, Daniel D. Daugherty wrote:
>>> Adding serviceability-dev at ... since a couple of JVM/TI files are 
>>> changed
>>> in this RFR. Also, I moved the bug from hotspot/runtime -> hotspot/gc.
>>>
>>> Dan
>>>
>>>
>>> On 7/9/20 2:36 AM, Aleksey Shipilev wrote:
>>>> RFE:
>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8249137
>>>>
>>>> It was added by JDK-8211270 to support old-style Shenandoah that 
>>>> needed a separate fwdptr slot.
>>>> After JDK-8224584 it does not need this anymore. Additionally, 
>>>> CH::obj_size may disagree with other
>>>> code that pokes at layout helper directly, for example 
>>>> GraphKit::new_instance.
>>>>
>>>> This also avoids a virtual call on some paths, although those paths 
>>>> are not very performance-sensitive.
>>>>
>>>> The patch is a simple series of few-liners:
>>>>
>>>> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.cpp
>>>> --- a/src/hotspot/share/gc/shared/collectedHeap.cpp???? Thu Jul 09 
>>>> 04:32:30 2020 +0200
>>>> +++ b/src/hotspot/share/gc/shared/collectedHeap.cpp???? Thu Jul 09 
>>>> 08:05:46 2020 +0200
>>>> @@ -578,6 +578,2 @@
>>>>
>>>> -size_t CollectedHeap::obj_size(oop obj) const {
>>>> -? return obj->size();
>>>> -}
>>>> -
>>>> ?? uint32_t CollectedHeap::hash_oop(oop obj) const {
>>>> diff -r 9cc348ebdc82 src/hotspot/share/gc/shared/collectedHeap.hpp
>>>> --- a/src/hotspot/share/gc/shared/collectedHeap.hpp???? Thu Jul 09 
>>>> 04:32:30 2020 +0200
>>>> +++ b/src/hotspot/share/gc/shared/collectedHeap.hpp???? Thu Jul 09 
>>>> 08:05:46 2020 +0200
>>>> @@ -495,4 +495,2 @@
>>>>
>>>> -? virtual size_t obj_size(oop obj) const;
>>>> -
>>>> ???? // Non product verification and debugging.
>>>> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiEnv.cpp
>>>> --- a/src/hotspot/share/prims/jvmtiEnv.cpp????? Thu Jul 09 04:32:30 
>>>> 2020 +0200
>>>> +++ b/src/hotspot/share/prims/jvmtiEnv.cpp????? Thu Jul 09 08:05:46 
>>>> 2020 +0200
>>>> @@ -488,3 +488,3 @@
>>>> ???? NULL_CHECK(mirror, JVMTI_ERROR_INVALID_OBJECT);
>>>> -? *size_ptr = (jlong)Universe::heap()->obj_size(mirror) * wordSize;
>>>> +? *size_ptr = (jlong)mirror->size() * wordSize;
>>>> ???? return JVMTI_ERROR_NONE;
>>>> diff -r 9cc348ebdc82 src/hotspot/share/prims/jvmtiExport.cpp
>>>> --- a/src/hotspot/share/prims/jvmtiExport.cpp?? Thu Jul 09 04:32:30 
>>>> 2020 +0200
>>>> +++ b/src/hotspot/share/prims/jvmtiExport.cpp?? Thu Jul 09 08:05:46 
>>>> 2020 +0200
>>>> @@ -1067,3 +1067,3 @@
>>>> ??????? _jobj = (jobject)to_jobject(obj);
>>>> -???? _size = Universe::heap()->obj_size(obj) * wordSize;
>>>> +???? _size = obj->size() * wordSize;
>>>> ????? };
>>>> diff -r 9cc348ebdc82 src/hotspot/share/prims/whitebox.cpp
>>>> --- a/src/hotspot/share/prims/whitebox.cpp????? Thu Jul 09 04:32:30 
>>>> 2020 +0200
>>>> +++ b/src/hotspot/share/prims/whitebox.cpp????? Thu Jul 09 08:05:46 
>>>> 2020 +0200
>>>> @@ -389,3 +389,3 @@
>>>> ???? oop p = JNIHandles::resolve(obj);
>>>> -? return Universe::heap()->obj_size(p) * HeapWordSize;
>>>> +? return p->size() * HeapWordSize;
>>>> ?? WB_END
>>>>
>>>> Testing: tier{1,2}; jdk-submit (running)
>>>>
>>
>


From david.holmes at oracle.com  Mon Jul 13 01:54:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 13 Jul 2020 11:54:37 +1000
Subject: RFR(S): 8248657: Windows: strengthening in ThreadCritical
 regarding memory model
In-Reply-To: <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
Message-ID: <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>

On 13/07/2020 12:25 am, Andrew Haley wrote:
> On 11/07/2020 15:15, Kim Barrett wrote:
> 
>  > This change seems to me to make the code noticeably simpler and
>  > easier to understand, while also making it platform-independent
>  > (eliminating a non-TSO race).
> 
> Which is, let us not forget, undefined behaviour. It's best to treat
> all such cases as bugs, even if they don't affect x86. But it's not
> always non-TSO machines that come out badly: this reminds me of
> JDK-8225716, a race condition which only showed on x86-32.
> 
>  > Those seem like good things, regardless of any aarch64 port that
>  > might be coming.
> 
> Indeed.
> 
> My opinion is that unnecessary platform dependencies are in effect
> technical debt. Another example is the use of non-portable integer
> types in AArch64 -- to a large extent my doing -- and it makes sense
> to do the cleanup in mainline. That way the Windows import patches
> will be as clean and simple as they can be.

We'll have to agree to disagree on which side of the "general cleanup" 
versus "part of the Windows-aarch64 port" fence this change sits. But I 
won't push further on that aspect.

But if we are dealing with non-TSO races then it would be good to get 
some guidance from Microsoft as to the memory ordering properties of 
various API's to ensure that we are maintaining correct ordering. For 
example, in the destructor we have:

81     lock_owner = 0;
82     // No lost wakeups, lock_event stays signaled until reset.
83     DWORD ret = SetEvent(lock_event);

but unless we are guaranteed that the store to lock_owner cannot be 
reordered by the compiler or the hardware, to appear to be after the 
SetEvent, then the logic is broken. Generally, because Windows only 
supported TSO systems, we have assumed that the compiler will not 
reorder code across these kind of API calls. But now we also need 
hardware guarantees.

Overall I do like the initialization cleanup to use the "init once" API.

Thanks,
David
-----

From david.holmes at oracle.com  Mon Jul 13 02:43:30 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 13 Jul 2020 12:43:30 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>

Hi Ludovic,

On 9/07/2020 11:55 pm, Ludovic Henry wrote:
> Hello,
> 
> As part of adding support for Windows-AArch64, I've had the opportunity to read through most of the Windows-x86 code. In doing so, I found some code that I think can be simplified and made easier to read and maintain.
> 
> The three areas I have found are:
> - Atomics: Hotspot doesn't make use of existing intrinsics provided by MSVC and Win32, even ones available since Windows XP.
> - Exception handling: there is some code repetition which, even if functional, is subpar.
> - Frames: we can use the existing os::fetch_frame_from_context to simplify the code and reduce frame parsing logic duplication.
> 
> I've split the webrevs along the above lines, making each simpler to review. I'm also hosting these webrevs on Bernhard Urban's CR as I currently do not have authorship. I'll also work with him to update the description of the JBS.

Thanks for doing the split!

As a general comment can you please ensure that the Oracle copyright 
second year is updated to 2020. Thanks.

Overall these cleanups look good. Thanks for providing them.

> JBS: https://bugs.openjdk.java.net/browse/JDK-8248817
> Webrevs:
> http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/

Love this cleanup! Great to see all the stubroutines go for x86.

src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp

Please delete this entire (archaic) comment block.

  42 // The following alternative implementations are needed because
  43 // Windows 95 doesn't support (some of) the corresponding Windows NT
  44 // calls. Furthermore, these versions allow inlining in the caller.
  45 // (More precisely: The documentation for InterlockedExchange says
  46 // it is supported for Windows 95. However, when single-stepping
  47 // through the assembly code we cannot step into the routine and
  48 // when looking at the routine address we see only garbage code.
  49 // Better safe then sorry!). Was bug 7/31/98 (gri).
  50 //
  51 // Performance note: On uniprocessors, the 'lock' prefixes are not
  52 // necessary (and expensive). We should generate separate cases if
  53 // this becomes a performance problem.

In this (and elsewhere):

  80 DEFINE_STUB_ADD(4, long,    InterlockedAdd)
  81 DEFINE_STUB_ADD(8, __int64, InterlockedAdd64)

can we use __int32 for clarity rather than "long"?

> http://cr.openjdk.java.net/~burban/luhenry/8248817-exception-handling/

Looks good!

> http://cr.openjdk.java.net/~burban/luhenry/8248817-frames/

Looks good!

Thanks,
David
-----

> Tests: jtreg:hotspot:tier, jtreg:jdk:tier1, jtreg:jdk:tier2, jtreg:langtools on Windows-x86 and Windows-x86_64, no regressions.
> 
> Thank you,
> 
> --
> Ludovic
> 

From david.holmes at oracle.com  Mon Jul 13 02:57:47 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 13 Jul 2020 12:57:47 +1000
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
Message-ID: <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>

Hi Dan,

This all looks good to me.

Thanks,
David
-----

On 8/07/2020 5:51 pm, David Holmes wrote:
> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>> Ping! Any takers??? Code deletion should be really appealing here!!
> 
> Sorry Dan didn't get to it before vacation. But if you can wait till 
> Monday ...
> 
> Cheers,
> David
> 
>> Dan
>>
>>
>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> It's time to remove the AsyncDeflateIdleMonitors option from JDK16. 
>>> We can
>>> also get rid of the safepoint based deflation mechanism since turning 
>>> off
>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way left to
>>> use it.
>>>
>>> This is marked as an "S/M" review because the number of touched/deleted
>>> lines makes it a Medium review, but the number of touched/changed lines
>>> (outside of the deletions) makes it a Small review. It's actually a 
>>> pretty
>>> fast read... :-)
>>>
>>> Here's the bug ID:
>>>
>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the safepoint
>>> ??????????????? based deflation mechanism
>>> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>>>
>>> Here's the webrev URL:
>>>
>>> ??? http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>
>>> The webrev is baselined on Thomas S's fix for 8248650 which is jdk-16+4
>>> plus a dozen or so changesets.
>>>
>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and there are
>>> no regressions (and very few known failures). My inflation stress 
>>> testing
>>> is still in process. I had to restart that testing after a thunderstorm
>>> related power failure took down my servers in Florida. Sigh...
>>>
>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>
>>> Dan
>>

From thomas.stuefe at gmail.com  Mon Jul 13 04:41:01 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 13 Jul 2020 06:41:01 +0200
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
Message-ID: <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>

On Mon, Jul 13, 2020 at 3:55 AM David Holmes <david.holmes at oracle.com>
wrote:

> On 13/07/2020 12:25 am, Andrew Haley wrote:
> > On 11/07/2020 15:15, Kim Barrett wrote:
> >
> >  > This change seems to me to make the code noticeably simpler and
> >  > easier to understand, while also making it platform-independent
> >  > (eliminating a non-TSO race).
> >
> > Which is, let us not forget, undefined behaviour. It's best to treat
> > all such cases as bugs, even if they don't affect x86. But it's not
> > always non-TSO machines that come out badly: this reminds me of
> > JDK-8225716, a race condition which only showed on x86-32.
> >
> >  > Those seem like good things, regardless of any aarch64 port that
> >  > might be coming.
> >
> > Indeed.
> >
> > My opinion is that unnecessary platform dependencies are in effect
> > technical debt. Another example is the use of non-portable integer
> > types in AArch64 -- to a large extent my doing -- and it makes sense
> > to do the cleanup in mainline. That way the Windows import patches
> > will be as clean and simple as they can be.
>
> We'll have to agree to disagree on which side of the "general cleanup"
> versus "part of the Windows-aarch64 port" fence this change sits. But I
> won't push further on that aspect.
>
> But if we are dealing with non-TSO races then it would be good to get
> some guidance from Microsoft as to the memory ordering properties of
> various API's to ensure that we are maintaining correct ordering. For
> example, in the destructor we have:
>
> 81     lock_owner = 0;
> 82     // No lost wakeups, lock_event stays signaled until reset.
> 83     DWORD ret = SetEvent(lock_event);
>
> but unless we are guaranteed that the store to lock_owner cannot be
> reordered by the compiler or the hardware, to appear to be after the
> SetEvent, then the logic is broken.


Can a compiler reorder system calls and stores? How would it determine if
this is safe to do?

I'd be surprised if Microsoft loosened up reordering since this would mean
existing software cannot just be recompiled for arm and expected to work.
But this is just a guess of course.

Generally, because Windows only
> supported TSO systems, we have assumed that the compiler will not
> reorder code across these kind of API calls. But now we also need
> hardware guarantees.
>
> Overall I do like the initialization cleanup to use the "init once" API.
>
> Thanks,
> David
> -----
>

Thomas

From shade at redhat.com  Mon Jul 13 05:40:48 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 13 Jul 2020 07:40:48 +0200
Subject: RFR (S) 8249137: Remove CollectedHeap::obj_size
In-Reply-To: <f68ba7d4-2693-57c4-59ef-6b0c04b3260c@oracle.com>
References: <a9bd4c4d-2d6b-733e-abbf-90e873c70951@redhat.com>
 <70984ebf-2717-c7b3-7076-12e2c8c7515c@oracle.com>
 <a2432ff6-79b2-aefc-0450-d1c2b3cabd22@redhat.com>
 <99b679ec-fcae-6f30-1186-1cf8de809c6f@oracle.com>
 <f68ba7d4-2693-57c4-59ef-6b0c04b3260c@oracle.com>
Message-ID: <8fea1a3f-8d62-4408-8f07-b92af35b6632@redhat.com>

On 7/12/20 9:08 PM, Chris Plummer wrote:
> Looks good to me.
Thanks! Pushed.

-- 
-Aleksey


From david.holmes at oracle.com  Mon Jul 13 05:48:49 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 13 Jul 2020 15:48:49 +1000
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
Message-ID: <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>

Hi Thomas,

On 13/07/2020 2:41 pm, Thomas St?fe wrote:
> On Mon, Jul 13, 2020 at 3:55 AM David Holmes <david.holmes at oracle.com 
> <mailto:david.holmes at oracle.com>> wrote:
> 
>     On 13/07/2020 12:25 am, Andrew Haley wrote:
>      > On 11/07/2020 15:15, Kim Barrett wrote:
>      >
>      >? > This change seems to me to make the code noticeably simpler and
>      >? > easier to understand, while also making it platform-independent
>      >? > (eliminating a non-TSO race).
>      >
>      > Which is, let us not forget, undefined behaviour. It's best to treat
>      > all such cases as bugs, even if they don't affect x86. But it's not
>      > always non-TSO machines that come out badly: this reminds me of
>      > JDK-8225716, a race condition which only showed on x86-32.
>      >
>      >? > Those seem like good things, regardless of any aarch64 port that
>      >? > might be coming.
>      >
>      > Indeed.
>      >
>      > My opinion is that unnecessary platform dependencies are in effect
>      > technical debt. Another example is the use of non-portable integer
>      > types in AArch64 -- to a large extent my doing -- and it makes sense
>      > to do the cleanup in mainline. That way the Windows import patches
>      > will be as clean and simple as they can be.
> 
>     We'll have to agree to disagree on which side of the "general cleanup"
>     versus "part of the Windows-aarch64 port" fence this change sits. But I
>     won't push further on that aspect.
> 
>     But if we are dealing with non-TSO races then it would be good to get
>     some guidance from Microsoft as to the memory ordering properties of
>     various API's to ensure that we are maintaining correct ordering. For
>     example, in the destructor we have:
> 
>     81? ? ?lock_owner = 0;
>     82? ? ?// No lost wakeups, lock_event stays signaled until reset.
>     83? ? ?DWORD ret = SetEvent(lock_event);
> 
>     but unless we are guaranteed that the store to lock_owner cannot be
>     reordered by the compiler or the hardware, to appear to be after the
>     SetEvent, then the logic is broken.
> 
> 
> Can a compiler reorder system calls and stores? How would it determine 
> if this is safe to do?

A compiler can reorder anything it likes if it can determine it is safe 
to do so. :) In general as it is impractical, if not impossible, to 
make such a determination, the response to such queries has typically 
been "the compiler would never do that". And for gcc and clang we have 
some very knowledgeable folk that can attest to that. For VS, as I 
stated (now below) we assume this is also the case.

> I'd be surprised if Microsoft loosened up reordering since this would 
> mean existing software cannot just be recompiled for arm and expected to 
> work. But this is just a guess of course.

It's an interesting point because I would expect there to be a lot of 
software written for Windows that contains assumptions of TSO that would 
in fact fail when run on Aarch64. I don't know if there are any special 
mechanisms to force a binary to run in TSO mode on Aarch64 under Windows 
(or build flags), that would allow for ease of migration. But unless all 
Windows software will run in such a mode there is a need for MS to 
document what the memory consistency properties of various APIs are (as 
POSIX does [1]). This may already exist in the Windows-Aarch64 "SDK" but 
I have no knowledge of that and so can only see the general win32 
documentation available online.

Cheers,
David

[1] 
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_12

>     Generally, because Windows only
>     supported TSO systems, we have assumed that the compiler will not
>     reorder code across these kind of API calls. But now we also need
>     hardware guarantees.
> 
>     Overall I do like the initialization cleanup to use the "init once" API.
> 
>     Thanks,
>     David
>     -----
> 
> 
> Thomas

From richard.reingruber at sap.com  Mon Jul 13 06:42:13 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Mon, 13 Jul 2020 06:42:13 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Goetz,

thanks for looking at this!

And my apologies for taking that long...

So here is the new webrev.6

Webrev.6: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/
Delta:    http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.inc/

I spent most of the time running a microbenchmark [1] I wrote to answer questions from your
review. At first I had trouble with variance in the results until I found out it was due to the NUMA
architecture of the server I used. After that I noticed that there was a performance regression of
about 5% even at low agent activity. I finally found out that it was due to the implementation of
JavaThread::wait_for_object_deoptimization() which is called by the target of the JVMTI operation to
self suspend for object deoptimization. I fixed this by adding limited spinning before calling
wait() on the monitor.

The delta includes many changes in comments, renaming of names, etc. So I'd like to summarize
functional changes:

* Collected all the code for the testing feature DeoptimizeObjectsALot in compileBroker.cpp and
  reworked it.

  With DeoptimizeObjectsALot enabled internal threads are started that deoptimize frames and
  objects. The number of threads started are given with DeoptimizeObjectsALotThreadCountAll and
  DeoptimizeObjectsALotThreadCountSingle. The former targets all existing threads whereas the
  latter operates on a single thread selected round robin.

  I removed the mode where deoptimizations were performed at every nth exit from the runtime. I
  never used it.

* EscapeBarrier::sync_and_suspend_one(): use a direct handshake and execute it always independently
  of is_thread_fully_suspended().

* Bugfix in EscapeBarrier::thread_added(): must not clear deopt flag. Found this testing with
  DeoptimizeObjectsALot.

* Added EscapeBarrier::thread_removed().

* EscapeBarrier constructors: barriers can now be entirely disabled by disabling DoEscapeAnalysis.
  This effectively disables the enhancement.

* JavaThread::wait_for_object_deoptimization():

  - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the safepoint check! This
    caused issues with not walkable stacks with DeoptimizeObjectsALot.

  - Added limited spinning inspired by HandshakeSpinYield to fix regression in microbenchmark [1]

I refer to some more changes answering your questions and comments inline below.

Thanks,
Richard.

[1] Microbenchmark: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/

> Hi Richard,
> 
> I had a look at your change.  It's complex, but not that big.
> A lot of code is just passing info through layers of abstraction.

Also it leverages preexisting functionality like materialization of virtual objects in non-top
frames (see materializeVirtualObjects).

> Also, one can tell this went through some iterations by now, 
> I think it's very well engineered.
> I had a look at webrev.05
> 
> Unfortunately
> "8242425: JVMTI monitor operations should use Thread-Local Handshakes" 
> breaks webrev.05.
> I updated to before that change and took that as base of my review.
> 
> I see four parts of the change that can be looked at
> rather individually.
> 
>  * Refactoring the scopeDesc constructors. Trivial.
>  * Persisting information about the optimizations done by the compilers.
>    Large and mostly trivial.
>  * Deoptimizing. The most complicated part. Really well abstracted, though.
>  * DeoptimizeObjectsALot for testing and the tests.
> 
> Review of compiler changes:
> 
> I understand you annotate at safepoints where the escape analysis
> finds out that an object is "better" than global escape. 
> This are the cases where the analysis identifies optimization 
> opportunities. These annotations are then used to deoptimize
> frames and the objects referenced by them.
> Doesn't this overestimate the optimized 
> objects?  E.g., eliminate_alloc_node has many cases where it bails
> out.

Yes, the implementation is conservative, but it is comparatively simple and the additional debug
info is just 2 flags per safepoint. On the other hand, those JVMTI operations that really trigger
deoptimizations are expected to be comparatively infrequent such that switching to the interpreter
for a few microseconds will hardly have an effect.

I've done microbenchmarking to check this.

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/

I found that in the worst case performance can be impacted by 10%. If the agent is extremely active
and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every millisecond or more often,
then the performance impact can be 30%. But I would think that this is not realistic. These calls
are issued in interactive sessions to analyze deadlocks.

We could get more precise deoptimizations by adding a third flag per safepoint for ea-local objects
among the owned monitors. This would help improve the worst case in the benchmark. But I'm not
convinced, if it is worth it.

Refer to the README.txt of the microbenchmark for a more detailled discussion.

> c1_IR.hpp   
> 
> OK, nothing to do for C1, just adapt to extended method signature.
> 
> Break line once more so that it matches above line length.

Done.

> ciEnv.h|cpp
> 
> Pass through another jvmti capability.  Trivial & good.
> 
> 
> debugInfoRec.hpp
> 
> Pass through escape info that must be recorded. OK.
> 
> pcDesc.hpp
> 
> I would like to see some documentation of the methods.
>
> Maybe:
>   // There is an object in the scope that does not escape globally.
>   // It either does not escape at all or it escapes as arguemnt.
> and
>   // One of the arguments is an object that is not globally visible
>   // but escapes to the callee.

Done. I didn't take your text, though, because I only noticed it after writing my own. Let me know
if you are not ok with it.

> scopeDesc.cpp
> 
>   Besides refactoring copy escape info from pcDesc to scopeDesc
>   and add accessors. Trivial.
> 
>   In scopeDesc.hpp you talk about NoEscape and ArgEscape. 
>   This are opto terms, but scopeDesc is a shared datastructure
>   that does not depend on a specific compiler. 
>   Please explain what is going on without using these terms.

Actually these are not too opto specific terms. They are used in the paper referenced in
escape.hpp. Also you can easily google them. I'd rather keep the comments as they are.

> jvmciCodeInstaller.cpp
> 
>   OK, nothing for JVMCI. Here support for Object Optimizations 
>   for JVMCI compilers could be added. Leave this to graal people.
> 
> callnode.hpp
> 
> You add functionality to annotate callnodes with escape information 
> This is carried through code generation to final output where it is
> added to the compiled methods meta information.
> 
> At Safepoints in general jvmti can access
>   - Objects that were scalar replaced. They must be reallocated.
>     (Flag EliminateAllocations)
>   - Objects that should be locked but are not because they never 
>     escape the thread. They need to be relocked.
> 
> At calls, Objects where locks have been removed escape to callees.
> We must persist this information so that if jvmti accesses the 
> object in a callee, we can determine by looking at the caller that
> it needs to be relocked.

Note that the ea-optimization must not be at the current location, it can also follow when control
returns to the caller. Lock elimination isn't the only relevant optimization. Accesses to instance
members or array elements can be optimized as well.

> A side comment: 
> I think the flage handling in Opto is not very intuitive.
> DoEscapeAnalysis depends on the jvmti capabilities.
> This makes no sense. It is only an analysis. The optimizations
> should depend on the jvmti capabilities.
> The correct setup would be to handle this in 
> CompilerConfig::ergo_initialize():
> If the jvmti capabilities allow, enable the optimizations 
> EliminateAllocations or  EliminateLocks/EliminateNestedLocks.
> If one of these optimizations is on, enable EscapeAnalysis.
>  -- end side comment.
>
> So I would propose the following comments:
> 
>   // In the scope of this safepoints there are objects
>   // that do not globally escape. They are either NoEscape or
>   // ArgEscape. As such, they might be subject to optimizations.
>   // Persist this information here so that the frame an the
>   // Objects in scope can 
>   // be deoptimized if jvmti accesses an object at this safepoint.
>   void set_not_global_escape_in_scope(bool b) {
> 
>   // This call passes objects that do not globally escape 
>   // to its callee. The object might be subject to optimization, 
>   // e.g. a lock might be omitted. Persist this information here 
>   // so that on a jvmti access to the callee frame we can deoptimize
>   // the object and this frame.
>   void  set_arg_escape(bool f)             { _arg_escape = f; }

I do not really like these comments. They are too verbose and do not match the comment style of the
surrounding code. The names are descriptive enough IMO. Also the measures taken depending on the
flags should be commented at the locations, where the flags are read.

> Actuall I am not sure whether the name of these fields (and all 
> the others in the course of this change) should refer to 
> escape analysis.  I think the term "Object deoptimization" 
> you also use is much better. You could call these properties 
> (througout the whole change) 
>   set_optimized_objects_in_scope()
> and
>   set_passes_optimized_objects().
> 
> I think this would make the whole matter much easier
> to understand. 

I'd prefer the current names. They are closer to established terminology.  And it is actually
unknown, if optimizations based on their escape state exist.

> Anyways, locks can already be removed without running
> escape analysis at all. C2 recognizes some local patterns
> that allow this.
> 
> escape.h|cpp
> 
> The code looks good. 
> 
> Line 325: The comment could be a bit more elaborate:
>   // Annotate at safepoints if they have <= ArgEscape objects in their
>   // scope. Additionally, if the safepoint is a java call, annotate
>   // whether it passes ArgEscape objects as parameters.
> 
> And maybe add these comments?:
> 
> // Returns true if an oop in the scope of sfn does not escape
> // globally.
> bool ConnectionGraph::has_not_global_escape_in_scope(SafePointNode* sfn) {
> 
> // Returns true if at least one of the arguments to the call is an oop
> // that does not escape globally.
> bool ConnectionGraph::has_arg_escape(CallJavaNode* call) {

IMHO the method names are descriptive and don't need the comments. But I give in :) (only replaced
"oop" with "object")

> General question:
> You collect the information you want to annotate to the 
> method during escape analysis.
> Don't you overestimate the optimized objects by this?
> E.g. elimination of allocations does bail out for 
> various reasons. At the end, no optimization might 
> have happened, but then during runtime the frame is 
> deoptimized nevertheless.

Please see statements and worst case microbenchmark above.

> machnode.hpp:
> 
> Extends MachSafePointNode similar to the ideal version.  Good.
> 
> matcher.cpp
>   
> Copy info from ideal to mach node. good.
> 
> output.cpp
> 
> Now finally the information is written to the 
> debug info.  Good.
> 
> ---------------------------------------------------------
> 
> So now let's have a look at the runtime part (including
> relaxing constraints to escape analysis):
> 
> rootResolver.cpp
> 
> Adapt to changed interface. good.
> 
> c2compiler.cpp / macro.cpp
> 
> Make EscpaeAnlysis independent of jvmti capabilities. Good.
> 
> jvmtiEnv.cpp/jvmtiEnvBase.cpp
> 
> You add deoptimization of objects where they are 
> accessed. good.
> 
> jvmtiImpl.cpp
> 
> In deoptimize_objects, you check for DoEscapeAnalysis.
> This is correct given the current design of the flag
> handling in the compiler.
> It's not really nice to have a dependency to C2 here, 
> though. I understand it's an optimization, the code 
> could be run anyways, it would check but not find
> anything. But actually I would excpect dependencies
> on EliminateLocks and EliminateAllocations (if they
> were set according to jvmti capabilitiers as I elaborated
> above.)  
> Would it make sense to protect the ArgEscape
> loop by if (EliminateLocks)?

You are right, it is not correct how flags are checked. Especially if only running with the JVMCI
compiler.

I changed Deoptimization::deoptimize_objects_internal() to make reallocation and relocking dependent
on similar checks as in Deoptimization::fetch_unroll_info_helper(). Furthermore EscapeBarriers are
conditionally activated depending on the following (see EscapeBarrier ctors):

JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false) COMPILER2_PRESENT(|| DoEscapeAnalysis)

So the enhancement can be practically completely disabled by disabling DoEscapeAnalysis, which is
what C2 currently does if JVMTI capabilities that allow access to local references are taken.

> jvmtiTagMap.cpp
> 
> Deoptimize for jvmti operations.  Good.
> 
> deoptimization.cpp
> 
> I guess this is the core of your work.
> 
> 
> You add a new mode that just deoptimizes objects but not frames. 
> Good idea. You have to use reallocated objects in upper frames, 
> or by jvmti accesses to inner frames, which can not easily be
> replaced by interpreter frames.
> This way you can wait with replacing the frame until just before
> execution returns.
> 
> eliminate_allocations():
> (Strange method name, should at least be in past tense, even
> better reallocate_eliminated_allocations() or 
> allocate_scalarized_objects(). Confused me until
> I groked the code. Legacy though, not your business.)

I still don't grok the name... ;) but it's preexisting as you noted

> It's not that nice to return whether you only deoptimized
> objects by the boolean reference argument. After all, 
> it again depends on the mode you pass in.
> A different design would be to clone the method and 
> have an eliminate_allocations_no_unpack() variant, but that would
> not be better as some code would be duplicated.
> Maybe a comment for argument eliminate_allocations:
> // deoptimized_objects is set to true if objects were deoptimized
> // but not the frame. It is unchanged if there are no objects to 
> // be deoptimized, or if the frame was deoptim

I agree: duplicating the code would be really bad, but I don't think that having reference
parameters is not nice. I think it is a common pattern, if you return an error code and additional
result data. The variable is a minor detail. With the meaningful name it is not necessary to
document it.

In my eyes it should be set independently of the exec_mode. I didn't do it to make the change smaller.

> Similar for eliminate_locks():
> // deoptimized_objects is set to true if objects were relocked,
> // else it is left unchanged.
> 
> You reuse and extend the existing realloc/relock_objects, but extended it.
> 
> deoptimize_objects_internal()
> 
> Simple version of fetch_unroll_info_helper for EscapeBarrier.
> Good.
> I attributed the comment "Then relock objects if synchronization on them was eliminated."
> to the if() just below. Add an empty line to make clear the comment
> refers to the next 10 lines.
> Alternatively, replace the whole comment by 
> // At first, reallocate the non-escaping objects and restore their fields
> // so they are available for relocking.
> And add 
> // Now relock objects with eliminated locks.
> befor the if ((DoEscape... below.

I went for the latter.

> In fetch_unroll_info_helper, I don't understand why you need 
>  && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
> for eliminated locks, but not for skalar replaced objects?

In short reallocation is idempotent, relocking is not.

Without the enhancement Deoptimization::realloc_objects() can already be called more than once for a frame:

First call in materializeVirtualObjects() (also iterateFrames()).

Second (indirect) call in fetch_unroll_info_helper().

The objects from the first call are saved as jvmti deferred updates when realloc_objects()
returns. Note that there is no relationship to jvmti. The thing in common is that updates cannot be
directely installed into a compiled frame, it is necessary to deoptimize the frame and defer the
updates until the compiled frame gets replaced. Every time the vframes corresponding to the owner
frame are iterated, they get the deferred updates. So in fetch_unroll_info_helper() the
GrowableArray<compiledVFrame*>* chunk reference them too. All references to the objects created by
the second (indirect) call to realloc_objects() are never used, because compiledVFrame accessors to
locals, expressions, and monitors override them with the deferred updates. The objects become
unreachable and get gc'ed.

materializeVirtualObjects() does not bother with relocking. deoptimize_objects_internal(), which is
introduced by the enhancement, does relock objects, after all the lock elimination becomes illegal
with the change in escape state. Relocking twice does not work, so the enhancement avoids it by
checking EscapeBarrier::objs_are_deoptimized(thread, deoptee.id()).

Note that materializeVirtualObjects() can be called more than once and will always return the very
same objects, even though it calls realloc_objects() again.

> I would guess it is because the eliminated locks can be applied to
> argEscape, but scalar replacement only to noescape objects?
> I.e. it might have been done before?
> 
> But why isn't this the case for eliminate_allocations?
> deoptimize_objects_internal does both unconditionally,
> so both can happen to inner frames, right?

Sorry, I don't quite understand. Hope the explanation above helps.

> relock_objects()
> 
> Ok, you need to undo biased locking. Also, you remember the 
> lock nesting for later relocking if waiting for lock.
> 
> revoke_for_object_deoptimization()
>   I like if boolean operators are at the beginning of broken lines, 
>   but I think hotspot convention is to have them at the end.

Ok, fixed.

> Code will get much more simple if BiasedLocking is removed.
> 
> EscapeBarrier:: ...
> 
> (This class maybe would qualify for a file of its own.)
> 
> deoptimize_objects()
> I would mention escape analysis only as side remark.  Also, as I understand, 
> there is only one frame at given depth?
> // Deoptimize frames with optimized objects. This can be omitted locks and 
> // objects not allocated but replaced by scalars. In C2, these optimizations
> // are based on escape analysis.
> // Up to depth, deoptimize frames with any optimized objects.
> // From depth to entry_frame, deoptimize only frames that
> // pass optimized objects to their callees.
> (First part similar for the comment above EscapeBarrier::deoptimize_objects_internal().)

I've reworked the comment. Let me know if you still think it needs to be improved.

> 
> What is the check (cur_depth <= depth) good for? Can you 
> ever walk past entry_frame?  

Yes (assuming you mean the outer while-statement), there are java frames beyond the entry frame if a
native method calls java methods again. So we visit all frames up to the given depth and from there
we continue to the entry frame. It is not necessary to continue beyond that entry frame, because
escape analysis assumes that arguments to native functions escape globally.

Example: Let the java stack look like this:

+---------+
| Frame A |
+---------+
| Frame N |
+---------+
| Frame B |
+---------+ <- top of stack

Where java method A calls native method N and N calls java method B.

Very simplified the native stack will look like this

+-------------------------+
| Frame of JIT Compiled A |
+-------------------------+
| Frame N                 |
+-------------------------+
| Entry Frame             |
+-------------------------+
| Frame B                 |
+-------------------------+ <- top of stack

The entry frame is an activation of the call stub, which is a small assembler routine that
translates from the native calling convention to the java calling convention.

There cannot be any ArgEscape that is passed to B (see above), therefore we can stop the stackwalk
at the entry frame if depth is 1. If depth is 3 we have to continue to Frame A, as it is directely
accessed.


> Isn't vf->is_compiled_frame() prerequisite that "Move to next physical frame" 
> is needed? You could move it into the other check.
> If so, similar for deoptimize_objects_all_threads().

Only compiledVFrame require moving to the /top/ frame. Fixed.

> Syncronization: looks good. I think others had a look at this before.
> 
> EscapeBarrier::deoptimize_objects_internal()
>   The method name is misleading, it is not used by 
>   deoptimize_objects().
>   Also, method with the same name is in Deopitmization.
>   Proposal: deoptimize_objects_thread() ?

Sorry, but I don't see, why it would be misleading.
What would be the meaning of 'deoptimize_objects_thread'? I don't understand that name.

> C1 stubs: this really shows you tested all configurations, great!
> 
> 
> mutexLocker: ok.
> objectMonitor.cpp: ok
> stackValue.hpp   Is this missing clearing a bug?

In short: that change is not needed anymore. I'll remove it again.

Details: it is not a real bug, but the assertion in vframeArrayElement::fill_in() was triggered:

assert(!value->obj_is_scalar_replaced() || realloc_failures) failed: object should be reallocated already.

But only with the first version of the enhancement (webrev.0), were objects were only reallocated
when replacing a compiled frame with equivalent interpreter frames iff virtual objects where not
reallocated before.

I changed this after prexisting code was refactored (JDK-8226705), because practically never already
reallocated objects exist and if there should be any, it does not harm to reallocate again, because
the unnecessarily allocated objects become immediately garbage and last but not least no tricky
synchronization is required.

Also that's what happens with the preexisting code if virtual objects are materialized with
materializeVirtualObjects().

> 
> thread.hpp
> 
> I would remove "_ea" from the flag and method names.

Done.

> 
> Renaming deferred_locals to deferred_updates is good, as well as 
> adding a datastructure for it. 
> (Adding this data structure might be a breakout, too.)
> 
> good.
> 
> thread.cpp
> 
> good.
> 
> vframe.cpp
> 
> Is this a bug in existing code?
> Makes sense. 

Depends on your definition of bug. There are no references to vframe::is_entry_frame() in the
existing code. I would think it is a bug.

> 
> vframe_hp.hpp 
> (What stands _hp for? helper? The file should be named compiledVFrame ...)
> 
> not_global_escape_in_scope() ...
> Again, you mention escape analysis here. Comments above hold, too.

I think it is the right name, because it is meaningful and simple.

> You introduce JvmtiDeferredUpdates. Good.
> 
> vframe_hp.cpp
> 
> Changes for JvmtiDeferredUpdates, escape state accessors,
> 
> line 422:
> Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?
> 
> 
> macros.hpp
>   Good.  
> 
> 
> Test coding
> ============
> 
> compileBroker.h|cpp
> 
> You introduce a third class of threads handled here and 
> add a new flag to distinguish it. Before, the two kinds
> of threads were distinguished implicitly by passing in 
> a compiler for compiler threads.
> The new thread kind is only used for testing in debug.
> 
> make_thread:
> You could assert (comp != NULL...) to assure previous
> conditions.

If replaced the if-statements with a switch-statement, made sure all enum-elements are covered, and
added the assertion you suggested.

> line 989 indentation broken

You are referring to this block I assume:
(from http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/src/hotspot/share/compiler/compileBroker.cpp.frames.html)

 976   if (MethodFlushing) {
 977     // Initialize the sweeper thread
 978     Handle thread_oop = create_thread_oop("Sweeper thread", CHECK);
 979     jobject thread_handle = JNIHandles::make_local(THREAD, thread_oop());
 980     make_thread(sweeper_t, thread_handle, NULL, NULL, THREAD);
 981   }
 982 
 983 #if defined(ASSERT) && COMPILER2_OR_JVMCI
 984   if (DeoptimizeObjectsALot == 2) {
 985     // Initialize and start the object deoptimizer threads
 986     for (int thread_count = 0; thread_count < DeoptimizeObjectsALotThreadCount; thread_count++) {
 987       Handle thread_oop = create_thread_oop("Deoptimize objects a lot thread", CHECK);
 988       jobject thread_handle = JNIHandles::make_local(THREAD, thread_oop());
 989       make_thread(deoptimizer_t, thread_handle, NULL, NULL, THREAD);
 990     }
 991   }
 992 #endif // defined(ASSERT) && COMPILER2_OR_JVMCI

I cannot really see broken indentation here. Am I looking at the wrong location?

> escape.cpp
> 
> You enable the optimization in case of testruns. good.
> 
> whitebox.cpp  ok.
> 
> deoptimization.cpp
> 
> deoptimize_objects_alot_loop()  Good.
> 
> globals.hpp
> 
> Nice docu of flags, but pleas mention "for testing purposes"
> or the like in DeoptimizeObjectsALot.
> I would place the flags next to each other. 
> 
> interfaceSupport.cpp: good.

Thanks! :)

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Mittwoch, 6. Mai 2020 12:28
To: Reingruber, Richard <richard.reingruber at sap.com>; Doerr, Martin <martin.doerr at sap.com>; 'Robbin Ehn' <robbin.ehn at oracle.com>; David Holmes <david.holmes at oracle.com>; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

I had a look at your change.  It's complex, but not that big.
A lot of code is just passing info through layers of abstraction.
Also, one can tell this went through some iterations by now, 
I think it's very well engineered.
I had a look at webrev.05

Unfortunately
"8242425: JVMTI monitor operations should use Thread-Local Handshakes" 
breaks webrev.05.
I updated to before that change and took that as base of my review.

I see four parts of the change that can be looked at
rather individually.

 * Refactoring the scopeDesc constructors. Trivial.
 * Persisting information about the optimizations done by the compilers.
   Large and mostly trivial.
 * Deoptimizing. The most complicated part. Really well abstracted, though.
 * DeoptimizeObjectsALot for testing and the tests.

Review of compiler changes:

I understand you annotate at safepoints where the escape analysis
finds out that an object is "better" than global escape. 
This are the cases where the analysis identifies optimization 
opportunities. These annotations are then used to deoptimize
frames and the objects referenced by them.
Doesn't this overestimate the optimized 
objects?  E.g., eliminate_alloc_node has many cases where it bails
out.

c1_IR.hpp   

OK, nothing to do for C1, just adapt to extended method signature.

Break line once more so that it matches above line length.


ciEnv.h|cpp

Pass through another jvmti capability.  Trivial & good.


debugInfoRec.hpp

Pass through escape info that must be recorded. OK.

pcDesc.hpp

I would like to see some documentation of the methods.

Maybe:
  // There is an object in the scope that does not escape globally.
  // It either does not escape at all or it escapes as arguemnt.
and
  // One of the arguments is an object that is not globally visible
  // but escapes to the callee.

scopeDesc.cpp

  Besides refactoring copy escape info from pcDesc to scopeDesc
  and add accessors. Trivial.

  In scopeDesc.hpp you talk about NoEscape and ArgEscape. 
  This are opto terms, but scopeDesc is a shared datastructure
  that does not depend on a specific compiler. 
  Please explain what is going on without using these terms.

jvmciCodeInstaller.cpp

  OK, nothing for JVMCI. Here support for Object Optimizations 
  for JVMCI compilers could be added. Leave this to graal people.

callnode.hpp

You add functionality to annotate callnodes with escape information 
This is carried through code generation to final output where it is
added to the compiled methods meta information.

At Safepoints in general jvmti can access
  - Objects that were scalar replaced. They must be reallocated.
    (Flag EliminateAllocations)
  - Objects that should be locked but are not because they never 
    escape the thread. They need to be relocked.

At calls, Objects where locks have been removed escape to callees.
We must persist this information so that if jvmti accesses the 
object in a callee, we can determine by looking at the caller that
it needs to be relocked.

A side comment: 
I think the flage handling in Opto is not very intuitive.
DoEscapeAnalysis depends on the jvmti capabilities.
This makes no sense. It is only an analysis. The optimizations
should depend on the jvmti capabilities.
The correct setup would be to handle this in 
CompilerConfig::ergo_initialize():
If the jvmti capabilities allow, enable the optimizations 
EliminateAllocations or  EliminateLocks/EliminateNestedLocks.
If one of these optimizations is on, enable EscapeAnalysis.
 -- end side comment.

So I would propose the following comments:

  // In the scope of this safepoints there are objects
  // that do not globally escape. They are either NoEscape or
  // ArgEscape. As such, they might be subject to optimizations.
  // Persist this information here so that the frame an the
  // Objects in scope can 
  // be deoptimized if jvmti accesses an object at this safepoint.
  void set_not_global_escape_in_scope(bool b) {

  // This call passes objects that do not globally escape 
  // to its callee. The object might be subject to optimization, 
  // e.g. a lock might be omitted. Persist this information here 
  // so that on a jvmti access to the callee frame we can deoptimize
  // the object and this frame.
  void  set_arg_escape(bool f)             { _arg_escape = f; }

Actuall I am not sure whether the name of these fields (and all 
the others in the course of this change) should refer to 
escape analysis.  I think the term "Object deoptimization" 
you also use is much better. You could call these properties 
(througout the whole change) 
  set_optimized_objects_in_scope()
and
  set_passes_optimized_objects().

I think this would make the whole matter much easier
to understand. 

Anyways, locks can already be removed without running
escape analysis at all. C2 recognizes some local patterns
that allow this.

escape.h|cpp

The code looks good. 

Line 325: The comment could be a bit more elaborate:
  // Annotate at safepoints if they have <= ArgEscape objects in their
  // scope. Additionally, if the safepoint is a java call, annotate
  // whether it passes ArgEscape objects as parameters.

And maybe add these comments?:

// Returns true if an oop in the scope of sfn does not escape
// globally.
bool ConnectionGraph::has_not_global_escape_in_scope(SafePointNode* sfn) {

// Returns true if at least one of the arguments to the call is an oop
// that does not escape globally.
bool ConnectionGraph::has_arg_escape(CallJavaNode* call) {

General question:
You collect the information you want to annotate to the 
method during escape analysis.
Don't you overestimate the optimized objects by this?
E.g. elimination of allocations does bail out for 
various reasons. At the end, no optimization might 
have happened, but then during runtime the frame is 
deoptimized nevertheless.

machnode.hpp:

Extends MachSafePointNode similar to the ideal version.  Good.

matcher.cpp
  
Copy info from ideal to mach node. good.

output.cpp

Now finally the information is written to the 
debug info.  Good.

---------------------------------------------------------

So now let's have a look at the runtime part (including
relaxing constraints to escape analysis):

rootResolver.cpp

Adapt to changed interface. good.

c2compiler.cpp / macro.cpp

Make EscpaeAnlysis independent of jvmti capabilities. Good.

jvmtiEnv.cpp/jvmtiEnvBase.cpp

You add deoptimization of objects where they are 
accessed. good.

jvmtiImpl.cpp

In deoptimize_objects, you check for DoEscapeAnalysis.
This is correct given the current design of the flag
handling in the compiler.
It's not really nice to have a dependency to C2 here, 
though. I understand it's an optimization, the code 
could be run anyways, it would check but not find
anything. But actually I would excpect dependencies
on EliminateLocks and EliminateAllocations (if they
were set according to jvmti capabilitiers as I elaborated
above.)  
Would it make sense to protect the ArgEscape
loop by if (EliminateLocks)?

jvmtiTagMap.cpp

Deoptimize for jvmti operations.  Good.

deoptimization.cpp

I guess this is the core of your work.


You add a new mode that just deoptimizes objects but not frames. 
Good idea. You have to use reallocated objects in upper frames, 
or by jvmti accesses to inner frames, which can not easily be
replaced by interpreter frames.
This way you can wait with replacing the frame until just before
execution returns.

eliminate_allocations():
(Strange method name, should at least be in past tense, even
better reallocate_eliminated_allocations() or 
allocate_scalarized_objects(). Confused me until
I groked the code. Legacy though, not your business.)

It's not that nice to return whether you only deoptimized
objects by the boolean reference argument. After all, 
it again depends on the mode you pass in.
A different design would be to clone the method and 
have an eliminate_allocations_no_unpack() variant, but that would
not be better as some code would be duplicated.
Maybe a comment for argument eliminate_allocations:
// deoptimized_objects is set to true if objects were deoptimized
// but not the frame. It is unchanged if there are no objects to 
// be deoptimized, or if the frame was deoptim

Similar for eliminate_locks():
// deoptimized_objects is set to true if objects were relocked,
// else it is left unchanged.

You reuse and extend the existing realloc/relock_objects, but extended it.

deoptimize_objects_internal()

Simple version of fetch_unroll_info_helper for EscapeBarrier.
Good.
I attributed the comment "Then relock objects if synchronization on them was eliminated."
to the if() just below. Add an empty line to make clear the comment
refers to the next 10 lines.
Alternatively, replace the whole comment by 
// At first, reallocate the non-escaping objects and restore their fields
// so they are available for relocking.
And add 
// Now relock objects with eliminated locks.
befor the if ((DoEscape... below.

In fetch_unroll_info_helper, I don't understand why you need 
 && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
for eliminated locks, but not for skalar replaced objects?
I would guess it is because the eliminated locks can be applied to
argEscape, but scalar replacement only to noescape objects?
I.e. it might have been done before?

But why isn't this the case for eliminate_allocations?
deoptimize_objects_internal does both unconditionally,
so both can happen to inner frames, right?

relock_objects()

Ok, you need to undo biased locking. Also, you remember the 
lock nesting for later relocking if waiting for lock.

revoke_for_object_deoptimization()
  I like if boolean operators are at the beginning of broken lines, 
  but I think hotspot convention is to have them at the end.

Code will get much more simple if BiasedLocking is removed.

EscapeBarrier:: ...

(This class maybe would qualify for a file of its own.)

deoptimize_objects()
I would mention escape analysis only as side remark.  Also, as I understand, 
there is only one frame at given depth?
// Deoptimize frames with optimized objects. This can be omitted locks and 
// objects not allocated but replaced by scalars. In C2, these optimizations
// are based on escape analysis.
// Up to depth, deoptimize frames with any optimized objects.
// From depth to entry_frame, deoptimize only frames that
// pass optimized objects to their callees.
(First part similar for the comment above EscapeBarrier::deoptimize_objects_internal().)

What is the check (cur_depth <= depth) good for? Can you 
ever walk past entry_frame?  

Isn't vf->is_compiled_frame() prerequisite that "Move to next physical frame" 
is needed? You could move it into the other check.
If so, similar for deoptimize_objects_all_threads().

Syncronization: looks good. I think others had a look at this before.

EscapeBarrier::deoptimize_objects_internal()
  The method name is misleading, it is not used by 
  deoptimize_objects().
  Also, method with the same name is in Deopitmization.
  Proposal: deoptimize_objects_thread() ?

C1 stubs: this really shows you tested all configurations, great!


mutexLocker: ok.
objectMonitor.cpp: ok
stackValue.hpp   Is this missing clearing a bug?

thread.hpp

I would remove "_ea" from the flag and method names.

Renaming deferred_locals to deferred_updates is good, as well as 
adding a datastructure for it. 
(Adding this data structure might be a breakout, too.)

good.

thread.cpp

good.

vframe.cpp

Is this a bug in existing code?
Makes sense. 

vframe_hp.hpp 
(What stands _hp for? helper? The file should be named compiledVFrame ...)

not_global_escape_in_scope() ...
Again, you mention escape analysis here. Comments above hold, too.

You introduce JvmtiDeferredUpdates. Good.

vframe_hp.cpp

Changes for JvmtiDeferredUpdates, escape state accessors,

line 422:
Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?


macros.hpp
  Good.  


Test coding
============

compileBroker.h|cpp

You introduce a third class of threads handled here and 
add a new flag to distinguish it. Before, the two kinds
of threads were distinguished implicitly by passing in 
a compiler for compiler threads.
The new thread kind is only used for testing in debug.

make_thread:
You could assert (comp != NULL...) to assure previous
conditions.

line 989 indentation broken

escape.cpp

You enable the optimization in case of testruns. good.

whitebox.cpp  ok.

deoptimization.cpp

deoptimize_objects_alot_loop()  Good.

globals.hpp

Nice docu of flags, but pleas mention "for testing purposes"
or the like in DeoptimizeObjectsALot.
I would place the flags next to each other. 

interfaceSupport.cpp: good.

I'll look at the test themselves in an extra mail (learning from 
Martin ??)

Best regards,
  Goetz.


> -----Original Message-----
> From: Reingruber, Richard <richard.reingruber at sap.com>
> Sent: Wednesday, April 1, 2020 8:15 AM
> To: Doerr, Martin <martin.doerr at sap.com>; 'Robbin Ehn'
> <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>;
> serviceability-dev at openjdk.java.net; hotspot-compiler-
> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Martin,
> 
> > thanks for addressing all my points. I've looked over webrev.5 and I'm
> satisfied with your changes.
> 
> Thanks!
> 
> > I had also promised to review the tests.
> 
> Thanks++
> I appreciate it very much, the tests are many lines of code.
> 
> > test/jdk/com/sun/jdi/EATests.java
> > This is a substantial amount of tests which is appropriate for a such a large
> change. Skipping some subtests with UseJVMCICompiler makes sense
> because it doesn't provide the necessary JVMTI functionality, yet.
> > Nice work!
> > I also like that you test with and without BiasedLocking. Your tests will still
> be fine after BiasedLocking deprecation.
> 
> Hope so :)
> 
> > Very minor nits:
> > - 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are
> ommitted" (sounds funny)
> > - You sometimes write "graal" and sometimes "Graal". I guess the capital G
> is better. (Also in EATestsJVMCI.java.)
> 
> > test/jdk/com/sun/jdi/EATestsJVMCI.java
> > EATests with Graal enabled. Nice that you support Graal to some extent.
> Maybe Graal folks want to enhance them in the future. I think this is a good
> starting point.
> 
> Will change this in the next webrev.
> 
> > Conclusion: Looks good and not trivial :-)
> > Now, you have one full review. I'd be ok with covering 2nd review by partial
> reviews.
> > Compiler and JVMTI parts are not too complicated IMHO.
> > Runtime part should get at least one additional careful review.
> 
> Thanks a lot,
> Richard.
> 
> -----Original Message-----
> From: Doerr, Martin <martin.doerr at sap.com>
> Sent: Dienstag, 31. M?rz 2020 16:01
> To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn'
> <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>;
> serviceability-dev at openjdk.java.net; hotspot-compiler-
> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> thanks for addressing all my points. I've looked over webrev.5 and I'm
> satisfied with your changes.
> 
> 
> I had also promised to review the tests.
> 
> test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysis
> Enabled.java
> Thanks for updating the @summary comment. Looks good in webrev.5.
> 
> test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnaly
> sisEnabled.c
> JVMTI agent for object tagging and heap iteration. Good.
> 
> test/jdk/com/sun/jdi/EATests.java
> This is a substantial amount of tests which is appropriate for a such a large
> change. Skipping some subtests with UseJVMCICompiler makes sense
> because it doesn't provide the necessary JVMTI functionality, yet.
> Nice work!
> I also like that you test with and without BiasedLocking. Your tests will still be
> fine after BiasedLocking deprecation.
> 
> Very minor nits:
> - 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are
> ommitted" (sounds funny)
> - You sometimes write "graal" and sometimes "Graal". I guess the capital G is
> better. (Also in EATestsJVMCI.java.)
> 
> test/jdk/com/sun/jdi/EATestsJVMCI.java
> EATests with Graal enabled. Nice that you support Graal to some extent.
> Maybe Graal folks want to enhance them in the future. I think this is a good
> starting point.
> 
> 
> Conclusion: Looks good and not trivial :-)
> Now, you have one full review. I'd be ok with covering 2nd review by partial
> reviews.
> Compiler and JVMTI parts are not too complicated IMHO.
> Runtime part should get at least one additional careful review.
> 
> Best regards,
> Martin
> 
> 
> > -----Original Message-----
> > From: Reingruber, Richard <richard.reingruber at sap.com>
> > Sent: Montag, 30. M?rz 2020 10:32
> > To: Doerr, Martin <martin.doerr at sap.com>; 'Robbin Ehn'
> > <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> > <goetz.lindenmaier at sap.com>; David Holmes
> <david.holmes at oracle.com>;
> > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > dev at openjdk.java.net
> > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> > in the Presence of JVMTI Agents
> >
> > Hi,
> >
> > this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)
> >
> > The change affects jvmti, hotspot and c2. Partial reviews are very welcome
> > too.
> >
> > Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
> > Delta:
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/
> >
> > Robbin, Martin, please let me know, if anything shouldn't be quite as you
> > wanted it. Also find my
> > comments on your feedback below.
> >
> > Robbin, can I count you as Reviewer for the runtime part?
> >
> > Thanks, Richard.
> >
> > --
> >
> > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > > You can move both declaration and definition to that file, no need to
> > clobber
> > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> >
> > Done.
> >
> > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in
> it's
> > own
> > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> >
> > I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting
> > jvmtiDeferredLocalVariableSet is
> > declared.
> >
> > > src/hotspot/share/code/compiledMethod.cpp
> > > Nice cleanup!
> >
> > Thanks :)
> >
> > > src/hotspot/share/code/debugInfoRec.cpp
> > > src/hotspot/share/code/debugInfoRec.hpp
> > > Additional parmeters. (Remark: I think "non_global_escape_in_scope"
> > would read better than "not_global_escape_in_scope", but your version is
> > consistent with existing code, so no change request from my side.) Ok.
> >
> > I've been thinking about this too and finally stayed with
> > not_global_escape_in_scope. It's supposed
> > to mean an object whose escape state is not GlobalEscape is in scope.
> >
> > > src/hotspot/share/compiler/compileBroker.cpp
> > > src/hotspot/share/compiler/compileBroker.hpp
> > > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into
> > a follow up change together with the test in order to make this webrev
> > smaller, but since it is included, I'm reviewing everything at once. Not a big
> > deal.) Ok.
> >
> > Yes the change would be a little smaller. And if it helps I'll split it off. In
> > general I prefer
> > patches that bring along a suitable amount of tests.
> >
> > > src/hotspot/share/opto/c2compiler.cpp
> > > Make do_escape_analysis independent of JVMCI capabilities. Nice!
> >
> > It is the main goal of the enhancement. It is done for C2, but could be done
> > for JVMCI compilers
> > with just a small effort as well.
> >
> > > src/hotspot/share/opto/escape.cpp
> > > Annotation for MachSafePointNodes. Your added functionality looks
> > correct.
> > > But I'd prefer to move the bulky code out of the large function.
> > > I suggest to factor out something like has_not_global_escape and
> > has_arg_escape. So the code could look like this:
> > >       SafePointNode* sfn = sfn_worklist.at(next);
> > >       sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
> > >       if (sfn->is_CallJava()) {
> > >         CallJavaNode* call = sfn->as_CallJava();
> > >         call->set_arg_escape(has_arg_escape(call));
> > >       }
> > > This would also allow us to get rid of the found_..._escape_in_args
> > variables making the loops better readable.
> >
> > Done.
> >
> > > It's kind of ugly to use strcmp to recognize uncommon trap, but that
> seems
> > to be the way to do it (there are more such places). So it's ok.
> >
> > Yeah. I copied the snippet.
> >
> > > src/hotspot/share/prims/jvmtiImpl.cpp
> > > src/hotspot/share/prims/jvmtiImpl.hpp
> > > The sequence is pretty complex:
> > > VM_GetOrSetLocal element initialization executes EscapeBarrier code
> > which suspends the target thread (extra VM Operation).
> >
> > Note that the target threads have to be suspended already for
> > VM_GetOrSetLocal*. So it's mainly the
> > synchronization effect of EscapeBarrier::sync_and_suspend_one() that is
> > required here. Also no extra
> > _handshake_ is executed, since sync_and_suspend_one() will find the
> > target threads already
> > suspended.
> >
> > > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by
> VM
> > Thread to prepare VM Operation with frame deoptimization).
> > > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor
> > which resumes the target thread.
> > > But I don't have any improvement proposal. Performance is probably not
> a
> > concern, here. So it's ok.
> >
> > > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it
> > has non-globally escaping objects and other frames if they have arg
> escaping
> > ones. Good.
> >
> > It's not specifically the top frame, but the frame that is accessed.
> >
> > > src/hotspot/share/runtime/deoptimization.cpp
> > > Object deoptimization. I have more comments and proposals, here.
> > > First of all, handling recursive and waiting locks in relock_objects is tricky,
> > but looks correct.
> > > Comments are sufficient to understand why things are done as they are
> > implemented.
> >
> > > BiasedLocking related parts are complex, but we may get rid of them in
> the
> > future (with BiasedLocking removal).
> > > Anyway, looks correct, too.
> >
> > > Typo in comment: "regularily" => "regularly"
> >
> > > Deoptimization::fetch_unroll_info_helper is the only place where
> > _jvmti_deferred_updates get deallocated (except JavaThread destructor).
> > But I think we always go through it, so I can't see a memory leak or such
> kind
> > of issues.
> >
> > That's correct. The compiled frame for which deferred updates are
> allocated
> > is always deoptimized
> > before (see EscapeBarrier::deoptimize_objects()). This is also asserted in
> > compiledVFrame::update_deferred_value(). I've added the same assertion
> > to
> > Deoptimization::relock_objects(). So we can be sure that
> > _jvmti_deferred_updates are deallocated
> > again in fetch_unroll_info_helper().
> >
> > > EscapeBarrier::deoptimize_objects: ResourceMark should use
> > calling_thread().
> >
> > Sure, well spotted!
> >
> > > You can use MutexLocker and MonitorLocker with Thread* to save the
> > Thread::current() call.
> >
> > Right, good hint. This was recently introduced with 8235678. I even had to
> > resolve conflicts. Should
> > have done this then.
> >
> > > I'd make set_objs_are_deoptimized static and remove it from the
> > EscapeBarrier interface because I think it shouldn't be used outside of
> > EscapeBarrier::deoptimize_objects.
> >
> > Done.
> >
> > > Typo in comment: "we must only deoptimize" => "we only have to
> > deoptimize"
> >
> > Replaced with "[...] we deoptimize iff local objects are passed as args"
> >
> > > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and
> > barrier_active() is redundant. Implementation can get moved to hpp file.
> >
> > Ok. Done.
> >
> > > I'll get back to suspend flags, later.
> >
> > > There are weird cases regarding _self_deoptimization_in_progress.
> > > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C.
> > C can set _self_deoptimization_in_progress while A performs the
> handshake
> > for suspending C. I think this doesn't lead to errors, but it's probably not
> > desired.
> > > I think it would be better to use only one "wait" call in
> > sync_and_suspend_one and sync_and_suspend_all.
> >
> > You're right. We've discussed that face-to-face, but couldn't find a real
> issue.
> > But now, thinking again, a reckon I found one:
> >
> > 2808   // Sync with other threads that might be doing deoptimizations
> > 2809   {
> > 2810     // Need to switch to _thread_blocked for the wait() call
> > 2811     ThreadBlockInVM tbivm(_calling_thread);
> > 2812     MonitorLocker ml(EscapeBarrier_lock,
> > Mutex::_no_safepoint_check_flag);
> > 2813     while (_self_deoptimization_in_progress) {
> > 2814       ml.wait();
> > 2815     }
> > 2816
> > 2817     if (self_deopt()) {
> > 2818       _self_deoptimization_in_progress = true;
> > 2819     }
> > 2820
> > 2821     while (_deoptee_thread->is_ea_obj_deopt_suspend()) {
> > 2822       ml.wait();
> > 2823     }
> > 2824
> > 2825     if (self_deopt()) {
> > 2826       return;
> > 2827     }
> > 2828
> > 2829     // set suspend flag for target thread
> > 2830     _deoptee_thread->set_ea_obj_deopt_flag();
> > 2831   }
> >
> > - A waits in 2822
> > - C is suspended
> > - B notifies all in resume_one()
> > - A and C wake up
> > - C wins over A and sets _self_deoptimization_in_progress = true in 2818
> > - C does the self deoptimization
> > - A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag()
> >
> > C will self suspend at some undefined point. The resulting state is illegal.
> >
> > > I first thought it'd be better to move ThreadBlockInVM before wait() to
> > reduce thread state transitions, but that seems to be problematic because
> > ThreadBlockInVM destructor contains a safepoint check which we
> shouldn't
> > do while holding EscapeBarrier_lock. So no change request.
> >
> > Yes, would be nice to have the state change only if needed, but for the
> > reason you mentioned it is
> > not quite as easy as it seems to be. I experimented as well with a second
> > lock, but did not succeed.
> >
> > > Change in thred_added:
> > > I think the sequence would be more comprehensive if we waited for
> > deopt_all_threads in Thread::start and all other places where a new thread
> > can run into Java code (e.g. JVMTI attach).
> > > Your version makes new threads come up with suspend flag set. That
> looks
> > correct, too. Advantage is that you only have to change one place
> > (thread_added). It'll be interesting to see how it will look like when we use
> > async handshakes instead of suspend flags.
> > > For now, I'm ok with your version.
> >
> > I had a version that did what you are suggesting. The current version also
> has
> > the advantage, that
> > there are fewer places where a thread has to wait for ongoing object
> > deoptimization. This means
> > viewer places where you have to worry about correct thread state
> > transitions, possible deadlocks,
> > and if all oops are properly Handle'ed.
> >
> > > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt-
> > >is_hidden_from_external_view()).
> >
> > Done.
> >
> > > Having 4 different deoptimize_objects functions makes it a little hard to
> > keep an overview of which one is used for what.
> > > Maybe adding suffixes would help a little bit, but I can also live with what
> > you have.
> > > Implementation looks correct to me.
> >
> > 2 are internal. I added the suffix _internal to them. This leaves 2 to choose
> > from.
> >
> > > src/hotspot/share/runtime/deoptimization.hpp
> > > Escape barriers and object deoptimization functions.
> > > Typo in comment: "helt" => "held"
> >
> > Done in place already.
> >
> > > src/hotspot/share/runtime/interfaceSupport.cpp
> > > InterfaceSupport::deoptimizeAllObjects() is only used for
> > DeoptimizeObjectsALot = 1.
> > > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not
> bad
> > to have DeoptimizeObjectsALot = 1 in addition. Ok.
> >
> > I never used DeoptimizeObjectsALot = 1 that much. It could be more
> > deterministic in single threaded
> > scenarios. I wouldn't object to get rid of it though.
> >
> > > src/hotspot/share/runtime/stackValue.hpp
> > > Better reinitilization in StackValue. Good.
> >
> > StackValue::obj_is_scalar_replaced() should not return true after calling
> > set_obj().
> >
> > > src/hotspot/share/runtime/thread.cpp
> > > src/hotspot/share/runtime/thread.hpp
> > > src/hotspot/share/runtime/thread.inline.hpp
> > > wait_for_object_deoptimization, suspend flag, deferred updates and test
> > feature to deoptimize objects.
> >
> > > In the long term, we want to get rid of suspend flags, so it's not so nice to
> > introduce a new one. But I agree with G?tz that it should be acceptable as
> > temporary solution until async handshakes are available (which takes more
> > time). So I'm ok with your change.
> >
> > I'm keen to build the feature on async handshakes when the arive.
> >
> > > You can use MutexLocker with Thread*.
> >
> > Done.
> >
> > > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class
> > out of thread.hpp.
> >
> > Done.
> >
> > > src/hotspot/share/runtime/vframe.cpp
> > > Added support for entry frame to new_vframe. Ok.
> >
> >
> > > src/hotspot/share/runtime/vframe_hp.cpp
> > > src/hotspot/share/runtime/vframe_hp.hpp
> >
> > > I think code()->as_nmethod() in not_global_escape_in_scope() and
> > arg_escape() should better be under #ifdef ASSERT or inside the assert
> > statement (no need for code cache walking in product build).
> >
> > Done.
> >
> > > jvmtiDeferredLocalVariableSet::update_monitors:
> > > Please add a comment explaining that owner referenced by original info
> > may be scalar replaced, but it is deoptimized in the vframe.
> >
> > Done.
> >
> > -----Original Message-----
> > From: Doerr, Martin <martin.doerr at sap.com>
> > Sent: Donnerstag, 12. M?rz 2020 17:28
> > To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn'
> > <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> > <goetz.lindenmaier at sap.com>; David Holmes
> <david.holmes at oracle.com>;
> > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > dev at openjdk.java.net
> > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> > in the Presence of JVMTI Agents
> >
> > Hi Richard,
> >
> >
> > I managed to find time for a (almost) complete review of webrev.4. (I'll
> > review the tests separately.)
> >
> > First of all, the change seems to be in pretty good quality for its significant
> > complexity. I couldn't find any real bugs. But I'd like to propose minor
> > improvements.
> > I'm convinced that it's mature because we did substantial testing.
> >
> > I like the new functionality for object deoptimization. It can possibly be
> > reused for future escape analysis based optimizations. So I appreciate
> having
> > it available in the code base.
> > In addition to that, your change makes the JVMTI implementation better
> > integrated into the VM.
> >
> >
> > Now to the details:
> >
> >
> > src/hotspot/share/c1/c1_IR.hpp
> > describe_scope parameters. Ok.
> >
> >
> > src/hotspot/share/ci/ciEnv.cpp
> > src/hotspot/share/ci/ciEnv.hpp
> > Fix for JvmtiExport::can_walk_any_space() capability. Ok.
> >
> >
> > src/hotspot/share/code/compiledMethod.cpp
> > Nice cleanup!
> >
> >
> > src/hotspot/share/code/debugInfoRec.cpp
> > src/hotspot/share/code/debugInfoRec.hpp
> > Additional parmeters. (Remark: I think "non_global_escape_in_scope"
> > would read better than "not_global_escape_in_scope", but your version is
> > consistent with existing code, so no change request from my side.) Ok.
> >
> >
> > src/hotspot/share/code/nmethod.cpp
> > Nice cleanup!
> >
> >
> > src/hotspot/share/code/pcDesc.hpp
> > Additional parameters. Ok.
> >
> >
> > src/hotspot/share/code/scopeDesc.cpp
> > src/hotspot/share/code/scopeDesc.hpp
> > Improved implementation + additional parameters. Ok.
> >
> >
> > src/hotspot/share/compiler/compileBroker.cpp
> > src/hotspot/share/compiler/compileBroker.hpp
> > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a
> > follow up change together with the test in order to make this webrev
> > smaller, but since it is included, I'm reviewing everything at once. Not a big
> > deal.) Ok.
> >
> >
> > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
> > Additional parameters. Ok.
> >
> >
> > src/hotspot/share/opto/c2compiler.cpp
> > Make do_escape_analysis independent of JVMCI capabilities. Nice!
> >
> >
> > src/hotspot/share/opto/callnode.hpp
> > Additional fields for MachSafePointNodes. Ok.
> >
> >
> > src/hotspot/share/opto/escape.cpp
> > Annotation for MachSafePointNodes. Your added functionality looks
> correct.
> > But I'd prefer to move the bulky code out of the large function.
> > I suggest to factor out something like has_not_global_escape and
> > has_arg_escape. So the code could look like this:
> >       SafePointNode* sfn = sfn_worklist.at(next);
> >       sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
> >       if (sfn->is_CallJava()) {
> >         CallJavaNode* call = sfn->as_CallJava();
> >         call->set_arg_escape(has_arg_escape(call));
> >       }
> > This would also allow us to get rid of the found_..._escape_in_args
> variables
> > making the loops better readable.
> >
> > It's kind of ugly to use strcmp to recognize uncommon trap, but that seems
> > to be the way to do it (there are more such places). So it's ok.
> >
> >
> > src/hotspot/share/opto/machnode.hpp
> > Additional fields for MachSafePointNodes. Ok.
> >
> >
> > src/hotspot/share/opto/macro.cpp
> > Allow elimination of non-escaping allocations. Ok.
> >
> >
> > src/hotspot/share/opto/matcher.cpp
> > src/hotspot/share/opto/output.cpp
> > Copy attribute / pass parameters. Ok.
> >
> >
> > src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
> > Nice cleanup!
> >
> >
> > src/hotspot/share/prims/jvmtiEnv.cpp
> > src/hotspot/share/prims/jvmtiEnvBase.cpp
> > Escape barriers + deoptimize objects for target thread. Good.
> >
> >
> > src/hotspot/share/prims/jvmtiImpl.cpp
> > src/hotspot/share/prims/jvmtiImpl.hpp
> > The sequence is pretty complex:
> > VM_GetOrSetLocal element initialization executes EscapeBarrier code
> which
> > suspends the target thread (extra VM Operation).
> > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM
> > Thread to prepare VM Operation with frame deoptimization).
> > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor
> which
> > resumes the target thread.
> > But I don't have any improvement proposal. Performance is probably not a
> > concern, here. So it's ok.
> >
> > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has
> > non-globally escaping objects and other frames if they have arg escaping
> > ones. Good.
> >
> >
> > src/hotspot/share/prims/jvmtiTagMap.cpp
> > Escape barriers + deoptimize objects for all threads. Ok.
> >
> >
> > src/hotspot/share/prims/whitebox.cpp
> > Added WB_IsFrameDeoptimized to API. Ok.
> >
> >
> > src/hotspot/share/runtime/deoptimization.cpp
> > Object deoptimization. I have more comments and proposals, here.
> > First of all, handling recursive and waiting locks in relock_objects is tricky,
> but
> > looks correct.
> > Comments are sufficient to understand why things are done as they are
> > implemented.
> >
> > BiasedLocking related parts are complex, but we may get rid of them in the
> > future (with BiasedLocking removal).
> > Anyway, looks correct, too.
> >
> > Typo in comment: "regularily" => "regularly"
> >
> > Deoptimization::fetch_unroll_info_helper is the only place where
> > _jvmti_deferred_updates get deallocated (except JavaThread destructor).
> > But I think we always go through it, so I can't see a memory leak or such
> kind
> > of issues.
> >
> > EscapeBarrier::deoptimize_objects: ResourceMark should use
> > calling_thread().
> >
> > You can use MutexLocker and MonitorLocker with Thread* to save the
> > Thread::current() call.
> >
> > I'd make set_objs_are_deoptimized static and remove it from the
> > EscapeBarrier interface because I think it shouldn't be used outside of
> > EscapeBarrier::deoptimize_objects.
> >
> > Typo in comment: "we must only deoptimize" => "we only have to
> > deoptimize"
> >
> > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and
> > barrier_active() is redundant. Implementation can get moved to hpp file.
> >
> > I'll get back to suspend flags, later.
> >
> > There are weird cases regarding _self_deoptimization_in_progress.
> > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C.
> C
> > can set _self_deoptimization_in_progress while A performs the handshake
> > for suspending C. I think this doesn't lead to errors, but it's probably not
> > desired.
> > I think it would be better to use only one "wait" call in
> > sync_and_suspend_one and sync_and_suspend_all.
> >
> > I first thought it'd be better to move ThreadBlockInVM before wait() to
> > reduce thread state transitions, but that seems to be problematic because
> > ThreadBlockInVM destructor contains a safepoint check which we
> shouldn't
> > do while holding EscapeBarrier_lock. So no change request.
> >
> > Change in thred_added:
> > I think the sequence would be more comprehensive if we waited for
> > deopt_all_threads in Thread::start and all other places where a new thread
> > can run into Java code (e.g. JVMTI attach).
> > Your version makes new threads come up with suspend flag set. That looks
> > correct, too. Advantage is that you only have to change one place
> > (thread_added). It'll be interesting to see how it will look like when we use
> > async handshakes instead of suspend flags.
> > For now, I'm ok with your version.
> >
> > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt-
> > >is_hidden_from_external_view()).
> >
> > Having 4 different deoptimize_objects functions makes it a little hard to
> keep
> > an overview of which one is used for what.
> > Maybe adding suffixes would help a little bit, but I can also live with what
> you
> > have.
> > Implementation looks correct to me.
> >
> >
> > src/hotspot/share/runtime/deoptimization.hpp
> > Escape barriers and object deoptimization functions.
> > Typo in comment: "helt" => "held"
> >
> >
> > src/hotspot/share/runtime/globals.hpp
> > Addition of develop flag DeoptimizeObjectsALotInterval. Ok.
> >
> >
> > src/hotspot/share/runtime/interfaceSupport.cpp
> > InterfaceSupport::deoptimizeAllObjects() is only used for
> > DeoptimizeObjectsALot = 1.
> > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad
> > to have DeoptimizeObjectsALot = 1 in addition. Ok.
> >
> >
> > src/hotspot/share/runtime/interfaceSupport.inline.hpp
> > Addition of deoptimizeAllObjects. Ok.
> >
> >
> > src/hotspot/share/runtime/mutexLocker.cpp
> > src/hotspot/share/runtime/mutexLocker.hpp
> > Addition of EscapeBarrier_lock. Ok.
> >
> >
> > src/hotspot/share/runtime/objectMonitor.cpp
> > Make recursion count relock aware. Ok.
> >
> >
> > src/hotspot/share/runtime/stackValue.hpp
> > Better reinitilization in StackValue. Good.
> >
> >
> > src/hotspot/share/runtime/thread.cpp
> > src/hotspot/share/runtime/thread.hpp
> > src/hotspot/share/runtime/thread.inline.hpp
> > wait_for_object_deoptimization, suspend flag, deferred updates and test
> > feature to deoptimize objects.
> >
> > In the long term, we want to get rid of suspend flags, so it's not so nice to
> > introduce a new one. But I agree with G?tz that it should be acceptable as
> > temporary solution until async handshakes are available (which takes more
> > time). So I'm ok with your change.
> >
> > You can use MutexLocker with Thread*.
> >
> > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class
> out
> > of thread.hpp.
> >
> >
> > src/hotspot/share/runtime/vframe.cpp
> > Added support for entry frame to new_vframe. Ok.
> >
> >
> > src/hotspot/share/runtime/vframe_hp.cpp
> > src/hotspot/share/runtime/vframe_hp.hpp
> >
> > I think code()->as_nmethod() in not_global_escape_in_scope() and
> > arg_escape() should better be under #ifdef ASSERT or inside the assert
> > statement (no need for code cache walking in product build).
> >
> > jvmtiDeferredLocalVariableSet::update_monitors:
> > Please add a comment explaining that owner referenced by original info
> may
> > be scalar replaced, but it is deoptimized in the vframe.
> >
> >
> > src/hotspot/share/utilities/macros.hpp
> > Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok.
> >
> >
> >
> test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysi
> > sEnabled.java
> >
> test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnal
> > ysisEnabled.c
> > New test. Will review separately.
> >
> >
> > test/jdk/TEST.ROOT
> > Addition of vm.jvmci as required property. Ok.
> >
> >
> > test/jdk/com/sun/jdi/EATests.java
> > test/jdk/com/sun/jdi/EATestsJVMCI.java
> > New test. Will review separately.
> >
> >
> > test/lib/sun/hotspot/WhiteBox.java
> > Added isFrameDeoptimized to API. Ok.
> >
> >
> > That was it. Best regards,
> > Martin
> >
> >
> > > -----Original Message-----
> > > From: hotspot-compiler-dev <hotspot-compiler-dev-
> > > bounces at openjdk.java.net> On Behalf Of Reingruber, Richard
> > > Sent: Dienstag, 3. M?rz 2020 21:23
> > > To: 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> > > <goetz.lindenmaier at sap.com>; David Holmes
> > <david.holmes at oracle.com>;
> > > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > > dev at openjdk.java.net
> > > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better
> > > Performance in the Presence of JVMTI Agents
> > >
> > > Hi Robbin,
> > >
> > > > > I understand that Robbin proposed to replace the usage of
> > > > > _suspend_flag with handshakes. Apparently, async handshakes
> > > > > are needed to do so. We have been waiting a while for removal
> > > > > of the _suspend_flag / introduction of async handshakes [2].
> > > > > What is the status here?
> > >
> > > > I have an old prototype which I would like to continue to work on.
> > > > So do not assume asynch handshakes will make 15.
> > > > Even if it would, I think there are a lot more investigate work to remove
> > > > _suspend_flag.
> > >
> > > Let us know, if we can be of any help to you and be it only testing.
> > >
> > > > >> Full:
> > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> > >
> > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > > > You can move both declaration and definition to that file, no need to
> > > clobber
> > > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> > >
> > > Will do.
> > >
> > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in
> > it's
> > > own
> > > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> > >
> > > You are right. It shouldn't be declared in thread.hpp. I will look into that.
> > >
> > > > Note that we also think we may have a bug in deopt:
> > > > https://bugs.openjdk.java.net/browse/JDK-8238237
> > >
> > > > I think it would be best, if possible, to push after that is resolved.
> > >
> > > Sure.
> > >
> > > > Not even nearly a full review :)
> > >
> > > I know :)
> > >
> > > Anyways, thanks a lot,
> > > Richard.
> > >
> > >
> > > -----Original Message-----
> > > From: Robbin Ehn <robbin.ehn at oracle.com>
> > > Sent: Monday, March 2, 2020 11:17 AM
> > > To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber,
> > Richard
> > > <richard.reingruber at sap.com>; David Holmes
> > <david.holmes at oracle.com>;
> > > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > > dev at openjdk.java.net
> > > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance
> > > in the Presence of JVMTI Agents
> > >
> > > Hi,
> > >
> > > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> > > > Hi,
> > > >
> > > > I had a look at the progress of this change. Nothing
> > > > happened since Richard posted his update using more
> > > > handshakes [1].
> > > > But we (SAP) would appreciate a lot if this change could
> > > > be successfully reviewed and pushed.
> > > >
> > > > I think there is basic understanding that this
> > > > change is helpful. It fixes a number of issues with JVMTI,
> > > > and will deliver the same performance benefits as EA
> > > > does in current production mode for debugging scenarios.
> > > >
> > > > This is important for us as we run our VMs prepared
> > > > for debugging in production mode.
> > > >
> > > > I understand that Robbin proposed to replace the usage of
> > > > _suspend_flag with handshakes. Apparently, async handshakes
> > > > are needed to do so. We have been waiting a while for removal
> > > > of the _suspend_flag / introduction of async handshakes [2].
> > > > What is the status here?
> > >
> > > I have an old prototype which I would like to continue to work on.
> > > So do not assume asynch handshakes will make 15.
> > > Even if it would, I think there are a lot more investigate work to remove
> > > _suspend_flag.
> > >
> > > >
> > > > I think we should no longer wait, but proceed with
> > > > this change. We will look into removing the usage of
> > > > suspend_flag introduced here once it is possible to implement
> > > > it with handshakes.
> > >
> > > Yes, sure.
> > >
> > > >> Full:
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> > >
> > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > > You can move both declaration and definition to that file, no need to
> > clobber
> > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> > >
> > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in
> it's
> > > own
> > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> > >
> > > Note that we also think we may have a bug in deopt:
> > > https://bugs.openjdk.java.net/browse/JDK-8238237
> > >
> > > I think it would be best, if possible, to push after that is resolved.
> > >
> > > Not even nearly a full review :)
> > >
> > > Thanks, Robbin
> > >
> > >
> > > >> Incremental:
> > > >>
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
> > > >>
> > > >> I was not able to eliminate the additional suspend flag now. I'll take
> care
> > > of this
> > > >> as soon as the
> > > >> existing suspend-resume-mechanism is reworked.
> > > >>
> > > >> Testing:
> > > >>
> > > >> Nightly tests @SAP:
> > > >>
> > > >>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
> > > Renaissance
> > > >> Suite, SAP specific tests
> > > >>    with fastdebug and release builds on all platforms
> > > >>
> > > >>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x
> > > parallel
> > > >> for 24h
> > > >>
> > > >> Thanks, Richard.
> > > >>
> > > >>
> > > >> More details on the changes:
> > > >>
> > > >> * Hide DeoptimizeObjectsALotThread from external view.
> > > >>
> > > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
> > > >>    It used to be _safepoint_check_sometimes, which will be eliminated
> > > sooner or
> > > >> later.
> > > >>    I added explicit thread state changes with ThreadBlockInVM to code
> > > paths
> > > >> where we can wait()
> > > >>    on EscapeBarrier_lock to become safepoint safe.
> > > >>
> > > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target
> > > threads
> > > >> instead of vm operation
> > > >>    VM_ThreadSuspendAllForObjDeopt.
> > > >>
> > > >> * Removed uses of Threads_lock. When adding a new thread we
> > suspend
> > > it iff
> > > >> EA optimizations are
> > > >>    being reverted. In the previous version we were waiting on
> > > Threads_lock
> > > >> while EA optimizations
> > > >>    were reverted. See EscapeBarrier::thread_added().
> > > >>
> > > >> * Made tests require Xmixed compilation mode.
> > > >>
> > > >> * Made tests agnostic regarding tiered compilation.
> > > >>    I.e. tc isn't disabled anymore, and the tests can be run with tc
> enabled
> > or
> > > >> disabled.
> > > >>
> > > >> * Exercising EATests.java as well with stress test options
> > > >> DeoptimizeObjectsALot*
> > > >>    Due to the non-deterministic deoptimizations some tests need to be
> > > skipped.
> > > >>    We do this to prevent bit-rot of the stress test code.
> > > >>
> > > >> * Executing EATests.java as well with graal if available. Driver for this is
> > > >>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not
> > > provide all
> > > >> the new debug info
> > > >>    (namely not_global_escape_in_scope and arg_escape in
> > > scopeDesc.hpp).
> > > >>    And graal does not yet support the JVMTI operations force early
> > return
> > > and
> > > >> pop frame.
> > > >>
> > > >> * Removed tracing from new jdi tests in EATests.java. Too much trace
> > > output
> > > >> before the debugging
> > > >>    connection is established can cause deadlock because output buffers
> > fill
> > > up.
> > > >>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
> > > >>
> > > >> * Many copyright year changes and smaller clean-up changes of
> testing
> > > code
> > > >> (trailing white-space and
> > > >>    the like).
> > > >>
> > > >>
> > > >> -----Original Message-----
> > > >> From: David Holmes <david.holmes at oracle.com>
> > > >> Sent: Donnerstag, 19. Dezember 2019 03:12
> > > >> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> > > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> > > hotspot-
> > > >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> > > (vladimir.kozlov at oracle.com)
> > > >> <vladimir.kozlov at oracle.com>
> > > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > > Performance in
> > > >> the Presence of JVMTI Agents
> > > >>
> > > >> Hi Richard,
> > > >>
> > > >> I think my issue is with the way EliminateNestedLocks works so I'm
> going
> > > >> to look into that more deeply.
> > > >>
> > > >> Thanks for the explanations.
> > > >>
> > > >> David
> > > >>
> > > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> > > >>> Hi David,
> > > >>>
> > > >>>     > >    > Some further queries/concerns:
> > > >>>     > >    >
> > > >>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
> > > >>>     > >    >
> > > >>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
> > > >>>     > >    >
> > > >>>     > >    > !   _recursions = save      // restore the old recursion count
> > > >>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
> > > >>>     > >    > increased by the deferred relock count
> > > >>>     > >    >
> > > >>>     > >    > what is the "deferred relock count"? I gather it relates to
> > > >>>     > >    >
> > > >>>     > >    > "The code was extended to be able to deoptimize objects of
> a
> > > >>>     > > frame that
> > > >>>     > >    > is not the top frame and to let another thread than the
> > owning
> > > >>>     > > thread do
> > > >>>     > >    > it."
> > > >>>     > >
> > > >>>     > > Yes, these relate. Currently EA based optimizations are reverted,
> > > when a
> > > >> compiled frame is
> > > >>>     > > replaced with corresponding interpreter frames. Part of this is
> > > relocking
> > > >> objects with eliminated
> > > >>>     > > locking. New with the enhancement is that we do this also just
> > > before
> > > >> object references are
> > > >>>     > > acquired through JVMTI. In this case we deoptimize also the
> > > owning
> > > >> compiled frame C and we
> > > >>>     > > register deoptimized objects as deferred updates. When control
> > > returns
> > > >> to C it gets deoptimized,
> > > >>>     > > we notice that objects are already deoptimized (reallocated and
> > > >> relocked), so we don't do it again
> > > >>>     > > (relocking twice would be incorrect of course). Deferred
> updates
> > > are
> > > >> copied into the new
> > > >>>     > > interpreter frames.
> > > >>>     > >
> > > >>>     > > Problem: relocking is not possible if the target thread T is
> waiting
> > > on the
> > > >> monitor that needs to
> > > >>>     > > be relocked. This happens only with non-local objects with
> > > >> EliminateNestedLocks. Instead relocking
> > > >>>     > > is deferred until T owns the monitor again. This is what the
> piece
> > of
> > > >> code above does.
> > > >>>     >
> > > >>>     >  Sorry I need some more detail here. How can you wait() on an
> > > object
> > > >>>     >  monitor if the object allocation and/or locking was optimised
> > away?
> > > And
> > > >>>     >  what is a "non-local object" in this context? Isn't EA restricted to
> > > >>>     >  thread-confined objects?
> > > >>>
> > > >>> "Non-local object" is an object that escapes its thread. The issue I'm
> > > >> addressing with the changes
> > > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused
> by
> > > >> EliminateNestedLocks, where C2
> > > >>> eliminates recursive locking of an already owned lock. The lock
> owning
> > > object
> > > >> exists on the heap, it
> > > >>> is locked and you can call wait() on it.
> > > >>>
> > > >>> EliminateLocks is the C2 option that controls lock elimination based
> on
> > > EA.
> > > >> Both optimizations have
> > > >>> in common that objects with eliminated locking need to be relocked
> > > when
> > > >> deoptimizing a frame,
> > > >>> i.e. when replacing a compiled frame with equivalent interpreter
> > > >>> frames. Deoptimization::relock_objects does that job for /all/
> > eliminated
> > > >> locks in scope. /All/ can
> > > >>> be a mix of eliminated nested locks and locks of not-escaping objects.
> > > >>>
> > > >>> New with the enhancement: I call relock_objects earlier, just before
> > > objects
> > > >> pontentially
> > > >>> escape. But then later when the owning compiled frame gets
> > > deoptimized, I
> > > >> must not do it again:
> > > >>>
> > > >>> See call to EscapeBarrier::objs_are_deoptimized in
> > deoptimization.cpp:
> > > >>>
> > > >>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis ||
> > > EliminateNestedLocks) &&
> > > >> EliminateLocks))
> > > >>>    374       && !EscapeBarrier::objs_are_deoptimized(thread,
> > > deoptee.id())) {
> > > >>>    375     bool unused;
> > > >>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee,
> > > exec_mode,
> > > >> unused);
> > > >>>    377   }
> > > >>>
> > > >>> Now when calling relock_objects early it is quiet possible that I have
> to
> > > relock
> > > >> an object the
> > > >>> target thread currently waits for. Obviously I cannot relock in this
> case,
> > > >> instead I chose to
> > > >>> introduce relock_count_after_wait to JavaThread.
> > > >>>
> > > >>>     >  Is it just that some of the locking gets optimized away e.g.
> > > >>>     >
> > > >>>     >  synchronised(obj) {
> > > >>>     >     synchronised(obj) {
> > > >>>     >       synchronised(obj) {
> > > >>>     >         obj.wait();
> > > >>>     >       }
> > > >>>     >     }
> > > >>>     >  }
> > > >>>     >
> > > >>>     >  If this is reduced to a form as-if it were a single lock of the
> monitor
> > > >>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to
> > the
> > > >>>     >  escape of "obj" then we need to reconstruct the true lock state,
> > and
> > > so
> > > >>>     >  when the wait() internally unblocks and reacquires the monitor it
> > > has to
> > > >>>     >  set the true recursion count to 3, not the 1 that it appeared to be
> > > when
> > > >>>     >  wait() was initially called. Is that the scenario?
> > > >>>
> > > >>> Kind of... except that the locking is not eliminated due to EA and
> there
> > is
> > > no
> > > >> JVM TI event
> > > >>> triggered by wait.
> > > >>>
> > > >>> Add
> > > >>>
> > > >>> LocalObject l1 = new LocalObject();
> > > >>>
> > > >>> in front of the synchrnized blocks and assume a JVM TI agent
> acquires
> > l1.
> > > This
> > > >> triggers the code in
> > > >>> question.
> > > >>>
> > > >>> See that relocking/reallocating is transactional. If it is done then for
> > /all/
> > > >> objects in scope and it is
> > > >>> done at most once. It wouldn't be quite so easy to split this in
> relocking
> > > of
> > > >> nested/EA-based
> > > >>> eliminated locks.
> > > >>>
> > > >>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
> > > >>>     >  requires a notification and so the object cannot be thread
> > confined.
> > > In
> > > >>>
> > > >>> It is not thread confined.
> > > >>>
> > > >>>     >  which case I would strongly argue that upon hitting the wait() the
> > > deopt
> > > >>>     >  should occur unconditionally and so the lock state is correct
> before
> > > we
> > > >>>     >  wait and so we don't need to mess with the recursion count
> > > internally
> > > >>>     >  when we reacquire the monitor.
> > > >>>     >
> > > >>>     > >
> > > >>>     > >    > which I don't like the sound of at all when it comes to
> > > ObjectMonitor
> > > >>>     > >    > state. So I'd like to understand in detail exactly what is going
> > on
> > > here
> > > >>>     > >    > and why.  This is a very intrusive change that seems to badly
> > > break
> > > >>>     > >    > encapsulation and impacts future changes to ObjectMonitor
> > > that are
> > > >> under
> > > >>>     > >    > investigation.
> > > >>>     > >
> > > >>>     > > I would not regard this as breaking encapsulation. Certainly not
> > > badly.
> > > >>>     > >
> > > >>>     > > I've added a property relock_count_after_wait to JavaThread.
> > The
> > > >> property is well
> > > >>>     > > encapsulated. Future ObjectMonitor implementations have to
> > deal
> > > with
> > > >> recursion too. They are free
> > > >>>     > > in choosing a way to do that as long as that property is taken
> into
> > > >> account. This is hardly a
> > > >>>     > > limitation.
> > > >>>     >
> > > >>>     >  I do think this badly breaks encapsulation as you have to add a
> > > callout
> > > >>>     >  from the guts of the ObjectMonitor code to reach into the thread
> > to
> > > get
> > > >>>     >  this lock count adjustment. I understand why you have had to do
> > > this but
> > > >>>     >  I would much rather see a change to the EA optimisation strategy
> > so
> > > that
> > > >>>     >  this is not needed.
> > > >>>     >
> > > >>>     > > Note also that the property is a straight forward extension of
> the
> > > >> existing concept of deferred
> > > >>>     > > local updates. It is embedded into the structure holding them.
> So
> > > not
> > > >> even the footprint of a
> > > >>>     > > JavaThread is enlarged if no deferred updates are generated.
> > > >>>     >
> > > >>>     > [...]
> > > >>>     >
> > > >>>     > >
> > > >>>     > > I'm actually duplicating the existing external suspend
> mechanism,
> > > >> because a thread can be
> > > >>>     > > suspended at most once. And hey, and don't like that either!
> But
> > it
> > > >> seems not unlikely that the
> > > >>>     > > duplicate can be removed together with the original and the
> new
> > > type
> > > >> of handshakes that will be
> > > >>>     > > used for thread suspend can be used for object deoptimization
> > > too. See
> > > >> today's discussion in
> > > >>>     > > JDK-8227745 [2].
> > > >>>     >
> > > >>>     >  I hope that discussion bears some fruit, at the moment it seems
> > not
> > > to
> > > >>>     >  be possible to use handshakes here. :(
> > > >>>     >
> > > >>>     >  The external suspend mechanism is a royal pain in the proverbial
> > > that we
> > > >>>     >  have to carefully live with. The idea that we're duplicating that
> for
> > > >>>     >  use in another fringe area of functionality does not thrill me at all.
> > > >>>     >
> > > >>>     >  To be clear, I understand the problem that exists and that you
> > wish
> > > to
> > > >>>     >  solve, but for the runtime parts I balk at the complexity cost of
> > > >>>     >  solving it.
> > > >>>
> > > >>> I know it's complex, but by far no rocket science.
> > > >>>
> > > >>> Also I find it hard to imagine another fix for JDK-8233915 besides
> > > changing
> > > >> the JVM TI specification.
> > > >>>
> > > >>> Thanks, Richard.
> > > >>>
> > > >>> -----Original Message-----
> > > >>> From: David Holmes <david.holmes at oracle.com>
> > > >>> Sent: Dienstag, 17. Dezember 2019 08:03
> > > >>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> serviceability-
> > > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> > > hotspot-
> > > >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> > > (vladimir.kozlov at oracle.com)
> > > >> <vladimir.kozlov at oracle.com>
> > > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > > Performance
> > > >> in the Presence of JVMTI Agents
> > > >>>
> > > >>> <resend as my mailer crashed during last send>
> > > >>>
> > > >>> David
> > > >>>
> > > >>> On 17/12/2019 4:57 pm, David Holmes wrote:
> > > >>>> Hi Richard,
> > > >>>>
> > > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
> > > >>>>> Hi David,
> > > >>>>>
> > > >>>>>   ?? > Some further queries/concerns:
> > > >>>>>   ?? >
> > > >>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
> > > >>>>>   ?? >
> > > >>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
> > > >>>>>   ?? >
> > > >>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
> > > >>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> > > >>>>>   ?? > increased by the deferred relock count
> > > >>>>>   ?? >
> > > >>>>>   ?? > what is the "deferred relock count"? I gather it relates to
> > > >>>>>   ?? >
> > > >>>>>   ?? > "The code was extended to be able to deoptimize objects of a
> > > >>>>> frame that
> > > >>>>>   ?? > is not the top frame and to let another thread than the owning
> > > >>>>> thread do
> > > >>>>>   ?? > it."
> > > >>>>>
> > > >>>>> Yes, these relate. Currently EA based optimizations are reverted,
> > > when
> > > >>>>> a compiled frame is replaced
> > > >>>>> with corresponding interpreter frames. Part of this is relocking
> > > >>>>> objects with eliminated
> > > >>>>> locking. New with the enhancement is that we do this also just
> > before
> > > >>>>> object references are acquired
> > > >>>>> through JVMTI. In this case we deoptimize also the owning
> compiled
> > > >>>>> frame C and we register
> > > >>>>> deoptimized objects as deferred updates. When control returns to
> > C
> > > it
> > > >>>>> gets deoptimized, we notice
> > > >>>>> that objects are already deoptimized (reallocated and relocked), so
> > > we
> > > >>>>> don't do it again (relocking
> > > >>>>> twice would be incorrect of course). Deferred updates are copied
> > into
> > > >>>>> the new interpreter frames.
> > > >>>>>
> > > >>>>> Problem: relocking is not possible if the target thread T is waiting
> > > >>>>> on the monitor that needs to be
> > > >>>>> relocked. This happens only with non-local objects with
> > > >>>>> EliminateNestedLocks. Instead relocking is
> > > >>>>> deferred until T owns the monitor again. This is what the piece of
> > > >>>>> code above does.
> > > >>>>
> > > >>>> Sorry I need some more detail here. How can you wait() on an
> object
> > > >>>> monitor if the object allocation and/or locking was optimised away?
> > > And
> > > >>>> what is a "non-local object" in this context? Isn't EA restricted to
> > > >>>> thread-confined objects?
> > > >>>>
> > > >>>> Is it just that some of the locking gets optimized away e.g.
> > > >>>>
> > > >>>> synchronised(obj) {
> > > >>>>    ? synchronised(obj) {
> > > >>>>    ??? synchronised(obj) {
> > > >>>>    ????? obj.wait();
> > > >>>>    ??? }
> > > >>>>    ? }
> > > >>>> }
> > > >>>>
> > > >>>> If this is reduced to a form as-if it were a single lock of the monitor
> > > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
> > > >>>> escape of "obj" then we need to reconstruct the true lock state, and
> > so
> > > >>>> when the wait() internally unblocks and reacquires the monitor it
> has
> > to
> > > >>>> set the true recursion count to 3, not the 1 that it appeared to be
> > when
> > > >>>> wait() was initially called. Is that the scenario?
> > > >>>>
> > > >>>> If so I find this truly awful. Anyone using wait() in a realistic form
> > > >>>> requires a notification and so the object cannot be thread confined.
> > In
> > > >>>> which case I would strongly argue that upon hitting the wait() the
> > > deopt
> > > >>>> should occur unconditionally and so the lock state is correct before
> > we
> > > >>>> wait and so we don't need to mess with the recursion count
> internally
> > > >>>> when we reacquire the monitor.
> > > >>>>
> > > >>>>>
> > > >>>>>   ?? > which I don't like the sound of at all when it comes to
> > > >>>>> ObjectMonitor
> > > >>>>>   ?? > state. So I'd like to understand in detail exactly what is going
> > > >>>>> on here
> > > >>>>>   ?? > and why.? This is a very intrusive change that seems to badly
> > > break
> > > >>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor
> > that
> > > >>>>> are under
> > > >>>>>   ?? > investigation.
> > > >>>>>
> > > >>>>> I would not regard this as breaking encapsulation. Certainly not
> > badly.
> > > >>>>>
> > > >>>>> I've added a property relock_count_after_wait to JavaThread. The
> > > >>>>> property is well
> > > >>>>> encapsulated. Future ObjectMonitor implementations have to deal
> > > with
> > > >>>>> recursion too. They are free in
> > > >>>>> choosing a way to do that as long as that property is taken into
> > > >>>>> account. This is hardly a
> > > >>>>> limitation.
> > > >>>>
> > > >>>> I do think this badly breaks encapsulation as you have to add a
> callout
> > > >>>> from the guts of the ObjectMonitor code to reach into the thread to
> > > get
> > > >>>> this lock count adjustment. I understand why you have had to do
> this
> > > but
> > > >>>> I would much rather see a change to the EA optimisation strategy so
> > > that
> > > >>>> this is not needed.
> > > >>>>
> > > >>>>> Note also that the property is a straight forward extension of the
> > > >>>>> existing concept of deferred
> > > >>>>> local updates. It is embedded into the structure holding them. So
> > not
> > > >>>>> even the footprint of a
> > > >>>>> JavaThread is enlarged if no deferred updates are generated.
> > > >>>>>
> > > >>>>>   ?? > ---
> > > >>>>>   ?? >
> > > >>>>>   ?? > src/hotspot/share/runtime/thread.cpp
> > > >>>>>   ?? >
> > > >>>>>   ?? > Can you please explain why
> > > >>>>> JavaThread::wait_for_object_deoptimization
> > > >>>>>   ?? > has to be handcrafted in this way rather than using proper
> > > >>>>> transitions.
> > > >>>>>   ?? >
> > > >>>>>
> > > >>>>> I wrote wait_for_object_deoptimization taking
> > > >>>>> JavaThread::java_suspend_self_with_safepoint_check
> > > >>>>> as template. So in short: for the same reasons :)
> > > >>>>>
> > > >>>>> Threads reach both methods as part of thread state transitions,
> > > >>>>> therefore special handling is
> > > >>>>> required to change thread state on top of ongoing transitions.
> > > >>>>>
> > > >>>>>   ?? > We got rid of "deopt suspend" some time ago and it is
> > disturbing
> > > >>>>> to see
> > > >>>>>   ?? > it being added back (effectively). This seems like it may be
> > > >>>>> something
> > > >>>>>   ?? > that handshakes could be used for.
> > > >>>>>
> > > >>>>> Deopt suspend used to be something rather different with a
> similar
> > > >>>>> name[1]. It is not being added back.
> > > >>>>
> > > >>>> I stand corrected. Despite comments in the code to the contrary
> > > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a
> lot
> > of
> > > >>>> cleanup in this area 13 years ago :)
> > > >>>>
> > > >>>>>
> > > >>>>> I'm actually duplicating the existing external suspend mechanism,
> > > >>>>> because a thread can be suspended
> > > >>>>> at most once. And hey, and don't like that either! But it seems not
> > > >>>>> unlikely that the duplicate can
> > > >>>>> be removed together with the original and the new type of
> > > handshakes
> > > >>>>> that will be used for
> > > >>>>> thread suspend can be used for object deoptimization too. See
> > > today's
> > > >>>>> discussion in JDK-8227745 [2].
> > > >>>>
> > > >>>> I hope that discussion bears some fruit, at the moment it seems not
> > to
> > > >>>> be possible to use handshakes here. :(
> > > >>>>
> > > >>>> The external suspend mechanism is a royal pain in the proverbial
> that
> > > we
> > > >>>> have to carefully live with. The idea that we're duplicating that for
> > > >>>> use in another fringe area of functionality does not thrill me at all.
> > > >>>>
> > > >>>> To be clear, I understand the problem that exists and that you wish
> to
> > > >>>> solve, but for the runtime parts I balk at the complexity cost of
> > > >>>> solving it.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> David
> > > >>>> -----
> > > >>>>
> > > >>>>> Thanks, Richard.
> > > >>>>>
> > > >>>>> [1] Deopt suspend was something like an async. handshake for
> > > >>>>> architectures with register windows,
> > > >>>>>   ???? where patching the return pc for deoptimization of a compiled
> > > >>>>> frame was racy if the owner thread
> > > >>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
> > > >>>>> which the thread patched its own
> > > >>>>>   ???? frame upon return from native. So no thread was suspended.
> It
> > > got
> > > >>>>> its name only from the name of
> > > >>>>>   ???? the flags.
> > > >>>>>
> > > >>>>> [2] Discussion about using handshakes to sync. with the target
> > thread:
> > > >>>>>
> > > >>>>> https://bugs.openjdk.java.net/browse/JDK-
> > > >>
> > >
> >
> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst
> > > e
> > > >> m.issuetabpanels:comment-tabpanel#comment-14306727
> > > >>>>>
> > > >>>>>
> > > >>>>> -----Original Message-----
> > > >>>>> From: David Holmes <david.holmes at oracle.com>
> > > >>>>> Sent: Freitag, 13. Dezember 2019 00:56
> > > >>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> > > >>>>> serviceability-dev at openjdk.java.net;
> > > >>>>> hotspot-compiler-dev at openjdk.java.net;
> > > >>>>> hotspot-runtime-dev at openjdk.java.net
> > > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > > >>>>> Performance in the Presence of JVMTI Agents
> > > >>>>>
> > > >>>>> Hi Richard,
> > > >>>>>
> > > >>>>> Some further queries/concerns:
> > > >>>>>
> > > >>>>> src/hotspot/share/runtime/objectMonitor.cpp
> > > >>>>>
> > > >>>>> Can you please explain the changes to ObjectMonitor::wait:
> > > >>>>>
> > > >>>>> !?? _recursions = save????? // restore the old recursion count
> > > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> > > >>>>> increased by the deferred relock count
> > > >>>>>
> > > >>>>> what is the "deferred relock count"? I gather it relates to
> > > >>>>>
> > > >>>>> "The code was extended to be able to deoptimize objects of a
> > frame
> > > that
> > > >>>>> is not the top frame and to let another thread than the owning
> > thread
> > > do
> > > >>>>> it."
> > > >>>>>
> > > >>>>> which I don't like the sound of at all when it comes to
> ObjectMonitor
> > > >>>>> state. So I'd like to understand in detail exactly what is going on
> here
> > > >>>>> and why.? This is a very intrusive change that seems to badly break
> > > >>>>> encapsulation and impacts future changes to ObjectMonitor that
> > are
> > > under
> > > >>>>> investigation.
> > > >>>>>
> > > >>>>> ---
> > > >>>>>
> > > >>>>> src/hotspot/share/runtime/thread.cpp
> > > >>>>>
> > > >>>>> Can you please explain why
> > > JavaThread::wait_for_object_deoptimization
> > > >>>>> has to be handcrafted in this way rather than using proper
> > transitions.
> > > >>>>>
> > > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing
> to
> > > see
> > > >>>>> it being added back (effectively). This seems like it may be
> > something
> > > >>>>> that handshakes could be used for.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> David
> > > >>>>> -----
> > > >>>>>
> > > >>>>> On 12/12/2019 7:02 am, David Holmes wrote:
> > > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
> > > >>>>>>> Hi David,
> > > >>>>>>>
> > > >>>>>>>   ??? > Most of the details here are in areas I can comment on in
> > > detail,
> > > >>>>>>> but I
> > > >>>>>>>   ??? > did take an initial general look at things.
> > > >>>>>>>
> > > >>>>>>> Thanks for taking the time!
> > > >>>>>>
> > > >>>>>> Apologies the above should read:
> > > >>>>>>
> > > >>>>>> "Most of the details here are in areas I *can't* comment on in
> > detail
> > > >>>>>> ..."
> > > >>>>>>
> > > >>>>>> David
> > > >>>>>>
> > > >>>>>>>   ??? > The only thing that jumped out at me is that I think the
> > > >>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
> > > >>>>>>>   ??? >
> > > >>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true;
> > }
> > > >>>>>>>
> > > >>>>>>> Yes, it should. Will add the method like above.
> > > >>>>>>>
> > > >>>>>>>   ??? > Also I don't see any testing of the
> > > DeoptimizeObjectsALotThread.
> > > >>>>>>> Without
> > > >>>>>>>   ??? > active testing this will just bit-rot.
> > > >>>>>>>
> > > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
> > > >>>>>>> workload. I will add a minimal test
> > > >>>>>>> to keep it fresh.
> > > >>>>>>>
> > > >>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
> > > >>>>>>>   ??? >
> > > >>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") &
> > > vm.compiler2.enabled
> > > >> &
> > > >>>>>>>   ??? > (vm.opt.TieredCompilation != true))
> > > >>>>>>>   ??? >
> > > >>>>>>>   ??? > This seems to require that TieredCompilation is disabled,
> but
> > > >>>>>>> tiered is
> > > >>>>>>>   ??? > our normal mode of operation. ??
> > > >>>>>>>   ??? >
> > > >>>>>>>
> > > >>>>>>> I removed the clause. I guess I wanted to target the tests
> towards
> > > the
> > > >>>>>>> code they are supposed to
> > > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation
> and
> > > >>>>>>> with just one compiler thread.
> > > >>>>>>>
> > > >>>>>>> Additionally I will make use of
> > > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the
> > tests.
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>> Richard.
> > > >>>>>>>
> > > >>>>>>> -----Original Message-----
> > > >>>>>>> From: David Holmes <david.holmes at oracle.com>
> > > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
> > > >>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> > > >>>>>>> serviceability-dev at openjdk.java.net;
> > > >>>>>>> hotspot-compiler-dev at openjdk.java.net;
> > > >>>>>>> hotspot-runtime-dev at openjdk.java.net
> > > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > > >>>>>>> Performance in the Presence of JVMTI Agents
> > > >>>>>>>
> > > >>>>>>> Hi Richard,
> > > >>>>>>>
> > > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> > > >>>>>>>> Hi,
> > > >>>>>>>>
> > > >>>>>>>> I would like to get reviews please for
> > > >>>>>>>>
> > > >>>>>>>>
> > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> > > >>>>>>>>
> > > >>>>>>>> Corresponding RFE:
> > > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
> > > >>>>>>>>
> > > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> > > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-
> > > 8214584 [1]
> > > >>>>>>>>
> > > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing
> > > without
> > > >>>>>>>> issues (thanks!). In addition the
> > > >>>>>>>> change is being tested at SAP since I posted the first RFR some
> > > >>>>>>>> months ago.
> > > >>>>>>>>
> > > >>>>>>>> The intention of this enhancement is to benefit performance
> > wise
> > > from
> > > >>>>>>>> escape analysis even if JVMTI
> > > >>>>>>>> agents request capabilities that allow them to access local
> > variable
> > > >>>>>>>> values. E.g. if you start-up
> > > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,
> > > then
> > > >>>>>>>> escape analysis is disabled right
> > > >>>>>>>> from the beginning, well before a debugger attaches -- if ever
> > one
> > > >>>>>>>> should do so. With the
> > > >>>>>>>> enhancement, escape analysis will remain enabled until and
> > after
> > > a
> > > >>>>>>>> debugger attaches. EA based
> > > >>>>>>>> optimizations are reverted just before an agent acquires the
> > > >>>>>>>> reference to an object. In the JBS item
> > > >>>>>>>> you'll find more details.
> > > >>>>>>>
> > > >>>>>>> Most of the details here are in areas I can comment on in detail,
> > but
> > > I
> > > >>>>>>> did take an initial general look at things.
> > > >>>>>>>
> > > >>>>>>> The only thing that jumped out at me is that I think the
> > > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
> > > >>>>>>>
> > > >>>>>>> +? bool is_hidden_from_external_view() const { return true; }
> > > >>>>>>>
> > > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
> > > >>>>>>> Without
> > > >>>>>>> active testing this will just bit-rot.
> > > >>>>>>>
> > > >>>>>>> Also on the tests I don't understand your @requires clause:
> > > >>>>>>>
> > > >>>>>>>   ??? @requires ((vm.compMode != "Xcomp") &
> > > vm.compiler2.enabled &
> > > >>>>>>> (vm.opt.TieredCompilation != true))
> > > >>>>>>>
> > > >>>>>>> This seems to require that TieredCompilation is disabled, but
> > tiered
> > > is
> > > >>>>>>> our normal mode of operation. ??
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>> David
> > > >>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>> Richard.
> > > >>>>>>>>
> > > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
> > > >>>>>>>>
> > > >>
> > >
> >
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa
> > > tc
> > > >> h
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>

From aph at redhat.com  Mon Jul 13 08:36:58 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 13 Jul 2020 09:36:58 +0100
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
Message-ID: <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>

On 13/07/2020 06:48, David Holmes wrote:
> Hi Thomas,
>
> On 13/07/2020 2:41 pm, Thomas St?fe wrote:
>>
>> Can a compiler reorder system calls and stores? How would it determine
>> if this is safe to do?

I very much doubt it.

> A compiler can reorder anything it likes if it can determine it is safe
> to do so. :)

I'm fairly sure the compiler doesn't care about that!

>> I'd be surprised if Microsoft loosened up reordering since this would
>> mean existing software cannot just be recompiled for arm and expected to
>> work. But this is just a guess of course.
>
> It's an interesting point because I would expect there to be a lot of
> software written for Windows that contains assumptions of TSO that would
> in fact fail when run on Aarch64. I don't know if there are any special
> mechanisms to force a binary to run in TSO mode on Aarch64 under Windows
> (or build flags), that would allow for ease of migration.

There's no standard hardware mechanism that would do so.

I've been very surprised at how little software has broken on AArch64
because of memory ordering. Like you, I initially assumed that stuff
would break all over the place, but by and large it was OK. I know of
two reasons: firstly, programmers are pretty conservative and tend to
use simple and reliable mechanisms such as safe publication and
mutexes for inter-thread communication. But also, and maybe more
importantly, the kinds of reordering the hardware can do are not very
different from those compilers do. Therefore, anyone playing fast and
loose with TSO has probably already been bitten by the compiler.

> But unless all Windows software will run in such a mode there is a
> need for MS to document what the memory consistency properties of
> various APIs are (as POSIX does [1]).

Indeed. I would have thought it existed somewhere.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From daniel.daugherty at oracle.com  Mon Jul 13 13:47:36 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 13 Jul 2020 09:47:36 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
Message-ID: <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>

Hi David,

Thanks for the review!

I need a second review folks... any takers?

Dan


On 7/12/20 10:57 PM, David Holmes wrote:
> Hi Dan,
>
> This all looks good to me.
>
> Thanks,
> David
> -----
>
> On 8/07/2020 5:51 pm, David Holmes wrote:
>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>> Ping! Any takers??? Code deletion should be really appealing here!!
>>
>> Sorry Dan didn't get to it before vacation. But if you can wait till 
>> Monday ...
>>
>> Cheers,
>> David
>>
>>> Dan
>>>
>>>
>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> It's time to remove the AsyncDeflateIdleMonitors option from JDK16. 
>>>> We can
>>>> also get rid of the safepoint based deflation mechanism since 
>>>> turning off
>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way 
>>>> left to
>>>> use it.
>>>>
>>>> This is marked as an "S/M" review because the number of 
>>>> touched/deleted
>>>> lines makes it a Medium review, but the number of touched/changed 
>>>> lines
>>>> (outside of the deletions) makes it a Small review. It's actually a 
>>>> pretty
>>>> fast read... :-)
>>>>
>>>> Here's the bug ID:
>>>>
>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the 
>>>> safepoint
>>>> ??????????????? based deflation mechanism
>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>
>>>> Here's the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>
>>>> The webrev is baselined on Thomas S's fix for 8248650 which is 
>>>> jdk-16+4
>>>> plus a dozen or so changesets.
>>>>
>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and 
>>>> there are
>>>> no regressions (and very few known failures). My inflation stress 
>>>> testing
>>>> is still in process. I had to restart that testing after a 
>>>> thunderstorm
>>>> related power failure took down my servers in Florida. Sigh...
>>>>
>>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>>
>>>> Dan
>>>


From goetz.lindenmaier at sap.com  Mon Jul 13 14:48:32 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 13 Jul 2020 14:48:32 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <3aee0e49-4583-cc7f-838a-df0000fbae4c@oracle.com>
 <AM4PR0202MB296437161E8548698BB6A33EEC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
Message-ID: <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi David, 

> Your extended message is only computed when there is no original message.
Hmm. I would say the extended message is only computed when
The NPE was raised by the runtime. It happens to never have a
message so far in these cases.
But this is two views to the same thing ??

> You're concerned about this scenario:
> 
> catch (NullPointerException npe) {
>    String msg1 = npe.getMessage(); // gets extends NPE message
>    npe.setStackTrace(...);
>    String msg2 = npe.getMessage(); // gets null
> }
> 
> While I find it hard to imagine anyone doing this
Well, all the scenario are quite artificial:
 - why would you call fillInStackTrace on an exception thrown by the VM?
 - why would you call setStackTrace at all?
> you can easily have
> specified that the extended message is only available with the original
> stacktrace, hence after a second call to fillInStackTrace, or a call to
> setStackTrace, then the message reverts to being empty.
The message is not meant to be a special thing that behaves different 
from other messages.  Like sometime be available, sometime not.
It ended up being different through requirements during the 
review.

> To me that makes
> far more sense than having msg2 continue to report the extended info for
> the original stacktrace when it now has a new stacktrace.
> 
> I'm really not seeing why calling fillInstackTrace() a second time
> should be treated any differently to calling setStackTrace(). They
> should be handled consistently IMO.
But then you treat setStackTrace() differently from setStackTrace()
with other exceptions.
The reason to treat fillInStackTrace differently is that we lost information
needed to compute it. This is not the case with setStackTrace().

A different solution, the one I would have proposed if I had not 
considered previous comments from reviews,  would be to just 
compute the message in the runtime in the call of fillInStackTrace 
before the old stack trace is lost and assign it to the message field.  
This way it would behave similar to all other exceptions. The message 
would just be there ... just that it's computed lazily.
The cost of the algorithm wouldn't harm that much as other costly
algorithms (walking the stack) are performed at this point, too.

> We are not talking about all exceptions only about your NPE extended
> error message.
Hmm, the inconsistency caused by the code you posted above 
holds for all exceptions.  If you fiddle with the stack trace, 
the message might become pointless.  Wrt. setStackTrace
they all behave the same.
Wrt. fillInStackTrace the message will be wrong. Only this 
needs to be fixed.

Best regards,
  Goetz.


> 
> David
> -----
> 
> > I implemented an example where wrong stack traces are
> > printed with LinkageError and NPE, modifying a jtreg test:
> > http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> jdk15/05/mess_with_exceptions.patch
> > See also the generated output added to a comment in the patch.
> > If the NEP message text was missing in the second printout, I think
> > this really would be unexpected.
> > Please note that the correct message is printed after messing
> > with the stack trace, it's the stack trace that is wrong.
> > (Not as with the problem I am fixing here where a wrong
> > message is printed.)
> >
> > Best regards,
> >    Goetz.
> >
> >
> >
> >>
> >>> I guess the normal usecase of setStackTrace is the other way around:
> >>> Change the message and throw a new exception with the existing
> >>> stack trace:
> >>>
> >>> try {
> >>>     a.x;
> >>> catch (NullPointerException e) {
> >>>     throw new NullPointerException("My own error
> >> message").setStackTrace(e.getStackTrace);
> >>> }
> >>>
> >>> And not taking an arbitrary stack trace and put it into an exception
> >>> with existing message.
> >>
> >> Interesting usage.
> >>
> >> Cheers,
> >> David
> >> -----
> >>
> >>> Best regards,
> >>>     Goetz.
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: David Holmes <david.holmes at oracle.com>
> >>>> Sent: Friday, July 3, 2020 9:30 AM
> >>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
> >> mlv.fr'
> >>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> >>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> message
> >>>> after calling fillInStackTrace
> >>>>
> >>>> Hi Goetz,
> >>>>
> >>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
> >>>>> Hi,
> >>>>>
> >>>>>> True. To ensure you process the original backtrace only you need to
> >> add
> >>>>>> synchronization in getMessage():
> >>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> NPE_fillInStackTrace-
> >>>> jdk15/05/
> >>>>>
> >>>>> I added the volatile, too, but as I understand the synchronized
> >>>>> block brings sufficient memory barriers that this also works
> >>>>> without.
> >>>>
> >>>> No "volatile" needed, or wanted, when all access is within synchronized
> >>>> regions.
> >>>>
> >>>>>> To be honest the idea that someone would share an exception
> instance
> >>>> and
> >>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
> >>>>>> information about it just seems highly unrealistic.
> >>>>> Yes, contention here is quite unlikely, so it should not harm
> performance
> >>>> ??
> >>>>
> >>>> Contention was not my concern at all. :)
> >>>>
> >>>>>> Though after looking at comments in the test I would also
> >>>>>> suggest that setStackTrace be updated:
> >>>>> The test shows that after setStackTrace still the correct message
> >>>>> is computed. This is because the algorithm uses Throwable::backtrace
> >>>>> and not Throwable::stacktrace.  Throwable::backtrace is not
> >>>>> affected by setStackTrace.
> >>>>> The behavior is just as with any exception. If you fiddle
> >>>>> with the stack trace, but don't adapt the message text,
> >>>>> the message might refer to other code than the stack trace
> >>>>> points to.
> >>>>
> >>>> But you can't adapt the message text - there is no setMessage! If the
> >>>> message is NULL and you call setStackTrace() then getMessage(), it
> makes
> >>>> no sense to return the extended error message that was associated
> with
> >>>> the original stack/backtrace.
> >>>>
> >>>> Cheers,
> >>>> David
> >>>>
> >>>>> Best regards,
> >>>>>      Goetz.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>> Sent: Friday, July 3, 2020 3:37 AM
> >>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> 'forax at univ-
> >>>> mlv.fr'
> >>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-
> dev
> >>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >> message
> >>>>>> after calling fillInStackTrace
> >>>>>>
> >>>>>> Hi Goetz,
> >>>>>>
> >>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> >>>>>>> Hi Remi,
> >>>>>>>
> >>>>>>> But how does volatile help?
> >>>>>>> I see the test for numStackTracesFilledIn == 1 then gets always the
> >>>>>>> right value.
> >>>>>>> But the backtrace may not be changed until I read it in
> >>>>>>> getExtendedNPEMessage.  The other thread could change it after
> >>>>>>> checking numStackTracesFilledIn and before I read the backtrace.
> >>>>>>
> >>>>>> True. To ensure you process the original backtrace only you need to
> >> add
> >>>>>> synchronization in getMessage():
> >>>>>>
> >>>>>>           public String getMessage() {
> >>>>>>               String message = super.getMessage();
> >>>>>>               // If the stack trace was changed the extended NPE algorithm
> >>>>>>               // will compute a wrong message.
> >>>>>> +         synchronized(this) {
> >>>>>> !             if (message == null && numStackTracesFilledIn == 1) {
> >>>>>> !                 return getExtendedNPEMessage();
> >>>>>> !             }
> >>>>>> +         }
> >>>>>>               return message;
> >>>>>>           }
> >>>>>>
> >>>>>> To be honest the idea that someone would share an exception
> instance
> >>>> and
> >>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
> >>>>>> information about it just seems highly unrealistic. But the above fixes
> >>>>>> it simply. Though after looking at comments in the test I would also
> >>>>>> suggest that setStackTrace be updated:
> >>>>>>
> >>>>>>            synchronized (this) {
> >>>>>>                 if (this.stackTrace == null && // Immutable stack
> >>>>>>                     backtrace == null) // Test for out of protocol state
> >>>>>>                     return;
> >>>>>> +           numStackTracesFilledIn++;
> >>>>>>                 this.stackTrace = defensiveCopy;
> >>>>>>             }
> >>>>>>         }
> >>>>>>
> >>>>>> as that would seem to be another hole in the mechanism.
> >>>>>>
> >>>>>>> I want to vote again for the much more simple version
> >>>>>>> proposed in webrev 02:
> >>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >> NPE_fillInStackTrace-
> >>>>>> jdk15/02/
> >>>>>>
> >>>>>> I much prefer the latest version that recognises that only the original
> >>>>>> stack can be processed.
> >>>>>>
> >>>>>> In the test:
> >>>>>>
> >>>>>> +         // This holds for explicitly crated NPEs, but also for implicilty
> >>>>>>
> >>>>>> Two typos: crated  & implicilty
> >>>>>>
> >>>>>> Thanks,
> >>>>>> David
> >>>>>> -----
> >>>>>>
> >>>>>>
> >>>>>>> It's drawback is only that for this code:
> >>>>>>>       ex = null;
> >>>>>>>       ex.fillInStackTrace()
> >>>>>>> no message is created.
> >>>>>>>
> >>>>>>> I think this really is acceptable.
> >>>>>>>
> >>>>>>>
> >>>>>>> Remi, I didn't comment on this statement from a previous mail:
> >>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at some
> >> point.
> >>>>>>>> yes, it contains the Java stack trace, but if the Java stack trace is
> filled
> >>>> you
> >>>>>> don't
> >>>>>>>> compute any helpful message anyway.
> >>>>>>> The internal structure is no more deleted when the stack trace
> >>>>>>> is filled. So the message can be computed later, too.
> >>>>>>>
> >>>>>>> Best regards,
> >>>>>>>       Goetz.
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> >>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
> >>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph
> >> Dreis
> >>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
> >> runtime-
> >>>>>>>> dev at openjdk.java.net>; David Holmes
> <david.holmes at oracle.com>
> >>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >>>> message
> >>>>>>>> after calling fillInStackTrace
> >>>>>>>>
> >>>>>>>> yes,
> >>>>>>>> it's what i was saying,
> >>>>>>>> given that a NPE can be thrown very early, before VarHandle is
> >>>> initialized,
> >>>>>> i
> >>>>>>>> believe that declaring numStackTracesFilledIn volatile is the best
> way
> >> to
> >>>>>>>> tackle that.
> >>>>>>>>
> >>>>>>>> R?mi
> >>>>>>>>
> >>>>>>>> ----- Mail original -----
> >>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> >>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
> "Christoph
> >>>>>> Dreis"
> >>>>>>>> <christoph.dreis at freenet.de>
> >>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
> >> dev at openjdk.java.net>,
> >>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
> >>>>>>>>> <forax at univ-mlv.fr>
> >>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> >>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException
> >> message
> >>>>>>>> after calling fillInStackTrace
> >>>>>>>>
> >>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >>>>>>>>>> Hi Christoph,
> >>>>>>>>>>
> >>>>>>>>>> I fixed the comment, thanks for pointing that out.
> >>>>>>>>>>
> >>>>>>>>> One other thing is that NPE::getMessage reads
> >> numStackTracesFilledIn
> >>>>>>>>> without synchronization.
> >>>>>>>>>
> >>>>>>>>> -Alan

From patricio.chilano.mateo at oracle.com  Mon Jul 13 16:05:09 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Mon, 13 Jul 2020 13:05:09 -0300
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
Message-ID: <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>

Hi Dan,

Changes look good to me!

In synchronizer.cpp we have this comment about ObjectMonitor lifecycle:

// Inflation unlinks monitors from om_list_globals._free_list or a 
per-thread
// free list and associates them with objects. Deflation -- which occurs at
// STW-time or asynchronously -- disassociates idle monitors from objects.
// Such scavenged monitors are returned to the om_list_globals._free_list.

With all the older code removed, are there still cases where we do 
deflations at safepoint?

Thanks!
Patricio
On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
> Hi David,
>
> Thanks for the review!
>
> I need a second review folks... any takers?
>
> Dan
>
>
> On 7/12/20 10:57 PM, David Holmes wrote:
>> Hi Dan,
>>
>> This all looks good to me.
>>
>> Thanks,
>> David
>> -----
>>
>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>> Ping! Any takers??? Code deletion should be really appealing here!!
>>>
>>> Sorry Dan didn't get to it before vacation. But if you can wait till 
>>> Monday ...
>>>
>>> Cheers,
>>> David
>>>
>>>> Dan
>>>>
>>>>
>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> It's time to remove the AsyncDeflateIdleMonitors option from 
>>>>> JDK16. We can
>>>>> also get rid of the safepoint based deflation mechanism since 
>>>>> turning off
>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way 
>>>>> left to
>>>>> use it.
>>>>>
>>>>> This is marked as an "S/M" review because the number of 
>>>>> touched/deleted
>>>>> lines makes it a Medium review, but the number of touched/changed 
>>>>> lines
>>>>> (outside of the deletions) makes it a Small review. It's actually 
>>>>> a pretty
>>>>> fast read... :-)
>>>>>
>>>>> Here's the bug ID:
>>>>>
>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the 
>>>>> safepoint
>>>>> ??????????????? based deflation mechanism
>>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>
>>>>> Here's the webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>
>>>>> The webrev is baselined on Thomas S's fix for 8248650 which is 
>>>>> jdk-16+4
>>>>> plus a dozen or so changesets.
>>>>>
>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and 
>>>>> there are
>>>>> no regressions (and very few known failures). My inflation stress 
>>>>> testing
>>>>> is still in process. I had to restart that testing after a 
>>>>> thunderstorm
>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>
>>>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>>>
>>>>> Dan
>>>>
>


From daniel.daugherty at oracle.com  Mon Jul 13 16:08:37 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 13 Jul 2020 12:08:37 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
Message-ID: <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>

On 7/13/20 12:05 PM, Patricio Chilano wrote:
> Hi Dan,
>
> Changes look good to me!

Thanks!


> In synchronizer.cpp we have this comment about ObjectMonitor lifecycle:
>
> // Inflation unlinks monitors from om_list_globals._free_list or a 
> per-thread
> // free list and associates them with objects. Deflation -- which 
> occurs at
> // STW-time or asynchronously -- disassociates idle monitors from 
> objects.
> // Such scavenged monitors are returned to the 
> om_list_globals._free_list.
>
> With all the older code removed, are there still cases where we do 
> deflations at safepoint?

Good catch! I need to adjust that comment. I'll look for others also.

Dan


>
> Thanks!
> Patricio
> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>> Hi David,
>>
>> Thanks for the review!
>>
>> I need a second review folks... any takers?
>>
>> Dan
>>
>>
>> On 7/12/20 10:57 PM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> This all looks good to me.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>> Ping! Any takers??? Code deletion should be really appealing here!!
>>>>
>>>> Sorry Dan didn't get to it before vacation. But if you can wait 
>>>> till Monday ...
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> It's time to remove the AsyncDeflateIdleMonitors option from 
>>>>>> JDK16. We can
>>>>>> also get rid of the safepoint based deflation mechanism since 
>>>>>> turning off
>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way 
>>>>>> left to
>>>>>> use it.
>>>>>>
>>>>>> This is marked as an "S/M" review because the number of 
>>>>>> touched/deleted
>>>>>> lines makes it a Medium review, but the number of touched/changed 
>>>>>> lines
>>>>>> (outside of the deletions) makes it a Small review. It's actually 
>>>>>> a pretty
>>>>>> fast read... :-)
>>>>>>
>>>>>> Here's the bug ID:
>>>>>>
>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the 
>>>>>> safepoint
>>>>>> ??????????????? based deflation mechanism
>>>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>
>>>>>> Here's the webrev URL:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>
>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which is 
>>>>>> jdk-16+4
>>>>>> plus a dozen or so changesets.
>>>>>>
>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and 
>>>>>> there are
>>>>>> no regressions (and very few known failures). My inflation stress 
>>>>>> testing
>>>>>> is still in process. I had to restart that testing after a 
>>>>>> thunderstorm
>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>
>>>>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>>>>
>>>>>> Dan
>>>>>
>>
>


From igor.ignatyev at oracle.com  Mon Jul 13 17:16:43 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 13 Jul 2020 10:16:43 -0700
Subject: RFR(S) [15] : 8249032 : clean up FileInstaller $test.src $cwd in
 vmTestbase_nsk_sysdict tests
Message-ID: <CF6D1A88-7BDA-42E2-A478-F321EBC3A176@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249032/webrev.00
> 20 lines changed: 0 ins; 20 del; 0 mod; 

Hi all,

could you please review the patch which removes `FileInstaller . .` jtreg action from : vmTestbase_nsk_sysdict tests?
from the main issue(8204985):
> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.

none of sysdict tests need FileInstaller, so the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/nsk/sysdict xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.

testing: :vmTestbase_nsk_sysdict on linux-x64
webrev: http://cr.openjdk.java.net/~iignatyev//8249032/webrev.00
JBS: https://bugs.openjdk.java.net/browse/JDK-8249032

Thanks,
-- Igor

From coleen.phillimore at oracle.com  Mon Jul 13 17:21:02 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 13 Jul 2020 13:21:02 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
 <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
Message-ID: <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>


http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/objectMonitor.inline.hpp.udiff.html
http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/synchronizer.cpp.udiff.html

+#ifdef ASSERT
    void* prev = Atomic::load(&_owner);
- ADIM_guarantee(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
+#endif
+ assert(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
                   ", expected=" INTPTR_FORMAT, p2i(prev), p2i(old_value));

Just a nit but these patterns look really strange.? Can you put the 
#endif on the other side of the assert?

http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/safepoint.cpp.udiff.html

if 
(_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
{ ... + ObjectSynchronizer::do_safepoint_work();


Why do we still have this to trigger the ServiceThread when we're going 
to only wait GuaranteedSafepointInterval before checking for monitor 
deflation anyway?? Why have this in a safepoint cleanup task?

The code deletion is really nice.

Thanks,
Coleen


On 7/13/20 12:08 PM, Daniel D. Daugherty wrote:
> On 7/13/20 12:05 PM, Patricio Chilano wrote:
>> Hi Dan,
>>
>> Changes look good to me!
>
> Thanks!
>
>
>> In synchronizer.cpp we have this comment about ObjectMonitor lifecycle:
>>
>> // Inflation unlinks monitors from om_list_globals._free_list or a 
>> per-thread
>> // free list and associates them with objects. Deflation -- which 
>> occurs at
>> // STW-time or asynchronously -- disassociates idle monitors from 
>> objects.
>> // Such scavenged monitors are returned to the 
>> om_list_globals._free_list.
>>
>> With all the older code removed, are there still cases where we do 
>> deflations at safepoint?
>
> Good catch! I need to adjust that comment. I'll look for others also.
>
> Dan
>
>
>>
>> Thanks!
>> Patricio
>> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>>> Hi David,
>>>
>>> Thanks for the review!
>>>
>>> I need a second review folks... any takers?
>>>
>>> Dan
>>>
>>>
>>> On 7/12/20 10:57 PM, David Holmes wrote:
>>>> Hi Dan,
>>>>
>>>> This all looks good to me.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>>> Ping! Any takers??? Code deletion should be really appealing here!!
>>>>>
>>>>> Sorry Dan didn't get to it before vacation. But if you can wait 
>>>>> till Monday ...
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> It's time to remove the AsyncDeflateIdleMonitors option from 
>>>>>>> JDK16. We can
>>>>>>> also get rid of the safepoint based deflation mechanism since 
>>>>>>> turning off
>>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only way 
>>>>>>> left to
>>>>>>> use it.
>>>>>>>
>>>>>>> This is marked as an "S/M" review because the number of 
>>>>>>> touched/deleted
>>>>>>> lines makes it a Medium review, but the number of 
>>>>>>> touched/changed lines
>>>>>>> (outside of the deletions) makes it a Small review. It's 
>>>>>>> actually a pretty
>>>>>>> fast read... :-)
>>>>>>>
>>>>>>> Here's the bug ID:
>>>>>>>
>>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the 
>>>>>>> safepoint
>>>>>>> ??????????????? based deflation mechanism
>>>>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>>
>>>>>>> Here's the webrev URL:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>>
>>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which is 
>>>>>>> jdk-16+4
>>>>>>> plus a dozen or so changesets.
>>>>>>>
>>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and 
>>>>>>> there are
>>>>>>> no regressions (and very few known failures). My inflation 
>>>>>>> stress testing
>>>>>>> is still in process. I had to restart that testing after a 
>>>>>>> thunderstorm
>>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>>
>>>>>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>>>>>
>>>>>>> Dan
>>>>>>
>>>
>>
>


From igor.ignatyev at oracle.com  Mon Jul 13 17:32:19 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 13 Jul 2020 10:32:19 -0700
Subject: RFR [15] : 8249033 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_metaspace tests
Message-ID: <BAEA719A-AE50-4D41-9202-AB40EA2370A7@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249033/webrev.00/
> 47 lines changed: 0 ins; 32 del; 15 mod; 

Hi all,

could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_metaspace tests?
from the main issue(8204985):
> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.

as none of these tests need FileInstaller, the patch is as simple as `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/metaspace/ xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.

testing: :vmTestbase_vm_metaspace on linux-x64
webrev: http://cr.openjdk.java.net/~iignatyev//8249033/webrev.00
JBS: https://bugs.openjdk.java.net/browse/JDK-8249033

Thanks,
-- Igor

From daniel.daugherty at oracle.com  Mon Jul 13 18:49:54 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 13 Jul 2020 14:49:54 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
 <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
 <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>
Message-ID: <23cfce5b-7748-b5f4-c58f-d90211adacf3@oracle.com>

Hi Coleen,

Thanks for jumping in on this code review!


On 7/13/20 1:21 PM, coleen.phillimore at oracle.com wrote:
>
> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/objectMonitor.inline.hpp.udiff.html 
>
> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/synchronizer.cpp.udiff.html 
>
>
> +#ifdef ASSERT
> ?? void* prev = Atomic::load(&_owner);
> - ADIM_guarantee(prev == old_value, "unexpected prev owner=" 
> INTPTR_FORMAT
> +#endif
> + assert(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
> ????????????????? ", expected=" INTPTR_FORMAT, p2i(prev), 
> p2i(old_value));
>
> Just a nit but these patterns look really strange.? Can you put the 
> #endif on the other side of the assert?

I can do that. In previous reviews of other fixes, I think I've had 
assert()
calls inside #ifdef ASSERT ... #endif blocks and have been asked to move
them out since it's redundant.

I'll plan to make that change, but I'll keep an eye open for complaints
from other reviewers before I push...


> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/safepoint.cpp.udiff.html 
>
>
> if 
> (_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
> { ... + ObjectSynchronizer::do_safepoint_work();
>
>
> Why do we still have this to trigger the ServiceThread when we're 
> going to only wait GuaranteedSafepointInterval before checking for 
> monitor deflation anyway?

When a cleanup safepoint triggers an async deflation request, it
counts as a direct async deflation request so it is not subject
to the AsyncDeflationInterval limit. In other words, when a cleanup
safepoint asks, we honor the request regardless of when the most
recent async deflation request was finished.

When we do a periodic wake-up and check to see if an async deflation
is needed, we'll only honor that request if it has been longer than
AsyncDeflationInterval since we last completed an async deflation
cycle. So the periodic invocation mechanism has a limit so that we
don't swamp the ServiceThread.


> Why have this in a safepoint cleanup task?

I figured I would leave it because there are plans to get rid of the
safepoint cleanup mechanism and whoever does that work can deal with
any comments in the code that talk about cleanup safepoints and the
like. That person will also have to evaluate and deal with whether
the safepoint cleanup task removal causes unacceptable issues in the
system.


> The code deletion is really nice.

Thanks!

Dan


>
> Thanks,
> Coleen
>
>
> On 7/13/20 12:08 PM, Daniel D. Daugherty wrote:
>> On 7/13/20 12:05 PM, Patricio Chilano wrote:
>>> Hi Dan,
>>>
>>> Changes look good to me!
>>
>> Thanks!
>>
>>
>>> In synchronizer.cpp we have this comment about ObjectMonitor lifecycle:
>>>
>>> // Inflation unlinks monitors from om_list_globals._free_list or a 
>>> per-thread
>>> // free list and associates them with objects. Deflation -- which 
>>> occurs at
>>> // STW-time or asynchronously -- disassociates idle monitors from 
>>> objects.
>>> // Such scavenged monitors are returned to the 
>>> om_list_globals._free_list.
>>>
>>> With all the older code removed, are there still cases where we do 
>>> deflations at safepoint?
>>
>> Good catch! I need to adjust that comment. I'll look for others also.
>>
>> Dan
>>
>>
>>>
>>> Thanks!
>>> Patricio
>>> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>>>> Hi David,
>>>>
>>>> Thanks for the review!
>>>>
>>>> I need a second review folks... any takers?
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 7/12/20 10:57 PM, David Holmes wrote:
>>>>> Hi Dan,
>>>>>
>>>>> This all looks good to me.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>>>> Ping! Any takers??? Code deletion should be really appealing here!!
>>>>>>
>>>>>> Sorry Dan didn't get to it before vacation. But if you can wait 
>>>>>> till Monday ...
>>>>>>
>>>>>> Cheers,
>>>>>> David
>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> It's time to remove the AsyncDeflateIdleMonitors option from 
>>>>>>>> JDK16. We can
>>>>>>>> also get rid of the safepoint based deflation mechanism since 
>>>>>>>> turning off
>>>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only 
>>>>>>>> way left to
>>>>>>>> use it.
>>>>>>>>
>>>>>>>> This is marked as an "S/M" review because the number of 
>>>>>>>> touched/deleted
>>>>>>>> lines makes it a Medium review, but the number of 
>>>>>>>> touched/changed lines
>>>>>>>> (outside of the deletions) makes it a Small review. It's 
>>>>>>>> actually a pretty
>>>>>>>> fast read... :-)
>>>>>>>>
>>>>>>>> Here's the bug ID:
>>>>>>>>
>>>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the 
>>>>>>>> safepoint
>>>>>>>> ??????????????? based deflation mechanism
>>>>>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>>>
>>>>>>>> Here's the webrev URL:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>>>
>>>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which is 
>>>>>>>> jdk-16+4
>>>>>>>> plus a dozen or so changesets.
>>>>>>>>
>>>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and 
>>>>>>>> there are
>>>>>>>> no regressions (and very few known failures). My inflation 
>>>>>>>> stress testing
>>>>>>>> is still in process. I had to restart that testing after a 
>>>>>>>> thunderstorm
>>>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>>>
>>>>>>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>
>>>>
>>>
>>
>


From coleen.phillimore at oracle.com  Mon Jul 13 19:23:47 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 13 Jul 2020 15:23:47 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <23cfce5b-7748-b5f4-c58f-d90211adacf3@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
 <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
 <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>
 <23cfce5b-7748-b5f4-c58f-d90211adacf3@oracle.com>
Message-ID: <979d2a0c-ff48-9308-25f5-cdf2c4c512f4@oracle.com>


On 7/13/20 2:49 PM, Daniel D. Daugherty wrote:
> Hi Coleen,
>
> Thanks for jumping in on this code review!
>
>
> On 7/13/20 1:21 PM, coleen.phillimore at oracle.com wrote:
>>
>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/objectMonitor.inline.hpp.udiff.html 
>>
>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/synchronizer.cpp.udiff.html 
>>
>>
>> +#ifdef ASSERT
>> ?? void* prev = Atomic::load(&_owner);
>> - ADIM_guarantee(prev == old_value, "unexpected prev owner=" 
>> INTPTR_FORMAT
>> +#endif
>> + assert(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
>> ????????????????? ", expected=" INTPTR_FORMAT, p2i(prev), 
>> p2i(old_value));
>>
>> Just a nit but these patterns look really strange.? Can you put the 
>> #endif on the other side of the assert?
>
> I can do that. In previous reviews of other fixes, I think I've had 
> assert()
> calls inside #ifdef ASSERT ... #endif blocks and have been asked to move
> them out since it's redundant.
>
> I'll plan to make that change, but I'll keep an eye open for complaints
> from other reviewers before I push...
>
>
>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/safepoint.cpp.udiff.html 
>>
>>
>> if 
>> (_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
>> { ... + ObjectSynchronizer::do_safepoint_work();
>>
>>
>> Why do we still have this to trigger the ServiceThread when we're 
>> going to only wait GuaranteedSafepointInterval before checking for 
>> monitor deflation anyway?
>
> When a cleanup safepoint triggers an async deflation request, it
> counts as a direct async deflation request so it is not subject
> to the AsyncDeflationInterval limit. In other words, when a cleanup
> safepoint asks, we honor the request regardless of when the most
> recent async deflation request was finished.
>
> When we do a periodic wake-up and check to see if an async deflation
> is needed, we'll only honor that request if it has been longer than
> AsyncDeflationInterval since we last completed an async deflation
> cycle. So the periodic invocation mechanism has a limit so that we
> don't swamp the ServiceThread.

Ok, I see it now, thanks!? If we had a lot of safepoints, we would swamp 
the service thread also, but we don't have a lot of safepoints anymore.

>
>
>> Why have this in a safepoint cleanup task?
>
> I figured I would leave it because there are plans to get rid of the
> safepoint cleanup mechanism and whoever does that work can deal with
> any comments in the code that talk about cleanup safepoints and the
> like. That person will also have to evaluate and deal with whether
> the safepoint cleanup task removal causes unacceptable issues in the
> system.
>

I think we're still going to need safepoint cleanup for the other 
things, but ok, someone can evaluate this later.? There's also this RFE, 
which I linked to this just now.

https://bugs.openjdk.java.net/browse/JDK-8227060

Thanks,
Coleen
>
>> The code deletion is really nice.
>
> Thanks!
>
> Dan
>
>
>>
>> Thanks,
>> Coleen
>>
>>
>> On 7/13/20 12:08 PM, Daniel D. Daugherty wrote:
>>> On 7/13/20 12:05 PM, Patricio Chilano wrote:
>>>> Hi Dan,
>>>>
>>>> Changes look good to me!
>>>
>>> Thanks!
>>>
>>>
>>>> In synchronizer.cpp we have this comment about ObjectMonitor 
>>>> lifecycle:
>>>>
>>>> // Inflation unlinks monitors from om_list_globals._free_list or a 
>>>> per-thread
>>>> // free list and associates them with objects. Deflation -- which 
>>>> occurs at
>>>> // STW-time or asynchronously -- disassociates idle monitors from 
>>>> objects.
>>>> // Such scavenged monitors are returned to the 
>>>> om_list_globals._free_list.
>>>>
>>>> With all the older code removed, are there still cases where we do 
>>>> deflations at safepoint?
>>>
>>> Good catch! I need to adjust that comment. I'll look for others also.
>>>
>>> Dan
>>>
>>>
>>>>
>>>> Thanks!
>>>> Patricio
>>>> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>>>>> Hi David,
>>>>>
>>>>> Thanks for the review!
>>>>>
>>>>> I need a second review folks... any takers?
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 7/12/20 10:57 PM, David Holmes wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>> This all looks good to me.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>>>>> Ping! Any takers??? Code deletion should be really appealing 
>>>>>>>> here!!
>>>>>>>
>>>>>>> Sorry Dan didn't get to it before vacation. But if you can wait 
>>>>>>> till Monday ...
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> It's time to remove the AsyncDeflateIdleMonitors option from 
>>>>>>>>> JDK16. We can
>>>>>>>>> also get rid of the safepoint based deflation mechanism since 
>>>>>>>>> turning off
>>>>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only 
>>>>>>>>> way left to
>>>>>>>>> use it.
>>>>>>>>>
>>>>>>>>> This is marked as an "S/M" review because the number of 
>>>>>>>>> touched/deleted
>>>>>>>>> lines makes it a Medium review, but the number of 
>>>>>>>>> touched/changed lines
>>>>>>>>> (outside of the deletions) makes it a Small review. It's 
>>>>>>>>> actually a pretty
>>>>>>>>> fast read... :-)
>>>>>>>>>
>>>>>>>>> Here's the bug ID:
>>>>>>>>>
>>>>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and the 
>>>>>>>>> safepoint
>>>>>>>>> ??????????????? based deflation mechanism
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>>>>
>>>>>>>>> Here's the webrev URL:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>>>>
>>>>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which is 
>>>>>>>>> jdk-16+4
>>>>>>>>> plus a dozen or so changesets.
>>>>>>>>>
>>>>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 and 
>>>>>>>>> there are
>>>>>>>>> no regressions (and very few known failures). My inflation 
>>>>>>>>> stress testing
>>>>>>>>> is still in process. I had to restart that testing after a 
>>>>>>>>> thunderstorm
>>>>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>>>>
>>>>>>>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>
>>>>>
>>>>
>>>
>>
>


From daniel.daugherty at oracle.com  Mon Jul 13 19:43:13 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 13 Jul 2020 15:43:13 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <979d2a0c-ff48-9308-25f5-cdf2c4c512f4@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
 <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
 <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>
 <23cfce5b-7748-b5f4-c58f-d90211adacf3@oracle.com>
 <979d2a0c-ff48-9308-25f5-cdf2c4c512f4@oracle.com>
Message-ID: <7b3809ab-72f6-170d-ea24-3641f92bbec4@oracle.com>

Greetings,

I've made minor tweaks based on Patricio's and Coleen's CR0 reviews.
Of course, while I was looking for safepoint related comments that
needed tweaking, I ran across a "perm gen" comment that needed to
be fixed and a misspelling of NoSafepointVerifier. I fixed both...

Fixing the "#ifdef ASSERT ... #endif" blocking touched a few files
so I've gone ahead and generated new webrevs.

Full webrev:

http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.full/

Incremental webrev:

http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.inc/


May I please have at least one re-review?

Thanks, in advance, for any comments, questions or suggestions.

Dan


On 7/13/20 3:23 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 7/13/20 2:49 PM, Daniel D. Daugherty wrote:
>> Hi Coleen,
>>
>> Thanks for jumping in on this code review!
>>
>>
>> On 7/13/20 1:21 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/objectMonitor.inline.hpp.udiff.html 
>>>
>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/synchronizer.cpp.udiff.html 
>>>
>>>
>>> +#ifdef ASSERT
>>> ?? void* prev = Atomic::load(&_owner);
>>> - ADIM_guarantee(prev == old_value, "unexpected prev owner=" 
>>> INTPTR_FORMAT
>>> +#endif
>>> + assert(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
>>> ????????????????? ", expected=" INTPTR_FORMAT, p2i(prev), 
>>> p2i(old_value));
>>>
>>> Just a nit but these patterns look really strange.? Can you put the 
>>> #endif on the other side of the assert?
>>
>> I can do that. In previous reviews of other fixes, I think I've had 
>> assert()
>> calls inside #ifdef ASSERT ... #endif blocks and have been asked to move
>> them out since it's redundant.
>>
>> I'll plan to make that change, but I'll keep an eye open for complaints
>> from other reviewers before I push...
>>
>>
>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/safepoint.cpp.udiff.html 
>>>
>>>
>>> if 
>>> (_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
>>> { ... + ObjectSynchronizer::do_safepoint_work();
>>>
>>>
>>> Why do we still have this to trigger the ServiceThread when we're 
>>> going to only wait GuaranteedSafepointInterval before checking for 
>>> monitor deflation anyway?
>>
>> When a cleanup safepoint triggers an async deflation request, it
>> counts as a direct async deflation request so it is not subject
>> to the AsyncDeflationInterval limit. In other words, when a cleanup
>> safepoint asks, we honor the request regardless of when the most
>> recent async deflation request was finished.
>>
>> When we do a periodic wake-up and check to see if an async deflation
>> is needed, we'll only honor that request if it has been longer than
>> AsyncDeflationInterval since we last completed an async deflation
>> cycle. So the periodic invocation mechanism has a limit so that we
>> don't swamp the ServiceThread.
>
> Ok, I see it now, thanks!? If we had a lot of safepoints, we would 
> swamp the service thread also, but we don't have a lot of safepoints 
> anymore.
>
>>
>>
>>> Why have this in a safepoint cleanup task?
>>
>> I figured I would leave it because there are plans to get rid of the
>> safepoint cleanup mechanism and whoever does that work can deal with
>> any comments in the code that talk about cleanup safepoints and the
>> like. That person will also have to evaluate and deal with whether
>> the safepoint cleanup task removal causes unacceptable issues in the
>> system.
>>
>
> I think we're still going to need safepoint cleanup for the other 
> things, but ok, someone can evaluate this later.? There's also this 
> RFE, which I linked to this just now.
>
> https://bugs.openjdk.java.net/browse/JDK-8227060
>
> Thanks,
> Coleen
>>
>>> The code deletion is really nice.
>>
>> Thanks!
>>
>> Dan
>>
>>
>>>
>>> Thanks,
>>> Coleen
>>>
>>>
>>> On 7/13/20 12:08 PM, Daniel D. Daugherty wrote:
>>>> On 7/13/20 12:05 PM, Patricio Chilano wrote:
>>>>> Hi Dan,
>>>>>
>>>>> Changes look good to me!
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>> In synchronizer.cpp we have this comment about ObjectMonitor 
>>>>> lifecycle:
>>>>>
>>>>> // Inflation unlinks monitors from om_list_globals._free_list or a 
>>>>> per-thread
>>>>> // free list and associates them with objects. Deflation -- which 
>>>>> occurs at
>>>>> // STW-time or asynchronously -- disassociates idle monitors from 
>>>>> objects.
>>>>> // Such scavenged monitors are returned to the 
>>>>> om_list_globals._free_list.
>>>>>
>>>>> With all the older code removed, are there still cases where we do 
>>>>> deflations at safepoint?
>>>>
>>>> Good catch! I need to adjust that comment. I'll look for others also.
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Thanks!
>>>>> Patricio
>>>>> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> Thanks for the review!
>>>>>>
>>>>>> I need a second review folks... any takers?
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On 7/12/20 10:57 PM, David Holmes wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> This all looks good to me.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>>>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>>>>>> Ping! Any takers??? Code deletion should be really appealing 
>>>>>>>>> here!!
>>>>>>>>
>>>>>>>> Sorry Dan didn't get to it before vacation. But if you can wait 
>>>>>>>> till Monday ...
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>>>>>> Greetings,
>>>>>>>>>>
>>>>>>>>>> It's time to remove the AsyncDeflateIdleMonitors option from 
>>>>>>>>>> JDK16. We can
>>>>>>>>>> also get rid of the safepoint based deflation mechanism since 
>>>>>>>>>> turning off
>>>>>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only 
>>>>>>>>>> way left to
>>>>>>>>>> use it.
>>>>>>>>>>
>>>>>>>>>> This is marked as an "S/M" review because the number of 
>>>>>>>>>> touched/deleted
>>>>>>>>>> lines makes it a Medium review, but the number of 
>>>>>>>>>> touched/changed lines
>>>>>>>>>> (outside of the deletions) makes it a Small review. It's 
>>>>>>>>>> actually a pretty
>>>>>>>>>> fast read... :-)
>>>>>>>>>>
>>>>>>>>>> Here's the bug ID:
>>>>>>>>>>
>>>>>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and 
>>>>>>>>>> the safepoint
>>>>>>>>>> ??????????????? based deflation mechanism
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>>>>>
>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>>>>>
>>>>>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which 
>>>>>>>>>> is jdk-16+4
>>>>>>>>>> plus a dozen or so changesets.
>>>>>>>>>>
>>>>>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 
>>>>>>>>>> and there are
>>>>>>>>>> no regressions (and very few known failures). My inflation 
>>>>>>>>>> stress testing
>>>>>>>>>> is still in process. I had to restart that testing after a 
>>>>>>>>>> thunderstorm
>>>>>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>>>>>
>>>>>>>>>> Thanks, in advance, for any comments, questions, or suggestions.
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From mandy.chung at oracle.com  Mon Jul 13 19:53:32 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 13 Jul 2020 12:53:32 -0700
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <c0e52106-bb54-fc52-08b3-652f1c9ac6ae@oracle.com>

Hi Goetz,

I digged up some history of setStackTrace.

> Well, all the scenario are quite artificial:
>   - why would you call fillInStackTrace on an exception thrown by the VM?
>   - why would you call setStackTrace at all?

FYI on the history on `setStackTrace` - this API is designed for the 
client side to fill in the stack trace for example concatening the 
server-side stack trace? with appropriate client-side stack trace (see 
JDK-4010355).

As described in JEP 358, NPEs that are explicitly created and/or 
explicitly thrown by programs running on the JVM are not subject to the 
bytecode analysis. ? So it's reasonable to include the case that NPE 
whose stack trace is explicitly replaced via `fillInStackTrace` or 
`setStackTrace` are not subject to the bytecode analysis (rather than 
only `fillInStackTrace`).

>> To me that makes
>> far more sense than having msg2 continue to report the extended info for
>> the original stacktrace when it now has a new stacktrace.
>>
>> I'm really not seeing why calling fillInstackTrace() a second time
>> should be treated any differently to calling setStackTrace(). They
>> should be handled consistently IMO.

I agree with David on this point.? `fillInStackTrace` and 
`setStackTrace` both override this throwable with a different stack 
trace and I don't see why we want them to behave differently.

> But then you treat setStackTrace() differently from setStackTrace()
> with other exceptions.
> The reason to treat fillInStackTrace differently is that we lost information
> needed to compute it. This is not the case with setStackTrace().

This is because backtrace is not reset to null which is just the current 
implementation.? A simple way is to override NPE::setStackTrace to 
indicate that this NPE instance is not subject to bytecode analysis.

> A different solution, the one I would have proposed if I had not
> considered previous comments from reviews,  would be to just
> compute the message in the runtime in the call of fillInStackTrace
> before the old stack trace is lost and assign it to the message field.
> This way it would behave similar to all other exceptions. The message
> would just be there ... just that it's computed lazily.
> The cost of the algorithm wouldn't harm that much as other costly
> algorithms (walking the stack) are performed at this point, too.
>

I expect most common cases calling fillInStackTrace and setStackTrace 
are on explicitly created NPEs which are not subject to bytecode 
analysis.??? While it should be rare to see NPE with an extended message 
but its stack trace is replaced, such NPE message + stack trace would be 
quite confusing.

OTOH, it's okay with me if you want to replace 
NPE::numStackTracesFilledIn with a volatile NPE::message field instead 
such that it's assigned to the result returned from 
getExtendedNPEMessage() and reset to an empty string if the stack trace 
is overridden.

Mandy

From coleen.phillimore at oracle.com  Mon Jul 13 19:56:12 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 13 Jul 2020 15:56:12 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <7b3809ab-72f6-170d-ea24-3641f92bbec4@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
 <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
 <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>
 <23cfce5b-7748-b5f4-c58f-d90211adacf3@oracle.com>
 <979d2a0c-ff48-9308-25f5-cdf2c4c512f4@oracle.com>
 <7b3809ab-72f6-170d-ea24-3641f92bbec4@oracle.com>
Message-ID: <b4d7b632-bbfc-be34-2710-3497f9572c13@oracle.com>


Looks good to me!? There were so many variations of "perm gen" to look 
for.? Thanks for finding that one.
Coleen

On 7/13/20 3:43 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> I've made minor tweaks based on Patricio's and Coleen's CR0 reviews.
> Of course, while I was looking for safepoint related comments that
> needed tweaking, I ran across a "perm gen" comment that needed to
> be fixed and a misspelling of NoSafepointVerifier. I fixed both...
>
> Fixing the "#ifdef ASSERT ... #endif" blocking touched a few files
> so I've gone ahead and generated new webrevs.
>
> Full webrev:
>
> http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.full/
>
> Incremental webrev:
>
> http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.inc/
>
>
> May I please have at least one re-review?
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
>
> On 7/13/20 3:23 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 7/13/20 2:49 PM, Daniel D. Daugherty wrote:
>>> Hi Coleen,
>>>
>>> Thanks for jumping in on this code review!
>>>
>>>
>>> On 7/13/20 1:21 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/objectMonitor.inline.hpp.udiff.html 
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/synchronizer.cpp.udiff.html 
>>>>
>>>>
>>>> +#ifdef ASSERT
>>>> ?? void* prev = Atomic::load(&_owner);
>>>> - ADIM_guarantee(prev == old_value, "unexpected prev owner=" 
>>>> INTPTR_FORMAT
>>>> +#endif
>>>> + assert(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
>>>> ????????????????? ", expected=" INTPTR_FORMAT, p2i(prev), 
>>>> p2i(old_value));
>>>>
>>>> Just a nit but these patterns look really strange.? Can you put the 
>>>> #endif on the other side of the assert?
>>>
>>> I can do that. In previous reviews of other fixes, I think I've had 
>>> assert()
>>> calls inside #ifdef ASSERT ... #endif blocks and have been asked to 
>>> move
>>> them out since it's redundant.
>>>
>>> I'll plan to make that change, but I'll keep an eye open for complaints
>>> from other reviewers before I push...
>>>
>>>
>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/safepoint.cpp.udiff.html 
>>>>
>>>>
>>>> if 
>>>> (_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
>>>> { ... + ObjectSynchronizer::do_safepoint_work();
>>>>
>>>>
>>>> Why do we still have this to trigger the ServiceThread when we're 
>>>> going to only wait GuaranteedSafepointInterval before checking for 
>>>> monitor deflation anyway?
>>>
>>> When a cleanup safepoint triggers an async deflation request, it
>>> counts as a direct async deflation request so it is not subject
>>> to the AsyncDeflationInterval limit. In other words, when a cleanup
>>> safepoint asks, we honor the request regardless of when the most
>>> recent async deflation request was finished.
>>>
>>> When we do a periodic wake-up and check to see if an async deflation
>>> is needed, we'll only honor that request if it has been longer than
>>> AsyncDeflationInterval since we last completed an async deflation
>>> cycle. So the periodic invocation mechanism has a limit so that we
>>> don't swamp the ServiceThread.
>>
>> Ok, I see it now, thanks!? If we had a lot of safepoints, we would 
>> swamp the service thread also, but we don't have a lot of safepoints 
>> anymore.
>>
>>>
>>>
>>>> Why have this in a safepoint cleanup task?
>>>
>>> I figured I would leave it because there are plans to get rid of the
>>> safepoint cleanup mechanism and whoever does that work can deal with
>>> any comments in the code that talk about cleanup safepoints and the
>>> like. That person will also have to evaluate and deal with whether
>>> the safepoint cleanup task removal causes unacceptable issues in the
>>> system.
>>>
>>
>> I think we're still going to need safepoint cleanup for the other 
>> things, but ok, someone can evaluate this later.? There's also this 
>> RFE, which I linked to this just now.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8227060
>>
>> Thanks,
>> Coleen
>>>
>>>> The code deletion is really nice.
>>>
>>> Thanks!
>>>
>>> Dan
>>>
>>>
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>
>>>> On 7/13/20 12:08 PM, Daniel D. Daugherty wrote:
>>>>> On 7/13/20 12:05 PM, Patricio Chilano wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>> Changes look good to me!
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>> In synchronizer.cpp we have this comment about ObjectMonitor 
>>>>>> lifecycle:
>>>>>>
>>>>>> // Inflation unlinks monitors from om_list_globals._free_list or 
>>>>>> a per-thread
>>>>>> // free list and associates them with objects. Deflation -- which 
>>>>>> occurs at
>>>>>> // STW-time or asynchronously -- disassociates idle monitors from 
>>>>>> objects.
>>>>>> // Such scavenged monitors are returned to the 
>>>>>> om_list_globals._free_list.
>>>>>>
>>>>>> With all the older code removed, are there still cases where we 
>>>>>> do deflations at safepoint?
>>>>>
>>>>> Good catch! I need to adjust that comment. I'll look for others also.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>> Patricio
>>>>>> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Thanks for the review!
>>>>>>>
>>>>>>> I need a second review folks... any takers?
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 7/12/20 10:57 PM, David Holmes wrote:
>>>>>>>> Hi Dan,
>>>>>>>>
>>>>>>>> This all looks good to me.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>>>>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>>>>>>> Ping! Any takers??? Code deletion should be really appealing 
>>>>>>>>>> here!!
>>>>>>>>>
>>>>>>>>> Sorry Dan didn't get to it before vacation. But if you can 
>>>>>>>>> wait till Monday ...
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>> Greetings,
>>>>>>>>>>>
>>>>>>>>>>> It's time to remove the AsyncDeflateIdleMonitors option from 
>>>>>>>>>>> JDK16. We can
>>>>>>>>>>> also get rid of the safepoint based deflation mechanism 
>>>>>>>>>>> since turning off
>>>>>>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only 
>>>>>>>>>>> way left to
>>>>>>>>>>> use it.
>>>>>>>>>>>
>>>>>>>>>>> This is marked as an "S/M" review because the number of 
>>>>>>>>>>> touched/deleted
>>>>>>>>>>> lines makes it a Medium review, but the number of 
>>>>>>>>>>> touched/changed lines
>>>>>>>>>>> (outside of the deletions) makes it a Small review. It's 
>>>>>>>>>>> actually a pretty
>>>>>>>>>>> fast read... :-)
>>>>>>>>>>>
>>>>>>>>>>> Here's the bug ID:
>>>>>>>>>>>
>>>>>>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and 
>>>>>>>>>>> the safepoint
>>>>>>>>>>> ??????????????? based deflation mechanism
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>>>>>>
>>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>>>>>>
>>>>>>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which 
>>>>>>>>>>> is jdk-16+4
>>>>>>>>>>> plus a dozen or so changesets.
>>>>>>>>>>>
>>>>>>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 
>>>>>>>>>>> and there are
>>>>>>>>>>> no regressions (and very few known failures). My inflation 
>>>>>>>>>>> stress testing
>>>>>>>>>>> is still in process. I had to restart that testing after a 
>>>>>>>>>>> thunderstorm
>>>>>>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>>>>>>
>>>>>>>>>>> Thanks, in advance, for any comments, questions, or 
>>>>>>>>>>> suggestions.
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From daniel.daugherty at oracle.com  Mon Jul 13 19:57:26 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 13 Jul 2020 15:57:26 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <b4d7b632-bbfc-be34-2710-3497f9572c13@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
 <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
 <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>
 <23cfce5b-7748-b5f4-c58f-d90211adacf3@oracle.com>
 <979d2a0c-ff48-9308-25f5-cdf2c4c512f4@oracle.com>
 <7b3809ab-72f6-170d-ea24-3641f92bbec4@oracle.com>
 <b4d7b632-bbfc-be34-2710-3497f9572c13@oracle.com>
Message-ID: <80eabaee-231a-bb62-760b-be67d678e2af@oracle.com>

On 7/13/20 3:56 PM, coleen.phillimore at oracle.com wrote:
>
> Looks good to me!

Thanks for the fast re-review!


> There were so many variations of "perm gen" to look for.? Thanks for 
> finding that one.

No problem. Thanks for letting me fix it as part of this work.

Dan


> Coleen
>
> On 7/13/20 3:43 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I've made minor tweaks based on Patricio's and Coleen's CR0 reviews.
>> Of course, while I was looking for safepoint related comments that
>> needed tweaking, I ran across a "perm gen" comment that needed to
>> be fixed and a misspelling of NoSafepointVerifier. I fixed both...
>>
>> Fixing the "#ifdef ASSERT ... #endif" blocking touched a few files
>> so I've gone ahead and generated new webrevs.
>>
>> Full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.full/
>>
>> Incremental webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.inc/
>>
>>
>> May I please have at least one re-review?
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>>
>> On 7/13/20 3:23 PM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 7/13/20 2:49 PM, Daniel D. Daugherty wrote:
>>>> Hi Coleen,
>>>>
>>>> Thanks for jumping in on this code review!
>>>>
>>>>
>>>> On 7/13/20 1:21 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/objectMonitor.inline.hpp.udiff.html 
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/synchronizer.cpp.udiff.html 
>>>>>
>>>>>
>>>>> +#ifdef ASSERT
>>>>> ?? void* prev = Atomic::load(&_owner);
>>>>> - ADIM_guarantee(prev == old_value, "unexpected prev owner=" 
>>>>> INTPTR_FORMAT
>>>>> +#endif
>>>>> + assert(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
>>>>> ????????????????? ", expected=" INTPTR_FORMAT, p2i(prev), 
>>>>> p2i(old_value));
>>>>>
>>>>> Just a nit but these patterns look really strange.? Can you put 
>>>>> the #endif on the other side of the assert?
>>>>
>>>> I can do that. In previous reviews of other fixes, I think I've had 
>>>> assert()
>>>> calls inside #ifdef ASSERT ... #endif blocks and have been asked to 
>>>> move
>>>> them out since it's redundant.
>>>>
>>>> I'll plan to make that change, but I'll keep an eye open for 
>>>> complaints
>>>> from other reviewers before I push...
>>>>
>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/safepoint.cpp.udiff.html 
>>>>>
>>>>>
>>>>> if 
>>>>> (_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
>>>>> { ... + ObjectSynchronizer::do_safepoint_work();
>>>>>
>>>>>
>>>>> Why do we still have this to trigger the ServiceThread when we're 
>>>>> going to only wait GuaranteedSafepointInterval before checking for 
>>>>> monitor deflation anyway?
>>>>
>>>> When a cleanup safepoint triggers an async deflation request, it
>>>> counts as a direct async deflation request so it is not subject
>>>> to the AsyncDeflationInterval limit. In other words, when a cleanup
>>>> safepoint asks, we honor the request regardless of when the most
>>>> recent async deflation request was finished.
>>>>
>>>> When we do a periodic wake-up and check to see if an async deflation
>>>> is needed, we'll only honor that request if it has been longer than
>>>> AsyncDeflationInterval since we last completed an async deflation
>>>> cycle. So the periodic invocation mechanism has a limit so that we
>>>> don't swamp the ServiceThread.
>>>
>>> Ok, I see it now, thanks!? If we had a lot of safepoints, we would 
>>> swamp the service thread also, but we don't have a lot of safepoints 
>>> anymore.
>>>
>>>>
>>>>
>>>>> Why have this in a safepoint cleanup task?
>>>>
>>>> I figured I would leave it because there are plans to get rid of the
>>>> safepoint cleanup mechanism and whoever does that work can deal with
>>>> any comments in the code that talk about cleanup safepoints and the
>>>> like. That person will also have to evaluate and deal with whether
>>>> the safepoint cleanup task removal causes unacceptable issues in the
>>>> system.
>>>>
>>>
>>> I think we're still going to need safepoint cleanup for the other 
>>> things, but ok, someone can evaluate this later. There's also this 
>>> RFE, which I linked to this just now.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8227060
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>>> The code deletion is really nice.
>>>>
>>>> Thanks!
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>
>>>>> On 7/13/20 12:08 PM, Daniel D. Daugherty wrote:
>>>>>> On 7/13/20 12:05 PM, Patricio Chilano wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> Changes look good to me!
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>>> In synchronizer.cpp we have this comment about ObjectMonitor 
>>>>>>> lifecycle:
>>>>>>>
>>>>>>> // Inflation unlinks monitors from om_list_globals._free_list or 
>>>>>>> a per-thread
>>>>>>> // free list and associates them with objects. Deflation -- 
>>>>>>> which occurs at
>>>>>>> // STW-time or asynchronously -- disassociates idle monitors 
>>>>>>> from objects.
>>>>>>> // Such scavenged monitors are returned to the 
>>>>>>> om_list_globals._free_list.
>>>>>>>
>>>>>>> With all the older code removed, are there still cases where we 
>>>>>>> do deflations at safepoint?
>>>>>>
>>>>>> Good catch! I need to adjust that comment. I'll look for others 
>>>>>> also.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Patricio
>>>>>>> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> Thanks for the review!
>>>>>>>>
>>>>>>>> I need a second review folks... any takers?
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/12/20 10:57 PM, David Holmes wrote:
>>>>>>>>> Hi Dan,
>>>>>>>>>
>>>>>>>>> This all looks good to me.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>>>>>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>>>>>>>> Ping! Any takers??? Code deletion should be really appealing 
>>>>>>>>>>> here!!
>>>>>>>>>>
>>>>>>>>>> Sorry Dan didn't get to it before vacation. But if you can 
>>>>>>>>>> wait till Monday ...
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>
>>>>>>>>>>>> It's time to remove the AsyncDeflateIdleMonitors option 
>>>>>>>>>>>> from JDK16. We can
>>>>>>>>>>>> also get rid of the safepoint based deflation mechanism 
>>>>>>>>>>>> since turning off
>>>>>>>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the 
>>>>>>>>>>>> only way left to
>>>>>>>>>>>> use it.
>>>>>>>>>>>>
>>>>>>>>>>>> This is marked as an "S/M" review because the number of 
>>>>>>>>>>>> touched/deleted
>>>>>>>>>>>> lines makes it a Medium review, but the number of 
>>>>>>>>>>>> touched/changed lines
>>>>>>>>>>>> (outside of the deletions) makes it a Small review. It's 
>>>>>>>>>>>> actually a pretty
>>>>>>>>>>>> fast read... :-)
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the bug ID:
>>>>>>>>>>>>
>>>>>>>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and 
>>>>>>>>>>>> the safepoint
>>>>>>>>>>>> ??????????????? based deflation mechanism
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>>>>>>>
>>>>>>>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which 
>>>>>>>>>>>> is jdk-16+4
>>>>>>>>>>>> plus a dozen or so changesets.
>>>>>>>>>>>>
>>>>>>>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 
>>>>>>>>>>>> and there are
>>>>>>>>>>>> no regressions (and very few known failures). My inflation 
>>>>>>>>>>>> stress testing
>>>>>>>>>>>> is still in process. I had to restart that testing after a 
>>>>>>>>>>>> thunderstorm
>>>>>>>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks, in advance, for any comments, questions, or 
>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From patricio.chilano.mateo at oracle.com  Mon Jul 13 20:17:44 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Mon, 13 Jul 2020 17:17:44 -0300
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <7b3809ab-72f6-170d-ea24-3641f92bbec4@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
 <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
 <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>
 <23cfce5b-7748-b5f4-c58f-d90211adacf3@oracle.com>
 <979d2a0c-ff48-9308-25f5-cdf2c4c512f4@oracle.com>
 <7b3809ab-72f6-170d-ea24-3641f92bbec4@oracle.com>
Message-ID: <b000024e-213a-0f68-ec71-80f25b06ea9a@oracle.com>

Looks good Dan! Thanks for the fixes.

Patricio
On 7/13/20 4:43 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> I've made minor tweaks based on Patricio's and Coleen's CR0 reviews.
> Of course, while I was looking for safepoint related comments that
> needed tweaking, I ran across a "perm gen" comment that needed to
> be fixed and a misspelling of NoSafepointVerifier. I fixed both...
>
> Fixing the "#ifdef ASSERT ... #endif" blocking touched a few files
> so I've gone ahead and generated new webrevs.
>
> Full webrev:
>
> http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.full/
>
> Incremental webrev:
>
> http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.inc/
>
>
> May I please have at least one re-review?
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
>
> On 7/13/20 3:23 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 7/13/20 2:49 PM, Daniel D. Daugherty wrote:
>>> Hi Coleen,
>>>
>>> Thanks for jumping in on this code review!
>>>
>>>
>>> On 7/13/20 1:21 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/objectMonitor.inline.hpp.udiff.html 
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/synchronizer.cpp.udiff.html 
>>>>
>>>>
>>>> +#ifdef ASSERT
>>>> ?? void* prev = Atomic::load(&_owner);
>>>> - ADIM_guarantee(prev == old_value, "unexpected prev owner=" 
>>>> INTPTR_FORMAT
>>>> +#endif
>>>> + assert(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
>>>> ????????????????? ", expected=" INTPTR_FORMAT, p2i(prev), 
>>>> p2i(old_value));
>>>>
>>>> Just a nit but these patterns look really strange.? Can you put the 
>>>> #endif on the other side of the assert?
>>>
>>> I can do that. In previous reviews of other fixes, I think I've had 
>>> assert()
>>> calls inside #ifdef ASSERT ... #endif blocks and have been asked to 
>>> move
>>> them out since it's redundant.
>>>
>>> I'll plan to make that change, but I'll keep an eye open for complaints
>>> from other reviewers before I push...
>>>
>>>
>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/safepoint.cpp.udiff.html 
>>>>
>>>>
>>>> if 
>>>> (_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
>>>> { ... + ObjectSynchronizer::do_safepoint_work();
>>>>
>>>>
>>>> Why do we still have this to trigger the ServiceThread when we're 
>>>> going to only wait GuaranteedSafepointInterval before checking for 
>>>> monitor deflation anyway?
>>>
>>> When a cleanup safepoint triggers an async deflation request, it
>>> counts as a direct async deflation request so it is not subject
>>> to the AsyncDeflationInterval limit. In other words, when a cleanup
>>> safepoint asks, we honor the request regardless of when the most
>>> recent async deflation request was finished.
>>>
>>> When we do a periodic wake-up and check to see if an async deflation
>>> is needed, we'll only honor that request if it has been longer than
>>> AsyncDeflationInterval since we last completed an async deflation
>>> cycle. So the periodic invocation mechanism has a limit so that we
>>> don't swamp the ServiceThread.
>>
>> Ok, I see it now, thanks!? If we had a lot of safepoints, we would 
>> swamp the service thread also, but we don't have a lot of safepoints 
>> anymore.
>>
>>>
>>>
>>>> Why have this in a safepoint cleanup task?
>>>
>>> I figured I would leave it because there are plans to get rid of the
>>> safepoint cleanup mechanism and whoever does that work can deal with
>>> any comments in the code that talk about cleanup safepoints and the
>>> like. That person will also have to evaluate and deal with whether
>>> the safepoint cleanup task removal causes unacceptable issues in the
>>> system.
>>>
>>
>> I think we're still going to need safepoint cleanup for the other 
>> things, but ok, someone can evaluate this later.? There's also this 
>> RFE, which I linked to this just now.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8227060
>>
>> Thanks,
>> Coleen
>>>
>>>> The code deletion is really nice.
>>>
>>> Thanks!
>>>
>>> Dan
>>>
>>>
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>
>>>> On 7/13/20 12:08 PM, Daniel D. Daugherty wrote:
>>>>> On 7/13/20 12:05 PM, Patricio Chilano wrote:
>>>>>> Hi Dan,
>>>>>>
>>>>>> Changes look good to me!
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>>> In synchronizer.cpp we have this comment about ObjectMonitor 
>>>>>> lifecycle:
>>>>>>
>>>>>> // Inflation unlinks monitors from om_list_globals._free_list or 
>>>>>> a per-thread
>>>>>> // free list and associates them with objects. Deflation -- which 
>>>>>> occurs at
>>>>>> // STW-time or asynchronously -- disassociates idle monitors from 
>>>>>> objects.
>>>>>> // Such scavenged monitors are returned to the 
>>>>>> om_list_globals._free_list.
>>>>>>
>>>>>> With all the older code removed, are there still cases where we 
>>>>>> do deflations at safepoint?
>>>>>
>>>>> Good catch! I need to adjust that comment. I'll look for others also.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>> Patricio
>>>>>> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Thanks for the review!
>>>>>>>
>>>>>>> I need a second review folks... any takers?
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>> On 7/12/20 10:57 PM, David Holmes wrote:
>>>>>>>> Hi Dan,
>>>>>>>>
>>>>>>>> This all looks good to me.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>>>>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>>>>>>> Ping! Any takers??? Code deletion should be really appealing 
>>>>>>>>>> here!!
>>>>>>>>>
>>>>>>>>> Sorry Dan didn't get to it before vacation. But if you can 
>>>>>>>>> wait till Monday ...
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>> Greetings,
>>>>>>>>>>>
>>>>>>>>>>> It's time to remove the AsyncDeflateIdleMonitors option from 
>>>>>>>>>>> JDK16. We can
>>>>>>>>>>> also get rid of the safepoint based deflation mechanism 
>>>>>>>>>>> since turning off
>>>>>>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the only 
>>>>>>>>>>> way left to
>>>>>>>>>>> use it.
>>>>>>>>>>>
>>>>>>>>>>> This is marked as an "S/M" review because the number of 
>>>>>>>>>>> touched/deleted
>>>>>>>>>>> lines makes it a Medium review, but the number of 
>>>>>>>>>>> touched/changed lines
>>>>>>>>>>> (outside of the deletions) makes it a Small review. It's 
>>>>>>>>>>> actually a pretty
>>>>>>>>>>> fast read... :-)
>>>>>>>>>>>
>>>>>>>>>>> Here's the bug ID:
>>>>>>>>>>>
>>>>>>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and 
>>>>>>>>>>> the safepoint
>>>>>>>>>>> ??????????????? based deflation mechanism
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>>>>>>
>>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>>>>>>
>>>>>>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which 
>>>>>>>>>>> is jdk-16+4
>>>>>>>>>>> plus a dozen or so changesets.
>>>>>>>>>>>
>>>>>>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 
>>>>>>>>>>> and there are
>>>>>>>>>>> no regressions (and very few known failures). My inflation 
>>>>>>>>>>> stress testing
>>>>>>>>>>> is still in process. I had to restart that testing after a 
>>>>>>>>>>> thunderstorm
>>>>>>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>>>>>>
>>>>>>>>>>> Thanks, in advance, for any comments, questions, or 
>>>>>>>>>>> suggestions.
>>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From daniel.daugherty at oracle.com  Mon Jul 13 20:27:18 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 13 Jul 2020 16:27:18 -0400
Subject: RFR(S/M): 8246476: remove AsyncDeflateIdleMonitors option and the
 safepoint based deflation mechanism
In-Reply-To: <b000024e-213a-0f68-ec71-80f25b06ea9a@oracle.com>
References: <8669d9b6-7b4d-4e1e-9409-970c5464f09e@oracle.com>
 <ce69cec8-8b60-6466-0182-59ac6c5eb05a@oracle.com>
 <d9b1c717-10aa-78e4-a318-158e037f3f03@oracle.com>
 <ee8e34aa-5b7e-900d-7b36-f8a8dc95f767@oracle.com>
 <18f5a738-53a7-44ea-d72e-9da6115e1759@oracle.com>
 <9e7efbb1-84a2-eeec-8e2c-35672b08edb8@oracle.com>
 <4ba557fc-c501-ed52-590b-fa09f8f08d04@oracle.com>
 <19e28c69-25f6-130a-dd06-d4c2a8016309@oracle.com>
 <23cfce5b-7748-b5f4-c58f-d90211adacf3@oracle.com>
 <979d2a0c-ff48-9308-25f5-cdf2c4c512f4@oracle.com>
 <7b3809ab-72f6-170d-ea24-3641f92bbec4@oracle.com>
 <b000024e-213a-0f68-ec71-80f25b06ea9a@oracle.com>
Message-ID: <22b18dcb-b2fb-baa3-a473-4bbed229bf1c@oracle.com>

Thanks for the re-review!

Dan


On 7/13/20 4:17 PM, Patricio Chilano wrote:
> Looks good Dan! Thanks for the fixes.
>
> Patricio
> On 7/13/20 4:43 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I've made minor tweaks based on Patricio's and Coleen's CR0 reviews.
>> Of course, while I was looking for safepoint related comments that
>> needed tweaking, I ran across a "perm gen" comment that needed to
>> be fixed and a misspelling of NoSafepointVerifier. I fixed both...
>>
>> Fixing the "#ifdef ASSERT ... #endif" blocking touched a few files
>> so I've gone ahead and generated new webrevs.
>>
>> Full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.full/
>>
>> Incremental webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/1_for_jdk16.inc/
>>
>>
>> May I please have at least one re-review?
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>>
>> On 7/13/20 3:23 PM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 7/13/20 2:49 PM, Daniel D. Daugherty wrote:
>>>> Hi Coleen,
>>>>
>>>> Thanks for jumping in on this code review!
>>>>
>>>>
>>>> On 7/13/20 1:21 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/objectMonitor.inline.hpp.udiff.html 
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/synchronizer.cpp.udiff.html 
>>>>>
>>>>>
>>>>> +#ifdef ASSERT
>>>>> ?? void* prev = Atomic::load(&_owner);
>>>>> - ADIM_guarantee(prev == old_value, "unexpected prev owner=" 
>>>>> INTPTR_FORMAT
>>>>> +#endif
>>>>> + assert(prev == old_value, "unexpected prev owner=" INTPTR_FORMAT
>>>>> ????????????????? ", expected=" INTPTR_FORMAT, p2i(prev), 
>>>>> p2i(old_value));
>>>>>
>>>>> Just a nit but these patterns look really strange.? Can you put 
>>>>> the #endif on the other side of the assert?
>>>>
>>>> I can do that. In previous reviews of other fixes, I think I've had 
>>>> assert()
>>>> calls inside #ifdef ASSERT ... #endif blocks and have been asked to 
>>>> move
>>>> them out since it's redundant.
>>>>
>>>> I'll plan to make that change, but I'll keep an eye open for 
>>>> complaints
>>>> from other reviewers before I push...
>>>>
>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/src/hotspot/share/runtime/safepoint.cpp.udiff.html 
>>>>>
>>>>>
>>>>> if 
>>>>> (_subtasks.try_claim_task(SafepointSynchronize::SAFEPOINT_CLEANUP_DEFLATE_MONITORS)) 
>>>>> { ... + ObjectSynchronizer::do_safepoint_work();
>>>>>
>>>>>
>>>>> Why do we still have this to trigger the ServiceThread when we're 
>>>>> going to only wait GuaranteedSafepointInterval before checking for 
>>>>> monitor deflation anyway?
>>>>
>>>> When a cleanup safepoint triggers an async deflation request, it
>>>> counts as a direct async deflation request so it is not subject
>>>> to the AsyncDeflationInterval limit. In other words, when a cleanup
>>>> safepoint asks, we honor the request regardless of when the most
>>>> recent async deflation request was finished.
>>>>
>>>> When we do a periodic wake-up and check to see if an async deflation
>>>> is needed, we'll only honor that request if it has been longer than
>>>> AsyncDeflationInterval since we last completed an async deflation
>>>> cycle. So the periodic invocation mechanism has a limit so that we
>>>> don't swamp the ServiceThread.
>>>
>>> Ok, I see it now, thanks!? If we had a lot of safepoints, we would 
>>> swamp the service thread also, but we don't have a lot of safepoints 
>>> anymore.
>>>
>>>>
>>>>
>>>>> Why have this in a safepoint cleanup task?
>>>>
>>>> I figured I would leave it because there are plans to get rid of the
>>>> safepoint cleanup mechanism and whoever does that work can deal with
>>>> any comments in the code that talk about cleanup safepoints and the
>>>> like. That person will also have to evaluate and deal with whether
>>>> the safepoint cleanup task removal causes unacceptable issues in the
>>>> system.
>>>>
>>>
>>> I think we're still going to need safepoint cleanup for the other 
>>> things, but ok, someone can evaluate this later. There's also this 
>>> RFE, which I linked to this just now.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8227060
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>>> The code deletion is really nice.
>>>>
>>>> Thanks!
>>>>
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>
>>>>> On 7/13/20 12:08 PM, Daniel D. Daugherty wrote:
>>>>>> On 7/13/20 12:05 PM, Patricio Chilano wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> Changes look good to me!
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>>> In synchronizer.cpp we have this comment about ObjectMonitor 
>>>>>>> lifecycle:
>>>>>>>
>>>>>>> // Inflation unlinks monitors from om_list_globals._free_list or 
>>>>>>> a per-thread
>>>>>>> // free list and associates them with objects. Deflation -- 
>>>>>>> which occurs at
>>>>>>> // STW-time or asynchronously -- disassociates idle monitors 
>>>>>>> from objects.
>>>>>>> // Such scavenged monitors are returned to the 
>>>>>>> om_list_globals._free_list.
>>>>>>>
>>>>>>> With all the older code removed, are there still cases where we 
>>>>>>> do deflations at safepoint?
>>>>>>
>>>>>> Good catch! I need to adjust that comment. I'll look for others 
>>>>>> also.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Patricio
>>>>>>> On 7/13/20 10:47 AM, Daniel D. Daugherty wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> Thanks for the review!
>>>>>>>>
>>>>>>>> I need a second review folks... any takers?
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/12/20 10:57 PM, David Holmes wrote:
>>>>>>>>> Hi Dan,
>>>>>>>>>
>>>>>>>>> This all looks good to me.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>> On 8/07/2020 5:51 pm, David Holmes wrote:
>>>>>>>>>> On 8/07/2020 7:21 am, Daniel D. Daugherty wrote:
>>>>>>>>>>> Ping! Any takers??? Code deletion should be really appealing 
>>>>>>>>>>> here!!
>>>>>>>>>>
>>>>>>>>>> Sorry Dan didn't get to it before vacation. But if you can 
>>>>>>>>>> wait till Monday ...
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/6/20 12:35 PM, Daniel D. Daugherty wrote:
>>>>>>>>>>>> Greetings,
>>>>>>>>>>>>
>>>>>>>>>>>> It's time to remove the AsyncDeflateIdleMonitors option 
>>>>>>>>>>>> from JDK16. We can
>>>>>>>>>>>> also get rid of the safepoint based deflation mechanism 
>>>>>>>>>>>> since turning off
>>>>>>>>>>>> async deflation (-XX:-AsyncDeflateIdleMonitors) was the 
>>>>>>>>>>>> only way left to
>>>>>>>>>>>> use it.
>>>>>>>>>>>>
>>>>>>>>>>>> This is marked as an "S/M" review because the number of 
>>>>>>>>>>>> touched/deleted
>>>>>>>>>>>> lines makes it a Medium review, but the number of 
>>>>>>>>>>>> touched/changed lines
>>>>>>>>>>>> (outside of the deletions) makes it a Small review. It's 
>>>>>>>>>>>> actually a pretty
>>>>>>>>>>>> fast read... :-)
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the bug ID:
>>>>>>>>>>>>
>>>>>>>>>>>> ??? JDK-8246476 remove AsyncDeflateIdleMonitors option and 
>>>>>>>>>>>> the safepoint
>>>>>>>>>>>> ??????????????? based deflation mechanism
>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246476
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the webrev URL:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~dcubed/8246476-webrev/0_for_jdk16/
>>>>>>>>>>>>
>>>>>>>>>>>> The webrev is baselined on Thomas S's fix for 8248650 which 
>>>>>>>>>>>> is jdk-16+4
>>>>>>>>>>>> plus a dozen or so changesets.
>>>>>>>>>>>>
>>>>>>>>>>>> This change has been tested with Mach5 Tier[1-3],4,5,6,7,8 
>>>>>>>>>>>> and there are
>>>>>>>>>>>> no regressions (and very few known failures). My inflation 
>>>>>>>>>>>> stress testing
>>>>>>>>>>>> is still in process. I had to restart that testing after a 
>>>>>>>>>>>> thunderstorm
>>>>>>>>>>>> related power failure took down my servers in Florida. Sigh...
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks, in advance, for any comments, questions, or 
>>>>>>>>>>>> suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>> Dan
>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From luhenry at microsoft.com  Mon Jul 13 21:55:24 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Mon, 13 Jul 2020 21:55:24 +0000
Subject: RFR(S): Use Vectored Exception Handling on Windows
In-Reply-To: <CAA-vtUzh65R01wHTW9-ObQZ7j0vNWjp_RuYivOrpGHoJNtyNgw@mail.gmail.com>
References: <MWHPR21MB0511A8150D4CAEBF3181E61EB0980@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUw1nEo_o4ayQBv=MJcKFCTXfvY2ThNL1x9evcvT7fuYyg@mail.gmail.com>
 <MWHPR21MB0511F8E1132F81170290209FB0920@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <CAA-vtUzh65R01wHTW9-ObQZ7j0vNWjp_RuYivOrpGHoJNtyNgw@mail.gmail.com>
Message-ID: <MWHPR21MB05117E4D1CBC613EF52991AEB0600@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Thomas,

Thank you for your feedback!

Let me answer on some of the cases you mention.

> A) this case exists today. An app getting signals via VEH would have to willingly ignore signals for us to get them. This does not change, your patch would mean this happens less often, so I do not see a backward compatibility problem here.

Exactly.

> B) this is a new case. We would have to ignore signals not meant for us. Technically by just ignoring them. Distinguishing this is a bit difficult though. Note the subtle difference to Unix: there we have signal chaining, so an application which is really really interested in signals for its own purposes uses it (e.g. by preloading libjsig) and then we know its handler and hand over the signal.

Today, through SEH and RtlAddFunctionTable, we only get a very clear subset of exceptions: the one triggered in the code cache. If an exception is triggered from a PC outside of this code cache, SEH will not get the handler we registered with RtlAddFunctionTable, and we'll simply _not_ call into HandleExceptionFromCodeCache (the handler we register with RtlAddFunctionTable). That can be trivially reproduced in the VEH by simply checking that the PC is between CodeCache::low_bound() and CodeCache::high_bound().

This is what you are mentioning with "we only can distinguish our crashes from their crashes via crash pc, rejecting any crash not in our code (dynamic or static). Well, arguably this would be just how it is today with our code scoped via SEH".

> With the added safety net of the unhandled exception filter (what happens if multiple parties call this?).

Here, Unhandled Exception Handling predates VEH and it doesn't integrate chaining. The API is similar to signals on Linux/Unix: the last one to register has to make sure to save the previous one and to call/chain it accordingly.

> My only very small personal gripe would be that I always liked how I can quickly use SEH to check if a pointer is valid without disturbing anyone. But within the hotspot at least I can just as well use SafeFetch.

Nothing from the Win32 API stops you from mix-and-matching VEH and SEH. If you want to do a `__try { val = *ptr; } __except (EXCEPTION_EXECUTE_HANDLER) { success = false; }` in some C++ code (in vm or native), nothing stops you from doing so. My understanding of the exception handler logic in the OpenJDK on Windows is that the accepted EXCEPTION_ACCESS_VIOLATION in java, vm, or native code is limited to a clear subset, and anything outside of these known cases is quickly treated as "an exception we cannot handle". SafeFetch is such a case where the instructions potentially triggering the EXCEPTION_ACCESS_VIOLATION are matched against by the exception handler.

--
Ludovic

________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Saturday, July 11, 2020 23:08
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi Ludovic,

sorry for the delay, and thanks for the extensive answer. Please find remarks inline.

On Fri, Jun 26, 2020 at 12:11 AM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com>> wrote:
Hi Thomas,

It seems that the problem you're describing stems from the current exception handler treating two cases: 1. any exception knowingly triggered by Java code and treated by HotSpot (ex: safepoint-polling, arraycopy stubs, stackoverflow in Java code), and 2. exceptional cases leading to crashes (ex: uncaught C++ exception, an access violation in VM or native/external code, etc.). There is the same problem on Unix because there is only one system (signal handling) for both cases. Fortunately, Windows proposes different systems, each with its own advantages.

The order in which Windows invokes each of these systems is the following:
 1. Vectored Exception Handler registered with `AddVectoredExceptionHandler`
 2. Structured Exception Handler
 3. Vectored Exception Handler registered with `AddVectoredContinueHandler`
 4. Unhandled Exception Handler

Today, Hotspot on x86/x86_64 catches the exception at 2. via a handler registered with `RtlAddFunctionTable`. This handler does both the Java-triggered exceptions and any other exceptions.

Now, from the point of view of an external library or application embedding the JVM inside their own process, they still have all the above options to register an exception handler, irrespective of how Hotspot does it. This creates the following cases:
 - If the application uses VEH: they will (with Hotspot using SEH) be called _before_ Hotspot's exception handler and will then have to be aware that they may get exceptions unrelated to them and will have to ignore them accordingly
 - If the application uses SEH: they will only get exceptions related to their code area

If Hotspot is to use VEH, an exception would play as follow:
 - If the application uses VEH and their registered handler executes _before_ Hotspot's one: same as above
 - If the application uses VEH and their registered handler executes _after_ Hotspot's one: Hotspot has to make sure that the exception was triggered by Hotspot and ignore them otherwise (a range check on the PC can be used here to emulate how it's done with RltAddFunctionTable)
 - If the application uses SEH: the same case as to where the application's handler executes _after_ Hotspot's one

This all assumes that Hotspot's VEH handler doesn't trigger a crash report (VMError::report_and_die) on any exception it doesn't know how to handle. The simplest way to do that is simply _not_ to do it in Hotspot's VEH handler, and to do it by registering a Win32 Unhandled Exception Handler (with SetUnhandlerdExceptionFilter [1]). This handler is _only_ called when no other exception handler treated the exception (by returning EXCEPTION_CONTINUE_EXECUTION or EXCEPTION_EXECUTE_HANDLER). Invoking it means the application is "toast" and not in a runnable state anymore, which fits nicely with the purpose of the Hotspot crash report.


Okay, If I get this correctly:

Today:
  App uses VEH - they execute before us and have to handle this correctly (->A)
  App uses SEH - no interaction

With proposed switch:
  App uses VEH - they may or may not execute before us. If they come before us: (->A). If they come after us -> (B)
  App uses SEH -> (B)

A) this case exists today. An app getting signals via VEH would have to willingly ignore signals for us to get them. This does not change, your patch would mean this happens less often, so I do not see a backward compatibility problem here.

B) this is a new case. We would have to ignore signals not meant for us. Technically by just ignoring them. Distinguishing this is a bit difficult though. Note the subtle difference to Unix: there we have signal chaining, so an application which is really really interested in signals for its own purposes uses it (e.g. by preloading libjsig) and then we know its handler and hand over the signal.

On windows we do not know this (?), we only can distinguish our crashes from their crashes via crash pc, rejecting any crash not in our code (dynamic or static). Well, arguably this would be just how it is today with our code scoped via SEH. With the added safety net of the unhandled exception filter (what happens if multiple parties call this?).

Okay this seems safe enough to try it at least.

My only very small personal gripe would be that I always liked how I can quickly use SEH to check if a pointer is valid without disturbing anyone. But within the hotspot at least I can just as well use SafeFetch.

Thank you,

Thomas

I hope this sheds some light on possible solutions ahead of us.

Thank you,

--
Ludovic

[1] https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117672388&sdata=zM0zOUCOujhp2fyW7PVXPplSn13elTyyf4cJUgZj%2Fm8%3D&reserved=0>
________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
Sent: Sunday, June 21, 2020 05:55
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi,

We at SAP had used VEH in our own Windows Itanium port and I dimly remember it being a source of problems. That is many years ago and I realize that it is not worth much, but it makes me bit apprehensive of this change.

The main problem I see is that this will be an observable change in behavior.

We currently use SEH, so our error handler is guaranteed to be invoked only for exceptions from within our own code. With VEH we now follow the Unix way of things and suddenly our error handler becomes a global resource.

We will suddenly be invoked for crashes outside the VM, e.g. in foreign launcher code atop of us or in non-java side threads, which will generate whole new classes of hs-err files for crashes the VM is not responsible for. Which are then perceived as VM crashes and sent to us vendors instead of going to the right people. This is the way it works on Unix today, and it is a constant annoyance and increases our support workload.

We also may introduce new problems since suddenly we interfere with application exception handling. At the very least, we have to think up a scheme for signal chaining (both ways: VM->foreign code and foreign code->VM). For the first, we probably need some form of libjsig preloading, or some other way to divert signal handler instalment. That would also need cooperation from the application programmers and/or operators.

Matters are even more complicated, since foreign code may use SEH instead of VEH, so what happens if a JNI library below me wants to use SEH, does that still work?

I feel this should not be rushed. Even considered "brittle" SEH has served us well, I do not recall many problems in the past aside from having to add the occasional __try/__except. Are there actual bugs we have to solve?

Lastly, personally I always found SEH quite a neat concept, and one of the few places where Windows was superior to Unix :)

Thanks, Thomas


On Fri, Jun 19, 2020 at 5:23 PM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>>> wrote:
Hello,

First, some context and definitions:
- when talking about exception here, I'm talking about Win32 exception which are equivalent to signals on Linux and other Unix, I am _not_ talking about Java exceptions.
- an explanation of an _exception filter_ can be found at https://docs.microsoft.com/en-us/cpp/cpp/writing-an-exception-filter?view=vs-2019<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665642403&sdata=fjcrwcQYAg3TstTSO2YHKziszwlusbYV6uUXINydD1E%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0>>. There is only a limited concept of that in Java with type-based exception filter (ex: `try { ... } catch (IOException ioe) { ... } catch (Throwable t) { ... }`).
- in Win32, there exist two exception handling mechanism:
  - Structured Exception Handling: the historical one, based on `__try {} __except (...) {}`
  - Vectored Exception Handling: introduced in Windows XP / Windows Server 2003, much more similar to signals on Linux

These exception handling mechanisms are used to catch any exceptions like Access Violation, Stack Overflow, Divide by Zero, Overflow, and more. These exceptions are equivalent to signal on Linux and are then core to many mechanisms in the OpenJDK.

Today, the OpenJDK uses Structured Exception Handling to catch such exceptions, creating several requirements. First, all code that might trigger an exception on purpose (like a Access Violation / SIGSEGV in the arraycopy stub), needs to be wrapped up in a __try / __except. Because it's not feasible to wrap every single instance of such code, these __try / __except are put at the top-level most function of any thread started by the runtime. Second, for code generated by Hotspot, `RtlAddFunctionTable` is used to simulate the use of __try / __except for a specific code area. This function needs platform specific code with the generation of  a trampoline that calls the exception filter declared in the runtime. It's also meant to be used as a one to one mapping with try / catch in user code, and not as a "catch all the exceptions in this code area". Third, Structured Exception Handling expects to be able to unwind the stack. However, because Hotspot doesn't guarantee the usage of the platform-specific ABI internally, the platform-specific unwinder might break. Hotspot's usage of `RtlAddFunctionTable` for the code cache relies on the assumption that Structured Exception Handling never tries to unwind the stack (which it would fail to do because of the different ABI) before calling the registered exception filter.

Discussing that with Windows Kernel maintainers, this approach is highly discouraged, considered brittle, and the better solution is Vectored Exception Handling. Vectored Exception Handling is conceptually much more similar to signal / sigaction on Linux and other Unix systems. It will catch all exceptions happening across the process, and no __try / __except will be required. It also removes the requirement to call `RtlAddFunctionTable`.  The exception filter then behaves like a signal handler with the possibility to modify the registers at will, modifying the PC to step over an instruction after an expected Access Violation for example. Vectored Exception Handling is also already used for AOT code.

The changes can be found at http://cr.openjdk.java.net/~burban/ludovic_vecexc/<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665652395&sdata=pTewy1%2BeB43HX4y0ypDwMDGRjBoNP6yBGrhRi7ncm1c%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0>>. As I am not an author, I have not created a corresponding bug in JBS.

Thank you, and looking forward for your feedback!

--
Ludovic


From luhenry at microsoft.com  Tue Jul 14 01:28:03 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Tue, 14 Jul 2020 01:28:03 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>,
 <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
Message-ID: <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>

Hello,

> But if we are dealing with non-TSO races then it would be good to get
> some guidance from Microsoft as to the memory ordering properties of
> various API's to ensure that we are maintaining correct ordering. For
> example, in the destructor we have:
> 
> 81     lock_owner = 0;
> 82     // No lost wakeups, lock_event stays signaled until reset.
> 83     DWORD ret = SetEvent(lock_event);
> 
> but unless we are guaranteed that the store to lock_owner cannot be
> reordered by the compiler or the hardware, to appear to be after the
> SetEvent, then the logic is broken. Generally, because Windows only
> supported TSO systems, we have assumed that the compiler will not
> reorder code across these kind of API calls. But now we also need
> hardware guarantees.

I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.

As for the general question around platforms with weaker memory models, AArch64 is not the first such platform that MSVC and Windows have been ported to. It is safe to assume that MSVC has a similar approach to GCC and Clang on memory reordering optimizations. [1] also gives some pointers on some MSVC specific knobs for working around the weaker memory model.

Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.

I hope this helps to answer your questions.

[1] https://docs.microsoft.com/en-us/cpp/build/common-visual-cpp-arm-migration-issues?view=vs-2019#volatile-keyword-default-behavior

--
Ludovic
________________________________________
From: Andrew Haley <aph at redhat.com>
Sent: Monday, July 13, 2020 01:36
To: David Holmes; Thomas St?fe
Cc: Kim Barrett; Ludovic Henry; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model

On 13/07/2020 06:48, David Holmes wrote:
> Hi Thomas,
>
> On 13/07/2020 2:41 pm, Thomas St?fe wrote:
>>
>> Can a compiler reorder system calls and stores? How would it determine
>> if this is safe to do?

I very much doubt it.

> A compiler can reorder anything it likes if it can determine it is safe
> to do so. :)

I'm fairly sure the compiler doesn't care about that!

>> I'd be surprised if Microsoft loosened up reordering since this would
>> mean existing software cannot just be recompiled for arm and expected to
>> work. But this is just a guess of course.
>
> It's an interesting point because I would expect there to be a lot of
> software written for Windows that contains assumptions of TSO that would
> in fact fail when run on Aarch64. I don't know if there are any special
> mechanisms to force a binary to run in TSO mode on Aarch64 under Windows
> (or build flags), that would allow for ease of migration.

There's no standard hardware mechanism that would do so.

I've been very surprised at how little software has broken on AArch64
because of memory ordering. Like you, I initially assumed that stuff
would break all over the place, but by and large it was OK. I know of
two reasons: firstly, programmers are pretty conservative and tend to
use simple and reliable mechanisms such as safe publication and
mutexes for inter-thread communication. But also, and maybe more
importantly, the kinds of reordering the hardware can do are not very
different from those compilers do. Therefore, anyone playing fast and
loose with TSO has probably already been bitten by the compiler.

> But unless all Windows software will run in such a mode there is a
> need for MS to document what the memory consistency properties of
> various APIs are (as POSIX does [1]).

Indeed. I would have thought it existed somewhere.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C382df232c9e14d02489b08d82707efb5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302262326530322&amp;sdata=4aWlLFQiA5WM8199fuDeMfFZoSyQzznGe9kwotVP2tk%3D&amp;reserved=0>
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C382df232c9e14d02489b08d82707efb5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302262326530322&amp;sdata=UiS3crdMoemJly%2BESLC%2F50noHZya9zArFenhHNVJ7C4%3D&amp;reserved=0
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From david.holmes at oracle.com  Tue Jul 14 02:11:25 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 13 Jul 2020 19:11:25 -0700 (PDT)
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>

Hi Goetz,

Okay ... if I understand your position correctly you are looking at this 
as if the extended message is created at the time the NPE is thrown, and 
it is an implementation detail that we actually determine it lazily. If 
it were eagerly determined then neither fillInstacktrace() nor 
setStackTrace() would make any difference to the message - just as with 
any other exception message.

However, the lazy determination of the message causes a problem with 
fillInStackTrace() because that call will destroy the original backtrace 
needed to produce the original message, and create an incorrect message. 
setStackTrace() does not have a similar problem because, simply by the 
way the current implementation works it doesn't touch the original 
backtrace.

So you are proposing to only fix the bug that is evident in relation to 
fillInStackTrace() by no longer evaluating the extended message if 
fillInStackTrace() is called after the NPE was constructed.

But in doing so you break the illusion that the extended message acts 
as-if determined at construction time, because you now effectively clear 
it when fillInStackTrace is called.

My position was that if fillInStackTrace can be seen to clear it, then 
setStackTrace (which is logically somewhat equivalent) should also be 
seen to clear it.

Alternatively, add a new field to NPE to cache the extended error 
message, and explicitly evaluate the message if fillInStackTrace() is 
called. That will continue the illusion that the extended message was 
actually set at construction time. No changes needed to setStackTrace() 
as we can still lazily compute the extended message.

Something like:

private String extendedMessage;

public synchronized Throwable fillInStackTrace() {
     if (extendedMessage == NULL) {
         extendedMessage = getExtendedNPEMessage();
     }
     return super.fillInStackTrace();
}

public String getMessage() {
     String message = super.getMessage();
     synchronized(this) {
         if (message == null) {
             // This NPE should have an extended message.
             if (extendedMessage == NULL) {
                 extendedMessage = getExtendedNPEMessage();
             }
             message = extendedMessage;
         }
     }
     return message;
}

Cheers,
David

On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
> Hi David,
> 
>> Your extended message is only computed when there is no original message.
> Hmm. I would say the extended message is only computed when
> The NPE was raised by the runtime. It happens to never have a
> message so far in these cases.
> But this is two views to the same thing ??
> 
>> You're concerned about this scenario:
>>
>> catch (NullPointerException npe) {
>>     String msg1 = npe.getMessage(); // gets extends NPE message
>>     npe.setStackTrace(...);
>>     String msg2 = npe.getMessage(); // gets null
>> }
>>
>> While I find it hard to imagine anyone doing this
> Well, all the scenario are quite artificial:
>   - why would you call fillInStackTrace on an exception thrown by the VM?
>   - why would you call setStackTrace at all?
>> you can easily have
>> specified that the extended message is only available with the original
>> stacktrace, hence after a second call to fillInStackTrace, or a call to
>> setStackTrace, then the message reverts to being empty.
> The message is not meant to be a special thing that behaves different
> from other messages.  Like sometime be available, sometime not.
> It ended up being different through requirements during the
> review.
> 
>> To me that makes
>> far more sense than having msg2 continue to report the extended info for
>> the original stacktrace when it now has a new stacktrace.
>>
>> I'm really not seeing why calling fillInstackTrace() a second time
>> should be treated any differently to calling setStackTrace(). They
>> should be handled consistently IMO.
> But then you treat setStackTrace() differently from setStackTrace()
> with other exceptions.
> The reason to treat fillInStackTrace differently is that we lost information
> needed to compute it. This is not the case with setStackTrace().
> 
> A different solution, the one I would have proposed if I had not
> considered previous comments from reviews,  would be to just
> compute the message in the runtime in the call of fillInStackTrace
> before the old stack trace is lost and assign it to the message field.
> This way it would behave similar to all other exceptions. The message
> would just be there ... just that it's computed lazily.
> The cost of the algorithm wouldn't harm that much as other costly
> algorithms (walking the stack) are performed at this point, too.
> 
>> We are not talking about all exceptions only about your NPE extended
>> error message.
> Hmm, the inconsistency caused by the code you posted above
> holds for all exceptions.  If you fiddle with the stack trace,
> the message might become pointless.  Wrt. setStackTrace
> they all behave the same.
> Wrt. fillInStackTrace the message will be wrong. Only this
> needs to be fixed.
> 
> Best regards,
>    Goetz.
> 
> 
>>
>> David
>> -----
>>
>>> I implemented an example where wrong stack traces are
>>> printed with LinkageError and NPE, modifying a jtreg test:
>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>> jdk15/05/mess_with_exceptions.patch
>>> See also the generated output added to a comment in the patch.
>>> If the NEP message text was missing in the second printout, I think
>>> this really would be unexpected.
>>> Please note that the correct message is printed after messing
>>> with the stack trace, it's the stack trace that is wrong.
>>> (Not as with the problem I am fixing here where a wrong
>>> message is printed.)
>>>
>>> Best regards,
>>>     Goetz.
>>>
>>>
>>>
>>>>
>>>>> I guess the normal usecase of setStackTrace is the other way around:
>>>>> Change the message and throw a new exception with the existing
>>>>> stack trace:
>>>>>
>>>>> try {
>>>>>      a.x;
>>>>> catch (NullPointerException e) {
>>>>>      throw new NullPointerException("My own error
>>>> message").setStackTrace(e.getStackTrace);
>>>>> }
>>>>>
>>>>> And not taking an arbitrary stack trace and put it into an exception
>>>>> with existing message.
>>>>
>>>> Interesting usage.
>>>>
>>>> Cheers,
>>>> David
>>>> -----
>>>>
>>>>> Best regards,
>>>>>      Goetz.
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
>>>> mlv.fr'
>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>> message
>>>>>> after calling fillInStackTrace
>>>>>>
>>>>>> Hi Goetz,
>>>>>>
>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>> True. To ensure you process the original backtrace only you need to
>>>> add
>>>>>>>> synchronization in getMessage():
>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>> NPE_fillInStackTrace-
>>>>>> jdk15/05/
>>>>>>>
>>>>>>> I added the volatile, too, but as I understand the synchronized
>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>> without.
>>>>>>
>>>>>> No "volatile" needed, or wanted, when all access is within synchronized
>>>>>> regions.
>>>>>>
>>>>>>>> To be honest the idea that someone would share an exception
>> instance
>>>>>> and
>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>> information about it just seems highly unrealistic.
>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>> performance
>>>>>> ??
>>>>>>
>>>>>> Contention was not my concern at all. :)
>>>>>>
>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>> suggest that setStackTrace be updated:
>>>>>>> The test shows that after setStackTrace still the correct message
>>>>>>> is computed. This is because the algorithm uses Throwable::backtrace
>>>>>>> and not Throwable::stacktrace.  Throwable::backtrace is not
>>>>>>> affected by setStackTrace.
>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>> the message might refer to other code than the stack trace
>>>>>>> points to.
>>>>>>
>>>>>> But you can't adapt the message text - there is no setMessage! If the
>>>>>> message is NULL and you call setStackTrace() then getMessage(), it
>> makes
>>>>>> no sense to return the extended error message that was associated
>> with
>>>>>> the original stack/backtrace.
>>>>>>
>>>>>> Cheers,
>>>>>> David
>>>>>>
>>>>>>> Best regards,
>>>>>>>       Goetz.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> 'forax at univ-
>>>>>> mlv.fr'
>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-
>> dev
>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>> message
>>>>>>>> after calling fillInStackTrace
>>>>>>>>
>>>>>>>> Hi Goetz,
>>>>>>>>
>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>> Hi Remi,
>>>>>>>>>
>>>>>>>>> But how does volatile help?
>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets always the
>>>>>>>>> right value.
>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>> getExtendedNPEMessage.  The other thread could change it after
>>>>>>>>> checking numStackTracesFilledIn and before I read the backtrace.
>>>>>>>>
>>>>>>>> True. To ensure you process the original backtrace only you need to
>>>> add
>>>>>>>> synchronization in getMessage():
>>>>>>>>
>>>>>>>>            public String getMessage() {
>>>>>>>>                String message = super.getMessage();
>>>>>>>>                // If the stack trace was changed the extended NPE algorithm
>>>>>>>>                // will compute a wrong message.
>>>>>>>> +         synchronized(this) {
>>>>>>>> !             if (message == null && numStackTracesFilledIn == 1) {
>>>>>>>> !                 return getExtendedNPEMessage();
>>>>>>>> !             }
>>>>>>>> +         }
>>>>>>>>                return message;
>>>>>>>>            }
>>>>>>>>
>>>>>>>> To be honest the idea that someone would share an exception
>> instance
>>>>>> and
>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>> information about it just seems highly unrealistic. But the above fixes
>>>>>>>> it simply. Though after looking at comments in the test I would also
>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>
>>>>>>>>             synchronized (this) {
>>>>>>>>                  if (this.stackTrace == null && // Immutable stack
>>>>>>>>                      backtrace == null) // Test for out of protocol state
>>>>>>>>                      return;
>>>>>>>> +           numStackTracesFilledIn++;
>>>>>>>>                  this.stackTrace = defensiveCopy;
>>>>>>>>              }
>>>>>>>>          }
>>>>>>>>
>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>
>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>> proposed in webrev 02:
>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>> NPE_fillInStackTrace-
>>>>>>>> jdk15/02/
>>>>>>>>
>>>>>>>> I much prefer the latest version that recognises that only the original
>>>>>>>> stack can be processed.
>>>>>>>>
>>>>>>>> In the test:
>>>>>>>>
>>>>>>>> +         // This holds for explicitly crated NPEs, but also for implicilty
>>>>>>>>
>>>>>>>> Two typos: crated  & implicilty
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>
>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>        ex = null;
>>>>>>>>>        ex.fillInStackTrace()
>>>>>>>>> no message is created.
>>>>>>>>>
>>>>>>>>> I think this really is acceptable.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Remi, I didn't comment on this statement from a previous mail:
>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at some
>>>> point.
>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack trace is
>> filled
>>>>>> you
>>>>>>>> don't
>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>> The internal structure is no more deleted when the stack trace
>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>        Goetz.
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph
>>>> Dreis
>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>>>> runtime-
>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>> <david.holmes at oracle.com>
>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>> message
>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>
>>>>>>>>>> yes,
>>>>>>>>>> it's what i was saying,
>>>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
>>>>>> initialized,
>>>>>>>> i
>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the best
>> way
>>>> to
>>>>>>>>>> tackle that.
>>>>>>>>>>
>>>>>>>>>> R?mi
>>>>>>>>>>
>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>> "Christoph
>>>>>>>> Dreis"
>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>> dev at openjdk.java.net>,
>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>> message
>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>
>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>
>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>
>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>> numStackTracesFilledIn
>>>>>>>>>>> without synchronization.
>>>>>>>>>>>
>>>>>>>>>>> -Alan

From david.holmes at oracle.com  Tue Jul 14 02:25:51 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 14 Jul 2020 12:25:51 +1000
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
 <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
 <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <d5a5e563-e0c5-ec0a-8640-ea940c05f738@oracle.com>

Hi Ludovic,

On 14/07/2020 11:28 am, Ludovic Henry wrote:
> Hello,
> 
>> But if we are dealing with non-TSO races then it would be good to get
>> some guidance from Microsoft as to the memory ordering properties of
>> various API's to ensure that we are maintaining correct ordering. For
>> example, in the destructor we have:
>>
>> 81     lock_owner = 0;
>> 82     // No lost wakeups, lock_event stays signaled until reset.
>> 83     DWORD ret = SetEvent(lock_event);
>>
>> but unless we are guaranteed that the store to lock_owner cannot be
>> reordered by the compiler or the hardware, to appear to be after the
>> SetEvent, then the logic is broken. Generally, because Windows only
>> supported TSO systems, we have assumed that the compiler will not
>> reorder code across these kind of API calls. But now we also need
>> hardware guarantees.
> 
> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.

That is good to know. But this is something that Microsoft should be 
documenting explicitly - even if just a blanket statement that all 
syscalls (which are what exactly?) provide an implicit memory barrier 
(of what type exactly?).

> As for the general question around platforms with weaker memory models, AArch64 is not the first such platform that MSVC and Windows have been ported to. It is safe to assume that MSVC has a similar approach to GCC and Clang on memory reordering optimizations. [1] also gives some pointers on some MSVC specific knobs for working around the weaker memory model.

The /volatile:ms is the kind of build control I was wondering about. 
Thanks for the pointer.

> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.

jcstress tests will execute the native runtime code of course, but they 
won't be "stressing" it as such.

Cheers,
David
-----

> I hope this helps to answer your questions.
> 
> [1] https://docs.microsoft.com/en-us/cpp/build/common-visual-cpp-arm-migration-issues?view=vs-2019#volatile-keyword-default-behavior
> 
> --
> Ludovic
> ________________________________________
> From: Andrew Haley <aph at redhat.com>
> Sent: Monday, July 13, 2020 01:36
> To: David Holmes; Thomas St?fe
> Cc: Kim Barrett; Ludovic Henry; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
> 
> On 13/07/2020 06:48, David Holmes wrote:
>> Hi Thomas,
>>
>> On 13/07/2020 2:41 pm, Thomas St?fe wrote:
>>>
>>> Can a compiler reorder system calls and stores? How would it determine
>>> if this is safe to do?
> 
> I very much doubt it.
> 
>> A compiler can reorder anything it likes if it can determine it is safe
>> to do so. :)
> 
> I'm fairly sure the compiler doesn't care about that!
> 
>>> I'd be surprised if Microsoft loosened up reordering since this would
>>> mean existing software cannot just be recompiled for arm and expected to
>>> work. But this is just a guess of course.
>>
>> It's an interesting point because I would expect there to be a lot of
>> software written for Windows that contains assumptions of TSO that would
>> in fact fail when run on Aarch64. I don't know if there are any special
>> mechanisms to force a binary to run in TSO mode on Aarch64 under Windows
>> (or build flags), that would allow for ease of migration.
> 
> There's no standard hardware mechanism that would do so.
> 
> I've been very surprised at how little software has broken on AArch64
> because of memory ordering. Like you, I initially assumed that stuff
> would break all over the place, but by and large it was OK. I know of
> two reasons: firstly, programmers are pretty conservative and tend to
> use simple and reliable mechanisms such as safe publication and
> mutexes for inter-thread communication. But also, and maybe more
> importantly, the kinds of reordering the hardware can do are not very
> different from those compilers do. Therefore, anyone playing fast and
> loose with TSO has probably already been bitten by the compiler.
> 
>> But unless all Windows software will run in such a mode there is a
>> need for MS to document what the memory consistency properties of
>> various APIs are (as POSIX does [1]).
> 
> Indeed. I would have thought it existed somewhere.
> 
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C382df232c9e14d02489b08d82707efb5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302262326530322&amp;sdata=4aWlLFQiA5WM8199fuDeMfFZoSyQzznGe9kwotVP2tk%3D&amp;reserved=0>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C382df232c9e14d02489b08d82707efb5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302262326530322&amp;sdata=UiS3crdMoemJly%2BESLC%2F50noHZya9zArFenhHNVJ7C4%3D&amp;reserved=0
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> 

From david.holmes at oracle.com  Tue Jul 14 02:42:40 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 14 Jul 2020 12:42:40 +1000
Subject: RFR [15] : 8249029: clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_defmeth tests
In-Reply-To: <9EC87F8D-662E-44B6-9EA1-F798A74D54B8@oracle.com>
References: <9EC87F8D-662E-44B6-9EA1-F798A74D54B8@oracle.com>
Message-ID: <f17d41d6-56b4-453c-72aa-5f4aaf5b571b@oracle.com>

Looks good!

Thanks,
David

On 9/07/2020 5:43 am, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249029/webrev.00
>> 750 lines changed: 0 ins; 376 del; 374 mod;
> 
> Hi all,
> 
> could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_defmeth tests?
> from the main issue(8204985):
>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
> 
> effectively, the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/vm/runtime/defmeth  | xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`
> 
> testing: :vmTestbase_vm_defmeth on linux-x64
> webrev: http://cr.openjdk.java.net/~iignatyev//8249029/webrev.00
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249029
> 
> Thanks,
> -- Igor
> 

From david.holmes at oracle.com  Tue Jul 14 02:44:24 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 14 Jul 2020 12:44:24 +1000
Subject: RFR [15] : 8249033 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_metaspace tests
In-Reply-To: <BAEA719A-AE50-4D41-9202-AB40EA2370A7@oracle.com>
References: <BAEA719A-AE50-4D41-9202-AB40EA2370A7@oracle.com>
Message-ID: <b38faba0-05a6-e60c-acd7-d7bb4869ab11@oracle.com>

Looks good!

Thanks,
David

On 14/07/2020 3:32 am, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249033/webrev.00/
>> 47 lines changed: 0 ins; 32 del; 15 mod;
> 
> Hi all,
> 
> could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_metaspace tests?
> from the main issue(8204985):
>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
> 
> as none of these tests need FileInstaller, the patch is as simple as `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/metaspace/ xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.
> 
> testing: :vmTestbase_vm_metaspace on linux-x64
> webrev: http://cr.openjdk.java.net/~iignatyev//8249033/webrev.00
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249033
> 
> Thanks,
> -- Igor
> 

From david.holmes at oracle.com  Tue Jul 14 03:41:07 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 14 Jul 2020 13:41:07 +1000
Subject: RFR(S) [15] : 8249032 : clean up FileInstaller $test.src $cwd in
 vmTestbase_nsk_sysdict tests
In-Reply-To: <CF6D1A88-7BDA-42E2-A478-F321EBC3A176@oracle.com>
References: <CF6D1A88-7BDA-42E2-A478-F321EBC3A176@oracle.com>
Message-ID: <8d3f9d7e-ac4e-6fe7-dd68-fb7bf4bc4b6f@oracle.com>

Looks good!

Thanks,
David

On 14/07/2020 3:16 am, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249032/webrev.00
>> 20 lines changed: 0 ins; 20 del; 0 mod;
> 
> Hi all,
> 
> could you please review the patch which removes `FileInstaller . .` jtreg action from : vmTestbase_nsk_sysdict tests?
> from the main issue(8204985):
>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
> 
> none of sysdict tests need FileInstaller, so the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/nsk/sysdict xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.
> 
> testing: :vmTestbase_nsk_sysdict on linux-x64
> webrev: http://cr.openjdk.java.net/~iignatyev//8249032/webrev.00
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249032
> 
> Thanks,
> -- Igor
> 

From thomas.stuefe at gmail.com  Tue Jul 14 06:29:22 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 14 Jul 2020 08:29:22 +0200
Subject: RFR(S): Use Vectored Exception Handling on Windows
In-Reply-To: <MWHPR21MB05117E4D1CBC613EF52991AEB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511A8150D4CAEBF3181E61EB0980@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUw1nEo_o4ayQBv=MJcKFCTXfvY2ThNL1x9evcvT7fuYyg@mail.gmail.com>
 <MWHPR21MB0511F8E1132F81170290209FB0920@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUzh65R01wHTW9-ObQZ7j0vNWjp_RuYivOrpGHoJNtyNgw@mail.gmail.com>
 <MWHPR21MB05117E4D1CBC613EF52991AEB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <CAA-vtUw4ZwOHitBtsJLVMMma1D+TVV02xzoWmwd1M2yg-S91DQ@mail.gmail.com>

Hi Ludovic,

On Mon, Jul 13, 2020 at 11:55 PM Ludovic Henry <luhenry at microsoft.com>
wrote:

> Hi Thomas,
>
> Thank you for your feedback!
>
> Let me answer on some of the cases you mention.
>
> > A) this case exists today. An app getting signals via VEH would have to
> willingly ignore signals for us to get them. This does not change, your
> patch would mean this happens less often, so I do not see a backward
> compatibility problem here.
>
> Exactly.
>
> > B) this is a new case. We would have to ignore signals not meant for us.
> Technically by just ignoring them. Distinguishing this is a bit difficult
> though. Note the subtle difference to Unix: there we have signal chaining,
> so an application which is really really interested in signals for its own
> purposes uses it (e.g. by preloading libjsig) and then we know its handler
> and hand over the signal.
>
> Today, through SEH and RtlAddFunctionTable, we only get a very clear
> subset of exceptions: the one triggered in the code cache. If an exception
> is triggered from a PC outside of this code cache, SEH will not get the
> handler we registered with RtlAddFunctionTable, and we'll simply _not_ call
> into HandleExceptionFromCodeCache (the handler we register with
> RtlAddFunctionTable). That can be trivially reproduced in the VEH by simply
> checking that the PC is between CodeCache::low_bound() and
> CodeCache::high_bound().
>
> This is what you are mentioning with "we only can distinguish our crashes
> from their crashes via crash pc, rejecting any crash not in our code
> (dynamic or static). Well, arguably this would be just how it is today with
> our code scoped via SEH".
>
>
Not sure we understand each other.

Today we get exceptions from two sides:
- via SEH, __try/__except, in threads attached to the VM. There the pc is
either us or third party code below us which did not bother setting up SEH
for themselves
- via RtlAddFunctionTable for the code cache, where we specify code cache
boundaries.

With VEH we would get all exceptions in the process. Including exceptions
from threads which have never seen the libjvm, or from caller code if the
hotspot is embedded somewhere.

Under Unix we handle all those crashes by writing hs-err crashlogs, even if
those crashes are not our responsibility. Unless user set up signal
chaining, where we hand over any crash signal to the chained handler (which
for the purpose of clear error reporting is also not perfect).

With VEH I get all exceptions, but have to decide on my own if an exception
should result in a hs-err file or handed to the next exception handler. The
only way I can see is by examining the pc - iterate through all our
binaries and compare the pc with their text segments, and also check the
code cache.

I may miss something here.


> > With the added safety net of the unhandled exception filter (what
> happens if multiple parties call this?).
>
> Here, Unhandled Exception Handling predates VEH and it doesn't integrate
> chaining. The API is similar to signals on Linux/Unix: the last one to
> register has to make sure to save the previous one and to call/chain it
> accordingly.
>
> > My only very small personal gripe would be that I always liked how I can
> quickly use SEH to check if a pointer is valid without disturbing anyone.
> But within the hotspot at least I can just as well use SafeFetch.
>
> Nothing from the Win32 API stops you from mix-and-matching VEH and SEH. If
> you want to do a `__try { val = *ptr; } __except
> (EXCEPTION_EXECUTE_HANDLER) { success = false; }` in some C++ code (in vm
> or native), nothing stops you from doing so. My understanding of the
> exception handler logic in the OpenJDK on Windows is that the accepted
> EXCEPTION_ACCESS_VIOLATION in java, vm, or native code is limited to a
> clear subset, and anything outside of these known cases is quickly treated
> as "an exception we cannot handle". SafeFetch is such a case where the
> instructions potentially triggering the EXCEPTION_ACCESS_VIOLATION are
> matched against by the exception handler.
>
>
Well, in your example, VEH would have preference and get the exception
first; in our handler we recognize the exception as not allowed, hence a
crash, and write a hs-err file. My success=false; handler would never
execute.

But I admit this is really a minor point. I also dimly remember seeing some
win32 API to check pointers for readability, so maybe using SEH for these
things is not necessary anyway.

Thanks, Thomas

--
> Ludovic
>
> ________________________________________
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Saturday, July 11, 2020 23:08
> To: Ludovic Henry
> Cc: hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR(S): Use Vectored Exception Handling on Windows
>
> Hi Ludovic,
>
> sorry for the delay, and thanks for the extensive answer. Please find
> remarks inline.
>
> On Fri, Jun 26, 2020 at 12:11 AM Ludovic Henry <luhenry at microsoft.com
> <mailto:luhenry at microsoft.com>> wrote:
> Hi Thomas,
>
> It seems that the problem you're describing stems from the current
> exception handler treating two cases: 1. any exception knowingly triggered
> by Java code and treated by HotSpot (ex: safepoint-polling, arraycopy
> stubs, stackoverflow in Java code), and 2. exceptional cases leading to
> crashes (ex: uncaught C++ exception, an access violation in VM or
> native/external code, etc.). There is the same problem on Unix because
> there is only one system (signal handling) for both cases. Fortunately,
> Windows proposes different systems, each with its own advantages.
>
> The order in which Windows invokes each of these systems is the following:
>  1. Vectored Exception Handler registered with
> `AddVectoredExceptionHandler`
>  2. Structured Exception Handler
>  3. Vectored Exception Handler registered with `AddVectoredContinueHandler`
>  4. Unhandled Exception Handler
>
> Today, Hotspot on x86/x86_64 catches the exception at 2. via a handler
> registered with `RtlAddFunctionTable`. This handler does both the
> Java-triggered exceptions and any other exceptions.
>
> Now, from the point of view of an external library or application
> embedding the JVM inside their own process, they still have all the above
> options to register an exception handler, irrespective of how Hotspot does
> it. This creates the following cases:
>  - If the application uses VEH: they will (with Hotspot using SEH) be
> called _before_ Hotspot's exception handler and will then have to be aware
> that they may get exceptions unrelated to them and will have to ignore them
> accordingly
>  - If the application uses SEH: they will only get exceptions related to
> their code area
>
> If Hotspot is to use VEH, an exception would play as follow:
>  - If the application uses VEH and their registered handler executes
> _before_ Hotspot's one: same as above
>  - If the application uses VEH and their registered handler executes
> _after_ Hotspot's one: Hotspot has to make sure that the exception was
> triggered by Hotspot and ignore them otherwise (a range check on the PC can
> be used here to emulate how it's done with RltAddFunctionTable)
>  - If the application uses SEH: the same case as to where the
> application's handler executes _after_ Hotspot's one
>
> This all assumes that Hotspot's VEH handler doesn't trigger a crash report
> (VMError::report_and_die) on any exception it doesn't know how to handle.
> The simplest way to do that is simply _not_ to do it in Hotspot's VEH
> handler, and to do it by registering a Win32 Unhandled Exception Handler
> (with SetUnhandlerdExceptionFilter [1]). This handler is _only_ called when
> no other exception handler treated the exception (by returning
> EXCEPTION_CONTINUE_EXECUTION or EXCEPTION_EXECUTE_HANDLER). Invoking it
> means the application is "toast" and not in a runnable state anymore, which
> fits nicely with the purpose of the Hotspot crash report.
>
>
> Okay, If I get this correctly:
>
> Today:
>   App uses VEH - they execute before us and have to handle this correctly
> (->A)
>   App uses SEH - no interaction
>
> With proposed switch:
>   App uses VEH - they may or may not execute before us. If they come
> before us: (->A). If they come after us -> (B)
>   App uses SEH -> (B)
>
> A) this case exists today. An app getting signals via VEH would have to
> willingly ignore signals for us to get them. This does not change, your
> patch would mean this happens less often, so I do not see a backward
> compatibility problem here.
>
> B) this is a new case. We would have to ignore signals not meant for us.
> Technically by just ignoring them. Distinguishing this is a bit difficult
> though. Note the subtle difference to Unix: there we have signal chaining,
> so an application which is really really interested in signals for its own
> purposes uses it (e.g. by preloading libjsig) and then we know its handler
> and hand over the signal.
>
> On windows we do not know this (?), we only can distinguish our crashes
> from their crashes via crash pc, rejecting any crash not in our code
> (dynamic or static). Well, arguably this would be just how it is today with
> our code scoped via SEH. With the added safety net of the unhandled
> exception filter (what happens if multiple parties call this?).
>
> Okay this seems safe enough to try it at least.
>
> My only very small personal gripe would be that I always liked how I can
> quickly use SEH to check if a pointer is valid without disturbing anyone.
> But within the hotspot at least I can just as well use SafeFetch.
>
> Thank you,
>
> Thomas
>
> I hope this sheds some light on possible solutions ahead of us.
>
> Thank you,
>
> --
> Ludovic
>
> [1]
> https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117672388&sdata=zM0zOUCOujhp2fyW7PVXPplSn13elTyyf4cJUgZj%2Fm8%3D&reserved=0
> >
> ________________________________________
> From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com
> >>
> Sent: Sunday, June 21, 2020 05:55
> To: Ludovic Henry
> Cc: hotspot-runtime-dev at openjdk.java.net<mailto:
> hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: RFR(S): Use Vectored Exception Handling on Windows
>
> Hi,
>
> We at SAP had used VEH in our own Windows Itanium port and I dimly
> remember it being a source of problems. That is many years ago and I
> realize that it is not worth much, but it makes me bit apprehensive of this
> change.
>
> The main problem I see is that this will be an observable change in
> behavior.
>
> We currently use SEH, so our error handler is guaranteed to be invoked
> only for exceptions from within our own code. With VEH we now follow the
> Unix way of things and suddenly our error handler becomes a global resource.
>
> We will suddenly be invoked for crashes outside the VM, e.g. in foreign
> launcher code atop of us or in non-java side threads, which will generate
> whole new classes of hs-err files for crashes the VM is not responsible
> for. Which are then perceived as VM crashes and sent to us vendors instead
> of going to the right people. This is the way it works on Unix today, and
> it is a constant annoyance and increases our support workload.
>
> We also may introduce new problems since suddenly we interfere with
> application exception handling. At the very least, we have to think up a
> scheme for signal chaining (both ways: VM->foreign code and foreign
> code->VM). For the first, we probably need some form of libjsig preloading,
> or some other way to divert signal handler instalment. That would also need
> cooperation from the application programmers and/or operators.
>
> Matters are even more complicated, since foreign code may use SEH instead
> of VEH, so what happens if a JNI library below me wants to use SEH, does
> that still work?
>
> I feel this should not be rushed. Even considered "brittle" SEH has served
> us well, I do not recall many problems in the past aside from having to add
> the occasional __try/__except. Are there actual bugs we have to solve?
>
> Lastly, personally I always found SEH quite a neat concept, and one of the
> few places where Windows was superior to Unix :)
>
> Thanks, Thomas
>
>
> On Fri, Jun 19, 2020 at 5:23 PM Ludovic Henry <luhenry at microsoft.com
> <mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:
> luhenry at microsoft.com>>> wrote:
> Hello,
>
> First, some context and definitions:
> - when talking about exception here, I'm talking about Win32 exception
> which are equivalent to signals on Linux and other Unix, I am _not_ talking
> about Java exceptions.
> - an explanation of an _exception filter_ can be found at
> https://docs.microsoft.com/en-us/cpp/cpp/writing-an-exception-filter?view=vs-2019
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0
> ><
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665642403&sdata=fjcrwcQYAg3TstTSO2YHKziszwlusbYV6uUXINydD1E%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0>>.
> There is only a limited concept of that in Java with type-based exception
> filter (ex: `try { ... } catch (IOException ioe) { ... } catch (Throwable
> t) { ... }`).
> - in Win32, there exist two exception handling mechanism:
>   - Structured Exception Handling: the historical one, based on `__try {}
> __except (...) {}`
>   - Vectored Exception Handling: introduced in Windows XP / Windows Server
> 2003, much more similar to signals on Linux
>
> These exception handling mechanisms are used to catch any exceptions like
> Access Violation, Stack Overflow, Divide by Zero, Overflow, and more. These
> exceptions are equivalent to signal on Linux and are then core to many
> mechanisms in the OpenJDK.
>
> Today, the OpenJDK uses Structured Exception Handling to catch such
> exceptions, creating several requirements. First, all code that might
> trigger an exception on purpose (like a Access Violation / SIGSEGV in the
> arraycopy stub), needs to be wrapped up in a __try / __except. Because it's
> not feasible to wrap every single instance of such code, these __try /
> __except are put at the top-level most function of any thread started by
> the runtime. Second, for code generated by Hotspot, `RtlAddFunctionTable`
> is used to simulate the use of __try / __except for a specific code area.
> This function needs platform specific code with the generation of  a
> trampoline that calls the exception filter declared in the runtime. It's
> also meant to be used as a one to one mapping with try / catch in user
> code, and not as a "catch all the exceptions in this code area". Third,
> Structured Exception Handling expects to be able to unwind the stack.
> However, because Hotspot doesn't guarantee the usage of the
> platform-specific ABI internally, the platform-specific unwinder might
> break. Hotspot's usage of `RtlAddFunctionTable` for the code cache relies
> on the assumption that Structured Exception Handling never tries to unwind
> the stack (which it would fail to do because of the different ABI) before
> calling the registered exception filter.
>
> Discussing that with Windows Kernel maintainers, this approach is highly
> discouraged, considered brittle, and the better solution is Vectored
> Exception Handling. Vectored Exception Handling is conceptually much more
> similar to signal / sigaction on Linux and other Unix systems. It will
> catch all exceptions happening across the process, and no __try / __except
> will be required. It also removes the requirement to call
> `RtlAddFunctionTable`.  The exception filter then behaves like a signal
> handler with the possibility to modify the registers at will, modifying the
> PC to step over an instruction after an expected Access Violation for
> example. Vectored Exception Handling is also already used for AOT code.
>
> The changes can be found at
> http://cr.openjdk.java.net/~burban/ludovic_vecexc/<
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0
> ><
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665652395&sdata=pTewy1%2BeB43HX4y0ypDwMDGRjBoNP6yBGrhRi7ncm1c%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0>>.
> As I am not an author, I have not created a corresponding bug in JBS.
>
> Thank you, and looking forward for your feedback!
>
> --
> Ludovic
>
>
>

From david.holmes at oracle.com  Tue Jul 14 11:55:16 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 14 Jul 2020 21:55:16 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
Message-ID: <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>

Correction ...

On 14/07/2020 12:11 pm, David Holmes wrote:
> Hi Goetz,
> 
> Okay ... if I understand your position correctly you are looking at this 
> as if the extended message is created at the time the NPE is thrown, and 
> it is an implementation detail that we actually determine it lazily. If 
> it were eagerly determined then neither fillInstacktrace() nor 
> setStackTrace() would make any difference to the message - just as with 
> any other exception message.
> 
> However, the lazy determination of the message causes a problem with 
> fillInStackTrace() because that call will destroy the original backtrace 
> needed to produce the original message, and create an incorrect message. 
> setStackTrace() does not have a similar problem because, simply by the 
> way the current implementation works it doesn't touch the original 
> backtrace.
> 
> So you are proposing to only fix the bug that is evident in relation to 
> fillInStackTrace() by no longer evaluating the extended message if 
> fillInStackTrace() is called after the NPE was constructed.
> 
> But in doing so you break the illusion that the extended message acts 
> as-if determined at construction time, because you now effectively clear 
> it when fillInStackTrace is called.
> 
> My position was that if fillInStackTrace can be seen to clear it, then 
> setStackTrace (which is logically somewhat equivalent) should also be 
> seen to clear it.
> 
> Alternatively, add a new field to NPE to cache the extended error 
> message, and explicitly evaluate the message if fillInStackTrace() is 
> called. That will continue the illusion that the extended message was 
> actually set at construction time. No changes needed to setStackTrace() 
> as we can still lazily compute the extended message.
> 
> Something like:
> 
> private String extendedMessage;
> 
> public synchronized Throwable fillInStackTrace() {
>  ??? if (extendedMessage == NULL) {
>  ??????? extendedMessage = getExtendedNPEMessage();
>  ??? }
>  ??? return super.fillInStackTrace();
> }

Coleen pointed out to me that we can't do it like this because we need 
the initial fillInStacktrace to be fast and we want the extended message 
computed lazily. So it will still need a counter so we only do this on 
the second call.


  private String extendedMessage;
  private int fillInCount;

  public synchronized Throwable fillInStackTrace() {
       if (extendedMessage == NULL && (fillInCount++ == 1)) {
           extendedMessage = getExtendedNPEMessage();
       }
       return super.fillInStackTrace();
  }

or something to that effect.

David
-----

> public String getMessage() {
>  ??? String message = super.getMessage();
>  ??? synchronized(this) {
>  ??????? if (message == null) {
>  ??????????? // This NPE should have an extended message.
>  ??????????? if (extendedMessage == NULL) {
>  ??????????????? extendedMessage = getExtendedNPEMessage();
>  ??????????? }
>  ??????????? message = extendedMessage;
>  ??????? }
>  ??? }
>  ??? return message;
> }
> 
> Cheers,
> David
> 
> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>> Hi David,
>>
>>> Your extended message is only computed when there is no original 
>>> message.
>> Hmm. I would say the extended message is only computed when
>> The NPE was raised by the runtime. It happens to never have a
>> message so far in these cases.
>> But this is two views to the same thing ??
>>
>>> You're concerned about this scenario:
>>>
>>> catch (NullPointerException npe) {
>>> ??? String msg1 = npe.getMessage(); // gets extends NPE message
>>> ??? npe.setStackTrace(...);
>>> ??? String msg2 = npe.getMessage(); // gets null
>>> }
>>>
>>> While I find it hard to imagine anyone doing this
>> Well, all the scenario are quite artificial:
>> ? - why would you call fillInStackTrace on an exception thrown by the VM?
>> ? - why would you call setStackTrace at all?
>>> you can easily have
>>> specified that the extended message is only available with the original
>>> stacktrace, hence after a second call to fillInStackTrace, or a call to
>>> setStackTrace, then the message reverts to being empty.
>> The message is not meant to be a special thing that behaves different
>> from other messages.? Like sometime be available, sometime not.
>> It ended up being different through requirements during the
>> review.
>>
>>> To me that makes
>>> far more sense than having msg2 continue to report the extended info for
>>> the original stacktrace when it now has a new stacktrace.
>>>
>>> I'm really not seeing why calling fillInstackTrace() a second time
>>> should be treated any differently to calling setStackTrace(). They
>>> should be handled consistently IMO.
>> But then you treat setStackTrace() differently from setStackTrace()
>> with other exceptions.
>> The reason to treat fillInStackTrace differently is that we lost 
>> information
>> needed to compute it. This is not the case with setStackTrace().
>>
>> A different solution, the one I would have proposed if I had not
>> considered previous comments from reviews,? would be to just
>> compute the message in the runtime in the call of fillInStackTrace
>> before the old stack trace is lost and assign it to the message field.
>> This way it would behave similar to all other exceptions. The message
>> would just be there ... just that it's computed lazily.
>> The cost of the algorithm wouldn't harm that much as other costly
>> algorithms (walking the stack) are performed at this point, too.
>>
>>> We are not talking about all exceptions only about your NPE extended
>>> error message.
>> Hmm, the inconsistency caused by the code you posted above
>> holds for all exceptions.? If you fiddle with the stack trace,
>> the message might become pointless.? Wrt. setStackTrace
>> they all behave the same.
>> Wrt. fillInStackTrace the message will be wrong. Only this
>> needs to be fixed.
>>
>> Best regards,
>> ?? Goetz.
>>
>>
>>>
>>> David
>>> -----
>>>
>>>> I implemented an example where wrong stack traces are
>>>> printed with LinkageError and NPE, modifying a jtreg test:
>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>>> jdk15/05/mess_with_exceptions.patch
>>>> See also the generated output added to a comment in the patch.
>>>> If the NEP message text was missing in the second printout, I think
>>>> this really would be unexpected.
>>>> Please note that the correct message is printed after messing
>>>> with the stack trace, it's the stack trace that is wrong.
>>>> (Not as with the problem I am fixing here where a wrong
>>>> message is printed.)
>>>>
>>>> Best regards,
>>>> ??? Goetz.
>>>>
>>>>
>>>>
>>>>>
>>>>>> I guess the normal usecase of setStackTrace is the other way around:
>>>>>> Change the message and throw a new exception with the existing
>>>>>> stack trace:
>>>>>>
>>>>>> try {
>>>>>> ???? a.x;
>>>>>> catch (NullPointerException e) {
>>>>>> ???? throw new NullPointerException("My own error
>>>>> message").setStackTrace(e.getStackTrace);
>>>>>> }
>>>>>>
>>>>>> And not taking an arbitrary stack trace and put it into an exception
>>>>>> with existing message.
>>>>>
>>>>> Interesting usage.
>>>>>
>>>>> Cheers,
>>>>> David
>>>>> -----
>>>>>
>>>>>> Best regards,
>>>>>> ???? Goetz.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
>>>>> mlv.fr'
>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; 
>>>>>>> hotspot-runtime-dev
>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>> message
>>>>>>> after calling fillInStackTrace
>>>>>>>
>>>>>>> Hi Goetz,
>>>>>>>
>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>> True. To ensure you process the original backtrace only you 
>>>>>>>>> need to
>>>>> add
>>>>>>>>> synchronization in getMessage():
>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>> NPE_fillInStackTrace-
>>>>>>> jdk15/05/
>>>>>>>>
>>>>>>>> I added the volatile, too, but as I understand the synchronized
>>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>>> without.
>>>>>>>
>>>>>>> No "volatile" needed, or wanted, when all access is within 
>>>>>>> synchronized
>>>>>>> regions.
>>>>>>>
>>>>>>>>> To be honest the idea that someone would share an exception
>>> instance
>>>>>>> and
>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>>> information about it just seems highly unrealistic.
>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>>> performance
>>>>>>> ??
>>>>>>>
>>>>>>> Contention was not my concern at all. :)
>>>>>>>
>>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>> The test shows that after setStackTrace still the correct message
>>>>>>>> is computed. This is because the algorithm uses 
>>>>>>>> Throwable::backtrace
>>>>>>>> and not Throwable::stacktrace.? Throwable::backtrace is not
>>>>>>>> affected by setStackTrace.
>>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>>> the message might refer to other code than the stack trace
>>>>>>>> points to.
>>>>>>>
>>>>>>> But you can't adapt the message text - there is no setMessage! If 
>>>>>>> the
>>>>>>> message is NULL and you call setStackTrace() then getMessage(), it
>>> makes
>>>>>>> no sense to return the extended error message that was associated
>>> with
>>>>>>> the original stack/backtrace.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> ????? Goetz.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>> 'forax at univ-
>>>>>>> mlv.fr'
>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-
>>> dev
>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>> message
>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>
>>>>>>>>> Hi Goetz,
>>>>>>>>>
>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>>> Hi Remi,
>>>>>>>>>>
>>>>>>>>>> But how does volatile help?
>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets 
>>>>>>>>>> always the
>>>>>>>>>> right value.
>>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>>> getExtendedNPEMessage.? The other thread could change it after
>>>>>>>>>> checking numStackTracesFilledIn and before I read the backtrace.
>>>>>>>>>
>>>>>>>>> True. To ensure you process the original backtrace only you 
>>>>>>>>> need to
>>>>> add
>>>>>>>>> synchronization in getMessage():
>>>>>>>>>
>>>>>>>>> ?????????? public String getMessage() {
>>>>>>>>> ?????????????? String message = super.getMessage();
>>>>>>>>> ?????????????? // If the stack trace was changed the extended 
>>>>>>>>> NPE algorithm
>>>>>>>>> ?????????????? // will compute a wrong message.
>>>>>>>>> +???????? synchronized(this) {
>>>>>>>>> !???????????? if (message == null && numStackTracesFilledIn == 
>>>>>>>>> 1) {
>>>>>>>>> !???????????????? return getExtendedNPEMessage();
>>>>>>>>> !???????????? }
>>>>>>>>> +???????? }
>>>>>>>>> ?????????????? return message;
>>>>>>>>> ?????????? }
>>>>>>>>>
>>>>>>>>> To be honest the idea that someone would share an exception
>>> instance
>>>>>>> and
>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>>> information about it just seems highly unrealistic. But the 
>>>>>>>>> above fixes
>>>>>>>>> it simply. Though after looking at comments in the test I would 
>>>>>>>>> also
>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>
>>>>>>>>> ??????????? synchronized (this) {
>>>>>>>>> ???????????????? if (this.stackTrace == null && // Immutable stack
>>>>>>>>> ???????????????????? backtrace == null) // Test for out of 
>>>>>>>>> protocol state
>>>>>>>>> ???????????????????? return;
>>>>>>>>> +?????????? numStackTracesFilledIn++;
>>>>>>>>> ???????????????? this.stackTrace = defensiveCopy;
>>>>>>>>> ???????????? }
>>>>>>>>> ???????? }
>>>>>>>>>
>>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>>
>>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>>> proposed in webrev 02:
>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>> NPE_fillInStackTrace-
>>>>>>>>> jdk15/02/
>>>>>>>>>
>>>>>>>>> I much prefer the latest version that recognises that only the 
>>>>>>>>> original
>>>>>>>>> stack can be processed.
>>>>>>>>>
>>>>>>>>> In the test:
>>>>>>>>>
>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also 
>>>>>>>>> for implicilty
>>>>>>>>>
>>>>>>>>> Two typos: crated? & implicilty
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>> ?????? ex = null;
>>>>>>>>>> ?????? ex.fillInStackTrace()
>>>>>>>>>> no message is created.
>>>>>>>>>>
>>>>>>>>>> I think this really is acceptable.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Remi, I didn't comment on this statement from a previous mail:
>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at some
>>>>> point.
>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack 
>>>>>>>>>>> trace is
>>> filled
>>>>>>> you
>>>>>>>>> don't
>>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>>> The internal structure is no more deleted when the stack trace
>>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> ?????? Goetz.
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Christoph
>>>>> Dreis
>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>>>>> runtime-
>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>>> <david.holmes at oracle.com>
>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>>> message
>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>
>>>>>>>>>>> yes,
>>>>>>>>>>> it's what i was saying,
>>>>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
>>>>>>> initialized,
>>>>>>>>> i
>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the 
>>>>>>>>>>> best
>>> way
>>>>> to
>>>>>>>>>>> tackle that.
>>>>>>>>>>>
>>>>>>>>>>> R?mi
>>>>>>>>>>>
>>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>>> "Christoph
>>>>>>>>> Dreis"
>>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>>> dev at openjdk.java.net>,
>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>> message
>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>
>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>>
>>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>>> numStackTracesFilledIn
>>>>>>>>>>>> without synchronization.
>>>>>>>>>>>>
>>>>>>>>>>>> -Alan

From goetz.lindenmaier at sap.com  Tue Jul 14 13:48:26 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Tue, 14 Jul 2020 13:48:26 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
Message-ID: <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>

Hi,

Yes, Coleen, you are right. We must preserve the lazy
computation, and also reduce overhead on discarded
exceptions.

And yes, we can do it with a counter:
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06/
but I would prefer placeholder strings: 
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07/
This way we need only one new field.

(I need two placeholders, because the getExtendedNPEMessage0()
sometimes returns null. If I write null into the extendedMessage field,
fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
time.)

With webrev 07 the overhead on discarded exceptions is basically the
same as with webrev 05: one additional field, one assignment in fillInStackTrace().

What do you think?

Best regards,
  Goetz.


> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Tuesday, July 14, 2020 1:55 PM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> Correction ...
> 
> On 14/07/2020 12:11 pm, David Holmes wrote:
> > Hi Goetz,
> >
> > Okay ... if I understand your position correctly you are looking at this
> > as if the extended message is created at the time the NPE is thrown, and
> > it is an implementation detail that we actually determine it lazily. If
> > it were eagerly determined then neither fillInstacktrace() nor
> > setStackTrace() would make any difference to the message - just as with
> > any other exception message.
> >
> > However, the lazy determination of the message causes a problem with
> > fillInStackTrace() because that call will destroy the original backtrace
> > needed to produce the original message, and create an incorrect message.
> > setStackTrace() does not have a similar problem because, simply by the
> > way the current implementation works it doesn't touch the original
> > backtrace.
> >
> > So you are proposing to only fix the bug that is evident in relation to
> > fillInStackTrace() by no longer evaluating the extended message if
> > fillInStackTrace() is called after the NPE was constructed.
> >
> > But in doing so you break the illusion that the extended message acts
> > as-if determined at construction time, because you now effectively clear
> > it when fillInStackTrace is called.
> >
> > My position was that if fillInStackTrace can be seen to clear it, then
> > setStackTrace (which is logically somewhat equivalent) should also be
> > seen to clear it.
> >
> > Alternatively, add a new field to NPE to cache the extended error
> > message, and explicitly evaluate the message if fillInStackTrace() is
> > called. That will continue the illusion that the extended message was
> > actually set at construction time. No changes needed to setStackTrace()
> > as we can still lazily compute the extended message.
> >
> > Something like:
> >
> > private String extendedMessage;
> >
> > public synchronized Throwable fillInStackTrace() {
> >  ??? if (extendedMessage == NULL) {
> >  ??????? extendedMessage = getExtendedNPEMessage();
> >  ??? }
> >  ??? return super.fillInStackTrace();
> > }
> 
> Coleen pointed out to me that we can't do it like this because we need
> the initial fillInStacktrace to be fast and we want the extended message
> computed lazily. So it will still need a counter so we only do this on
> the second call.
> 
> 
>   private String extendedMessage;
>   private int fillInCount;
> 
>   public synchronized Throwable fillInStackTrace() {
>        if (extendedMessage == NULL && (fillInCount++ == 1)) {
>            extendedMessage = getExtendedNPEMessage();
>        }
>        return super.fillInStackTrace();
>   }
> 
> or something to that effect.
> 
> David
> -----
> 
> > public String getMessage() {
> >  ??? String message = super.getMessage();
> >  ??? synchronized(this) {
> >  ??????? if (message == null) {
> >  ??????????? // This NPE should have an extended message.
> >  ??????????? if (extendedMessage == NULL) {
> >  ??????????????? extendedMessage = getExtendedNPEMessage();
> >  ??????????? }
> >  ??????????? message = extendedMessage;
> >  ??????? }
> >  ??? }
> >  ??? return message;
> > }
> >
> > Cheers,
> > David
> >
> > On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
> >> Hi David,
> >>
> >>> Your extended message is only computed when there is no original
> >>> message.
> >> Hmm. I would say the extended message is only computed when
> >> The NPE was raised by the runtime. It happens to never have a
> >> message so far in these cases.
> >> But this is two views to the same thing ??
> >>
> >>> You're concerned about this scenario:
> >>>
> >>> catch (NullPointerException npe) {
> >>> ??? String msg1 = npe.getMessage(); // gets extends NPE message
> >>> ??? npe.setStackTrace(...);
> >>> ??? String msg2 = npe.getMessage(); // gets null
> >>> }
> >>>
> >>> While I find it hard to imagine anyone doing this
> >> Well, all the scenario are quite artificial:
> >> ? - why would you call fillInStackTrace on an exception thrown by the VM?
> >> ? - why would you call setStackTrace at all?
> >>> you can easily have
> >>> specified that the extended message is only available with the original
> >>> stacktrace, hence after a second call to fillInStackTrace, or a call to
> >>> setStackTrace, then the message reverts to being empty.
> >> The message is not meant to be a special thing that behaves different
> >> from other messages.? Like sometime be available, sometime not.
> >> It ended up being different through requirements during the
> >> review.
> >>
> >>> To me that makes
> >>> far more sense than having msg2 continue to report the extended info
> for
> >>> the original stacktrace when it now has a new stacktrace.
> >>>
> >>> I'm really not seeing why calling fillInstackTrace() a second time
> >>> should be treated any differently to calling setStackTrace(). They
> >>> should be handled consistently IMO.
> >> But then you treat setStackTrace() differently from setStackTrace()
> >> with other exceptions.
> >> The reason to treat fillInStackTrace differently is that we lost
> >> information
> >> needed to compute it. This is not the case with setStackTrace().
> >>
> >> A different solution, the one I would have proposed if I had not
> >> considered previous comments from reviews,? would be to just
> >> compute the message in the runtime in the call of fillInStackTrace
> >> before the old stack trace is lost and assign it to the message field.
> >> This way it would behave similar to all other exceptions. The message
> >> would just be there ... just that it's computed lazily.
> >> The cost of the algorithm wouldn't harm that much as other costly
> >> algorithms (walking the stack) are performed at this point, too.
> >>
> >>> We are not talking about all exceptions only about your NPE extended
> >>> error message.
> >> Hmm, the inconsistency caused by the code you posted above
> >> holds for all exceptions.? If you fiddle with the stack trace,
> >> the message might become pointless.? Wrt. setStackTrace
> >> they all behave the same.
> >> Wrt. fillInStackTrace the message will be wrong. Only this
> >> needs to be fixed.
> >>
> >> Best regards,
> >> ?? Goetz.
> >>
> >>
> >>>
> >>> David
> >>> -----
> >>>
> >>>> I implemented an example where wrong stack traces are
> >>>> printed with LinkageError and NPE, modifying a jtreg test:
> >>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> NPE_fillInStackTrace-
> >>> jdk15/05/mess_with_exceptions.patch
> >>>> See also the generated output added to a comment in the patch.
> >>>> If the NEP message text was missing in the second printout, I think
> >>>> this really would be unexpected.
> >>>> Please note that the correct message is printed after messing
> >>>> with the stack trace, it's the stack trace that is wrong.
> >>>> (Not as with the problem I am fixing here where a wrong
> >>>> message is printed.)
> >>>>
> >>>> Best regards,
> >>>> ??? Goetz.
> >>>>
> >>>>
> >>>>
> >>>>>
> >>>>>> I guess the normal usecase of setStackTrace is the other way around:
> >>>>>> Change the message and throw a new exception with the existing
> >>>>>> stack trace:
> >>>>>>
> >>>>>> try {
> >>>>>> ???? a.x;
> >>>>>> catch (NullPointerException e) {
> >>>>>> ???? throw new NullPointerException("My own error
> >>>>> message").setStackTrace(e.getStackTrace);
> >>>>>> }
> >>>>>>
> >>>>>> And not taking an arbitrary stack trace and put it into an exception
> >>>>>> with existing message.
> >>>>>
> >>>>> Interesting usage.
> >>>>>
> >>>>> Cheers,
> >>>>> David
> >>>>> -----
> >>>>>
> >>>>>> Best regards,
> >>>>>> ???? Goetz.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>> Sent: Friday, July 3, 2020 9:30 AM
> >>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> 'forax at univ-
> >>>>> mlv.fr'
> >>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
> >>>>>>> hotspot-runtime-dev
> >>>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >>> message
> >>>>>>> after calling fillInStackTrace
> >>>>>>>
> >>>>>>> Hi Goetz,
> >>>>>>>
> >>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>>> True. To ensure you process the original backtrace only you
> >>>>>>>>> need to
> >>>>> add
> >>>>>>>>> synchronization in getMessage():
> >>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>> NPE_fillInStackTrace-
> >>>>>>> jdk15/05/
> >>>>>>>>
> >>>>>>>> I added the volatile, too, but as I understand the synchronized
> >>>>>>>> block brings sufficient memory barriers that this also works
> >>>>>>>> without.
> >>>>>>>
> >>>>>>> No "volatile" needed, or wanted, when all access is within
> >>>>>>> synchronized
> >>>>>>> regions.
> >>>>>>>
> >>>>>>>>> To be honest the idea that someone would share an exception
> >>> instance
> >>>>>>> and
> >>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
> >>>>>>>>> information about it just seems highly unrealistic.
> >>>>>>>> Yes, contention here is quite unlikely, so it should not harm
> >>> performance
> >>>>>>> ??
> >>>>>>>
> >>>>>>> Contention was not my concern at all. :)
> >>>>>>>
> >>>>>>>>> Though after looking at comments in the test I would also
> >>>>>>>>> suggest that setStackTrace be updated:
> >>>>>>>> The test shows that after setStackTrace still the correct message
> >>>>>>>> is computed. This is because the algorithm uses
> >>>>>>>> Throwable::backtrace
> >>>>>>>> and not Throwable::stacktrace.? Throwable::backtrace is not
> >>>>>>>> affected by setStackTrace.
> >>>>>>>> The behavior is just as with any exception. If you fiddle
> >>>>>>>> with the stack trace, but don't adapt the message text,
> >>>>>>>> the message might refer to other code than the stack trace
> >>>>>>>> points to.
> >>>>>>>
> >>>>>>> But you can't adapt the message text - there is no setMessage! If
> >>>>>>> the
> >>>>>>> message is NULL and you call setStackTrace() then getMessage(), it
> >>> makes
> >>>>>>> no sense to return the extended error message that was associated
> >>> with
> >>>>>>> the original stack/backtrace.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> David
> >>>>>>>
> >>>>>>>> Best regards,
> >>>>>>>> ????? Goetz.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
> >>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>> 'forax at univ-
> >>>>>>> mlv.fr'
> >>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
> runtime-
> >>> dev
> >>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >>>>> message
> >>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>
> >>>>>>>>> Hi Goetz,
> >>>>>>>>>
> >>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> >>>>>>>>>> Hi Remi,
> >>>>>>>>>>
> >>>>>>>>>> But how does volatile help?
> >>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
> >>>>>>>>>> always the
> >>>>>>>>>> right value.
> >>>>>>>>>> But the backtrace may not be changed until I read it in
> >>>>>>>>>> getExtendedNPEMessage.? The other thread could change it
> after
> >>>>>>>>>> checking numStackTracesFilledIn and before I read the
> backtrace.
> >>>>>>>>>
> >>>>>>>>> True. To ensure you process the original backtrace only you
> >>>>>>>>> need to
> >>>>> add
> >>>>>>>>> synchronization in getMessage():
> >>>>>>>>>
> >>>>>>>>> ?????????? public String getMessage() {
> >>>>>>>>> ?????????????? String message = super.getMessage();
> >>>>>>>>> ?????????????? // If the stack trace was changed the extended
> >>>>>>>>> NPE algorithm
> >>>>>>>>> ?????????????? // will compute a wrong message.
> >>>>>>>>> +???????? synchronized(this) {
> >>>>>>>>> !???????????? if (message == null && numStackTracesFilledIn ==
> >>>>>>>>> 1) {
> >>>>>>>>> !???????????????? return getExtendedNPEMessage();
> >>>>>>>>> !???????????? }
> >>>>>>>>> +???????? }
> >>>>>>>>> ?????????????? return message;
> >>>>>>>>> ?????????? }
> >>>>>>>>>
> >>>>>>>>> To be honest the idea that someone would share an exception
> >>> instance
> >>>>>>> and
> >>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
> >>>>>>>>> information about it just seems highly unrealistic. But the
> >>>>>>>>> above fixes
> >>>>>>>>> it simply. Though after looking at comments in the test I would
> >>>>>>>>> also
> >>>>>>>>> suggest that setStackTrace be updated:
> >>>>>>>>>
> >>>>>>>>> ??????????? synchronized (this) {
> >>>>>>>>> ???????????????? if (this.stackTrace == null && // Immutable stack
> >>>>>>>>> ???????????????????? backtrace == null) // Test for out of
> >>>>>>>>> protocol state
> >>>>>>>>> ???????????????????? return;
> >>>>>>>>> +?????????? numStackTracesFilledIn++;
> >>>>>>>>> ???????????????? this.stackTrace = defensiveCopy;
> >>>>>>>>> ???????????? }
> >>>>>>>>> ???????? }
> >>>>>>>>>
> >>>>>>>>> as that would seem to be another hole in the mechanism.
> >>>>>>>>>
> >>>>>>>>>> I want to vote again for the much more simple version
> >>>>>>>>>> proposed in webrev 02:
> >>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>>>> NPE_fillInStackTrace-
> >>>>>>>>> jdk15/02/
> >>>>>>>>>
> >>>>>>>>> I much prefer the latest version that recognises that only the
> >>>>>>>>> original
> >>>>>>>>> stack can be processed.
> >>>>>>>>>
> >>>>>>>>> In the test:
> >>>>>>>>>
> >>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
> >>>>>>>>> for implicilty
> >>>>>>>>>
> >>>>>>>>> Two typos: crated? & implicilty
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> David
> >>>>>>>>> -----
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> It's drawback is only that for this code:
> >>>>>>>>>> ?????? ex = null;
> >>>>>>>>>> ?????? ex.fillInStackTrace()
> >>>>>>>>>> no message is created.
> >>>>>>>>>>
> >>>>>>>>>> I think this really is acceptable.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Remi, I didn't comment on this statement from a previous mail:
> >>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
> some
> >>>>> point.
> >>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack
> >>>>>>>>>>> trace is
> >>> filled
> >>>>>>> you
> >>>>>>>>> don't
> >>>>>>>>>>> compute any helpful message anyway.
> >>>>>>>>>> The internal structure is no more deleted when the stack trace
> >>>>>>>>>> is filled. So the message can be computed later, too.
> >>>>>>>>>>
> >>>>>>>>>> Best regards,
> >>>>>>>>>> ?????? Goetz.
> >>>>>>>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> >>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
> >>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> Christoph
> >>>>> Dreis
> >>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
> >>>>> runtime-
> >>>>>>>>>>> dev at openjdk.java.net>; David Holmes
> >>> <david.holmes at oracle.com>
> >>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
> NullPointerException
> >>>>>>> message
> >>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>
> >>>>>>>>>>> yes,
> >>>>>>>>>>> it's what i was saying,
> >>>>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
> >>>>>>> initialized,
> >>>>>>>>> i
> >>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the
> >>>>>>>>>>> best
> >>> way
> >>>>> to
> >>>>>>>>>>> tackle that.
> >>>>>>>>>>>
> >>>>>>>>>>> R?mi
> >>>>>>>>>>>
> >>>>>>>>>>> ----- Mail original -----
> >>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> >>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
> >>> "Christoph
> >>>>>>>>> Dreis"
> >>>>>>>>>>> <christoph.dreis at freenet.de>
> >>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
> >>>>> dev at openjdk.java.net>,
> >>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
> >>>>>>>>>>>> <forax at univ-mlv.fr>
> >>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> >>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
> NullPointerException
> >>>>> message
> >>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>
> >>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>>> Hi Christoph,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
> >>>>>>>>>>>>>
> >>>>>>>>>>>> One other thing is that NPE::getMessage reads
> >>>>> numStackTracesFilledIn
> >>>>>>>>>>>> without synchronization.
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Alan

From goetz.lindenmaier at sap.com  Tue Jul 14 14:53:40 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Tue, 14 Jul 2020 14:53:40 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <33FC716B-2934-4DDC-A968-842A9E69F40F@freenet.de>
 <AM4PR0202MB2964C64C51B620EC0931BCB6EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <595ffefe-61d3-5166-bcd7-cf2223047112@oracle.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
Message-ID: <VI1PR0202MB2975AAFC14DE44AA8CC0239FEC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>

Hi David, 

Sorry, so far I only responded to the "correction" mail.

Yes, you describe it exactly as I see it.
To the user it should not be visible that the message is computed
lazy. It should just feel like any other message. The lazy computation
is only meant to improve performance of discarded exceptions.
This is just the same as with the backtrace data structure. 
No one ever sees this on user side, the backtrace -> stacktrace 
conversion is done internally on demand.

The actual NPE message implementation does not honor this 
principle always, but that is how I would liked to have it.
If we go with the latest webrevs, it's according to this principle.

Best regards,
  Goetz.

> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Tuesday, July 14, 2020 4:11 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> Hi Goetz,
> 
> Okay ... if I understand your position correctly you are looking at this
> as if the extended message is created at the time the NPE is thrown, and
> it is an implementation detail that we actually determine it lazily. If
> it were eagerly determined then neither fillInstacktrace() nor
> setStackTrace() would make any difference to the message - just as with
> any other exception message.
> 
> However, the lazy determination of the message causes a problem with
> fillInStackTrace() because that call will destroy the original backtrace
> needed to produce the original message, and create an incorrect message.
> setStackTrace() does not have a similar problem because, simply by the
> way the current implementation works it doesn't touch the original
> backtrace.
> 
> So you are proposing to only fix the bug that is evident in relation to
> fillInStackTrace() by no longer evaluating the extended message if
> fillInStackTrace() is called after the NPE was constructed.
> 
> But in doing so you break the illusion that the extended message acts
> as-if determined at construction time, because you now effectively clear
> it when fillInStackTrace is called.
> 
> My position was that if fillInStackTrace can be seen to clear it, then
> setStackTrace (which is logically somewhat equivalent) should also be
> seen to clear it.
> 
> Alternatively, add a new field to NPE to cache the extended error
> message, and explicitly evaluate the message if fillInStackTrace() is
> called. That will continue the illusion that the extended message was
> actually set at construction time. No changes needed to setStackTrace()
> as we can still lazily compute the extended message.
> 
> Something like:
> 
> private String extendedMessage;
> 
> public synchronized Throwable fillInStackTrace() {
>      if (extendedMessage == NULL) {
>          extendedMessage = getExtendedNPEMessage();
>      }
>      return super.fillInStackTrace();
> }
> 
> public String getMessage() {
>      String message = super.getMessage();
>      synchronized(this) {
>          if (message == null) {
>              // This NPE should have an extended message.
>              if (extendedMessage == NULL) {
>                  extendedMessage = getExtendedNPEMessage();
>              }
>              message = extendedMessage;
>          }
>      }
>      return message;
> }
> 
> Cheers,
> David
> 
> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
> > Hi David,
> >
> >> Your extended message is only computed when there is no original
> message.
> > Hmm. I would say the extended message is only computed when
> > The NPE was raised by the runtime. It happens to never have a
> > message so far in these cases.
> > But this is two views to the same thing ??
> >
> >> You're concerned about this scenario:
> >>
> >> catch (NullPointerException npe) {
> >>     String msg1 = npe.getMessage(); // gets extends NPE message
> >>     npe.setStackTrace(...);
> >>     String msg2 = npe.getMessage(); // gets null
> >> }
> >>
> >> While I find it hard to imagine anyone doing this
> > Well, all the scenario are quite artificial:
> >   - why would you call fillInStackTrace on an exception thrown by the VM?
> >   - why would you call setStackTrace at all?
> >> you can easily have
> >> specified that the extended message is only available with the original
> >> stacktrace, hence after a second call to fillInStackTrace, or a call to
> >> setStackTrace, then the message reverts to being empty.
> > The message is not meant to be a special thing that behaves different
> > from other messages.  Like sometime be available, sometime not.
> > It ended up being different through requirements during the
> > review.
> >
> >> To me that makes
> >> far more sense than having msg2 continue to report the extended info for
> >> the original stacktrace when it now has a new stacktrace.
> >>
> >> I'm really not seeing why calling fillInstackTrace() a second time
> >> should be treated any differently to calling setStackTrace(). They
> >> should be handled consistently IMO.
> > But then you treat setStackTrace() differently from setStackTrace()
> > with other exceptions.
> > The reason to treat fillInStackTrace differently is that we lost information
> > needed to compute it. This is not the case with setStackTrace().
> >
> > A different solution, the one I would have proposed if I had not
> > considered previous comments from reviews,  would be to just
> > compute the message in the runtime in the call of fillInStackTrace
> > before the old stack trace is lost and assign it to the message field.
> > This way it would behave similar to all other exceptions. The message
> > would just be there ... just that it's computed lazily.
> > The cost of the algorithm wouldn't harm that much as other costly
> > algorithms (walking the stack) are performed at this point, too.
> >
> >> We are not talking about all exceptions only about your NPE extended
> >> error message.
> > Hmm, the inconsistency caused by the code you posted above
> > holds for all exceptions.  If you fiddle with the stack trace,
> > the message might become pointless.  Wrt. setStackTrace
> > they all behave the same.
> > Wrt. fillInStackTrace the message will be wrong. Only this
> > needs to be fixed.
> >
> > Best regards,
> >    Goetz.
> >
> >
> >>
> >> David
> >> -----
> >>
> >>> I implemented an example where wrong stack traces are
> >>> printed with LinkageError and NPE, modifying a jtreg test:
> >>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> >> jdk15/05/mess_with_exceptions.patch
> >>> See also the generated output added to a comment in the patch.
> >>> If the NEP message text was missing in the second printout, I think
> >>> this really would be unexpected.
> >>> Please note that the correct message is printed after messing
> >>> with the stack trace, it's the stack trace that is wrong.
> >>> (Not as with the problem I am fixing here where a wrong
> >>> message is printed.)
> >>>
> >>> Best regards,
> >>>     Goetz.
> >>>
> >>>
> >>>
> >>>>
> >>>>> I guess the normal usecase of setStackTrace is the other way around:
> >>>>> Change the message and throw a new exception with the existing
> >>>>> stack trace:
> >>>>>
> >>>>> try {
> >>>>>      a.x;
> >>>>> catch (NullPointerException e) {
> >>>>>      throw new NullPointerException("My own error
> >>>> message").setStackTrace(e.getStackTrace);
> >>>>> }
> >>>>>
> >>>>> And not taking an arbitrary stack trace and put it into an exception
> >>>>> with existing message.
> >>>>
> >>>> Interesting usage.
> >>>>
> >>>> Cheers,
> >>>> David
> >>>> -----
> >>>>
> >>>>> Best regards,
> >>>>>      Goetz.
> >>>>>
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>> Sent: Friday, July 3, 2020 9:30 AM
> >>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> 'forax at univ-
> >>>> mlv.fr'
> >>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-
> dev
> >>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >> message
> >>>>>> after calling fillInStackTrace
> >>>>>>
> >>>>>> Hi Goetz,
> >>>>>>
> >>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>>> True. To ensure you process the original backtrace only you need
> to
> >>>> add
> >>>>>>>> synchronization in getMessage():
> >>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >> NPE_fillInStackTrace-
> >>>>>> jdk15/05/
> >>>>>>>
> >>>>>>> I added the volatile, too, but as I understand the synchronized
> >>>>>>> block brings sufficient memory barriers that this also works
> >>>>>>> without.
> >>>>>>
> >>>>>> No "volatile" needed, or wanted, when all access is within
> synchronized
> >>>>>> regions.
> >>>>>>
> >>>>>>>> To be honest the idea that someone would share an exception
> >> instance
> >>>>>> and
> >>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
> >>>>>>>> information about it just seems highly unrealistic.
> >>>>>>> Yes, contention here is quite unlikely, so it should not harm
> >> performance
> >>>>>> ??
> >>>>>>
> >>>>>> Contention was not my concern at all. :)
> >>>>>>
> >>>>>>>> Though after looking at comments in the test I would also
> >>>>>>>> suggest that setStackTrace be updated:
> >>>>>>> The test shows that after setStackTrace still the correct message
> >>>>>>> is computed. This is because the algorithm uses
> Throwable::backtrace
> >>>>>>> and not Throwable::stacktrace.  Throwable::backtrace is not
> >>>>>>> affected by setStackTrace.
> >>>>>>> The behavior is just as with any exception. If you fiddle
> >>>>>>> with the stack trace, but don't adapt the message text,
> >>>>>>> the message might refer to other code than the stack trace
> >>>>>>> points to.
> >>>>>>
> >>>>>> But you can't adapt the message text - there is no setMessage! If the
> >>>>>> message is NULL and you call setStackTrace() then getMessage(), it
> >> makes
> >>>>>> no sense to return the extended error message that was associated
> >> with
> >>>>>> the original stack/backtrace.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> David
> >>>>>>
> >>>>>>> Best regards,
> >>>>>>>       Goetz.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
> >>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >> 'forax at univ-
> >>>>>> mlv.fr'
> >>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
> runtime-
> >> dev
> >>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >>>> message
> >>>>>>>> after calling fillInStackTrace
> >>>>>>>>
> >>>>>>>> Hi Goetz,
> >>>>>>>>
> >>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> >>>>>>>>> Hi Remi,
> >>>>>>>>>
> >>>>>>>>> But how does volatile help?
> >>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets always
> the
> >>>>>>>>> right value.
> >>>>>>>>> But the backtrace may not be changed until I read it in
> >>>>>>>>> getExtendedNPEMessage.  The other thread could change it after
> >>>>>>>>> checking numStackTracesFilledIn and before I read the backtrace.
> >>>>>>>>
> >>>>>>>> True. To ensure you process the original backtrace only you need
> to
> >>>> add
> >>>>>>>> synchronization in getMessage():
> >>>>>>>>
> >>>>>>>>            public String getMessage() {
> >>>>>>>>                String message = super.getMessage();
> >>>>>>>>                // If the stack trace was changed the extended NPE
> algorithm
> >>>>>>>>                // will compute a wrong message.
> >>>>>>>> +         synchronized(this) {
> >>>>>>>> !             if (message == null && numStackTracesFilledIn == 1) {
> >>>>>>>> !                 return getExtendedNPEMessage();
> >>>>>>>> !             }
> >>>>>>>> +         }
> >>>>>>>>                return message;
> >>>>>>>>            }
> >>>>>>>>
> >>>>>>>> To be honest the idea that someone would share an exception
> >> instance
> >>>>>> and
> >>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
> >>>>>>>> information about it just seems highly unrealistic. But the above
> fixes
> >>>>>>>> it simply. Though after looking at comments in the test I would
> also
> >>>>>>>> suggest that setStackTrace be updated:
> >>>>>>>>
> >>>>>>>>             synchronized (this) {
> >>>>>>>>                  if (this.stackTrace == null && // Immutable stack
> >>>>>>>>                      backtrace == null) // Test for out of protocol state
> >>>>>>>>                      return;
> >>>>>>>> +           numStackTracesFilledIn++;
> >>>>>>>>                  this.stackTrace = defensiveCopy;
> >>>>>>>>              }
> >>>>>>>>          }
> >>>>>>>>
> >>>>>>>> as that would seem to be another hole in the mechanism.
> >>>>>>>>
> >>>>>>>>> I want to vote again for the much more simple version
> >>>>>>>>> proposed in webrev 02:
> >>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>>> NPE_fillInStackTrace-
> >>>>>>>> jdk15/02/
> >>>>>>>>
> >>>>>>>> I much prefer the latest version that recognises that only the
> original
> >>>>>>>> stack can be processed.
> >>>>>>>>
> >>>>>>>> In the test:
> >>>>>>>>
> >>>>>>>> +         // This holds for explicitly crated NPEs, but also for implicilty
> >>>>>>>>
> >>>>>>>> Two typos: crated  & implicilty
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> David
> >>>>>>>> -----
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> It's drawback is only that for this code:
> >>>>>>>>>        ex = null;
> >>>>>>>>>        ex.fillInStackTrace()
> >>>>>>>>> no message is created.
> >>>>>>>>>
> >>>>>>>>> I think this really is acceptable.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Remi, I didn't comment on this statement from a previous mail:
> >>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
> some
> >>>> point.
> >>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack trace is
> >> filled
> >>>>>> you
> >>>>>>>> don't
> >>>>>>>>>> compute any helpful message anyway.
> >>>>>>>>> The internal structure is no more deleted when the stack trace
> >>>>>>>>> is filled. So the message can be computed later, too.
> >>>>>>>>>
> >>>>>>>>> Best regards,
> >>>>>>>>>        Goetz.
> >>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> >>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
> >>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> Christoph
> >>>> Dreis
> >>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
> >>>> runtime-
> >>>>>>>>>> dev at openjdk.java.net>; David Holmes
> >> <david.holmes at oracle.com>
> >>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >>>>>> message
> >>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>
> >>>>>>>>>> yes,
> >>>>>>>>>> it's what i was saying,
> >>>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
> >>>>>> initialized,
> >>>>>>>> i
> >>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the best
> >> way
> >>>> to
> >>>>>>>>>> tackle that.
> >>>>>>>>>>
> >>>>>>>>>> R?mi
> >>>>>>>>>>
> >>>>>>>>>> ----- Mail original -----
> >>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> >>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
> >> "Christoph
> >>>>>>>> Dreis"
> >>>>>>>>>> <christoph.dreis at freenet.de>
> >>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
> >>>> dev at openjdk.java.net>,
> >>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
> >>>>>>>>>>> <forax at univ-mlv.fr>
> >>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> >>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful NullPointerException
> >>>> message
> >>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>
> >>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>> Hi Christoph,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
> >>>>>>>>>>>>
> >>>>>>>>>>> One other thing is that NPE::getMessage reads
> >>>> numStackTracesFilledIn
> >>>>>>>>>>> without synchronization.
> >>>>>>>>>>>
> >>>>>>>>>>> -Alan

From luhenry at microsoft.com  Tue Jul 14 16:34:49 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Tue, 14 Jul 2020 16:34:49 +0000
Subject: RFR(S): Use Vectored Exception Handling on Windows
In-Reply-To: <CAA-vtUw4ZwOHitBtsJLVMMma1D+TVV02xzoWmwd1M2yg-S91DQ@mail.gmail.com>
References: <MWHPR21MB0511A8150D4CAEBF3181E61EB0980@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUw1nEo_o4ayQBv=MJcKFCTXfvY2ThNL1x9evcvT7fuYyg@mail.gmail.com>
 <MWHPR21MB0511F8E1132F81170290209FB0920@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUzh65R01wHTW9-ObQZ7j0vNWjp_RuYivOrpGHoJNtyNgw@mail.gmail.com>
 <MWHPR21MB05117E4D1CBC613EF52991AEB0600@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <CAA-vtUw4ZwOHitBtsJLVMMma1D+TVV02xzoWmwd1M2yg-S91DQ@mail.gmail.com>
Message-ID: <MWHPR21MB0511FC402865B075019D00D7B0610@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Thomas,

This where Windows exception handling and Unix/Linux signals differ. On Windows, you have VEH, SEH and Unhandled Exception Handling (I'll call it UEH here), while on Unix/Linux, you only have signals.

On Windows, by having this split, you can easily split your exception handling into 1. treating expected exceptions (EXCEPTION_ILLEGAL_INSTRUCTION on a deoptimization, EXCEPTION_ACCESS_VIOLATION in arraycopy stub, etc.), and 2. generating an hs_err file on an unexpected exception. You can do 1. with VEH and SEH, and 2. with UEH, and that's what I am proposing to do here.

Practically speaking, the existing `topLevelExceptionFilter` would be split into two: a `topLevelVectoredExceptionFilter` which would be passed to `AddVectoredExceptionHandler`, and a `topLevelUnhandledExceptionHandler` which would be passed to `SetUnhandledExceptionHandler`. This `topLevelUnhandledExceptionHandler` would contain (more or less) _only_ the `VMError::report_and_die`, and the `topLevelVectoredExceptionFilter` would contain _no_ `VMError::report_and_die` whatsoever.

Keeping the `VMError::report_and_die` inside VEH would, like you say, completely kill any use of SEH, even in external libraries. That would be a breaking change, and is then, IMO, not acceptable.

Thanks,

--
Ludovic

________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Monday, July 13, 2020 23:29
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi Ludovic,

On Mon, Jul 13, 2020 at 11:55 PM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com>> wrote:
Hi Thomas,

Thank you for your feedback!

Let me answer on some of the cases you mention.

> A) this case exists today. An app getting signals via VEH would have to willingly ignore signals for us to get them. This does not change, your patch would mean this happens less often, so I do not see a backward compatibility problem here.

Exactly.

> B) this is a new case. We would have to ignore signals not meant for us. Technically by just ignoring them. Distinguishing this is a bit difficult though. Note the subtle difference to Unix: there we have signal chaining, so an application which is really really interested in signals for its own purposes uses it (e.g. by preloading libjsig) and then we know its handler and hand over the signal.

Today, through SEH and RtlAddFunctionTable, we only get a very clear subset of exceptions: the one triggered in the code cache. If an exception is triggered from a PC outside of this code cache, SEH will not get the handler we registered with RtlAddFunctionTable, and we'll simply _not_ call into HandleExceptionFromCodeCache (the handler we register with RtlAddFunctionTable). That can be trivially reproduced in the VEH by simply checking that the PC is between CodeCache::low_bound() and CodeCache::high_bound().

This is what you are mentioning with "we only can distinguish our crashes from their crashes via crash pc, rejecting any crash not in our code (dynamic or static). Well, arguably this would be just how it is today with our code scoped via SEH".


Not sure we understand each other.

Today we get exceptions from two sides:
- via SEH, __try/__except, in threads attached to the VM. There the pc is either us or third party code below us which did not bother setting up SEH for themselves
- via RtlAddFunctionTable for the code cache, where we specify code cache boundaries.

With VEH we would get all exceptions in the process. Including exceptions from threads which have never seen the libjvm, or from caller code if the hotspot is embedded somewhere.

Under Unix we handle all those crashes by writing hs-err crashlogs, even if those crashes are not our responsibility. Unless user set up signal chaining, where we hand over any crash signal to the chained handler (which for the purpose of clear error reporting is also not perfect).

With VEH I get all exceptions, but have to decide on my own if an exception should result in a hs-err file or handed to the next exception handler. The only way I can see is by examining the pc - iterate through all our binaries and compare the pc with their text segments, and also check the code cache.

I may miss something here.

> With the added safety net of the unhandled exception filter (what happens if multiple parties call this?).

Here, Unhandled Exception Handling predates VEH and it doesn't integrate chaining. The API is similar to signals on Linux/Unix: the last one to register has to make sure to save the previous one and to call/chain it accordingly.

> My only very small personal gripe would be that I always liked how I can quickly use SEH to check if a pointer is valid without disturbing anyone. But within the hotspot at least I can just as well use SafeFetch.

Nothing from the Win32 API stops you from mix-and-matching VEH and SEH. If you want to do a `__try { val = *ptr; } __except (EXCEPTION_EXECUTE_HANDLER) { success = false; }` in some C++ code (in vm or native), nothing stops you from doing so. My understanding of the exception handler logic in the OpenJDK on Windows is that the accepted EXCEPTION_ACCESS_VIOLATION in java, vm, or native code is limited to a clear subset, and anything outside of these known cases is quickly treated as "an exception we cannot handle". SafeFetch is such a case where the instructions potentially triggering the EXCEPTION_ACCESS_VIOLATION are matched against by the exception handler.


Well, in your example, VEH would have preference and get the exception first; in our handler we recognize the exception as not allowed, hence a crash, and write a hs-err file. My success=false; handler would never execute.

But I admit this is really a minor point. I also dimly remember seeing some win32 API to check pointers for readability, so maybe using SEH for these things is not necessary anyway.

Thanks, Thomas

--
Ludovic

________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
Sent: Saturday, July 11, 2020 23:08
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi Ludovic,

sorry for the delay, and thanks for the extensive answer. Please find remarks inline.

On Fri, Jun 26, 2020 at 12:11 AM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>>> wrote:
Hi Thomas,

It seems that the problem you're describing stems from the current exception handler treating two cases: 1. any exception knowingly triggered by Java code and treated by HotSpot (ex: safepoint-polling, arraycopy stubs, stackoverflow in Java code), and 2. exceptional cases leading to crashes (ex: uncaught C++ exception, an access violation in VM or native/external code, etc.). There is the same problem on Unix because there is only one system (signal handling) for both cases. Fortunately, Windows proposes different systems, each with its own advantages.

The order in which Windows invokes each of these systems is the following:
 1. Vectored Exception Handler registered with `AddVectoredExceptionHandler`
 2. Structured Exception Handler
 3. Vectored Exception Handler registered with `AddVectoredContinueHandler`
 4. Unhandled Exception Handler

Today, Hotspot on x86/x86_64 catches the exception at 2. via a handler registered with `RtlAddFunctionTable`. This handler does both the Java-triggered exceptions and any other exceptions.

Now, from the point of view of an external library or application embedding the JVM inside their own process, they still have all the above options to register an exception handler, irrespective of how Hotspot does it. This creates the following cases:
 - If the application uses VEH: they will (with Hotspot using SEH) be called _before_ Hotspot's exception handler and will then have to be aware that they may get exceptions unrelated to them and will have to ignore them accordingly
 - If the application uses SEH: they will only get exceptions related to their code area

If Hotspot is to use VEH, an exception would play as follow:
 - If the application uses VEH and their registered handler executes _before_ Hotspot's one: same as above
 - If the application uses VEH and their registered handler executes _after_ Hotspot's one: Hotspot has to make sure that the exception was triggered by Hotspot and ignore them otherwise (a range check on the PC can be used here to emulate how it's done with RltAddFunctionTable)
 - If the application uses SEH: the same case as to where the application's handler executes _after_ Hotspot's one

This all assumes that Hotspot's VEH handler doesn't trigger a crash report (VMError::report_and_die) on any exception it doesn't know how to handle. The simplest way to do that is simply _not_ to do it in Hotspot's VEH handler, and to do it by registering a Win32 Unhandled Exception Handler (with SetUnhandlerdExceptionFilter [1]). This handler is _only_ called when no other exception handler treated the exception (by returning EXCEPTION_CONTINUE_EXECUTION or EXCEPTION_EXECUTE_HANDLER). Invoking it means the application is "toast" and not in a runnable state anymore, which fits nicely with the purpose of the Hotspot crash report.


Okay, If I get this correctly:

Today:
  App uses VEH - they execute before us and have to handle this correctly (->A)
  App uses SEH - no interaction

With proposed switch:
  App uses VEH - they may or may not execute before us. If they come before us: (->A). If they come after us -> (B)
  App uses SEH -> (B)

A) this case exists today. An app getting signals via VEH would have to willingly ignore signals for us to get them. This does not change, your patch would mean this happens less often, so I do not see a backward compatibility problem here.

B) this is a new case. We would have to ignore signals not meant for us. Technically by just ignoring them. Distinguishing this is a bit difficult though. Note the subtle difference to Unix: there we have signal chaining, so an application which is really really interested in signals for its own purposes uses it (e.g. by preloading libjsig) and then we know its handler and hand over the signal.

On windows we do not know this (?), we only can distinguish our crashes from their crashes via crash pc, rejecting any crash not in our code (dynamic or static). Well, arguably this would be just how it is today with our code scoped via SEH. With the added safety net of the unhandled exception filter (what happens if multiple parties call this?).

Okay this seems safe enough to try it at least.

My only very small personal gripe would be that I always liked how I can quickly use SEH to check if a pointer is valid without disturbing anyone. But within the hotspot at least I can just as well use SafeFetch.

Thank you,

Thomas

I hope this sheds some light on possible solutions ahead of us.

Thank you,

--
Ludovic

[1] https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767253164&sdata=7AF3UPjOdK%2Bmgr8OYFiQvsjEYSZ4fQpvLNvATm6pLls%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117672388&sdata=zM0zOUCOujhp2fyW7PVXPplSn13elTyyf4cJUgZj%2Fm8%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767253164&sdata=7AF3UPjOdK%2Bmgr8OYFiQvsjEYSZ4fQpvLNvATm6pLls%3D&reserved=0>>
________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com><mailto:thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>>
Sent: Sunday, June 21, 2020 05:55
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net><mailto:hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi,

We at SAP had used VEH in our own Windows Itanium port and I dimly remember it being a source of problems. That is many years ago and I realize that it is not worth much, but it makes me bit apprehensive of this change.

The main problem I see is that this will be an observable change in behavior.

We currently use SEH, so our error handler is guaranteed to be invoked only for exceptions from within our own code. With VEH we now follow the Unix way of things and suddenly our error handler becomes a global resource.

We will suddenly be invoked for crashes outside the VM, e.g. in foreign launcher code atop of us or in non-java side threads, which will generate whole new classes of hs-err files for crashes the VM is not responsible for. Which are then perceived as VM crashes and sent to us vendors instead of going to the right people. This is the way it works on Unix today, and it is a constant annoyance and increases our support workload.

We also may introduce new problems since suddenly we interfere with application exception handling. At the very least, we have to think up a scheme for signal chaining (both ways: VM->foreign code and foreign code->VM). For the first, we probably need some form of libjsig preloading, or some other way to divert signal handler instalment. That would also need cooperation from the application programmers and/or operators.

Matters are even more complicated, since foreign code may use SEH instead of VEH, so what happens if a JNI library below me wants to use SEH, does that still work?

I feel this should not be rushed. Even considered "brittle" SEH has served us well, I do not recall many problems in the past aside from having to add the occasional __try/__except. Are there actual bugs we have to solve?

Lastly, personally I always found SEH quite a neat concept, and one of the few places where Windows was superior to Unix :)

Thanks, Thomas


On Fri, Jun 19, 2020 at 5:23 PM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>>>> wrote:
Hello,

First, some context and definitions:
- when talking about exception here, I'm talking about Win32 exception which are equivalent to signals on Linux and other Unix, I am _not_ talking about Java exceptions.
- an explanation of an _exception filter_ can be found at https://docs.microsoft.com/en-us/cpp/cpp/writing-an-exception-filter?view=vs-2019<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767263161&sdata=7LKO5ISpYpdDKMysIeYx%2BT6B3o9uFNaY%2FDB924Sr6Vo%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767263161&sdata=7LKO5ISpYpdDKMysIeYx%2BT6B3o9uFNaY%2FDB924Sr6Vo%3D&reserved=0>><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665642403&sdata=fjcrwcQYAg3TstTSO2YHKziszwlusbYV6uUXINydD1E%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767273154&sdata=88xdAtISIFDd52eRNLpr%2BJ8UNHdmXd6oZvdwsEygbZU%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767273154&sdata=88xdAtISIFDd52eRNLpr%2BJ8UNHdmXd6oZvdwsEygbZU%3D&reserved=0>>>. There is only a limited concept of that in Java with type-based exception filter (ex: `try { ... } catch (IOException ioe) { ... } catch (Throwable t) { ... }`).
- in Win32, there exist two exception handling mechanism:
  - Structured Exception Handling: the historical one, based on `__try {} __except (...) {}`
  - Vectored Exception Handling: introduced in Windows XP / Windows Server 2003, much more similar to signals on Linux

These exception handling mechanisms are used to catch any exceptions like Access Violation, Stack Overflow, Divide by Zero, Overflow, and more. These exceptions are equivalent to signal on Linux and are then core to many mechanisms in the OpenJDK.

Today, the OpenJDK uses Structured Exception Handling to catch such exceptions, creating several requirements. First, all code that might trigger an exception on purpose (like a Access Violation / SIGSEGV in the arraycopy stub), needs to be wrapped up in a __try / __except. Because it's not feasible to wrap every single instance of such code, these __try / __except are put at the top-level most function of any thread started by the runtime. Second, for code generated by Hotspot, `RtlAddFunctionTable` is used to simulate the use of __try / __except for a specific code area. This function needs platform specific code with the generation of  a trampoline that calls the exception filter declared in the runtime. It's also meant to be used as a one to one mapping with try / catch in user code, and not as a "catch all the exceptions in this code area". Third, Structured Exception Handling expects to be able to unwind the stack. However, because Hotspot doesn't guarantee the usage of the platform-specific ABI internally, the platform-specific unwinder might break. Hotspot's usage of `RtlAddFunctionTable` for the code cache relies on the assumption that Structured Exception Handling never tries to unwind the stack (which it would fail to do because of the different ABI) before calling the registered exception filter.

Discussing that with Windows Kernel maintainers, this approach is highly discouraged, considered brittle, and the better solution is Vectored Exception Handling. Vectored Exception Handling is conceptually much more similar to signal / sigaction on Linux and other Unix systems. It will catch all exceptions happening across the process, and no __try / __except will be required. It also removes the requirement to call `RtlAddFunctionTable`.  The exception filter then behaves like a signal handler with the possibility to modify the registers at will, modifying the PC to step over an instruction after an expected Access Violation for example. Vectored Exception Handling is also already used for AOT code.

The changes can be found at http://cr.openjdk.java.net/~burban/ludovic_vecexc/<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767283147&sdata=d5JQScm01HijYY5AxVwV2AEjAr%2BuX90MxOGlpfj0lA8%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767283147&sdata=d5JQScm01HijYY5AxVwV2AEjAr%2BuX90MxOGlpfj0lA8%3D&reserved=0>><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665652395&sdata=pTewy1%2BeB43HX4y0ypDwMDGRjBoNP6yBGrhRi7ncm1c%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767293145&sdata=SVmMjP8BRzSq1mm%2FG14cQRwiSqgTbx%2Bu8ZpeA1QjhFk%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767293145&sdata=SVmMjP8BRzSq1mm%2FG14cQRwiSqgTbx%2Bu8ZpeA1QjhFk%3D&reserved=0>>>. As I am not an author, I have not created a corresponding bug in JBS.

Thank you, and looking forward for your feedback!

--
Ludovic


From forax at univ-mlv.fr  Tue Jul 14 17:30:34 2020
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 14 Jul 2020 19:30:34 +0200 (CEST)
Subject: [15] RFR: 8248476: No helpful NullPointerException message
 after calling fillInStackTrace
In-Reply-To: <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
Message-ID: <1277163916.614880.1594747834571.JavaMail.zimbra@u-pem.fr>

----- Mail original -----
> De: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>
> ?: "David Holmes" <david.holmes at oracle.com>, "Remi Forax" <forax at univ-mlv.fr>, "Alan Bateman" <Alan.Bateman at oracle.com>
> Cc: "Christoph Dreis" <christoph.dreis at freenet.de>, "hotspot-runtime-dev" <hotspot-runtime-dev at openjdk.java.net>
> Envoy?: Mardi 14 Juillet 2020 15:48:26
> Objet: RE: [15] RFR: 8248476: No helpful NullPointerException message after calling fillInStackTrace

> Hi,
> 
> Yes, Coleen, you are right. We must preserve the lazy
> computation, and also reduce overhead on discarded
> exceptions.
> 
> And yes, we can do it with a counter:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06/
> but I would prefer placeholder strings:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07/
> This way we need only one new field.
> 
> (I need two placeholders, because the getExtendedNPEMessage0()
> sometimes returns null. If I write null into the extendedMessage field,
> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
> time.)
> 
> With webrev 07 the overhead on discarded exceptions is basically the
> same as with webrev 05: one additional field, one assignment in
> fillInStackTrace().
> 
> What do you think?

Hi Goetz,
this is a review for the v07,
the static final fields should be in uppercase to make the code more readable,
you need to use new String("1") because "1" may be a valid string, i also think that "1" and "2" are not explicit enough, using something like "MUST_COMPUTE_EXTENTED_NPE_MESSAGE" seems better IMO.

i don't think you need to declare extendedMessage volatile, it is only accessed inside a synchronized block on this.

in getMessage, you can use a early return to simplify the code shape
  synchronized(this) {
    if (extendedMessage == mustComputeExtendedNPEMessage) {
       // Only the original stack trace was filled in. Message will
       // compute correctly.
       return extendedMessage = getExtendedNPEMessage();   // <-- HERE
    }
    ...

> 
> Best regards,
>  Goetz.

regards,
R?mi

> 
> 
> 
> 
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Tuesday, July 14, 2020 1:55 PM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>> 
>> Correction ...
>> 
>> On 14/07/2020 12:11 pm, David Holmes wrote:
>> > Hi Goetz,
>> >
>> > Okay ... if I understand your position correctly you are looking at this
>> > as if the extended message is created at the time the NPE is thrown, and
>> > it is an implementation detail that we actually determine it lazily. If
>> > it were eagerly determined then neither fillInstacktrace() nor
>> > setStackTrace() would make any difference to the message - just as with
>> > any other exception message.
>> >
>> > However, the lazy determination of the message causes a problem with
>> > fillInStackTrace() because that call will destroy the original backtrace
>> > needed to produce the original message, and create an incorrect message.
>> > setStackTrace() does not have a similar problem because, simply by the
>> > way the current implementation works it doesn't touch the original
>> > backtrace.
>> >
>> > So you are proposing to only fix the bug that is evident in relation to
>> > fillInStackTrace() by no longer evaluating the extended message if
>> > fillInStackTrace() is called after the NPE was constructed.
>> >
>> > But in doing so you break the illusion that the extended message acts
>> > as-if determined at construction time, because you now effectively clear
>> > it when fillInStackTrace is called.
>> >
>> > My position was that if fillInStackTrace can be seen to clear it, then
>> > setStackTrace (which is logically somewhat equivalent) should also be
>> > seen to clear it.
>> >
>> > Alternatively, add a new field to NPE to cache the extended error
>> > message, and explicitly evaluate the message if fillInStackTrace() is
>> > called. That will continue the illusion that the extended message was
>> > actually set at construction time. No changes needed to setStackTrace()
>> > as we can still lazily compute the extended message.
>> >
>> > Something like:
>> >
>> > private String extendedMessage;
>> >
>> > public synchronized Throwable fillInStackTrace() {
>> >  ??? if (extendedMessage == NULL) {
>> >  ??????? extendedMessage = getExtendedNPEMessage();
>> >  ??? }
>> >  ??? return super.fillInStackTrace();
>> > }
>> 
>> Coleen pointed out to me that we can't do it like this because we need
>> the initial fillInStacktrace to be fast and we want the extended message
>> computed lazily. So it will still need a counter so we only do this on
>> the second call.
>> 
>> 
>>   private String extendedMessage;
>>   private int fillInCount;
>> 
>>   public synchronized Throwable fillInStackTrace() {
>>        if (extendedMessage == NULL && (fillInCount++ == 1)) {
>>            extendedMessage = getExtendedNPEMessage();
>>        }
>>        return super.fillInStackTrace();
>>   }
>> 
>> or something to that effect.
>> 
>> David
>> -----
>> 
>> > public String getMessage() {
>> >  ??? String message = super.getMessage();
>> >  ??? synchronized(this) {
>> >  ??????? if (message == null) {
>> >  ??????????? // This NPE should have an extended message.
>> >  ??????????? if (extendedMessage == NULL) {
>> >  ??????????????? extendedMessage = getExtendedNPEMessage();
>> >  ??????????? }
>> >  ??????????? message = extendedMessage;
>> >  ??????? }
>> >  ??? }
>> >  ??? return message;
>> > }
>> >
>> > Cheers,
>> > David
>> >
>> > On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>> >> Hi David,
>> >>
>> >>> Your extended message is only computed when there is no original
>> >>> message.
>> >> Hmm. I would say the extended message is only computed when
>> >> The NPE was raised by the runtime. It happens to never have a
>> >> message so far in these cases.
>> >> But this is two views to the same thing ??
>> >>
>> >>> You're concerned about this scenario:
>> >>>
>> >>> catch (NullPointerException npe) {
>> >>> ??? String msg1 = npe.getMessage(); // gets extends NPE message
>> >>> ??? npe.setStackTrace(...);
>> >>> ??? String msg2 = npe.getMessage(); // gets null
>> >>> }
>> >>>
>> >>> While I find it hard to imagine anyone doing this
>> >> Well, all the scenario are quite artificial:
>> >> ? - why would you call fillInStackTrace on an exception thrown by the VM?
>> >> ? - why would you call setStackTrace at all?
>> >>> you can easily have
>> >>> specified that the extended message is only available with the original
>> >>> stacktrace, hence after a second call to fillInStackTrace, or a call to
>> >>> setStackTrace, then the message reverts to being empty.
>> >> The message is not meant to be a special thing that behaves different
>> >> from other messages.? Like sometime be available, sometime not.
>> >> It ended up being different through requirements during the
>> >> review.
>> >>
>> >>> To me that makes
>> >>> far more sense than having msg2 continue to report the extended info
>> for
>> >>> the original stacktrace when it now has a new stacktrace.
>> >>>
>> >>> I'm really not seeing why calling fillInstackTrace() a second time
>> >>> should be treated any differently to calling setStackTrace(). They
>> >>> should be handled consistently IMO.
>> >> But then you treat setStackTrace() differently from setStackTrace()
>> >> with other exceptions.
>> >> The reason to treat fillInStackTrace differently is that we lost
>> >> information
>> >> needed to compute it. This is not the case with setStackTrace().
>> >>
>> >> A different solution, the one I would have proposed if I had not
>> >> considered previous comments from reviews,? would be to just
>> >> compute the message in the runtime in the call of fillInStackTrace
>> >> before the old stack trace is lost and assign it to the message field.
>> >> This way it would behave similar to all other exceptions. The message
>> >> would just be there ... just that it's computed lazily.
>> >> The cost of the algorithm wouldn't harm that much as other costly
>> >> algorithms (walking the stack) are performed at this point, too.
>> >>
>> >>> We are not talking about all exceptions only about your NPE extended
>> >>> error message.
>> >> Hmm, the inconsistency caused by the code you posted above
>> >> holds for all exceptions.? If you fiddle with the stack trace,
>> >> the message might become pointless.? Wrt. setStackTrace
>> >> they all behave the same.
>> >> Wrt. fillInStackTrace the message will be wrong. Only this
>> >> needs to be fixed.
>> >>
>> >> Best regards,
>> >> ?? Goetz.
>> >>
>> >>
>> >>>
>> >>> David
>> >>> -----
>> >>>
>> >>>> I implemented an example where wrong stack traces are
>> >>>> printed with LinkageError and NPE, modifying a jtreg test:
>> >>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>> NPE_fillInStackTrace-
>> >>> jdk15/05/mess_with_exceptions.patch
>> >>>> See also the generated output added to a comment in the patch.
>> >>>> If the NEP message text was missing in the second printout, I think
>> >>>> this really would be unexpected.
>> >>>> Please note that the correct message is printed after messing
>> >>>> with the stack trace, it's the stack trace that is wrong.
>> >>>> (Not as with the problem I am fixing here where a wrong
>> >>>> message is printed.)
>> >>>>
>> >>>> Best regards,
>> >>>> ??? Goetz.
>> >>>>
>> >>>>
>> >>>>
>> >>>>>
>> >>>>>> I guess the normal usecase of setStackTrace is the other way around:
>> >>>>>> Change the message and throw a new exception with the existing
>> >>>>>> stack trace:
>> >>>>>>
>> >>>>>> try {
>> >>>>>> ???? a.x;
>> >>>>>> catch (NullPointerException e) {
>> >>>>>> ???? throw new NullPointerException("My own error
>> >>>>> message").setStackTrace(e.getStackTrace);
>> >>>>>> }
>> >>>>>>
>> >>>>>> And not taking an arbitrary stack trace and put it into an exception
>> >>>>>> with existing message.
>> >>>>>
>> >>>>> Interesting usage.
>> >>>>>
>> >>>>> Cheers,
>> >>>>> David
>> >>>>> -----
>> >>>>>
>> >>>>>> Best regards,
>> >>>>>> ???? Goetz.
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>> -----Original Message-----
>> >>>>>>> From: David Holmes <david.holmes at oracle.com>
>> >>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>> >>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> 'forax at univ-
>> >>>>> mlv.fr'
>> >>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>> >>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
>> >>>>>>> hotspot-runtime-dev
>> >>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>> >>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>> >>> message
>> >>>>>>> after calling fillInStackTrace
>> >>>>>>>
>> >>>>>>> Hi Goetz,
>> >>>>>>>
>> >>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>> >>>>>>>> Hi,
>> >>>>>>>>
>> >>>>>>>>> True. To ensure you process the original backtrace only you
>> >>>>>>>>> need to
>> >>>>> add
>> >>>>>>>>> synchronization in getMessage():
>> >>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>> >>> NPE_fillInStackTrace-
>> >>>>>>> jdk15/05/
>> >>>>>>>>
>> >>>>>>>> I added the volatile, too, but as I understand the synchronized
>> >>>>>>>> block brings sufficient memory barriers that this also works
>> >>>>>>>> without.
>> >>>>>>>
>> >>>>>>> No "volatile" needed, or wanted, when all access is within
>> >>>>>>> synchronized
>> >>>>>>> regions.
>> >>>>>>>
>> >>>>>>>>> To be honest the idea that someone would share an exception
>> >>> instance
>> >>>>>>> and
>> >>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>> >>>>>>>>> information about it just seems highly unrealistic.
>> >>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>> >>> performance
>> >>>>>>> ??
>> >>>>>>>
>> >>>>>>> Contention was not my concern at all. :)
>> >>>>>>>
>> >>>>>>>>> Though after looking at comments in the test I would also
>> >>>>>>>>> suggest that setStackTrace be updated:
>> >>>>>>>> The test shows that after setStackTrace still the correct message
>> >>>>>>>> is computed. This is because the algorithm uses
>> >>>>>>>> Throwable::backtrace
>> >>>>>>>> and not Throwable::stacktrace.? Throwable::backtrace is not
>> >>>>>>>> affected by setStackTrace.
>> >>>>>>>> The behavior is just as with any exception. If you fiddle
>> >>>>>>>> with the stack trace, but don't adapt the message text,
>> >>>>>>>> the message might refer to other code than the stack trace
>> >>>>>>>> points to.
>> >>>>>>>
>> >>>>>>> But you can't adapt the message text - there is no setMessage! If
>> >>>>>>> the
>> >>>>>>> message is NULL and you call setStackTrace() then getMessage(), it
>> >>> makes
>> >>>>>>> no sense to return the extended error message that was associated
>> >>> with
>> >>>>>>> the original stack/backtrace.
>> >>>>>>>
>> >>>>>>> Cheers,
>> >>>>>>> David
>> >>>>>>>
>> >>>>>>>> Best regards,
>> >>>>>>>> ????? Goetz.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>> -----Original Message-----
>> >>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>> >>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>> >>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> >>> 'forax at univ-
>> >>>>>>> mlv.fr'
>> >>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>> >>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
>> runtime-
>> >>> dev
>> >>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>> >>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>> >>>>> message
>> >>>>>>>>> after calling fillInStackTrace
>> >>>>>>>>>
>> >>>>>>>>> Hi Goetz,
>> >>>>>>>>>
>> >>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>> >>>>>>>>>> Hi Remi,
>> >>>>>>>>>>
>> >>>>>>>>>> But how does volatile help?
>> >>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
>> >>>>>>>>>> always the
>> >>>>>>>>>> right value.
>> >>>>>>>>>> But the backtrace may not be changed until I read it in
>> >>>>>>>>>> getExtendedNPEMessage.? The other thread could change it
>> after
>> >>>>>>>>>> checking numStackTracesFilledIn and before I read the
>> backtrace.
>> >>>>>>>>>
>> >>>>>>>>> True. To ensure you process the original backtrace only you
>> >>>>>>>>> need to
>> >>>>> add
>> >>>>>>>>> synchronization in getMessage():
>> >>>>>>>>>
>> >>>>>>>>> ?????????? public String getMessage() {
>> >>>>>>>>> ?????????????? String message = super.getMessage();
>> >>>>>>>>> ?????????????? // If the stack trace was changed the extended
>> >>>>>>>>> NPE algorithm
>> >>>>>>>>> ?????????????? // will compute a wrong message.
>> >>>>>>>>> +???????? synchronized(this) {
>> >>>>>>>>> !???????????? if (message == null && numStackTracesFilledIn ==
>> >>>>>>>>> 1) {
>> >>>>>>>>> !???????????????? return getExtendedNPEMessage();
>> >>>>>>>>> !???????????? }
>> >>>>>>>>> +???????? }
>> >>>>>>>>> ?????????????? return message;
>> >>>>>>>>> ?????????? }
>> >>>>>>>>>
>> >>>>>>>>> To be honest the idea that someone would share an exception
>> >>> instance
>> >>>>>>> and
>> >>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>> >>>>>>>>> information about it just seems highly unrealistic. But the
>> >>>>>>>>> above fixes
>> >>>>>>>>> it simply. Though after looking at comments in the test I would
>> >>>>>>>>> also
>> >>>>>>>>> suggest that setStackTrace be updated:
>> >>>>>>>>>
>> >>>>>>>>> ??????????? synchronized (this) {
>> >>>>>>>>> ???????????????? if (this.stackTrace == null && // Immutable stack
>> >>>>>>>>> ???????????????????? backtrace == null) // Test for out of
>> >>>>>>>>> protocol state
>> >>>>>>>>> ???????????????????? return;
>> >>>>>>>>> +?????????? numStackTracesFilledIn++;
>> >>>>>>>>> ???????????????? this.stackTrace = defensiveCopy;
>> >>>>>>>>> ???????????? }
>> >>>>>>>>> ???????? }
>> >>>>>>>>>
>> >>>>>>>>> as that would seem to be another hole in the mechanism.
>> >>>>>>>>>
>> >>>>>>>>>> I want to vote again for the much more simple version
>> >>>>>>>>>> proposed in webrev 02:
>> >>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>> >>>>> NPE_fillInStackTrace-
>> >>>>>>>>> jdk15/02/
>> >>>>>>>>>
>> >>>>>>>>> I much prefer the latest version that recognises that only the
>> >>>>>>>>> original
>> >>>>>>>>> stack can be processed.
>> >>>>>>>>>
>> >>>>>>>>> In the test:
>> >>>>>>>>>
>> >>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
>> >>>>>>>>> for implicilty
>> >>>>>>>>>
>> >>>>>>>>> Two typos: crated? & implicilty
>> >>>>>>>>>
>> >>>>>>>>> Thanks,
>> >>>>>>>>> David
>> >>>>>>>>> -----
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>> It's drawback is only that for this code:
>> >>>>>>>>>> ?????? ex = null;
>> >>>>>>>>>> ?????? ex.fillInStackTrace()
>> >>>>>>>>>> no message is created.
>> >>>>>>>>>>
>> >>>>>>>>>> I think this really is acceptable.
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Remi, I didn't comment on this statement from a previous mail:
>> >>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
>> some
>> >>>>> point.
>> >>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack
>> >>>>>>>>>>> trace is
>> >>> filled
>> >>>>>>> you
>> >>>>>>>>> don't
>> >>>>>>>>>>> compute any helpful message anyway.
>> >>>>>>>>>> The internal structure is no more deleted when the stack trace
>> >>>>>>>>>> is filled. So the message can be computed later, too.
>> >>>>>>>>>>
>> >>>>>>>>>> Best regards,
>> >>>>>>>>>> ?????? Goetz.
>> >>>>>>>>>>
>> >>>>>>>>>>> -----Original Message-----
>> >>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>> >>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>> >>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>> >>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> Christoph
>> >>>>> Dreis
>> >>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>> >>>>> runtime-
>> >>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>> >>> <david.holmes at oracle.com>
>> >>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>> >>>>>>> message
>> >>>>>>>>>>> after calling fillInStackTrace
>> >>>>>>>>>>>
>> >>>>>>>>>>> yes,
>> >>>>>>>>>>> it's what i was saying,
>> >>>>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
>> >>>>>>> initialized,
>> >>>>>>>>> i
>> >>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the
>> >>>>>>>>>>> best
>> >>> way
>> >>>>> to
>> >>>>>>>>>>> tackle that.
>> >>>>>>>>>>>
>> >>>>>>>>>>> R?mi
>> >>>>>>>>>>>
>> >>>>>>>>>>> ----- Mail original -----
>> >>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>> >>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>> >>> "Christoph
>> >>>>>>>>> Dreis"
>> >>>>>>>>>>> <christoph.dreis at freenet.de>
>> >>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>> >>>>> dev at openjdk.java.net>,
>> >>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>> >>>>>>>>>>>> <forax at univ-mlv.fr>
>> >>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>> >>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>> >>>>> message
>> >>>>>>>>>>> after calling fillInStackTrace
>> >>>>>>>>>>>
>> >>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>> >>>>>>>>>>>>> Hi Christoph,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>> One other thing is that NPE::getMessage reads
>> >>>>> numStackTracesFilledIn
>> >>>>>>>>>>>> without synchronization.
>> >>>>>>>>>>>>
> > >>>>>>>>>>>> -Alan

From mandy.chung at oracle.com  Tue Jul 14 18:08:39 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Tue, 14 Jul 2020 11:08:39 -0700
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
Message-ID: <f8b4cbf4-9dc2-0d9d-552d-0dc6f16a0868@oracle.com>

Hi Goetz,

I'm okay with this solution as if NPE instance is created with the 
extended message and therefore it will always print the same message.

As fillInStackTrace is always called from the constructor that will set 
Throwable::backtrace to non-null, it's simpler to add a package-private 
Throwable::isBackTraceFilled that returns backtrace != null;

This way you only need one sentinel value to indicate no extended 
message and that could be an empty string that is simpler. ? As Remi 
noted, extendedMessage field does not have to be volatile.

Something like this or a similar revision to webrev.07:

diff --git 
a/src/java.base/share/classes/java/lang/NullPointerException.java 
b/src/java.base/share/classes/java/lang/NullPointerException.java
--- a/src/java.base/share/classes/java/lang/NullPointerException.java
+++ b/src/java.base/share/classes/java/lang/NullPointerException.java
@@ -70,6 +70,23 @@
 ???????? super(s);
 ???? }

+??? private transient String extendedMessage;
+
+??? /**
+???? * {@inheritDoc}
+???? */
+??? public synchronized Throwable fillInStackTrace() {
+??????? // If the stack trace is changed the extended NPE algorithm
+??????? // will compute a wrong message. So compute it beforehand.
+??????? if (isBackTraceFilled() && extendedMessage == null) {
+??????????? String msg = getExtendedNPEMessage();
+???????????? if (msg == null) {
+???????????????? extendedMessage = "";
+???????????? }
+??????? }
+??????? return super.fillInStackTrace();
+??? }
+
 ???? /**
 ????? * Returns the detail message string of this throwable.
 ????? *
@@ -89,7 +106,16 @@
 ???? public String getMessage() {
 ???????? String message = super.getMessage();
 ???????? if (message == null) {
-??????????? return getExtendedNPEMessage();
+??????????? synchronized(this) {
+??????????????? if (extendedMessage == null) {
+??????????????????? // Only the original stack trace was filled in. 
Message will
+??????????????????? // compute correctly.
+??????????????????? message = getExtendedNPEMessage();
+??????????????????? extendedMessage = message != null ? message : "";
+??????????????? } else {
+??????????????????? message = extendedMessage.isEmpty() ? null : 
extendedMessage;
+??????????????? }
+??????????? }
 ???????? }
 ???????? return message;
 ???? }
diff --git a/src/java.base/share/classes/java/lang/Throwable.java 
b/src/java.base/share/classes/java/lang/Throwable.java
--- a/src/java.base/share/classes/java/lang/Throwable.java
+++ b/src/java.base/share/classes/java/lang/Throwable.java
@@ -478,6 +478,10 @@
 ???????? this.cause = t;
 ???? }

+??? /* package-private */ boolean isBackTraceFilled() {
+??????? return backtrace != null;
+??? }
+
 ???? /**
 ????? * Returns a short description of this throwable.
 ????? * The result is the concatenation of:

Mandy

On 7/14/20 6:48 AM, Lindenmaier, Goetz wrote:
> Hi,
>
> Yes, Coleen, you are right. We must preserve the lazy
> computation, and also reduce overhead on discarded
> exceptions.
>
> And yes, we can do it with a counter:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06/
> but I would prefer placeholder strings:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07/
> This way we need only one new field.
>
> (I need two placeholders, because the getExtendedNPEMessage0()
> sometimes returns null. If I write null into the extendedMessage field,
> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
> time.)
>
> With webrev 07 the overhead on discarded exceptions is basically the
> same as with webrev 05: one additional field, one assignment in fillInStackTrace().
>
> What do you think?
>
> Best regards,
>    Goetz.
>
>
>
>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Tuesday, July 14, 2020 1:55 PM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> Correction ...
>>
>> On 14/07/2020 12:11 pm, David Holmes wrote:
>>> Hi Goetz,
>>>
>>> Okay ... if I understand your position correctly you are looking at this
>>> as if the extended message is created at the time the NPE is thrown, and
>>> it is an implementation detail that we actually determine it lazily. If
>>> it were eagerly determined then neither fillInstacktrace() nor
>>> setStackTrace() would make any difference to the message - just as with
>>> any other exception message.
>>>
>>> However, the lazy determination of the message causes a problem with
>>> fillInStackTrace() because that call will destroy the original backtrace
>>> needed to produce the original message, and create an incorrect message.
>>> setStackTrace() does not have a similar problem because, simply by the
>>> way the current implementation works it doesn't touch the original
>>> backtrace.
>>>
>>> So you are proposing to only fix the bug that is evident in relation to
>>> fillInStackTrace() by no longer evaluating the extended message if
>>> fillInStackTrace() is called after the NPE was constructed.
>>>
>>> But in doing so you break the illusion that the extended message acts
>>> as-if determined at construction time, because you now effectively clear
>>> it when fillInStackTrace is called.
>>>
>>> My position was that if fillInStackTrace can be seen to clear it, then
>>> setStackTrace (which is logically somewhat equivalent) should also be
>>> seen to clear it.
>>>
>>> Alternatively, add a new field to NPE to cache the extended error
>>> message, and explicitly evaluate the message if fillInStackTrace() is
>>> called. That will continue the illusion that the extended message was
>>> actually set at construction time. No changes needed to setStackTrace()
>>> as we can still lazily compute the extended message.
>>>
>>> Something like:
>>>
>>> private String extendedMessage;
>>>
>>> public synchronized Throwable fillInStackTrace() {
>>>   ??? if (extendedMessage == NULL) {
>>>   ??????? extendedMessage = getExtendedNPEMessage();
>>>   ??? }
>>>   ??? return super.fillInStackTrace();
>>> }
>> Coleen pointed out to me that we can't do it like this because we need
>> the initial fillInStacktrace to be fast and we want the extended message
>> computed lazily. So it will still need a counter so we only do this on
>> the second call.
>>
>>
>>    private String extendedMessage;
>>    private int fillInCount;
>>
>>    public synchronized Throwable fillInStackTrace() {
>>         if (extendedMessage == NULL && (fillInCount++ == 1)) {
>>             extendedMessage = getExtendedNPEMessage();
>>         }
>>         return super.fillInStackTrace();
>>    }
>>
>> or something to that effect.
>>
>> David
>> -----
>>
>>> public String getMessage() {
>>>   ??? String message = super.getMessage();
>>>   ??? synchronized(this) {
>>>   ??????? if (message == null) {
>>>   ??????????? // This NPE should have an extended message.
>>>   ??????????? if (extendedMessage == NULL) {
>>>   ??????????????? extendedMessage = getExtendedNPEMessage();
>>>   ??????????? }
>>>   ??????????? message = extendedMessage;
>>>   ??????? }
>>>   ??? }
>>>   ??? return message;
>>> }
>>>
>>> Cheers,
>>> David
>>>
>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>>>> Hi David,
>>>>
>>>>> Your extended message is only computed when there is no original
>>>>> message.
>>>> Hmm. I would say the extended message is only computed when
>>>> The NPE was raised by the runtime. It happens to never have a
>>>> message so far in these cases.
>>>> But this is two views to the same thing ??
>>>>
>>>>> You're concerned about this scenario:
>>>>>
>>>>> catch (NullPointerException npe) {
>>>>>  ??? String msg1 = npe.getMessage(); // gets extends NPE message
>>>>>  ??? npe.setStackTrace(...);
>>>>>  ??? String msg2 = npe.getMessage(); // gets null
>>>>> }
>>>>>
>>>>> While I find it hard to imagine anyone doing this
>>>> Well, all the scenario are quite artificial:
>>>>  ? - why would you call fillInStackTrace on an exception thrown by the VM?
>>>>  ? - why would you call setStackTrace at all?
>>>>> you can easily have
>>>>> specified that the extended message is only available with the original
>>>>> stacktrace, hence after a second call to fillInStackTrace, or a call to
>>>>> setStackTrace, then the message reverts to being empty.
>>>> The message is not meant to be a special thing that behaves different
>>>> from other messages.? Like sometime be available, sometime not.
>>>> It ended up being different through requirements during the
>>>> review.
>>>>
>>>>> To me that makes
>>>>> far more sense than having msg2 continue to report the extended info
>> for
>>>>> the original stacktrace when it now has a new stacktrace.
>>>>>
>>>>> I'm really not seeing why calling fillInstackTrace() a second time
>>>>> should be treated any differently to calling setStackTrace(). They
>>>>> should be handled consistently IMO.
>>>> But then you treat setStackTrace() differently from setStackTrace()
>>>> with other exceptions.
>>>> The reason to treat fillInStackTrace differently is that we lost
>>>> information
>>>> needed to compute it. This is not the case with setStackTrace().
>>>>
>>>> A different solution, the one I would have proposed if I had not
>>>> considered previous comments from reviews,? would be to just
>>>> compute the message in the runtime in the call of fillInStackTrace
>>>> before the old stack trace is lost and assign it to the message field.
>>>> This way it would behave similar to all other exceptions. The message
>>>> would just be there ... just that it's computed lazily.
>>>> The cost of the algorithm wouldn't harm that much as other costly
>>>> algorithms (walking the stack) are performed at this point, too.
>>>>
>>>>> We are not talking about all exceptions only about your NPE extended
>>>>> error message.
>>>> Hmm, the inconsistency caused by the code you posted above
>>>> holds for all exceptions.? If you fiddle with the stack trace,
>>>> the message might become pointless.? Wrt. setStackTrace
>>>> they all behave the same.
>>>> Wrt. fillInStackTrace the message will be wrong. Only this
>>>> needs to be fixed.
>>>>
>>>> Best regards,
>>>>  ?? Goetz.
>>>>
>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> I implemented an example where wrong stack traces are
>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>> NPE_fillInStackTrace-
>>>>> jdk15/05/mess_with_exceptions.patch
>>>>>> See also the generated output added to a comment in the patch.
>>>>>> If the NEP message text was missing in the second printout, I think
>>>>>> this really would be unexpected.
>>>>>> Please note that the correct message is printed after messing
>>>>>> with the stack trace, it's the stack trace that is wrong.
>>>>>> (Not as with the problem I am fixing here where a wrong
>>>>>> message is printed.)
>>>>>>
>>>>>> Best regards,
>>>>>>  ??? Goetz.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>> I guess the normal usecase of setStackTrace is the other way around:
>>>>>>>> Change the message and throw a new exception with the existing
>>>>>>>> stack trace:
>>>>>>>>
>>>>>>>> try {
>>>>>>>>  ???? a.x;
>>>>>>>> catch (NullPointerException e) {
>>>>>>>>  ???? throw new NullPointerException("My own error
>>>>>>> message").setStackTrace(e.getStackTrace);
>>>>>>>> }
>>>>>>>>
>>>>>>>> And not taking an arbitrary stack trace and put it into an exception
>>>>>>>> with existing message.
>>>>>>> Interesting usage.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>  ???? Goetz.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> 'forax at univ-
>>>>>>> mlv.fr'
>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
>>>>>>>>> hotspot-runtime-dev
>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>> message
>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>
>>>>>>>>> Hi Goetz,
>>>>>>>>>
>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>> need to
>>>>>>> add
>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>> NPE_fillInStackTrace-
>>>>>>>>> jdk15/05/
>>>>>>>>>> I added the volatile, too, but as I understand the synchronized
>>>>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>>>>> without.
>>>>>>>>> No "volatile" needed, or wanted, when all access is within
>>>>>>>>> synchronized
>>>>>>>>> regions.
>>>>>>>>>
>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>> instance
>>>>>>>>> and
>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>>>>> information about it just seems highly unrealistic.
>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>>>>> performance
>>>>>>>>> ??
>>>>>>>>>
>>>>>>>>> Contention was not my concern at all. :)
>>>>>>>>>
>>>>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>> The test shows that after setStackTrace still the correct message
>>>>>>>>>> is computed. This is because the algorithm uses
>>>>>>>>>> Throwable::backtrace
>>>>>>>>>> and not Throwable::stacktrace.? Throwable::backtrace is not
>>>>>>>>>> affected by setStackTrace.
>>>>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>>>>> the message might refer to other code than the stack trace
>>>>>>>>>> points to.
>>>>>>>>> But you can't adapt the message text - there is no setMessage! If
>>>>>>>>> the
>>>>>>>>> message is NULL and you call setStackTrace() then getMessage(), it
>>>>> makes
>>>>>>>>> no sense to return the extended error message that was associated
>>>>> with
>>>>>>>>> the original stack/backtrace.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>  ????? Goetz.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>> 'forax at univ-
>>>>>>>>> mlv.fr'
>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
>> runtime-
>>>>> dev
>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>>> message
>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>
>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>
>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>>>>> Hi Remi,
>>>>>>>>>>>>
>>>>>>>>>>>> But how does volatile help?
>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
>>>>>>>>>>>> always the
>>>>>>>>>>>> right value.
>>>>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change it
>> after
>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
>> backtrace.
>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>> need to
>>>>>>> add
>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>
>>>>>>>>>>>  ?????????? public String getMessage() {
>>>>>>>>>>>  ?????????????? String message = super.getMessage();
>>>>>>>>>>>  ?????????????? // If the stack trace was changed the extended
>>>>>>>>>>> NPE algorithm
>>>>>>>>>>>  ?????????????? // will compute a wrong message.
>>>>>>>>>>> +???????? synchronized(this) {
>>>>>>>>>>> !???????????? if (message == null && numStackTracesFilledIn ==
>>>>>>>>>>> 1) {
>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
>>>>>>>>>>> !???????????? }
>>>>>>>>>>> +???????? }
>>>>>>>>>>>  ?????????????? return message;
>>>>>>>>>>>  ?????????? }
>>>>>>>>>>>
>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>> instance
>>>>>>>>> and
>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>>>>> information about it just seems highly unrealistic. But the
>>>>>>>>>>> above fixes
>>>>>>>>>>> it simply. Though after looking at comments in the test I would
>>>>>>>>>>> also
>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>
>>>>>>>>>>>  ??????????? synchronized (this) {
>>>>>>>>>>>  ???????????????? if (this.stackTrace == null && // Immutable stack
>>>>>>>>>>>  ???????????????????? backtrace == null) // Test for out of
>>>>>>>>>>> protocol state
>>>>>>>>>>>  ???????????????????? return;
>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
>>>>>>>>>>>  ???????????????? this.stackTrace = defensiveCopy;
>>>>>>>>>>>  ???????????? }
>>>>>>>>>>>  ???????? }
>>>>>>>>>>>
>>>>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>>>>
>>>>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>>>>> proposed in webrev 02:
>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>> jdk15/02/
>>>>>>>>>>>
>>>>>>>>>>> I much prefer the latest version that recognises that only the
>>>>>>>>>>> original
>>>>>>>>>>> stack can be processed.
>>>>>>>>>>>
>>>>>>>>>>> In the test:
>>>>>>>>>>>
>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
>>>>>>>>>>> for implicilty
>>>>>>>>>>>
>>>>>>>>>>> Two typos: crated? & implicilty
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>>>>  ?????? ex = null;
>>>>>>>>>>>>  ?????? ex.fillInStackTrace()
>>>>>>>>>>>> no message is created.
>>>>>>>>>>>>
>>>>>>>>>>>> I think this really is acceptable.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous mail:
>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
>> some
>>>>>>> point.
>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack
>>>>>>>>>>>>> trace is
>>>>> filled
>>>>>>>>> you
>>>>>>>>>>> don't
>>>>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>>>>> The internal structure is no more deleted when the stack trace
>>>>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>  ?????? Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> Christoph
>>>>>>> Dreis
>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>>>>>>> runtime-
>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>>>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>> yes,
>>>>>>>>>>>>> it's what i was saying,
>>>>>>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
>>>>>>>>> initialized,
>>>>>>>>>>> i
>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the
>>>>>>>>>>>>> best
>>>>> way
>>>>>>> to
>>>>>>>>>>>>> tackle that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> R?mi
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>>>>> "Christoph
>>>>>>>>>>> Dreis"
>>>>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>>>>> dev at openjdk.java.net>,
>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>>>>> numStackTracesFilledIn
>>>>>>>>>>>>>> without synchronization.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Alan


From igor.ignatyev at oracle.com  Tue Jul 14 18:18:12 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 14 Jul 2020 11:18:12 -0700
Subject: RFR [15] : 8249029: clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_defmeth tests
In-Reply-To: <f17d41d6-56b4-453c-72aa-5f4aaf5b571b@oracle.com>
References: <9EC87F8D-662E-44B6-9EA1-F798A74D54B8@oracle.com>
 <f17d41d6-56b4-453c-72aa-5f4aaf5b571b@oracle.com>
Message-ID: <71188264-9C72-49ED-A588-B697E2BF334F@oracle.com>

Thanks David,

pushed to jdk15.

-- Igor

> On Jul 13, 2020, at 7:42 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Looks good!
> 
> Thanks,
> David
> 
> On 9/07/2020 5:43 am, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249029/webrev.00
>>> 750 lines changed: 0 ins; 376 del; 374 mod;
>> Hi all,
>> could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_defmeth tests?
>> from the main issue(8204985):
>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
>> effectively, the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/vm/runtime/defmeth  | xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`
>> testing: :vmTestbase_vm_defmeth on linux-x64
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249029/webrev.00
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249029
>> Thanks,
>> -- Igor


From igor.ignatyev at oracle.com  Tue Jul 14 18:18:06 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 14 Jul 2020 11:18:06 -0700
Subject: RFR(S) [15] : 8249032 : clean up FileInstaller $test.src $cwd in
 vmTestbase_nsk_sysdict tests
In-Reply-To: <8d3f9d7e-ac4e-6fe7-dd68-fb7bf4bc4b6f@oracle.com>
References: <CF6D1A88-7BDA-42E2-A478-F321EBC3A176@oracle.com>
 <8d3f9d7e-ac4e-6fe7-dd68-fb7bf4bc4b6f@oracle.com>
Message-ID: <2578CE1D-64B3-4629-8849-B896EE945A60@oracle.com>

Thanks David,

pushed to jdk15.

-- Igor

> On Jul 13, 2020, at 8:41 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Looks good!
> 
> Thanks,
> David
> 
> On 14/07/2020 3:16 am, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249032/webrev.00
>>> 20 lines changed: 0 ins; 20 del; 0 mod;
>> Hi all,
>> could you please review the patch which removes `FileInstaller . .` jtreg action from : vmTestbase_nsk_sysdict tests?
>> from the main issue(8204985):
>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
>> none of sysdict tests need FileInstaller, so the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/nsk/sysdict xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.
>> testing: :vmTestbase_nsk_sysdict on linux-x64
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249032/webrev.00
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249032
>> Thanks,
>> -- Igor


From igor.ignatyev at oracle.com  Tue Jul 14 18:18:20 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 14 Jul 2020 11:18:20 -0700
Subject: RFR [15] : 8249033 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_metaspace tests
In-Reply-To: <b38faba0-05a6-e60c-acd7-d7bb4869ab11@oracle.com>
References: <BAEA719A-AE50-4D41-9202-AB40EA2370A7@oracle.com>
 <b38faba0-05a6-e60c-acd7-d7bb4869ab11@oracle.com>
Message-ID: <B596EDF2-72DB-4C1B-9266-62CDAD2539FF@oracle.com>

Thanks David,

pushed to jdk15.

-- Igor

> On Jul 13, 2020, at 7:44 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Looks good!
> 
> Thanks,
> David
> 
> On 14/07/2020 3:32 am, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249033/webrev.00/
>>> 47 lines changed: 0 ins; 32 del; 15 mod;
>> Hi all,
>> could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_metaspace tests?
>> from the main issue(8204985):
>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
>> as none of these tests need FileInstaller, the patch is as simple as `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/metaspace/ xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.
>> testing: :vmTestbase_vm_metaspace on linux-x64
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249033/webrev.00
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249033
>> Thanks,
>> -- Igor


From coleen.phillimore at oracle.com  Tue Jul 14 19:55:15 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 14 Jul 2020 15:55:15 -0400
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
Message-ID: <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>


Goetz and all,

I have to admit, the version with the counter 06 is more intuitive to 
me.? It would be even better if it was a boolean.? I don't think an 
extra 32 bits in an NPE Throwable matters considering the backtrace is a 
lot bigger.? The NPE Throwable in general shouldn't be a long lived 
object, and there shouldn't be thousands of them.

There seemed to be disagreement on the issue of the message not matching 
the stack trace if the code calls setStackTrace(). It doesn't seem like 
it should be the same at all to fillInStackTrace() to me, but this 
latest patch maintains the status quo.? If you want to explore this 
further, I think you should file a separate RFE, and fix the reported 
bug with this patch.

So if I get a vote, I'd pick 06.

Thanks,
Coleen

On 7/14/20 9:48 AM, Lindenmaier, Goetz wrote:
> Hi,
>
> Yes, Coleen, you are right. We must preserve the lazy
> computation, and also reduce overhead on discarded
> exceptions.
>
> And yes, we can do it with a counter:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06/
> but I would prefer placeholder strings:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07/
> This way we need only one new field.
>
> (I need two placeholders, because the getExtendedNPEMessage0()
> sometimes returns null. If I write null into the extendedMessage field,
> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
> time.)
>
> With webrev 07 the overhead on discarded exceptions is basically the
> same as with webrev 05: one additional field, one assignment in fillInStackTrace().
>
> What do you think?
>
> Best regards,
>    Goetz.
>
>
>
>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Tuesday, July 14, 2020 1:55 PM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> Correction ...
>>
>> On 14/07/2020 12:11 pm, David Holmes wrote:
>>> Hi Goetz,
>>>
>>> Okay ... if I understand your position correctly you are looking at this
>>> as if the extended message is created at the time the NPE is thrown, and
>>> it is an implementation detail that we actually determine it lazily. If
>>> it were eagerly determined then neither fillInstacktrace() nor
>>> setStackTrace() would make any difference to the message - just as with
>>> any other exception message.
>>>
>>> However, the lazy determination of the message causes a problem with
>>> fillInStackTrace() because that call will destroy the original backtrace
>>> needed to produce the original message, and create an incorrect message.
>>> setStackTrace() does not have a similar problem because, simply by the
>>> way the current implementation works it doesn't touch the original
>>> backtrace.
>>>
>>> So you are proposing to only fix the bug that is evident in relation to
>>> fillInStackTrace() by no longer evaluating the extended message if
>>> fillInStackTrace() is called after the NPE was constructed.
>>>
>>> But in doing so you break the illusion that the extended message acts
>>> as-if determined at construction time, because you now effectively clear
>>> it when fillInStackTrace is called.
>>>
>>> My position was that if fillInStackTrace can be seen to clear it, then
>>> setStackTrace (which is logically somewhat equivalent) should also be
>>> seen to clear it.
>>>
>>> Alternatively, add a new field to NPE to cache the extended error
>>> message, and explicitly evaluate the message if fillInStackTrace() is
>>> called. That will continue the illusion that the extended message was
>>> actually set at construction time. No changes needed to setStackTrace()
>>> as we can still lazily compute the extended message.
>>>
>>> Something like:
>>>
>>> private String extendedMessage;
>>>
>>> public synchronized Throwable fillInStackTrace() {
>>>   ??? if (extendedMessage == NULL) {
>>>   ??????? extendedMessage = getExtendedNPEMessage();
>>>   ??? }
>>>   ??? return super.fillInStackTrace();
>>> }
>> Coleen pointed out to me that we can't do it like this because we need
>> the initial fillInStacktrace to be fast and we want the extended message
>> computed lazily. So it will still need a counter so we only do this on
>> the second call.
>>
>>
>>    private String extendedMessage;
>>    private int fillInCount;
>>
>>    public synchronized Throwable fillInStackTrace() {
>>         if (extendedMessage == NULL && (fillInCount++ == 1)) {
>>             extendedMessage = getExtendedNPEMessage();
>>         }
>>         return super.fillInStackTrace();
>>    }
>>
>> or something to that effect.
>>
>> David
>> -----
>>
>>> public String getMessage() {
>>>   ??? String message = super.getMessage();
>>>   ??? synchronized(this) {
>>>   ??????? if (message == null) {
>>>   ??????????? // This NPE should have an extended message.
>>>   ??????????? if (extendedMessage == NULL) {
>>>   ??????????????? extendedMessage = getExtendedNPEMessage();
>>>   ??????????? }
>>>   ??????????? message = extendedMessage;
>>>   ??????? }
>>>   ??? }
>>>   ??? return message;
>>> }
>>>
>>> Cheers,
>>> David
>>>
>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>>>> Hi David,
>>>>
>>>>> Your extended message is only computed when there is no original
>>>>> message.
>>>> Hmm. I would say the extended message is only computed when
>>>> The NPE was raised by the runtime. It happens to never have a
>>>> message so far in these cases.
>>>> But this is two views to the same thing ??
>>>>
>>>>> You're concerned about this scenario:
>>>>>
>>>>> catch (NullPointerException npe) {
>>>>>  ??? String msg1 = npe.getMessage(); // gets extends NPE message
>>>>>  ??? npe.setStackTrace(...);
>>>>>  ??? String msg2 = npe.getMessage(); // gets null
>>>>> }
>>>>>
>>>>> While I find it hard to imagine anyone doing this
>>>> Well, all the scenario are quite artificial:
>>>>  ? - why would you call fillInStackTrace on an exception thrown by the VM?
>>>>  ? - why would you call setStackTrace at all?
>>>>> you can easily have
>>>>> specified that the extended message is only available with the original
>>>>> stacktrace, hence after a second call to fillInStackTrace, or a call to
>>>>> setStackTrace, then the message reverts to being empty.
>>>> The message is not meant to be a special thing that behaves different
>>>> from other messages.? Like sometime be available, sometime not.
>>>> It ended up being different through requirements during the
>>>> review.
>>>>
>>>>> To me that makes
>>>>> far more sense than having msg2 continue to report the extended info
>> for
>>>>> the original stacktrace when it now has a new stacktrace.
>>>>>
>>>>> I'm really not seeing why calling fillInstackTrace() a second time
>>>>> should be treated any differently to calling setStackTrace(). They
>>>>> should be handled consistently IMO.
>>>> But then you treat setStackTrace() differently from setStackTrace()
>>>> with other exceptions.
>>>> The reason to treat fillInStackTrace differently is that we lost
>>>> information
>>>> needed to compute it. This is not the case with setStackTrace().
>>>>
>>>> A different solution, the one I would have proposed if I had not
>>>> considered previous comments from reviews,? would be to just
>>>> compute the message in the runtime in the call of fillInStackTrace
>>>> before the old stack trace is lost and assign it to the message field.
>>>> This way it would behave similar to all other exceptions. The message
>>>> would just be there ... just that it's computed lazily.
>>>> The cost of the algorithm wouldn't harm that much as other costly
>>>> algorithms (walking the stack) are performed at this point, too.
>>>>
>>>>> We are not talking about all exceptions only about your NPE extended
>>>>> error message.
>>>> Hmm, the inconsistency caused by the code you posted above
>>>> holds for all exceptions.? If you fiddle with the stack trace,
>>>> the message might become pointless.? Wrt. setStackTrace
>>>> they all behave the same.
>>>> Wrt. fillInStackTrace the message will be wrong. Only this
>>>> needs to be fixed.
>>>>
>>>> Best regards,
>>>>  ?? Goetz.
>>>>
>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> I implemented an example where wrong stack traces are
>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>> NPE_fillInStackTrace-
>>>>> jdk15/05/mess_with_exceptions.patch
>>>>>> See also the generated output added to a comment in the patch.
>>>>>> If the NEP message text was missing in the second printout, I think
>>>>>> this really would be unexpected.
>>>>>> Please note that the correct message is printed after messing
>>>>>> with the stack trace, it's the stack trace that is wrong.
>>>>>> (Not as with the problem I am fixing here where a wrong
>>>>>> message is printed.)
>>>>>>
>>>>>> Best regards,
>>>>>>  ??? Goetz.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>> I guess the normal usecase of setStackTrace is the other way around:
>>>>>>>> Change the message and throw a new exception with the existing
>>>>>>>> stack trace:
>>>>>>>>
>>>>>>>> try {
>>>>>>>>  ???? a.x;
>>>>>>>> catch (NullPointerException e) {
>>>>>>>>  ???? throw new NullPointerException("My own error
>>>>>>> message").setStackTrace(e.getStackTrace);
>>>>>>>> }
>>>>>>>>
>>>>>>>> And not taking an arbitrary stack trace and put it into an exception
>>>>>>>> with existing message.
>>>>>>> Interesting usage.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>  ???? Goetz.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> 'forax at univ-
>>>>>>> mlv.fr'
>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
>>>>>>>>> hotspot-runtime-dev
>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>> message
>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>
>>>>>>>>> Hi Goetz,
>>>>>>>>>
>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>> need to
>>>>>>> add
>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>> NPE_fillInStackTrace-
>>>>>>>>> jdk15/05/
>>>>>>>>>> I added the volatile, too, but as I understand the synchronized
>>>>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>>>>> without.
>>>>>>>>> No "volatile" needed, or wanted, when all access is within
>>>>>>>>> synchronized
>>>>>>>>> regions.
>>>>>>>>>
>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>> instance
>>>>>>>>> and
>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>>>>> information about it just seems highly unrealistic.
>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>>>>> performance
>>>>>>>>> ??
>>>>>>>>>
>>>>>>>>> Contention was not my concern at all. :)
>>>>>>>>>
>>>>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>> The test shows that after setStackTrace still the correct message
>>>>>>>>>> is computed. This is because the algorithm uses
>>>>>>>>>> Throwable::backtrace
>>>>>>>>>> and not Throwable::stacktrace.? Throwable::backtrace is not
>>>>>>>>>> affected by setStackTrace.
>>>>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>>>>> the message might refer to other code than the stack trace
>>>>>>>>>> points to.
>>>>>>>>> But you can't adapt the message text - there is no setMessage! If
>>>>>>>>> the
>>>>>>>>> message is NULL and you call setStackTrace() then getMessage(), it
>>>>> makes
>>>>>>>>> no sense to return the extended error message that was associated
>>>>> with
>>>>>>>>> the original stack/backtrace.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>  ????? Goetz.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>> 'forax at univ-
>>>>>>>>> mlv.fr'
>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
>> runtime-
>>>>> dev
>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>>> message
>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>
>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>
>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>>>>> Hi Remi,
>>>>>>>>>>>>
>>>>>>>>>>>> But how does volatile help?
>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
>>>>>>>>>>>> always the
>>>>>>>>>>>> right value.
>>>>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change it
>> after
>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
>> backtrace.
>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>> need to
>>>>>>> add
>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>
>>>>>>>>>>>  ?????????? public String getMessage() {
>>>>>>>>>>>  ?????????????? String message = super.getMessage();
>>>>>>>>>>>  ?????????????? // If the stack trace was changed the extended
>>>>>>>>>>> NPE algorithm
>>>>>>>>>>>  ?????????????? // will compute a wrong message.
>>>>>>>>>>> +???????? synchronized(this) {
>>>>>>>>>>> !???????????? if (message == null && numStackTracesFilledIn ==
>>>>>>>>>>> 1) {
>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
>>>>>>>>>>> !???????????? }
>>>>>>>>>>> +???????? }
>>>>>>>>>>>  ?????????????? return message;
>>>>>>>>>>>  ?????????? }
>>>>>>>>>>>
>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>> instance
>>>>>>>>> and
>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>>>>> information about it just seems highly unrealistic. But the
>>>>>>>>>>> above fixes
>>>>>>>>>>> it simply. Though after looking at comments in the test I would
>>>>>>>>>>> also
>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>
>>>>>>>>>>>  ??????????? synchronized (this) {
>>>>>>>>>>>  ???????????????? if (this.stackTrace == null && // Immutable stack
>>>>>>>>>>>  ???????????????????? backtrace == null) // Test for out of
>>>>>>>>>>> protocol state
>>>>>>>>>>>  ???????????????????? return;
>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
>>>>>>>>>>>  ???????????????? this.stackTrace = defensiveCopy;
>>>>>>>>>>>  ???????????? }
>>>>>>>>>>>  ???????? }
>>>>>>>>>>>
>>>>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>>>>
>>>>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>>>>> proposed in webrev 02:
>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>> jdk15/02/
>>>>>>>>>>>
>>>>>>>>>>> I much prefer the latest version that recognises that only the
>>>>>>>>>>> original
>>>>>>>>>>> stack can be processed.
>>>>>>>>>>>
>>>>>>>>>>> In the test:
>>>>>>>>>>>
>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
>>>>>>>>>>> for implicilty
>>>>>>>>>>>
>>>>>>>>>>> Two typos: crated? & implicilty
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>>>>  ?????? ex = null;
>>>>>>>>>>>>  ?????? ex.fillInStackTrace()
>>>>>>>>>>>> no message is created.
>>>>>>>>>>>>
>>>>>>>>>>>> I think this really is acceptable.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous mail:
>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
>> some
>>>>>>> point.
>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack
>>>>>>>>>>>>> trace is
>>>>> filled
>>>>>>>>> you
>>>>>>>>>>> don't
>>>>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>>>>> The internal structure is no more deleted when the stack trace
>>>>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>  ?????? Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> Christoph
>>>>>>> Dreis
>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>>>>>>> runtime-
>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>>>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>> yes,
>>>>>>>>>>>>> it's what i was saying,
>>>>>>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
>>>>>>>>> initialized,
>>>>>>>>>>> i
>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the
>>>>>>>>>>>>> best
>>>>> way
>>>>>>> to
>>>>>>>>>>>>> tackle that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> R?mi
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>>>>> "Christoph
>>>>>>>>>>> Dreis"
>>>>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>>>>> dev at openjdk.java.net>,
>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>>>>> numStackTracesFilledIn
>>>>>>>>>>>>>> without synchronization.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Alan


From thomas.stuefe at gmail.com  Tue Jul 14 20:14:13 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 14 Jul 2020 22:14:13 +0200
Subject: RFR(S): Use Vectored Exception Handling on Windows
In-Reply-To: <MWHPR21MB0511FC402865B075019D00D7B0610@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511A8150D4CAEBF3181E61EB0980@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUw1nEo_o4ayQBv=MJcKFCTXfvY2ThNL1x9evcvT7fuYyg@mail.gmail.com>
 <MWHPR21MB0511F8E1132F81170290209FB0920@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUzh65R01wHTW9-ObQZ7j0vNWjp_RuYivOrpGHoJNtyNgw@mail.gmail.com>
 <MWHPR21MB05117E4D1CBC613EF52991AEB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUw4ZwOHitBtsJLVMMma1D+TVV02xzoWmwd1M2yg-S91DQ@mail.gmail.com>
 <MWHPR21MB0511FC402865B075019D00D7B0610@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <CAA-vtUwAQVDgBFa6hXdWaKezrXB09UR5ci4tqaXAz4zcSgL=4w@mail.gmail.com>

Hi Ludovic,

Okay, I get it now. This sounds good.

The only way backward incompatibility I still see is when third party code
were to use UEH - today, their handlers would be ignored if our SEH handler
gets called first; with your proposed solution, what happens depends on who
calls SetUnhandledExceptionFilter() last.

In fact we may now compete for the UnhandledExceptionFilter with the third
party app, like we do on Unix for the signal handler. Well, like on Unix,
we could add a check to the periodic CheckJNI-triggered code to check if
our handler is still in place.

But I am convinced now, this really seems better. I think we do not need
RtlAddFunctionTable anymore, since VEH would work for exceptions from
dynamically generated code too, yes?.

Cheers, Thomas


On Tue, Jul 14, 2020 at 6:34 PM Ludovic Henry <luhenry at microsoft.com> wrote:

> Hi Thomas,
>
> This where Windows exception handling and Unix/Linux signals differ. On
> Windows, you have VEH, SEH and Unhandled Exception Handling (I'll call it
> UEH here), while on Unix/Linux, you only have signals.
>
> On Windows, by having this split, you can easily split your exception
> handling into 1. treating expected exceptions
> (EXCEPTION_ILLEGAL_INSTRUCTION on a deoptimization,
> EXCEPTION_ACCESS_VIOLATION in arraycopy stub, etc.), and 2. generating an
> hs_err file on an unexpected exception. You can do 1. with VEH and SEH, and
> 2. with UEH, and that's what I am proposing to do here.
>
> Practically speaking, the existing `topLevelExceptionFilter` would be
> split into two: a `topLevelVectoredExceptionFilter` which would be passed
> to `AddVectoredExceptionHandler`, and a `topLevelUnhandledExceptionHandler`
> which would be passed to `SetUnhandledExceptionHandler`. This
> `topLevelUnhandledExceptionHandler` would contain (more or less) _only_ the
> `VMError::report_and_die`, and the `topLevelVectoredExceptionFilter` would
> contain _no_ `VMError::report_and_die` whatsoever.
>
> Keeping the `VMError::report_and_die` inside VEH would, like you say,
> completely kill any use of SEH, even in external libraries. That would be a
> breaking change, and is then, IMO, not acceptable.
>
> Thanks,
>
> --
> Ludovic
>
> ________________________________________
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Monday, July 13, 2020 23:29
> To: Ludovic Henry
> Cc: hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR(S): Use Vectored Exception Handling on Windows
>
> Hi Ludovic,
>
> On Mon, Jul 13, 2020 at 11:55 PM Ludovic Henry <luhenry at microsoft.com
> <mailto:luhenry at microsoft.com>> wrote:
> Hi Thomas,
>
> Thank you for your feedback!
>
> Let me answer on some of the cases you mention.
>
> > A) this case exists today. An app getting signals via VEH would have to
> willingly ignore signals for us to get them. This does not change, your
> patch would mean this happens less often, so I do not see a backward
> compatibility problem here.
>
> Exactly.
>
> > B) this is a new case. We would have to ignore signals not meant for us.
> Technically by just ignoring them. Distinguishing this is a bit difficult
> though. Note the subtle difference to Unix: there we have signal chaining,
> so an application which is really really interested in signals for its own
> purposes uses it (e.g. by preloading libjsig) and then we know its handler
> and hand over the signal.
>
> Today, through SEH and RtlAddFunctionTable, we only get a very clear
> subset of exceptions: the one triggered in the code cache. If an exception
> is triggered from a PC outside of this code cache, SEH will not get the
> handler we registered with RtlAddFunctionTable, and we'll simply _not_ call
> into HandleExceptionFromCodeCache (the handler we register with
> RtlAddFunctionTable). That can be trivially reproduced in the VEH by simply
> checking that the PC is between CodeCache::low_bound() and
> CodeCache::high_bound().
>
> This is what you are mentioning with "we only can distinguish our crashes
> from their crashes via crash pc, rejecting any crash not in our code
> (dynamic or static). Well, arguably this would be just how it is today with
> our code scoped via SEH".
>
>
> Not sure we understand each other.
>
> Today we get exceptions from two sides:
> - via SEH, __try/__except, in threads attached to the VM. There the pc is
> either us or third party code below us which did not bother setting up SEH
> for themselves
> - via RtlAddFunctionTable for the code cache, where we specify code cache
> boundaries.
>
> With VEH we would get all exceptions in the process. Including exceptions
> from threads which have never seen the libjvm, or from caller code if the
> hotspot is embedded somewhere.
>
> Under Unix we handle all those crashes by writing hs-err crashlogs, even
> if those crashes are not our responsibility. Unless user set up signal
> chaining, where we hand over any crash signal to the chained handler (which
> for the purpose of clear error reporting is also not perfect).
>
> With VEH I get all exceptions, but have to decide on my own if an
> exception should result in a hs-err file or handed to the next exception
> handler. The only way I can see is by examining the pc - iterate through
> all our binaries and compare the pc with their text segments, and also
> check the code cache.
>
> I may miss something here.
>
> > With the added safety net of the unhandled exception filter (what
> happens if multiple parties call this?).
>
> Here, Unhandled Exception Handling predates VEH and it doesn't integrate
> chaining. The API is similar to signals on Linux/Unix: the last one to
> register has to make sure to save the previous one and to call/chain it
> accordingly.
>
> > My only very small personal gripe would be that I always liked how I can
> quickly use SEH to check if a pointer is valid without disturbing anyone.
> But within the hotspot at least I can just as well use SafeFetch.
>
> Nothing from the Win32 API stops you from mix-and-matching VEH and SEH. If
> you want to do a `__try { val = *ptr; } __except
> (EXCEPTION_EXECUTE_HANDLER) { success = false; }` in some C++ code (in vm
> or native), nothing stops you from doing so. My understanding of the
> exception handler logic in the OpenJDK on Windows is that the accepted
> EXCEPTION_ACCESS_VIOLATION in java, vm, or native code is limited to a
> clear subset, and anything outside of these known cases is quickly treated
> as "an exception we cannot handle". SafeFetch is such a case where the
> instructions potentially triggering the EXCEPTION_ACCESS_VIOLATION are
> matched against by the exception handler.
>
>
> Well, in your example, VEH would have preference and get the exception
> first; in our handler we recognize the exception as not allowed, hence a
> crash, and write a hs-err file. My success=false; handler would never
> execute.
>
> But I admit this is really a minor point. I also dimly remember seeing
> some win32 API to check pointers for readability, so maybe using SEH for
> these things is not necessary anyway.
>
> Thanks, Thomas
>
> --
> Ludovic
>
> ________________________________________
> From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com
> >>
> Sent: Saturday, July 11, 2020 23:08
> To: Ludovic Henry
> Cc: hotspot-runtime-dev at openjdk.java.net<mailto:
> hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: RFR(S): Use Vectored Exception Handling on Windows
>
> Hi Ludovic,
>
> sorry for the delay, and thanks for the extensive answer. Please find
> remarks inline.
>
> On Fri, Jun 26, 2020 at 12:11 AM Ludovic Henry <luhenry at microsoft.com
> <mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:
> luhenry at microsoft.com>>> wrote:
> Hi Thomas,
>
> It seems that the problem you're describing stems from the current
> exception handler treating two cases: 1. any exception knowingly triggered
> by Java code and treated by HotSpot (ex: safepoint-polling, arraycopy
> stubs, stackoverflow in Java code), and 2. exceptional cases leading to
> crashes (ex: uncaught C++ exception, an access violation in VM or
> native/external code, etc.). There is the same problem on Unix because
> there is only one system (signal handling) for both cases. Fortunately,
> Windows proposes different systems, each with its own advantages.
>
> The order in which Windows invokes each of these systems is the following:
>  1. Vectored Exception Handler registered with
> `AddVectoredExceptionHandler`
>  2. Structured Exception Handler
>  3. Vectored Exception Handler registered with `AddVectoredContinueHandler`
>  4. Unhandled Exception Handler
>
> Today, Hotspot on x86/x86_64 catches the exception at 2. via a handler
> registered with `RtlAddFunctionTable`. This handler does both the
> Java-triggered exceptions and any other exceptions.
>
> Now, from the point of view of an external library or application
> embedding the JVM inside their own process, they still have all the above
> options to register an exception handler, irrespective of how Hotspot does
> it. This creates the following cases:
>  - If the application uses VEH: they will (with Hotspot using SEH) be
> called _before_ Hotspot's exception handler and will then have to be aware
> that they may get exceptions unrelated to them and will have to ignore them
> accordingly
>  - If the application uses SEH: they will only get exceptions related to
> their code area
>
> If Hotspot is to use VEH, an exception would play as follow:
>  - If the application uses VEH and their registered handler executes
> _before_ Hotspot's one: same as above
>  - If the application uses VEH and their registered handler executes
> _after_ Hotspot's one: Hotspot has to make sure that the exception was
> triggered by Hotspot and ignore them otherwise (a range check on the PC can
> be used here to emulate how it's done with RltAddFunctionTable)
>  - If the application uses SEH: the same case as to where the
> application's handler executes _after_ Hotspot's one
>
> This all assumes that Hotspot's VEH handler doesn't trigger a crash report
> (VMError::report_and_die) on any exception it doesn't know how to handle.
> The simplest way to do that is simply _not_ to do it in Hotspot's VEH
> handler, and to do it by registering a Win32 Unhandled Exception Handler
> (with SetUnhandlerdExceptionFilter [1]). This handler is _only_ called when
> no other exception handler treated the exception (by returning
> EXCEPTION_CONTINUE_EXECUTION or EXCEPTION_EXECUTE_HANDLER). Invoking it
> means the application is "toast" and not in a runnable state anymore, which
> fits nicely with the purpose of the Hotspot crash report.
>
>
> Okay, If I get this correctly:
>
> Today:
>   App uses VEH - they execute before us and have to handle this correctly
> (->A)
>   App uses SEH - no interaction
>
> With proposed switch:
>   App uses VEH - they may or may not execute before us. If they come
> before us: (->A). If they come after us -> (B)
>   App uses SEH -> (B)
>
> A) this case exists today. An app getting signals via VEH would have to
> willingly ignore signals for us to get them. This does not change, your
> patch would mean this happens less often, so I do not see a backward
> compatibility problem here.
>
> B) this is a new case. We would have to ignore signals not meant for us.
> Technically by just ignoring them. Distinguishing this is a bit difficult
> though. Note the subtle difference to Unix: there we have signal chaining,
> so an application which is really really interested in signals for its own
> purposes uses it (e.g. by preloading libjsig) and then we know its handler
> and hand over the signal.
>
> On windows we do not know this (?), we only can distinguish our crashes
> from their crashes via crash pc, rejecting any crash not in our code
> (dynamic or static). Well, arguably this would be just how it is today with
> our code scoped via SEH. With the added safety net of the unhandled
> exception filter (what happens if multiple parties call this?).
>
> Okay this seems safe enough to try it at least.
>
> My only very small personal gripe would be that I always liked how I can
> quickly use SEH to check if a pointer is valid without disturbing anyone.
> But within the hotspot at least I can just as well use SafeFetch.
>
> Thank you,
>
> Thomas
>
> I hope this sheds some light on possible solutions ahead of us.
>
> Thank you,
>
> --
> Ludovic
>
> [1]
> https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767253164&sdata=7AF3UPjOdK%2Bmgr8OYFiQvsjEYSZ4fQpvLNvATm6pLls%3D&reserved=0
> ><
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117672388&sdata=zM0zOUCOujhp2fyW7PVXPplSn13elTyyf4cJUgZj%2Fm8%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767253164&sdata=7AF3UPjOdK%2Bmgr8OYFiQvsjEYSZ4fQpvLNvATm6pLls%3D&reserved=0
> >>
> ________________________________________
> From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com
> ><mailto:thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>>
> Sent: Sunday, June 21, 2020 05:55
> To: Ludovic Henry
> Cc: hotspot-runtime-dev at openjdk.java.net<mailto:
> hotspot-runtime-dev at openjdk.java.net><mailto:
> hotspot-runtime-dev at openjdk.java.net<mailto:
> hotspot-runtime-dev at openjdk.java.net>>
> Subject: Re: RFR(S): Use Vectored Exception Handling on Windows
>
> Hi,
>
> We at SAP had used VEH in our own Windows Itanium port and I dimly
> remember it being a source of problems. That is many years ago and I
> realize that it is not worth much, but it makes me bit apprehensive of this
> change.
>
> The main problem I see is that this will be an observable change in
> behavior.
>
> We currently use SEH, so our error handler is guaranteed to be invoked
> only for exceptions from within our own code. With VEH we now follow the
> Unix way of things and suddenly our error handler becomes a global resource.
>
> We will suddenly be invoked for crashes outside the VM, e.g. in foreign
> launcher code atop of us or in non-java side threads, which will generate
> whole new classes of hs-err files for crashes the VM is not responsible
> for. Which are then perceived as VM crashes and sent to us vendors instead
> of going to the right people. This is the way it works on Unix today, and
> it is a constant annoyance and increases our support workload.
>
> We also may introduce new problems since suddenly we interfere with
> application exception handling. At the very least, we have to think up a
> scheme for signal chaining (both ways: VM->foreign code and foreign
> code->VM). For the first, we probably need some form of libjsig preloading,
> or some other way to divert signal handler instalment. That would also need
> cooperation from the application programmers and/or operators.
>
> Matters are even more complicated, since foreign code may use SEH instead
> of VEH, so what happens if a JNI library below me wants to use SEH, does
> that still work?
>
> I feel this should not be rushed. Even considered "brittle" SEH has served
> us well, I do not recall many problems in the past aside from having to add
> the occasional __try/__except. Are there actual bugs we have to solve?
>
> Lastly, personally I always found SEH quite a neat concept, and one of the
> few places where Windows was superior to Unix :)
>
> Thanks, Thomas
>
>
> On Fri, Jun 19, 2020 at 5:23 PM Ludovic Henry <luhenry at microsoft.com
> <mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:
> luhenry at microsoft.com>><mailto:luhenry at microsoft.com<mailto:
> luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:
> luhenry at microsoft.com>>>> wrote:
> Hello,
>
> First, some context and definitions:
> - when talking about exception here, I'm talking about Win32 exception
> which are equivalent to signals on Linux and other Unix, I am _not_ talking
> about Java exceptions.
> - an explanation of an _exception filter_ can be found at
> https://docs.microsoft.com/en-us/cpp/cpp/writing-an-exception-filter?view=vs-2019
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767263161&sdata=7LKO5ISpYpdDKMysIeYx%2BT6B3o9uFNaY%2FDB924Sr6Vo%3D&reserved=0
> ><
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767263161&sdata=7LKO5ISpYpdDKMysIeYx%2BT6B3o9uFNaY%2FDB924Sr6Vo%3D&reserved=0
> >><
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665642403&sdata=fjcrwcQYAg3TstTSO2YHKziszwlusbYV6uUXINydD1E%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767273154&sdata=88xdAtISIFDd52eRNLpr%2BJ8UNHdmXd6oZvdwsEygbZU%3D&reserved=0
> ><
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767273154&sdata=88xdAtISIFDd52eRNLpr%2BJ8UNHdmXd6oZvdwsEygbZU%3D&reserved=0>>>.
> There is only a limited concept of that in Java with type-based exception
> filter (ex: `try { ... } catch (IOException ioe) { ... } catch (Throwable
> t) { ... }`).
> - in Win32, there exist two exception handling mechanism:
>   - Structured Exception Handling: the historical one, based on `__try {}
> __except (...) {}`
>   - Vectored Exception Handling: introduced in Windows XP / Windows Server
> 2003, much more similar to signals on Linux
>
> These exception handling mechanisms are used to catch any exceptions like
> Access Violation, Stack Overflow, Divide by Zero, Overflow, and more. These
> exceptions are equivalent to signal on Linux and are then core to many
> mechanisms in the OpenJDK.
>
> Today, the OpenJDK uses Structured Exception Handling to catch such
> exceptions, creating several requirements. First, all code that might
> trigger an exception on purpose (like a Access Violation / SIGSEGV in the
> arraycopy stub), needs to be wrapped up in a __try / __except. Because it's
> not feasible to wrap every single instance of such code, these __try /
> __except are put at the top-level most function of any thread started by
> the runtime. Second, for code generated by Hotspot, `RtlAddFunctionTable`
> is used to simulate the use of __try / __except for a specific code area.
> This function needs platform specific code with the generation of  a
> trampoline that calls the exception filter declared in the runtime. It's
> also meant to be used as a one to one mapping with try / catch in user
> code, and not as a "catch all the exceptions in this code area". Third,
> Structured Exception Handling expects to be able to unwind the stack.
> However, because Hotspot doesn't guarantee the usage of the
> platform-specific ABI internally, the platform-specific unwinder might
> break. Hotspot's usage of `RtlAddFunctionTable` for the code cache relies
> on the assumption that Structured Exception Handling never tries to unwind
> the stack (which it would fail to do because of the different ABI) before
> calling the registered exception filter.
>
> Discussing that with Windows Kernel maintainers, this approach is highly
> discouraged, considered brittle, and the better solution is Vectored
> Exception Handling. Vectored Exception Handling is conceptually much more
> similar to signal / sigaction on Linux and other Unix systems. It will
> catch all exceptions happening across the process, and no __try / __except
> will be required. It also removes the requirement to call
> `RtlAddFunctionTable`.  The exception filter then behaves like a signal
> handler with the possibility to modify the registers at will, modifying the
> PC to step over an instruction after an expected Access Violation for
> example. Vectored Exception Handling is also already used for AOT code.
>
> The changes can be found at
> http://cr.openjdk.java.net/~burban/ludovic_vecexc/<
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767283147&sdata=d5JQScm01HijYY5AxVwV2AEjAr%2BuX90MxOGlpfj0lA8%3D&reserved=0
> ><
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767283147&sdata=d5JQScm01HijYY5AxVwV2AEjAr%2BuX90MxOGlpfj0lA8%3D&reserved=0
> >><
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665652395&sdata=pTewy1%2BeB43HX4y0ypDwMDGRjBoNP6yBGrhRi7ncm1c%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767293145&sdata=SVmMjP8BRzSq1mm%2FG14cQRwiSqgTbx%2Bu8ZpeA1QjhFk%3D&reserved=0
> ><
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0
> <
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767293145&sdata=SVmMjP8BRzSq1mm%2FG14cQRwiSqgTbx%2Bu8ZpeA1QjhFk%3D&reserved=0>>>.
> As I am not an author, I have not created a corresponding bug in JBS.
>
> Thank you, and looking forward for your feedback!
>
> --
> Ludovic
>
>
>

From mandy.chung at oracle.com  Tue Jul 14 20:17:11 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Tue, 14 Jul 2020 13:17:11 -0700
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
Message-ID: <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>

fillInStackTrace and setStackTrace replace the stack trace of a NPE 
instance. Therefore I think both should behave consistently for any NPE 
instances with and without an explicit message.

For webrev.06/webrev.07, this would behave as if NPE was created with an 
extended message which cannot be altered once constructed.? I expect 
that it'd be rare to see NPE instance thrown by VM (not explicitly 
constructed) but whose stack trace is replaced.? So I'm fine with this 
approach.

webrev.06 is okay while I think checking Throwable::backtrace != null is 
clearer as I suggested.

Mandy

On 7/14/20 12:55 PM, coleen.phillimore at oracle.com wrote:
>
> Goetz and all,
>
> I have to admit, the version with the counter 06 is more intuitive to 
> me.? It would be even better if it was a boolean.? I don't think an 
> extra 32 bits in an NPE Throwable matters considering the backtrace is 
> a lot bigger.? The NPE Throwable in general shouldn't be a long lived 
> object, and there shouldn't be thousands of them.
>
> There seemed to be disagreement on the issue of the message not 
> matching the stack trace if the code calls setStackTrace(). It doesn't 
> seem like it should be the same at all to fillInStackTrace() to me, 
> but this latest patch maintains the status quo.? If you want to 
> explore this further, I think you should file a separate RFE, and fix 
> the reported bug with this patch.
>
> So if I get a vote, I'd pick 06.
>
> Thanks,
> Coleen
>
> On 7/14/20 9:48 AM, Lindenmaier, Goetz wrote:
>> Hi,
>>
>> Yes, Coleen, you are right. We must preserve the lazy
>> computation, and also reduce overhead on discarded
>> exceptions.
>>
>> And yes, we can do it with a counter:
>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06/ 
>>
>> but I would prefer placeholder strings:
>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07/ 
>>
>> This way we need only one new field.
>>
>> (I need two placeholders, because the getExtendedNPEMessage0()
>> sometimes returns null. If I write null into the extendedMessage field,
>> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
>> time.)
>>
>> With webrev 07 the overhead on discarded exceptions is basically the
>> same as with webrev 05: one additional field, one assignment in 
>> fillInStackTrace().
>>
>> What do you think?
>>
>> Best regards,
>> ?? Goetz.
>>
>>
>>
>>
>>> -----Original Message-----
>>> From: David Holmes <david.holmes at oracle.com>
>>> Sent: Tuesday, July 14, 2020 1:55 PM
>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>>> <hotspot-runtime-dev at openjdk.java.net>
>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>>> after calling fillInStackTrace
>>>
>>> Correction ...
>>>
>>> On 14/07/2020 12:11 pm, David Holmes wrote:
>>>> Hi Goetz,
>>>>
>>>> Okay ... if I understand your position correctly you are looking at 
>>>> this
>>>> as if the extended message is created at the time the NPE is 
>>>> thrown, and
>>>> it is an implementation detail that we actually determine it 
>>>> lazily. If
>>>> it were eagerly determined then neither fillInstacktrace() nor
>>>> setStackTrace() would make any difference to the message - just as 
>>>> with
>>>> any other exception message.
>>>>
>>>> However, the lazy determination of the message causes a problem with
>>>> fillInStackTrace() because that call will destroy the original 
>>>> backtrace
>>>> needed to produce the original message, and create an incorrect 
>>>> message.
>>>> setStackTrace() does not have a similar problem because, simply by the
>>>> way the current implementation works it doesn't touch the original
>>>> backtrace.
>>>>
>>>> So you are proposing to only fix the bug that is evident in 
>>>> relation to
>>>> fillInStackTrace() by no longer evaluating the extended message if
>>>> fillInStackTrace() is called after the NPE was constructed.
>>>>
>>>> But in doing so you break the illusion that the extended message acts
>>>> as-if determined at construction time, because you now effectively 
>>>> clear
>>>> it when fillInStackTrace is called.
>>>>
>>>> My position was that if fillInStackTrace can be seen to clear it, then
>>>> setStackTrace (which is logically somewhat equivalent) should also be
>>>> seen to clear it.
>>>>
>>>> Alternatively, add a new field to NPE to cache the extended error
>>>> message, and explicitly evaluate the message if fillInStackTrace() is
>>>> called. That will continue the illusion that the extended message was
>>>> actually set at construction time. No changes needed to 
>>>> setStackTrace()
>>>> as we can still lazily compute the extended message.
>>>>
>>>> Something like:
>>>>
>>>> private String extendedMessage;
>>>>
>>>> public synchronized Throwable fillInStackTrace() {
>>>> ? ??? if (extendedMessage == NULL) {
>>>> ? ??????? extendedMessage = getExtendedNPEMessage();
>>>> ? ??? }
>>>> ? ??? return super.fillInStackTrace();
>>>> }
>>> Coleen pointed out to me that we can't do it like this because we need
>>> the initial fillInStacktrace to be fast and we want the extended 
>>> message
>>> computed lazily. So it will still need a counter so we only do this on
>>> the second call.
>>>
>>>
>>> ?? private String extendedMessage;
>>> ?? private int fillInCount;
>>>
>>> ?? public synchronized Throwable fillInStackTrace() {
>>> ??????? if (extendedMessage == NULL && (fillInCount++ == 1)) {
>>> ??????????? extendedMessage = getExtendedNPEMessage();
>>> ??????? }
>>> ??????? return super.fillInStackTrace();
>>> ?? }
>>>
>>> or something to that effect.
>>>
>>> David
>>> -----
>>>
>>>> public String getMessage() {
>>>> ? ??? String message = super.getMessage();
>>>> ? ??? synchronized(this) {
>>>> ? ??????? if (message == null) {
>>>> ? ??????????? // This NPE should have an extended message.
>>>> ? ??????????? if (extendedMessage == NULL) {
>>>> ? ??????????????? extendedMessage = getExtendedNPEMessage();
>>>> ? ??????????? }
>>>> ? ??????????? message = extendedMessage;
>>>> ? ??????? }
>>>> ? ??? }
>>>> ? ??? return message;
>>>> }
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>>>>> Hi David,
>>>>>
>>>>>> Your extended message is only computed when there is no original
>>>>>> message.
>>>>> Hmm. I would say the extended message is only computed when
>>>>> The NPE was raised by the runtime. It happens to never have a
>>>>> message so far in these cases.
>>>>> But this is two views to the same thing ??
>>>>>
>>>>>> You're concerned about this scenario:
>>>>>>
>>>>>> catch (NullPointerException npe) {
>>>>>> ???? String msg1 = npe.getMessage(); // gets extends NPE message
>>>>>> ???? npe.setStackTrace(...);
>>>>>> ???? String msg2 = npe.getMessage(); // gets null
>>>>>> }
>>>>>>
>>>>>> While I find it hard to imagine anyone doing this
>>>>> Well, all the scenario are quite artificial:
>>>>> ?? - why would you call fillInStackTrace on an exception thrown by 
>>>>> the VM?
>>>>> ?? - why would you call setStackTrace at all?
>>>>>> you can easily have
>>>>>> specified that the extended message is only available with the 
>>>>>> original
>>>>>> stacktrace, hence after a second call to fillInStackTrace, or a 
>>>>>> call to
>>>>>> setStackTrace, then the message reverts to being empty.
>>>>> The message is not meant to be a special thing that behaves different
>>>>> from other messages.? Like sometime be available, sometime not.
>>>>> It ended up being different through requirements during the
>>>>> review.
>>>>>
>>>>>> To me that makes
>>>>>> far more sense than having msg2 continue to report the extended info
>>> for
>>>>>> the original stacktrace when it now has a new stacktrace.
>>>>>>
>>>>>> I'm really not seeing why calling fillInstackTrace() a second time
>>>>>> should be treated any differently to calling setStackTrace(). They
>>>>>> should be handled consistently IMO.
>>>>> But then you treat setStackTrace() differently from setStackTrace()
>>>>> with other exceptions.
>>>>> The reason to treat fillInStackTrace differently is that we lost
>>>>> information
>>>>> needed to compute it. This is not the case with setStackTrace().
>>>>>
>>>>> A different solution, the one I would have proposed if I had not
>>>>> considered previous comments from reviews,? would be to just
>>>>> compute the message in the runtime in the call of fillInStackTrace
>>>>> before the old stack trace is lost and assign it to the message 
>>>>> field.
>>>>> This way it would behave similar to all other exceptions. The message
>>>>> would just be there ... just that it's computed lazily.
>>>>> The cost of the algorithm wouldn't harm that much as other costly
>>>>> algorithms (walking the stack) are performed at this point, too.
>>>>>
>>>>>> We are not talking about all exceptions only about your NPE extended
>>>>>> error message.
>>>>> Hmm, the inconsistency caused by the code you posted above
>>>>> holds for all exceptions.? If you fiddle with the stack trace,
>>>>> the message might become pointless.? Wrt. setStackTrace
>>>>> they all behave the same.
>>>>> Wrt. fillInStackTrace the message will be wrong. Only this
>>>>> needs to be fixed.
>>>>>
>>>>> Best regards,
>>>>> ??? Goetz.
>>>>>
>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> I implemented an example where wrong stack traces are
>>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>> NPE_fillInStackTrace-
>>>>>> jdk15/05/mess_with_exceptions.patch
>>>>>>> See also the generated output added to a comment in the patch.
>>>>>>> If the NEP message text was missing in the second printout, I think
>>>>>>> this really would be unexpected.
>>>>>>> Please note that the correct message is printed after messing
>>>>>>> with the stack trace, it's the stack trace that is wrong.
>>>>>>> (Not as with the problem I am fixing here where a wrong
>>>>>>> message is printed.)
>>>>>>>
>>>>>>> Best regards,
>>>>>>> ???? Goetz.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>> I guess the normal usecase of setStackTrace is the other way 
>>>>>>>>> around:
>>>>>>>>> Change the message and throw a new exception with the existing
>>>>>>>>> stack trace:
>>>>>>>>>
>>>>>>>>> try {
>>>>>>>>> ????? a.x;
>>>>>>>>> catch (NullPointerException e) {
>>>>>>>>> ????? throw new NullPointerException("My own error
>>>>>>>> message").setStackTrace(e.getStackTrace);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> And not taking an arbitrary stack trace and put it into an 
>>>>>>>>> exception
>>>>>>>>> with existing message.
>>>>>>>> Interesting usage.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> ????? Goetz.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>> 'forax at univ-
>>>>>>>> mlv.fr'
>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
>>>>>>>>>> hotspot-runtime-dev
>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>> message
>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>
>>>>>>>>>> Hi Goetz,
>>>>>>>>>>
>>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>>> need to
>>>>>>>> add
>>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>> NPE_fillInStackTrace-
>>>>>>>>>> jdk15/05/
>>>>>>>>>>> I added the volatile, too, but as I understand the synchronized
>>>>>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>>>>>> without.
>>>>>>>>>> No "volatile" needed, or wanted, when all access is within
>>>>>>>>>> synchronized
>>>>>>>>>> regions.
>>>>>>>>>>
>>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>>> instance
>>>>>>>>>> and
>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst 
>>>>>>>>>>>> printing out
>>>>>>>>>>>> information about it just seems highly unrealistic.
>>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>>>>>> performance
>>>>>>>>>> ??
>>>>>>>>>>
>>>>>>>>>> Contention was not my concern at all. :)
>>>>>>>>>>
>>>>>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>> The test shows that after setStackTrace still the correct 
>>>>>>>>>>> message
>>>>>>>>>>> is computed. This is because the algorithm uses
>>>>>>>>>>> Throwable::backtrace
>>>>>>>>>>> and not Throwable::stacktrace. Throwable::backtrace is not
>>>>>>>>>>> affected by setStackTrace.
>>>>>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>>>>>> the message might refer to other code than the stack trace
>>>>>>>>>>> points to.
>>>>>>>>>> But you can't adapt the message text - there is no 
>>>>>>>>>> setMessage! If
>>>>>>>>>> the
>>>>>>>>>> message is NULL and you call setStackTrace() then 
>>>>>>>>>> getMessage(), it
>>>>>> makes
>>>>>>>>>> no sense to return the extended error message that was 
>>>>>>>>>> associated
>>>>>> with
>>>>>>>>>> the original stack/backtrace.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> ?????? Goetz.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> 'forax at univ-
>>>>>>>>>> mlv.fr'
>>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
>>> runtime-
>>>>>> dev
>>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful 
>>>>>>>>>>>> NullPointerException
>>>>>>>> message
>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>> Hi Remi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> But how does volatile help?
>>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
>>>>>>>>>>>>> always the
>>>>>>>>>>>>> right value.
>>>>>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change it
>>> after
>>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
>>> backtrace.
>>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>>> need to
>>>>>>>> add
>>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>>
>>>>>>>>>>>> ??????????? public String getMessage() {
>>>>>>>>>>>> ??????????????? String message = super.getMessage();
>>>>>>>>>>>> ??????????????? // If the stack trace was changed the extended
>>>>>>>>>>>> NPE algorithm
>>>>>>>>>>>> ??????????????? // will compute a wrong message.
>>>>>>>>>>>> +???????? synchronized(this) {
>>>>>>>>>>>> !???????????? if (message == null && numStackTracesFilledIn ==
>>>>>>>>>>>> 1) {
>>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
>>>>>>>>>>>> !???????????? }
>>>>>>>>>>>> +???????? }
>>>>>>>>>>>> ??????????????? return message;
>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>
>>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>>> instance
>>>>>>>>>> and
>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst 
>>>>>>>>>>>> printing out
>>>>>>>>>>>> information about it just seems highly unrealistic. But the
>>>>>>>>>>>> above fixes
>>>>>>>>>>>> it simply. Though after looking at comments in the test I 
>>>>>>>>>>>> would
>>>>>>>>>>>> also
>>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>>
>>>>>>>>>>>> ???????????? synchronized (this) {
>>>>>>>>>>>> ????????????????? if (this.stackTrace == null && // 
>>>>>>>>>>>> Immutable stack
>>>>>>>>>>>> ????????????????????? backtrace == null) // Test for out of
>>>>>>>>>>>> protocol state
>>>>>>>>>>>> ????????????????????? return;
>>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
>>>>>>>>>>>> ????????????????? this.stackTrace = defensiveCopy;
>>>>>>>>>>>> ????????????? }
>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>
>>>>>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>>>>>
>>>>>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>>>>>> proposed in webrev 02:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>>> jdk15/02/
>>>>>>>>>>>>
>>>>>>>>>>>> I much prefer the latest version that recognises that only the
>>>>>>>>>>>> original
>>>>>>>>>>>> stack can be processed.
>>>>>>>>>>>>
>>>>>>>>>>>> In the test:
>>>>>>>>>>>>
>>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
>>>>>>>>>>>> for implicilty
>>>>>>>>>>>>
>>>>>>>>>>>> Two typos: crated? & implicilty
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>>> -----
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>>>>> ??????? ex = null;
>>>>>>>>>>>>> ??????? ex.fillInStackTrace()
>>>>>>>>>>>>> no message is created.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think this really is acceptable.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous 
>>>>>>>>>>>>> mail:
>>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
>>> some
>>>>>>>> point.
>>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack
>>>>>>>>>>>>>> trace is
>>>>>> filled
>>>>>>>>>> you
>>>>>>>>>>>> don't
>>>>>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>>>>>> The internal structure is no more deleted when the stack 
>>>>>>>>>>>>> trace
>>>>>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> ??????? Goetz.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>> Christoph
>>>>>>>> Dreis
>>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>>>>>>>> runtime-
>>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>>> NullPointerException
>>>>>>>>>> message
>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> yes,
>>>>>>>>>>>>>> it's what i was saying,
>>>>>>>>>>>>>> given that a NPE can be thrown very early, before 
>>>>>>>>>>>>>> VarHandle is
>>>>>>>>>> initialized,
>>>>>>>>>>>> i
>>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> best
>>>>>> way
>>>>>>>> to
>>>>>>>>>>>>>> tackle that.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> R?mi
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>>>>>> "Christoph
>>>>>>>>>>>> Dreis"
>>>>>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>>>>>> dev at openjdk.java.net>,
>>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
>>> NullPointerException
>>>>>>>> message
>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>>>>>> numStackTracesFilledIn
>>>>>>>>>>>>>>> without synchronization.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Alan
>


From coleen.phillimore at oracle.com  Tue Jul 14 20:27:05 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 14 Jul 2020 16:27:05 -0400
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
 <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
Message-ID: <4d27afc7-d070-6ef9-3d44-8f3b4d35a611@oracle.com>


On 7/14/20 4:17 PM, Mandy Chung wrote:
> fillInStackTrace and setStackTrace replace the stack trace of a NPE 
> instance. Therefore I think both should behave consistently for any 
> NPE instances with and without an explicit message.
>
> For webrev.06/webrev.07, this would behave as if NPE was created with 
> an extended message which cannot be altered once constructed.? I 
> expect that it'd be rare to see NPE instance thrown by VM (not 
> explicitly constructed) but whose stack trace is replaced.? So I'm 
> fine with this approach.
>
> webrev.06 is okay while I think checking Throwable::backtrace != null 
> is clearer as I suggested.

I like that version 06 isolates knowledge to NullPointerException.java 
and doesn't have to know what the expected value of backtrace is in the 
super class.? I maintain my vote for 06.

Thanks, I was trying to understand the fillInStackTrace vs. 
setStackTrace issue, and your description makes sense to me.

Coleen

>
> Mandy
>
> On 7/14/20 12:55 PM, coleen.phillimore at oracle.com wrote:
>>
>> Goetz and all,
>>
>> I have to admit, the version with the counter 06 is more intuitive to 
>> me.? It would be even better if it was a boolean. I don't think an 
>> extra 32 bits in an NPE Throwable matters considering the backtrace 
>> is a lot bigger.? The NPE Throwable in general shouldn't be a long 
>> lived object, and there shouldn't be thousands of them.
>>
>> There seemed to be disagreement on the issue of the message not 
>> matching the stack trace if the code calls setStackTrace(). It 
>> doesn't seem like it should be the same at all to fillInStackTrace() 
>> to me, but this latest patch maintains the status quo.? If you want 
>> to explore this further, I think you should file a separate RFE, and 
>> fix the reported bug with this patch.
>>
>> So if I get a vote, I'd pick 06.
>>
>> Thanks,
>> Coleen
>>
>> On 7/14/20 9:48 AM, Lindenmaier, Goetz wrote:
>>> Hi,
>>>
>>> Yes, Coleen, you are right. We must preserve the lazy
>>> computation, and also reduce overhead on discarded
>>> exceptions.
>>>
>>> And yes, we can do it with a counter:
>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06/ 
>>>
>>> but I would prefer placeholder strings:
>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07/ 
>>>
>>> This way we need only one new field.
>>>
>>> (I need two placeholders, because the getExtendedNPEMessage0()
>>> sometimes returns null. If I write null into the extendedMessage field,
>>> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
>>> time.)
>>>
>>> With webrev 07 the overhead on discarded exceptions is basically the
>>> same as with webrev 05: one additional field, one assignment in 
>>> fillInStackTrace().
>>>
>>> What do you think?
>>>
>>> Best regards,
>>> ?? Goetz.
>>>
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: David Holmes <david.holmes at oracle.com>
>>>> Sent: Tuesday, July 14, 2020 1:55 PM
>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 
>>>> 'forax at univ-mlv.fr'
>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException 
>>>> message
>>>> after calling fillInStackTrace
>>>>
>>>> Correction ...
>>>>
>>>> On 14/07/2020 12:11 pm, David Holmes wrote:
>>>>> Hi Goetz,
>>>>>
>>>>> Okay ... if I understand your position correctly you are looking 
>>>>> at this
>>>>> as if the extended message is created at the time the NPE is 
>>>>> thrown, and
>>>>> it is an implementation detail that we actually determine it 
>>>>> lazily. If
>>>>> it were eagerly determined then neither fillInstacktrace() nor
>>>>> setStackTrace() would make any difference to the message - just as 
>>>>> with
>>>>> any other exception message.
>>>>>
>>>>> However, the lazy determination of the message causes a problem with
>>>>> fillInStackTrace() because that call will destroy the original 
>>>>> backtrace
>>>>> needed to produce the original message, and create an incorrect 
>>>>> message.
>>>>> setStackTrace() does not have a similar problem because, simply by 
>>>>> the
>>>>> way the current implementation works it doesn't touch the original
>>>>> backtrace.
>>>>>
>>>>> So you are proposing to only fix the bug that is evident in 
>>>>> relation to
>>>>> fillInStackTrace() by no longer evaluating the extended message if
>>>>> fillInStackTrace() is called after the NPE was constructed.
>>>>>
>>>>> But in doing so you break the illusion that the extended message acts
>>>>> as-if determined at construction time, because you now effectively 
>>>>> clear
>>>>> it when fillInStackTrace is called.
>>>>>
>>>>> My position was that if fillInStackTrace can be seen to clear it, 
>>>>> then
>>>>> setStackTrace (which is logically somewhat equivalent) should also be
>>>>> seen to clear it.
>>>>>
>>>>> Alternatively, add a new field to NPE to cache the extended error
>>>>> message, and explicitly evaluate the message if fillInStackTrace() is
>>>>> called. That will continue the illusion that the extended message was
>>>>> actually set at construction time. No changes needed to 
>>>>> setStackTrace()
>>>>> as we can still lazily compute the extended message.
>>>>>
>>>>> Something like:
>>>>>
>>>>> private String extendedMessage;
>>>>>
>>>>> public synchronized Throwable fillInStackTrace() {
>>>>> ? ??? if (extendedMessage == NULL) {
>>>>> ? ??????? extendedMessage = getExtendedNPEMessage();
>>>>> ? ??? }
>>>>> ? ??? return super.fillInStackTrace();
>>>>> }
>>>> Coleen pointed out to me that we can't do it like this because we need
>>>> the initial fillInStacktrace to be fast and we want the extended 
>>>> message
>>>> computed lazily. So it will still need a counter so we only do this on
>>>> the second call.
>>>>
>>>>
>>>> ?? private String extendedMessage;
>>>> ?? private int fillInCount;
>>>>
>>>> ?? public synchronized Throwable fillInStackTrace() {
>>>> ??????? if (extendedMessage == NULL && (fillInCount++ == 1)) {
>>>> ??????????? extendedMessage = getExtendedNPEMessage();
>>>> ??????? }
>>>> ??????? return super.fillInStackTrace();
>>>> ?? }
>>>>
>>>> or something to that effect.
>>>>
>>>> David
>>>> -----
>>>>
>>>>> public String getMessage() {
>>>>> ? ??? String message = super.getMessage();
>>>>> ? ??? synchronized(this) {
>>>>> ? ??????? if (message == null) {
>>>>> ? ??????????? // This NPE should have an extended message.
>>>>> ? ??????????? if (extendedMessage == NULL) {
>>>>> ? ??????????????? extendedMessage = getExtendedNPEMessage();
>>>>> ? ??????????? }
>>>>> ? ??????????? message = extendedMessage;
>>>>> ? ??????? }
>>>>> ? ??? }
>>>>> ? ??? return message;
>>>>> }
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>>>>>> Hi David,
>>>>>>
>>>>>>> Your extended message is only computed when there is no original
>>>>>>> message.
>>>>>> Hmm. I would say the extended message is only computed when
>>>>>> The NPE was raised by the runtime. It happens to never have a
>>>>>> message so far in these cases.
>>>>>> But this is two views to the same thing ??
>>>>>>
>>>>>>> You're concerned about this scenario:
>>>>>>>
>>>>>>> catch (NullPointerException npe) {
>>>>>>> ???? String msg1 = npe.getMessage(); // gets extends NPE message
>>>>>>> ???? npe.setStackTrace(...);
>>>>>>> ???? String msg2 = npe.getMessage(); // gets null
>>>>>>> }
>>>>>>>
>>>>>>> While I find it hard to imagine anyone doing this
>>>>>> Well, all the scenario are quite artificial:
>>>>>> ?? - why would you call fillInStackTrace on an exception thrown 
>>>>>> by the VM?
>>>>>> ?? - why would you call setStackTrace at all?
>>>>>>> you can easily have
>>>>>>> specified that the extended message is only available with the 
>>>>>>> original
>>>>>>> stacktrace, hence after a second call to fillInStackTrace, or a 
>>>>>>> call to
>>>>>>> setStackTrace, then the message reverts to being empty.
>>>>>> The message is not meant to be a special thing that behaves 
>>>>>> different
>>>>>> from other messages.? Like sometime be available, sometime not.
>>>>>> It ended up being different through requirements during the
>>>>>> review.
>>>>>>
>>>>>>> To me that makes
>>>>>>> far more sense than having msg2 continue to report the extended 
>>>>>>> info
>>>> for
>>>>>>> the original stacktrace when it now has a new stacktrace.
>>>>>>>
>>>>>>> I'm really not seeing why calling fillInstackTrace() a second time
>>>>>>> should be treated any differently to calling setStackTrace(). They
>>>>>>> should be handled consistently IMO.
>>>>>> But then you treat setStackTrace() differently from setStackTrace()
>>>>>> with other exceptions.
>>>>>> The reason to treat fillInStackTrace differently is that we lost
>>>>>> information
>>>>>> needed to compute it. This is not the case with setStackTrace().
>>>>>>
>>>>>> A different solution, the one I would have proposed if I had not
>>>>>> considered previous comments from reviews,? would be to just
>>>>>> compute the message in the runtime in the call of fillInStackTrace
>>>>>> before the old stack trace is lost and assign it to the message 
>>>>>> field.
>>>>>> This way it would behave similar to all other exceptions. The 
>>>>>> message
>>>>>> would just be there ... just that it's computed lazily.
>>>>>> The cost of the algorithm wouldn't harm that much as other costly
>>>>>> algorithms (walking the stack) are performed at this point, too.
>>>>>>
>>>>>>> We are not talking about all exceptions only about your NPE 
>>>>>>> extended
>>>>>>> error message.
>>>>>> Hmm, the inconsistency caused by the code you posted above
>>>>>> holds for all exceptions.? If you fiddle with the stack trace,
>>>>>> the message might become pointless.? Wrt. setStackTrace
>>>>>> they all behave the same.
>>>>>> Wrt. fillInStackTrace the message will be wrong. Only this
>>>>>> needs to be fixed.
>>>>>>
>>>>>> Best regards,
>>>>>> ??? Goetz.
>>>>>>
>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> I implemented an example where wrong stack traces are
>>>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>> NPE_fillInStackTrace-
>>>>>>> jdk15/05/mess_with_exceptions.patch
>>>>>>>> See also the generated output added to a comment in the patch.
>>>>>>>> If the NEP message text was missing in the second printout, I 
>>>>>>>> think
>>>>>>>> this really would be unexpected.
>>>>>>>> Please note that the correct message is printed after messing
>>>>>>>> with the stack trace, it's the stack trace that is wrong.
>>>>>>>> (Not as with the problem I am fixing here where a wrong
>>>>>>>> message is printed.)
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> ???? Goetz.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>> I guess the normal usecase of setStackTrace is the other way 
>>>>>>>>>> around:
>>>>>>>>>> Change the message and throw a new exception with the existing
>>>>>>>>>> stack trace:
>>>>>>>>>>
>>>>>>>>>> try {
>>>>>>>>>> ????? a.x;
>>>>>>>>>> catch (NullPointerException e) {
>>>>>>>>>> ????? throw new NullPointerException("My own error
>>>>>>>>> message").setStackTrace(e.getStackTrace);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> And not taking an arbitrary stack trace and put it into an 
>>>>>>>>>> exception
>>>>>>>>>> with existing message.
>>>>>>>>> Interesting usage.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> ????? Goetz.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>> 'forax at univ-
>>>>>>>>> mlv.fr'
>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
>>>>>>>>>>> hotspot-runtime-dev
>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>>> message
>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>
>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>
>>>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>>>> need to
>>>>>>>>> add
>>>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>> jdk15/05/
>>>>>>>>>>>> I added the volatile, too, but as I understand the 
>>>>>>>>>>>> synchronized
>>>>>>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>>>>>>> without.
>>>>>>>>>>> No "volatile" needed, or wanted, when all access is within
>>>>>>>>>>> synchronized
>>>>>>>>>>> regions.
>>>>>>>>>>>
>>>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>>>> instance
>>>>>>>>>>> and
>>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst 
>>>>>>>>>>>>> printing out
>>>>>>>>>>>>> information about it just seems highly unrealistic.
>>>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>>>>>>> performance
>>>>>>>>>>> ??
>>>>>>>>>>>
>>>>>>>>>>> Contention was not my concern at all. :)
>>>>>>>>>>>
>>>>>>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>> The test shows that after setStackTrace still the correct 
>>>>>>>>>>>> message
>>>>>>>>>>>> is computed. This is because the algorithm uses
>>>>>>>>>>>> Throwable::backtrace
>>>>>>>>>>>> and not Throwable::stacktrace. Throwable::backtrace is not
>>>>>>>>>>>> affected by setStackTrace.
>>>>>>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>>>>>>> the message might refer to other code than the stack trace
>>>>>>>>>>>> points to.
>>>>>>>>>>> But you can't adapt the message text - there is no 
>>>>>>>>>>> setMessage! If
>>>>>>>>>>> the
>>>>>>>>>>> message is NULL and you call setStackTrace() then 
>>>>>>>>>>> getMessage(), it
>>>>>>> makes
>>>>>>>>>>> no sense to return the extended error message that was 
>>>>>>>>>>> associated
>>>>>>> with
>>>>>>>>>>> the original stack/backtrace.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> ?????? Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>> 'forax at univ-
>>>>>>>>>>> mlv.fr'
>>>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
>>>> runtime-
>>>>>>> dev
>>>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful 
>>>>>>>>>>>>> NullPointerException
>>>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>> Hi Remi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But how does volatile help?
>>>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
>>>>>>>>>>>>>> always the
>>>>>>>>>>>>>> right value.
>>>>>>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change it
>>>> after
>>>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
>>>> backtrace.
>>>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>>>> need to
>>>>>>>>> add
>>>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>>>
>>>>>>>>>>>>> ??????????? public String getMessage() {
>>>>>>>>>>>>> ??????????????? String message = super.getMessage();
>>>>>>>>>>>>> ??????????????? // If the stack trace was changed the 
>>>>>>>>>>>>> extended
>>>>>>>>>>>>> NPE algorithm
>>>>>>>>>>>>> ??????????????? // will compute a wrong message.
>>>>>>>>>>>>> +???????? synchronized(this) {
>>>>>>>>>>>>> !???????????? if (message == null && 
>>>>>>>>>>>>> numStackTracesFilledIn ==
>>>>>>>>>>>>> 1) {
>>>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
>>>>>>>>>>>>> !???????????? }
>>>>>>>>>>>>> +???????? }
>>>>>>>>>>>>> ??????????????? return message;
>>>>>>>>>>>>> ??????????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>>>> instance
>>>>>>>>>>> and
>>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst 
>>>>>>>>>>>>> printing out
>>>>>>>>>>>>> information about it just seems highly unrealistic. But the
>>>>>>>>>>>>> above fixes
>>>>>>>>>>>>> it simply. Though after looking at comments in the test I 
>>>>>>>>>>>>> would
>>>>>>>>>>>>> also
>>>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ???????????? synchronized (this) {
>>>>>>>>>>>>> ????????????????? if (this.stackTrace == null && // 
>>>>>>>>>>>>> Immutable stack
>>>>>>>>>>>>> ????????????????????? backtrace == null) // Test for out of
>>>>>>>>>>>>> protocol state
>>>>>>>>>>>>> ????????????????????? return;
>>>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
>>>>>>>>>>>>> ????????????????? this.stackTrace = defensiveCopy;
>>>>>>>>>>>>> ????????????? }
>>>>>>>>>>>>> ????????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>>>>>>> proposed in webrev 02:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>>>> jdk15/02/
>>>>>>>>>>>>>
>>>>>>>>>>>>> I much prefer the latest version that recognises that only 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> original
>>>>>>>>>>>>> stack can be processed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the test:
>>>>>>>>>>>>>
>>>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
>>>>>>>>>>>>> for implicilty
>>>>>>>>>>>>>
>>>>>>>>>>>>> Two typos: crated? & implicilty
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> David
>>>>>>>>>>>>> -----
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>>>>>> ??????? ex = null;
>>>>>>>>>>>>>> ??????? ex.fillInStackTrace()
>>>>>>>>>>>>>> no message is created.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think this really is acceptable.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous 
>>>>>>>>>>>>>> mail:
>>>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
>>>> some
>>>>>>>>> point.
>>>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java 
>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>> trace is
>>>>>>> filled
>>>>>>>>>>> you
>>>>>>>>>>>>> don't
>>>>>>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>>>>>>> The internal structure is no more deleted when the stack 
>>>>>>>>>>>>>> trace
>>>>>>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> ??????? Goetz.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>> Christoph
>>>>>>>>> Dreis
>>>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>>>>>>>>> runtime-
>>>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>>>> NullPointerException
>>>>>>>>>>> message
>>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> yes,
>>>>>>>>>>>>>>> it's what i was saying,
>>>>>>>>>>>>>>> given that a NPE can be thrown very early, before 
>>>>>>>>>>>>>>> VarHandle is
>>>>>>>>>>> initialized,
>>>>>>>>>>>>> i
>>>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile 
>>>>>>>>>>>>>>> is the
>>>>>>>>>>>>>>> best
>>>>>>> way
>>>>>>>>> to
>>>>>>>>>>>>>>> tackle that.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> R?mi
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>>>>>>> "Christoph
>>>>>>>>>>>>> Dreis"
>>>>>>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>>>>>>> dev at openjdk.java.net>,
>>>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
>>>> NullPointerException
>>>>>>>>> message
>>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>>>>>>> numStackTracesFilledIn
>>>>>>>>>>>>>>>> without synchronization.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Alan
>>
>


From david.holmes at oracle.com  Tue Jul 14 23:09:42 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 15 Jul 2020 09:09:42 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <1190140638.736731.1593715932799.JavaMail.zimbra@u-pem.fr>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
Message-ID: <6ea1c0f8-c133-b7e4-b03c-7d3fa517a0c6@oracle.com>

Hi Goetz,

On 14/07/2020 11:48 pm, Lindenmaier, Goetz wrote:
> Hi,
> 
> Yes, Coleen, you are right. We must preserve the lazy
> computation, and also reduce overhead on discarded
> exceptions.
> 
> And yes, we can do it with a counter:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06/

This is my preferred approach. It would be nice if we could use a 
boolean instead of a counter, but I think we have a ternary state we 
need to track. The counter approach could be made more state-based (and 
avoid theoretical overflow problem) as follows:

      public synchronized Throwable fillInStackTrace() {
          if (numStackTracesFilledIn == 0) {
              numStackTracesFilledIn = 1;
          } else if (numStackTracesFilledIn == 1) {
             // If the stack trace is changed the extended NPE algorithm
             // will compute a wrong message. So compute it beforehand.
             extendedMessage = getExtendedNPEMessage();
             numStackTracesFilledIn = 2;
          }
          return super.fillInStackTrace();
      }

Note neither new field needs to be volatile as Remi pointed out.

> but I would prefer placeholder strings:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07/
> This way we need only one new field.

I found the string version awkward to read, sorry.

I don't think the extra field for the counter is a concern here.

Thanks,
David
-----

> (I need two placeholders, because the getExtendedNPEMessage0()
> sometimes returns null. If I write null into the extendedMessage field,
> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
> time.)
> 
> With webrev 07 the overhead on discarded exceptions is basically the
> same as with webrev 05: one additional field, one assignment in fillInStackTrace().
> 
> What do you think?
> 
> Best regards,
>    Goetz.
> 
> 
> 
> 
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Tuesday, July 14, 2020 1:55 PM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-mlv.fr'
>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-runtime-dev at openjdk.java.net>
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>> Correction ...
>>
>> On 14/07/2020 12:11 pm, David Holmes wrote:
>>> Hi Goetz,
>>>
>>> Okay ... if I understand your position correctly you are looking at this
>>> as if the extended message is created at the time the NPE is thrown, and
>>> it is an implementation detail that we actually determine it lazily. If
>>> it were eagerly determined then neither fillInstacktrace() nor
>>> setStackTrace() would make any difference to the message - just as with
>>> any other exception message.
>>>
>>> However, the lazy determination of the message causes a problem with
>>> fillInStackTrace() because that call will destroy the original backtrace
>>> needed to produce the original message, and create an incorrect message.
>>> setStackTrace() does not have a similar problem because, simply by the
>>> way the current implementation works it doesn't touch the original
>>> backtrace.
>>>
>>> So you are proposing to only fix the bug that is evident in relation to
>>> fillInStackTrace() by no longer evaluating the extended message if
>>> fillInStackTrace() is called after the NPE was constructed.
>>>
>>> But in doing so you break the illusion that the extended message acts
>>> as-if determined at construction time, because you now effectively clear
>>> it when fillInStackTrace is called.
>>>
>>> My position was that if fillInStackTrace can be seen to clear it, then
>>> setStackTrace (which is logically somewhat equivalent) should also be
>>> seen to clear it.
>>>
>>> Alternatively, add a new field to NPE to cache the extended error
>>> message, and explicitly evaluate the message if fillInStackTrace() is
>>> called. That will continue the illusion that the extended message was
>>> actually set at construction time. No changes needed to setStackTrace()
>>> as we can still lazily compute the extended message.
>>>
>>> Something like:
>>>
>>> private String extendedMessage;
>>>
>>> public synchronized Throwable fillInStackTrace() {
>>>   ??? if (extendedMessage == NULL) {
>>>   ??????? extendedMessage = getExtendedNPEMessage();
>>>   ??? }
>>>   ??? return super.fillInStackTrace();
>>> }
>>
>> Coleen pointed out to me that we can't do it like this because we need
>> the initial fillInStacktrace to be fast and we want the extended message
>> computed lazily. So it will still need a counter so we only do this on
>> the second call.
>>
>>
>>    private String extendedMessage;
>>    private int fillInCount;
>>
>>    public synchronized Throwable fillInStackTrace() {
>>         if (extendedMessage == NULL && (fillInCount++ == 1)) {
>>             extendedMessage = getExtendedNPEMessage();
>>         }
>>         return super.fillInStackTrace();
>>    }
>>
>> or something to that effect.
>>
>> David
>> -----
>>
>>> public String getMessage() {
>>>   ??? String message = super.getMessage();
>>>   ??? synchronized(this) {
>>>   ??????? if (message == null) {
>>>   ??????????? // This NPE should have an extended message.
>>>   ??????????? if (extendedMessage == NULL) {
>>>   ??????????????? extendedMessage = getExtendedNPEMessage();
>>>   ??????????? }
>>>   ??????????? message = extendedMessage;
>>>   ??????? }
>>>   ??? }
>>>   ??? return message;
>>> }
>>>
>>> Cheers,
>>> David
>>>
>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>>>> Hi David,
>>>>
>>>>> Your extended message is only computed when there is no original
>>>>> message.
>>>> Hmm. I would say the extended message is only computed when
>>>> The NPE was raised by the runtime. It happens to never have a
>>>> message so far in these cases.
>>>> But this is two views to the same thing ??
>>>>
>>>>> You're concerned about this scenario:
>>>>>
>>>>> catch (NullPointerException npe) {
>>>>>  ??? String msg1 = npe.getMessage(); // gets extends NPE message
>>>>>  ??? npe.setStackTrace(...);
>>>>>  ??? String msg2 = npe.getMessage(); // gets null
>>>>> }
>>>>>
>>>>> While I find it hard to imagine anyone doing this
>>>> Well, all the scenario are quite artificial:
>>>>  ? - why would you call fillInStackTrace on an exception thrown by the VM?
>>>>  ? - why would you call setStackTrace at all?
>>>>> you can easily have
>>>>> specified that the extended message is only available with the original
>>>>> stacktrace, hence after a second call to fillInStackTrace, or a call to
>>>>> setStackTrace, then the message reverts to being empty.
>>>> The message is not meant to be a special thing that behaves different
>>>> from other messages.? Like sometime be available, sometime not.
>>>> It ended up being different through requirements during the
>>>> review.
>>>>
>>>>> To me that makes
>>>>> far more sense than having msg2 continue to report the extended info
>> for
>>>>> the original stacktrace when it now has a new stacktrace.
>>>>>
>>>>> I'm really not seeing why calling fillInstackTrace() a second time
>>>>> should be treated any differently to calling setStackTrace(). They
>>>>> should be handled consistently IMO.
>>>> But then you treat setStackTrace() differently from setStackTrace()
>>>> with other exceptions.
>>>> The reason to treat fillInStackTrace differently is that we lost
>>>> information
>>>> needed to compute it. This is not the case with setStackTrace().
>>>>
>>>> A different solution, the one I would have proposed if I had not
>>>> considered previous comments from reviews,? would be to just
>>>> compute the message in the runtime in the call of fillInStackTrace
>>>> before the old stack trace is lost and assign it to the message field.
>>>> This way it would behave similar to all other exceptions. The message
>>>> would just be there ... just that it's computed lazily.
>>>> The cost of the algorithm wouldn't harm that much as other costly
>>>> algorithms (walking the stack) are performed at this point, too.
>>>>
>>>>> We are not talking about all exceptions only about your NPE extended
>>>>> error message.
>>>> Hmm, the inconsistency caused by the code you posted above
>>>> holds for all exceptions.? If you fiddle with the stack trace,
>>>> the message might become pointless.? Wrt. setStackTrace
>>>> they all behave the same.
>>>> Wrt. fillInStackTrace the message will be wrong. Only this
>>>> needs to be fixed.
>>>>
>>>> Best regards,
>>>>  ?? Goetz.
>>>>
>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> I implemented an example where wrong stack traces are
>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>> NPE_fillInStackTrace-
>>>>> jdk15/05/mess_with_exceptions.patch
>>>>>> See also the generated output added to a comment in the patch.
>>>>>> If the NEP message text was missing in the second printout, I think
>>>>>> this really would be unexpected.
>>>>>> Please note that the correct message is printed after messing
>>>>>> with the stack trace, it's the stack trace that is wrong.
>>>>>> (Not as with the problem I am fixing here where a wrong
>>>>>> message is printed.)
>>>>>>
>>>>>> Best regards,
>>>>>>  ??? Goetz.
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> I guess the normal usecase of setStackTrace is the other way around:
>>>>>>>> Change the message and throw a new exception with the existing
>>>>>>>> stack trace:
>>>>>>>>
>>>>>>>> try {
>>>>>>>>  ???? a.x;
>>>>>>>> catch (NullPointerException e) {
>>>>>>>>  ???? throw new NullPointerException("My own error
>>>>>>> message").setStackTrace(e.getStackTrace);
>>>>>>>> }
>>>>>>>>
>>>>>>>> And not taking an arbitrary stack trace and put it into an exception
>>>>>>>> with existing message.
>>>>>>>
>>>>>>> Interesting usage.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>  ???? Goetz.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> 'forax at univ-
>>>>>>> mlv.fr'
>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
>>>>>>>>> hotspot-runtime-dev
>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>> message
>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>
>>>>>>>>> Hi Goetz,
>>>>>>>>>
>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>> need to
>>>>>>> add
>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>> NPE_fillInStackTrace-
>>>>>>>>> jdk15/05/
>>>>>>>>>>
>>>>>>>>>> I added the volatile, too, but as I understand the synchronized
>>>>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>>>>> without.
>>>>>>>>>
>>>>>>>>> No "volatile" needed, or wanted, when all access is within
>>>>>>>>> synchronized
>>>>>>>>> regions.
>>>>>>>>>
>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>> instance
>>>>>>>>> and
>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>>>>> information about it just seems highly unrealistic.
>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>>>>> performance
>>>>>>>>> ??
>>>>>>>>>
>>>>>>>>> Contention was not my concern at all. :)
>>>>>>>>>
>>>>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>> The test shows that after setStackTrace still the correct message
>>>>>>>>>> is computed. This is because the algorithm uses
>>>>>>>>>> Throwable::backtrace
>>>>>>>>>> and not Throwable::stacktrace.? Throwable::backtrace is not
>>>>>>>>>> affected by setStackTrace.
>>>>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>>>>> the message might refer to other code than the stack trace
>>>>>>>>>> points to.
>>>>>>>>>
>>>>>>>>> But you can't adapt the message text - there is no setMessage! If
>>>>>>>>> the
>>>>>>>>> message is NULL and you call setStackTrace() then getMessage(), it
>>>>> makes
>>>>>>>>> no sense to return the extended error message that was associated
>>>>> with
>>>>>>>>> the original stack/backtrace.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>  ????? Goetz.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>> 'forax at univ-
>>>>>>>>> mlv.fr'
>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
>> runtime-
>>>>> dev
>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>>> message
>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>
>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>
>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>>>>> Hi Remi,
>>>>>>>>>>>>
>>>>>>>>>>>> But how does volatile help?
>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
>>>>>>>>>>>> always the
>>>>>>>>>>>> right value.
>>>>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change it
>> after
>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
>> backtrace.
>>>>>>>>>>>
>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>> need to
>>>>>>> add
>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>
>>>>>>>>>>>  ?????????? public String getMessage() {
>>>>>>>>>>>  ?????????????? String message = super.getMessage();
>>>>>>>>>>>  ?????????????? // If the stack trace was changed the extended
>>>>>>>>>>> NPE algorithm
>>>>>>>>>>>  ?????????????? // will compute a wrong message.
>>>>>>>>>>> +???????? synchronized(this) {
>>>>>>>>>>> !???????????? if (message == null && numStackTracesFilledIn ==
>>>>>>>>>>> 1) {
>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
>>>>>>>>>>> !???????????? }
>>>>>>>>>>> +???????? }
>>>>>>>>>>>  ?????????????? return message;
>>>>>>>>>>>  ?????????? }
>>>>>>>>>>>
>>>>>>>>>>> To be honest the idea that someone would share an exception
>>>>> instance
>>>>>>>>> and
>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst printing out
>>>>>>>>>>> information about it just seems highly unrealistic. But the
>>>>>>>>>>> above fixes
>>>>>>>>>>> it simply. Though after looking at comments in the test I would
>>>>>>>>>>> also
>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>
>>>>>>>>>>>  ??????????? synchronized (this) {
>>>>>>>>>>>  ???????????????? if (this.stackTrace == null && // Immutable stack
>>>>>>>>>>>  ???????????????????? backtrace == null) // Test for out of
>>>>>>>>>>> protocol state
>>>>>>>>>>>  ???????????????????? return;
>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
>>>>>>>>>>>  ???????????????? this.stackTrace = defensiveCopy;
>>>>>>>>>>>  ???????????? }
>>>>>>>>>>>  ???????? }
>>>>>>>>>>>
>>>>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>>>>
>>>>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>>>>> proposed in webrev 02:
>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>> jdk15/02/
>>>>>>>>>>>
>>>>>>>>>>> I much prefer the latest version that recognises that only the
>>>>>>>>>>> original
>>>>>>>>>>> stack can be processed.
>>>>>>>>>>>
>>>>>>>>>>> In the test:
>>>>>>>>>>>
>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
>>>>>>>>>>> for implicilty
>>>>>>>>>>>
>>>>>>>>>>> Two typos: crated? & implicilty
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>>>>  ?????? ex = null;
>>>>>>>>>>>>  ?????? ex.fillInStackTrace()
>>>>>>>>>>>> no message is created.
>>>>>>>>>>>>
>>>>>>>>>>>> I think this really is acceptable.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous mail:
>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
>> some
>>>>>>> point.
>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack
>>>>>>>>>>>>> trace is
>>>>> filled
>>>>>>>>> you
>>>>>>>>>>> don't
>>>>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>>>>> The internal structure is no more deleted when the stack trace
>>>>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>  ?????? Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>> Christoph
>>>>>>> Dreis
>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev <hotspot-
>>>>>>> runtime-
>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>>>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>> yes,
>>>>>>>>>>>>> it's what i was saying,
>>>>>>>>>>>>> given that a NPE can be thrown very early, before VarHandle is
>>>>>>>>> initialized,
>>>>>>>>>>> i
>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is the
>>>>>>>>>>>>> best
>>>>> way
>>>>>>> to
>>>>>>>>>>>>> tackle that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> R?mi
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>>>>> "Christoph
>>>>>>>>>>> Dreis"
>>>>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>>>>> dev at openjdk.java.net>,
>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi Forax"
>>>>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>>>>> numStackTracesFilledIn
>>>>>>>>>>>>>> without synchronization.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Alan

From luhenry at microsoft.com  Wed Jul 15 00:44:13 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Wed, 15 Jul 2020 00:44:13 +0000
Subject: RFR(S): Use Vectored Exception Handling on Windows
In-Reply-To: <CAA-vtUwAQVDgBFa6hXdWaKezrXB09UR5ci4tqaXAz4zcSgL=4w@mail.gmail.com>
References: <MWHPR21MB0511A8150D4CAEBF3181E61EB0980@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUw1nEo_o4ayQBv=MJcKFCTXfvY2ThNL1x9evcvT7fuYyg@mail.gmail.com>
 <MWHPR21MB0511F8E1132F81170290209FB0920@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUzh65R01wHTW9-ObQZ7j0vNWjp_RuYivOrpGHoJNtyNgw@mail.gmail.com>
 <MWHPR21MB05117E4D1CBC613EF52991AEB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <CAA-vtUw4ZwOHitBtsJLVMMma1D+TVV02xzoWmwd1M2yg-S91DQ@mail.gmail.com>
 <MWHPR21MB0511FC402865B075019D00D7B0610@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <CAA-vtUwAQVDgBFa6hXdWaKezrXB09UR5ci4tqaXAz4zcSgL=4w@mail.gmail.com>
Message-ID: <MWHPR21MB0511062B6308E01170842BB9B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>

> The only way backward incompatibility I still see is when third party code were to use UEH - today, their handlers would be ignored if our SEH handler gets called first; with your proposed solution, what happens depends on who calls SetUnhandledExceptionFilter() last.

That is true the order of the UEH depends on the order you call SetUnhandledExceptionFilter, but that is already the case with SEH (if you are deeper in the stack), and signals on Linux/Unix. Moreover, just like signals, the registered UEH needs to manually take care of chaining with previously registered UEH.

> Well, like on Unix, we could add a check to the periodic CheckJNI-triggered code to check if our handler is still in place.

Interesting idea. From looking at os::Linux::check_signal_handler, it only seems to _check_ whether the jvm handler are still registered, not re-registering them, correct?

> But I am convinced now, this really seems better. I think we do not need RtlAddFunctionTable anymore, since VEH would work for exceptions from dynamically generated code too, yes?.

Yes, VEH is called for all exceptions, wherever they are triggered from, similarly to Linux/Unix signals.

________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Tuesday, July 14, 2020 13:14
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi Ludovic,

Okay, I get it now. This sounds good.

The only way backward incompatibility I still see is when third party code were to use UEH - today, their handlers would be ignored if our SEH handler gets called first; with your proposed solution, what happens depends on who calls SetUnhandledExceptionFilter() last.

In fact we may now compete for the UnhandledExceptionFilter with the third party app, like we do on Unix for the signal handler. Well, like on Unix, we could add a check to the periodic CheckJNI-triggered code to check if our handler is still in place.

But I am convinced now, this really seems better. I think we do not need RtlAddFunctionTable anymore, since VEH would work for exceptions from dynamically generated code too, yes?.

Cheers, Thomas


On Tue, Jul 14, 2020 at 6:34 PM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com>> wrote:
Hi Thomas,

This where Windows exception handling and Unix/Linux signals differ. On Windows, you have VEH, SEH and Unhandled Exception Handling (I'll call it UEH here), while on Unix/Linux, you only have signals.

On Windows, by having this split, you can easily split your exception handling into 1. treating expected exceptions (EXCEPTION_ILLEGAL_INSTRUCTION on a deoptimization, EXCEPTION_ACCESS_VIOLATION in arraycopy stub, etc.), and 2. generating an hs_err file on an unexpected exception. You can do 1. with VEH and SEH, and 2. with UEH, and that's what I am proposing to do here.

Practically speaking, the existing `topLevelExceptionFilter` would be split into two: a `topLevelVectoredExceptionFilter` which would be passed to `AddVectoredExceptionHandler`, and a `topLevelUnhandledExceptionHandler` which would be passed to `SetUnhandledExceptionHandler`. This `topLevelUnhandledExceptionHandler` would contain (more or less) _only_ the `VMError::report_and_die`, and the `topLevelVectoredExceptionFilter` would contain _no_ `VMError::report_and_die` whatsoever.

Keeping the `VMError::report_and_die` inside VEH would, like you say, completely kill any use of SEH, even in external libraries. That would be a breaking change, and is then, IMO, not acceptable.

Thanks,

--
Ludovic

________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
Sent: Monday, July 13, 2020 23:29
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi Ludovic,

On Mon, Jul 13, 2020 at 11:55 PM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>>> wrote:
Hi Thomas,

Thank you for your feedback!

Let me answer on some of the cases you mention.

> A) this case exists today. An app getting signals via VEH would have to willingly ignore signals for us to get them. This does not change, your patch would mean this happens less often, so I do not see a backward compatibility problem here.

Exactly.

> B) this is a new case. We would have to ignore signals not meant for us. Technically by just ignoring them. Distinguishing this is a bit difficult though. Note the subtle difference to Unix: there we have signal chaining, so an application which is really really interested in signals for its own purposes uses it (e.g. by preloading libjsig) and then we know its handler and hand over the signal.

Today, through SEH and RtlAddFunctionTable, we only get a very clear subset of exceptions: the one triggered in the code cache. If an exception is triggered from a PC outside of this code cache, SEH will not get the handler we registered with RtlAddFunctionTable, and we'll simply _not_ call into HandleExceptionFromCodeCache (the handler we register with RtlAddFunctionTable). That can be trivially reproduced in the VEH by simply checking that the PC is between CodeCache::low_bound() and CodeCache::high_bound().

This is what you are mentioning with "we only can distinguish our crashes from their crashes via crash pc, rejecting any crash not in our code (dynamic or static). Well, arguably this would be just how it is today with our code scoped via SEH".


Not sure we understand each other.

Today we get exceptions from two sides:
- via SEH, __try/__except, in threads attached to the VM. There the pc is either us or third party code below us which did not bother setting up SEH for themselves
- via RtlAddFunctionTable for the code cache, where we specify code cache boundaries.

With VEH we would get all exceptions in the process. Including exceptions from threads which have never seen the libjvm, or from caller code if the hotspot is embedded somewhere.

Under Unix we handle all those crashes by writing hs-err crashlogs, even if those crashes are not our responsibility. Unless user set up signal chaining, where we hand over any crash signal to the chained handler (which for the purpose of clear error reporting is also not perfect).

With VEH I get all exceptions, but have to decide on my own if an exception should result in a hs-err file or handed to the next exception handler. The only way I can see is by examining the pc - iterate through all our binaries and compare the pc with their text segments, and also check the code cache.

I may miss something here.

> With the added safety net of the unhandled exception filter (what happens if multiple parties call this?).

Here, Unhandled Exception Handling predates VEH and it doesn't integrate chaining. The API is similar to signals on Linux/Unix: the last one to register has to make sure to save the previous one and to call/chain it accordingly.

> My only very small personal gripe would be that I always liked how I can quickly use SEH to check if a pointer is valid without disturbing anyone. But within the hotspot at least I can just as well use SafeFetch.

Nothing from the Win32 API stops you from mix-and-matching VEH and SEH. If you want to do a `__try { val = *ptr; } __except (EXCEPTION_EXECUTE_HANDLER) { success = false; }` in some C++ code (in vm or native), nothing stops you from doing so. My understanding of the exception handler logic in the OpenJDK on Windows is that the accepted EXCEPTION_ACCESS_VIOLATION in java, vm, or native code is limited to a clear subset, and anything outside of these known cases is quickly treated as "an exception we cannot handle". SafeFetch is such a case where the instructions potentially triggering the EXCEPTION_ACCESS_VIOLATION are matched against by the exception handler.


Well, in your example, VEH would have preference and get the exception first; in our handler we recognize the exception as not allowed, hence a crash, and write a hs-err file. My success=false; handler would never execute.

But I admit this is really a minor point. I also dimly remember seeing some win32 API to check pointers for readability, so maybe using SEH for these things is not necessary anyway.

Thanks, Thomas

--
Ludovic

________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com><mailto:thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>>
Sent: Saturday, July 11, 2020 23:08
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net><mailto:hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi Ludovic,

sorry for the delay, and thanks for the extensive answer. Please find remarks inline.

On Fri, Jun 26, 2020 at 12:11 AM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>>>> wrote:
Hi Thomas,

It seems that the problem you're describing stems from the current exception handler treating two cases: 1. any exception knowingly triggered by Java code and treated by HotSpot (ex: safepoint-polling, arraycopy stubs, stackoverflow in Java code), and 2. exceptional cases leading to crashes (ex: uncaught C++ exception, an access violation in VM or native/external code, etc.). There is the same problem on Unix because there is only one system (signal handling) for both cases. Fortunately, Windows proposes different systems, each with its own advantages.

The order in which Windows invokes each of these systems is the following:
 1. Vectored Exception Handler registered with `AddVectoredExceptionHandler`
 2. Structured Exception Handler
 3. Vectored Exception Handler registered with `AddVectoredContinueHandler`
 4. Unhandled Exception Handler

Today, Hotspot on x86/x86_64 catches the exception at 2. via a handler registered with `RtlAddFunctionTable`. This handler does both the Java-triggered exceptions and any other exceptions.

Now, from the point of view of an external library or application embedding the JVM inside their own process, they still have all the above options to register an exception handler, irrespective of how Hotspot does it. This creates the following cases:
 - If the application uses VEH: they will (with Hotspot using SEH) be called _before_ Hotspot's exception handler and will then have to be aware that they may get exceptions unrelated to them and will have to ignore them accordingly
 - If the application uses SEH: they will only get exceptions related to their code area

If Hotspot is to use VEH, an exception would play as follow:
 - If the application uses VEH and their registered handler executes _before_ Hotspot's one: same as above
 - If the application uses VEH and their registered handler executes _after_ Hotspot's one: Hotspot has to make sure that the exception was triggered by Hotspot and ignore them otherwise (a range check on the PC can be used here to emulate how it's done with RltAddFunctionTable)
 - If the application uses SEH: the same case as to where the application's handler executes _after_ Hotspot's one

This all assumes that Hotspot's VEH handler doesn't trigger a crash report (VMError::report_and_die) on any exception it doesn't know how to handle. The simplest way to do that is simply _not_ to do it in Hotspot's VEH handler, and to do it by registering a Win32 Unhandled Exception Handler (with SetUnhandlerdExceptionFilter [1]). This handler is _only_ called when no other exception handler treated the exception (by returning EXCEPTION_CONTINUE_EXECUTION or EXCEPTION_EXECUTE_HANDLER). Invoking it means the application is "toast" and not in a runnable state anymore, which fits nicely with the purpose of the Hotspot crash report.


Okay, If I get this correctly:

Today:
  App uses VEH - they execute before us and have to handle this correctly (->A)
  App uses SEH - no interaction

With proposed switch:
  App uses VEH - they may or may not execute before us. If they come before us: (->A). If they come after us -> (B)
  App uses SEH -> (B)

A) this case exists today. An app getting signals via VEH would have to willingly ignore signals for us to get them. This does not change, your patch would mean this happens less often, so I do not see a backward compatibility problem here.

B) this is a new case. We would have to ignore signals not meant for us. Technically by just ignoring them. Distinguishing this is a bit difficult though. Note the subtle difference to Unix: there we have signal chaining, so an application which is really really interested in signals for its own purposes uses it (e.g. by preloading libjsig) and then we know its handler and hand over the signal.

On windows we do not know this (?), we only can distinguish our crashes from their crashes via crash pc, rejecting any crash not in our code (dynamic or static). Well, arguably this would be just how it is today with our code scoped via SEH. With the added safety net of the unhandled exception filter (what happens if multiple parties call this?).

Okay this seems safe enough to try it at least.

My only very small personal gripe would be that I always liked how I can quickly use SEH to check if a pointer is valid without disturbing anyone. But within the hotspot at least I can just as well use SafeFetch.

Thank you,

Thomas

I hope this sheds some light on possible solutions ahead of us.

Thank you,

--
Ludovic

[1] https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-setunhandledexceptionfilter<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681321753&sdata=HO1c3UrMp%2FRsS5FFIcNUCZebHaayyMXATmEj9TndBRo%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767253164&sdata=7AF3UPjOdK%2Bmgr8OYFiQvsjEYSZ4fQpvLNvATm6pLls%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681331751&sdata=4j4o76CrLBGzsRivDWP%2FKDAQnLxxSxRTMtk2GgeuXgM%3D&reserved=0>><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117672388&sdata=zM0zOUCOujhp2fyW7PVXPplSn13elTyyf4cJUgZj%2Fm8%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681341744&sdata=Ayq4dk2G7yhOJ5E3au7vZDFr80iUlDm%2Ba1mYTH5yg3I%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767253164&sdata=7AF3UPjOdK%2Bmgr8OYFiQvsjEYSZ4fQpvLNvATm6pLls%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fwindows%2Fwin32%2Fapi%2Ferrhandlingapi%2Fnf-errhandlingapi-setunhandledexceptionfilter&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681341744&sdata=Ayq4dk2G7yhOJ5E3au7vZDFr80iUlDm%2Ba1mYTH5yg3I%3D&reserved=0>>>
________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com><mailto:thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>><mailto:thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com><mailto:thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>>>
Sent: Sunday, June 21, 2020 05:55
To: Ludovic Henry
Cc: hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net><mailto:hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>><mailto:hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net><mailto:hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>>
Subject: Re: RFR(S): Use Vectored Exception Handling on Windows

Hi,

We at SAP had used VEH in our own Windows Itanium port and I dimly remember it being a source of problems. That is many years ago and I realize that it is not worth much, but it makes me bit apprehensive of this change.

The main problem I see is that this will be an observable change in behavior.

We currently use SEH, so our error handler is guaranteed to be invoked only for exceptions from within our own code. With VEH we now follow the Unix way of things and suddenly our error handler becomes a global resource.

We will suddenly be invoked for crashes outside the VM, e.g. in foreign launcher code atop of us or in non-java side threads, which will generate whole new classes of hs-err files for crashes the VM is not responsible for. Which are then perceived as VM crashes and sent to us vendors instead of going to the right people. This is the way it works on Unix today, and it is a constant annoyance and increases our support workload.

We also may introduce new problems since suddenly we interfere with application exception handling. At the very least, we have to think up a scheme for signal chaining (both ways: VM->foreign code and foreign code->VM). For the first, we probably need some form of libjsig preloading, or some other way to divert signal handler instalment. That would also need cooperation from the application programmers and/or operators.

Matters are even more complicated, since foreign code may use SEH instead of VEH, so what happens if a JNI library below me wants to use SEH, does that still work?

I feel this should not be rushed. Even considered "brittle" SEH has served us well, I do not recall many problems in the past aside from having to add the occasional __try/__except. Are there actual bugs we have to solve?

Lastly, personally I always found SEH quite a neat concept, and one of the few places where Windows was superior to Unix :)

Thanks, Thomas


On Fri, Jun 19, 2020 at 5:23 PM Ludovic Henry <luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>>><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com><mailto:luhenry at microsoft.com<mailto:luhenry at microsoft.com>>>>> wrote:
Hello,

First, some context and definitions:
- when talking about exception here, I'm talking about Win32 exception which are equivalent to signals on Linux and other Unix, I am _not_ talking about Java exceptions.
- an explanation of an _exception filter_ can be found at https://docs.microsoft.com/en-us/cpp/cpp/writing-an-exception-filter?view=vs-2019<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681351738&sdata=wgUUIy8alY%2Fv8%2B5komkjEiwkzbVe4VLmqqYMoqyDAU4%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767263161&sdata=7LKO5ISpYpdDKMysIeYx%2BT6B3o9uFNaY%2FDB924Sr6Vo%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681351738&sdata=wgUUIy8alY%2Fv8%2B5komkjEiwkzbVe4VLmqqYMoqyDAU4%3D&reserved=0>><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681361732&sdata=4EJJEoSxkqy4PATIbmYFdJ%2BvKdHBRs%2BJTGjLAWCJA9A%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767263161&sdata=7LKO5ISpYpdDKMysIeYx%2BT6B3o9uFNaY%2FDB924Sr6Vo%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681361732&sdata=4EJJEoSxkqy4PATIbmYFdJ%2BvKdHBRs%2BJTGjLAWCJA9A%3D&reserved=0>>><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665642403&sdata=fjcrwcQYAg3TstTSO2YHKziszwlusbYV6uUXINydD1E%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681371725&sdata=%2FNipjMVe55D5pK%2BQDvYlBB5kjIsaxYaTpaleG6Dl5W8%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767273154&sdata=88xdAtISIFDd52eRNLpr%2BJ8UNHdmXd6oZvdwsEygbZU%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681371725&sdata=%2FNipjMVe55D5pK%2BQDvYlBB5kjIsaxYaTpaleG6Dl5W8%3D&reserved=0>><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117682378&sdata=LAIuT%2F0l9W1anQUurSRprjzrtAgRo%2F3SjiAHAUvm%2FDs%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681381719&sdata=XU09gfOGMCJ7dKkg6PSlZP3JVM0SOFmWGKPQQWzUsow%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767273154&sdata=88xdAtISIFDd52eRNLpr%2BJ8UNHdmXd6oZvdwsEygbZU%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fcpp%2Fwriting-an-exception-filter%3Fview%3Dvs-2019&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681381719&sdata=XU09gfOGMCJ7dKkg6PSlZP3JVM0SOFmWGKPQQWzUsow%3D&reserved=0>>>>. There is only a limited concept of that in Java with type-based exception filter (ex: `try { ... } catch (IOException ioe) { ... } catch (Throwable t) { ... }`).
- in Win32, there exist two exception handling mechanism:
  - Structured Exception Handling: the historical one, based on `__try {} __except (...) {}`
  - Vectored Exception Handling: introduced in Windows XP / Windows Server 2003, much more similar to signals on Linux

These exception handling mechanisms are used to catch any exceptions like Access Violation, Stack Overflow, Divide by Zero, Overflow, and more. These exceptions are equivalent to signal on Linux and are then core to many mechanisms in the OpenJDK.

Today, the OpenJDK uses Structured Exception Handling to catch such exceptions, creating several requirements. First, all code that might trigger an exception on purpose (like a Access Violation / SIGSEGV in the arraycopy stub), needs to be wrapped up in a __try / __except. Because it's not feasible to wrap every single instance of such code, these __try / __except are put at the top-level most function of any thread started by the runtime. Second, for code generated by Hotspot, `RtlAddFunctionTable` is used to simulate the use of __try / __except for a specific code area. This function needs platform specific code with the generation of  a trampoline that calls the exception filter declared in the runtime. It's also meant to be used as a one to one mapping with try / catch in user code, and not as a "catch all the exceptions in this code area". Third, Structured Exception Handling expects to be able to unwind the stack. However, because Hotspot doesn't guarantee the usage of the platform-specific ABI internally, the platform-specific unwinder might break. Hotspot's usage of `RtlAddFunctionTable` for the code cache relies on the assumption that Structured Exception Handling never tries to unwind the stack (which it would fail to do because of the different ABI) before calling the registered exception filter.

Discussing that with Windows Kernel maintainers, this approach is highly discouraged, considered brittle, and the better solution is Vectored Exception Handling. Vectored Exception Handling is conceptually much more similar to signal / sigaction on Linux and other Unix systems. It will catch all exceptions happening across the process, and no __try / __except will be required. It also removes the requirement to call `RtlAddFunctionTable`.  The exception filter then behaves like a signal handler with the possibility to modify the registers at will, modifying the PC to step over an instruction after an expected Access Violation for example. Vectored Exception Handling is also already used for AOT code.

The changes can be found at http://cr.openjdk.java.net/~burban/ludovic_vecexc/<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681391715&sdata=wrYv%2BR0pJMxS6INIcLbzNdSkpkX1G%2FQzRqeDg9USuDQ%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767283147&sdata=d5JQScm01HijYY5AxVwV2AEjAr%2BuX90MxOGlpfj0lA8%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681391715&sdata=wrYv%2BR0pJMxS6INIcLbzNdSkpkX1G%2FQzRqeDg9USuDQ%3D&reserved=0>><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681401708&sdata=LhubtvtHPE14%2F21GUfmNq%2FA1xo97xeXE75YPwEq6nm0%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767283147&sdata=d5JQScm01HijYY5AxVwV2AEjAr%2BuX90MxOGlpfj0lA8%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681401708&sdata=LhubtvtHPE14%2F21GUfmNq%2FA1xo97xeXE75YPwEq6nm0%3D&reserved=0>>><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Cd552fedab47f45c6fe9808d815e2758f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637283409665652395&sdata=pTewy1%2BeB43HX4y0ypDwMDGRjBoNP6yBGrhRi7ncm1c%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681411704&sdata=38V1ZXnv7qC4ECvrPv19rcfNuDk91nnJJR56bupYX%2B8%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767293145&sdata=SVmMjP8BRzSq1mm%2FG14cQRwiSqgTbx%2Bu8ZpeA1QjhFk%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681421700&sdata=n5q%2FhLPJrfEinsNwu2%2FlMTvX2Ww3by9W10wxVHm0QBo%3D&reserved=0>><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3a2bd46b66be4f6824b108d82629ffb7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637301309117692381&sdata=itjRga%2B5m%2FK2zyt6i0eN12wZMqekP4KPbAqJYgb3zDY%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681421700&sdata=n5q%2FhLPJrfEinsNwu2%2FlMTvX2Ww3by9W10wxVHm0QBo%3D&reserved=0><https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C7c845a5f11314c645d5f08d827bf468a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303049767293145&sdata=SVmMjP8BRzSq1mm%2FG14cQRwiSqgTbx%2Bu8ZpeA1QjhFk%3D&reserved=0<https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fludovic_vecexc%2F&data=02%7C01%7Cluhenry%40microsoft.com%7Ca93ec069e4af4e886cf508d828328173%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637303544681431694&sdata=b4sOIh2%2FNqCv0xnAzW226sQepCd9%2FphZUNa4phO3JkE%3D&reserved=0>>>>. As I am not an author, I have not created a corresponding bug in JBS.

Thank you, and looking forward for your feedback!

--
Ludovic


From daniel.daugherty at oracle.com  Wed Jul 15 02:30:34 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Tue, 14 Jul 2020 22:30:34 -0400
Subject: [15] RFR(S): 8246676: monitor list lock operations need more fencing
 and 8247280
Message-ID: <096dfe66-cc4c-1d83-e876-914937d2f87e@oracle.com>

Greetings,

These fixes are targeted for JDK15 and I would like to push both of these
fixes before the RDP2 cutoff on Thursday.

I have a JDK15 fix ready for a couple of related ObjectMonitor bug fixes:

 ??? JDK-8246676 monitor list lock operations need more fencing
 ??? https://bugs.openjdk.java.net/browse/JDK-8246676

 ??? JDK-8247280 more fencing needed in async deflation for non-TSO machines
 ??? https://bugs.openjdk.java.net/browse/JDK-8247280

The fix for JDK-8246676 has been through three rounds of preliminary
code review with David H.; Erik O. and Robbin participated in the first
preliminary code review round. Mostly comment changes or backouts of
code changes to return to the baseline code were done in the second and
third preliminary rounds.

The fix for JDK-8247280 has been through two rounds of preliminary
code review with David H. The bug fix itself was suggested by Erik O. so
he's likely on-board with my implementation of the fix. :-)

Many thanks to David H., Erik O. and Robbin for their many emails on
these topics and for reviewing the preliminary webrevs.

Here are the two webrevs:

http://cr.openjdk.java.net/~dcubed/8247280-webrev/0-for-jdk15/

http://cr.openjdk.java.net/~dcubed/8246676-webrev/0-for-jdk15/

The project is currently baselined on jdk-15+30 and has gone through
Mach5 Tier[1-3],4,5,6,7,8 testing with no regressions. I've also run
my inflation stress kit on Linux-X64 and macOSX without any regressions.

Thanks, in advance, for any comments, questions or suggestions.

Dan

From david.holmes at oracle.com  Wed Jul 15 06:46:26 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 15 Jul 2020 16:46:26 +1000
Subject: [15] RFR(S): 8246676: monitor list lock operations need more
 fencing and 8247280
In-Reply-To: <096dfe66-cc4c-1d83-e876-914937d2f87e@oracle.com>
References: <096dfe66-cc4c-1d83-e876-914937d2f87e@oracle.com>
Message-ID: <4d874d07-2c43-ebc4-7d2a-a8fa15e914b1@oracle.com>

Hi Dan,

On 15/07/2020 12:30 pm, Daniel D. Daugherty wrote:
> Greetings,
> 
> These fixes are targeted for JDK15 and I would like to push both of these
> fixes before the RDP2 cutoff on Thursday.
> 
> I have a JDK15 fix ready for a couple of related ObjectMonitor bug fixes:
> 
>  ??? JDK-8246676 monitor list lock operations need more fencing
>  ??? https://bugs.openjdk.java.net/browse/JDK-8246676
> 
>  ??? JDK-8247280 more fencing needed in async deflation for non-TSO 
> machines
>  ??? https://bugs.openjdk.java.net/browse/JDK-8247280
> 
> The fix for JDK-8246676 has been through three rounds of preliminary
> code review with David H.; Erik O. and Robbin participated in the first
> preliminary code review round. Mostly comment changes or backouts of
> code changes to return to the baseline code were done in the second and
> third preliminary rounds.
> 
> The fix for JDK-8247280 has been through two rounds of preliminary
> code review with David H. The bug fix itself was suggested by Erik O. so
> he's likely on-board with my implementation of the fix. :-)
> 
> Many thanks to David H., Erik O. and Robbin for their many emails on
> these topics and for reviewing the preliminary webrevs.
> 
> Here are the two webrevs:
> 
> http://cr.openjdk.java.net/~dcubed/8247280-webrev/0-for-jdk15/

My only follow up here is the proper fix for using Atomics with enums. 
The patch is below and I've tested it with tiers 1-3 (link sent separately).

> http://cr.openjdk.java.net/~dcubed/8246676-webrev/0-for-jdk15/

Also looks good.

> The project is currently baselined on jdk-15+30 and has gone through
> Mach5 Tier[1-3],4,5,6,7,8 testing with no regressions. I've also run
> my inflation stress kit on Linux-X64 and macOSX without any regressions.
> 
> Thanks, in advance, for any comments, questions or suggestions.
> 
> Dan

Thanks for working through all the details on this.

David
-----

diff -r 3bef86e53c51 src/hotspot/share/runtime/objectMonitor.hpp
--- a/src/hotspot/share/runtime/objectMonitor.hpp
+++ b/src/hotspot/share/runtime/objectMonitor.hpp
@@ -27,6 +27,7 @@

  #include "memory/allocation.hpp"
  #include "memory/padded.hpp"
+#include "metaprogramming/isRegisteredEnum.hpp"
  #include "oops/markWord.hpp"
  #include "runtime/os.hpp"
  #include "runtime/park.hpp"
@@ -372,4 +373,7 @@
    void      install_displaced_markword_in_object(const oop obj);
  };

+// Register for atomic operations.
+template<> struct IsRegisteredEnum<ObjectMonitor::AllocationState> : 
public TrueType {};
+
  #endif // SHARE_RUNTIME_OBJECTMONITOR_HPP
diff -r 3bef86e53c51 src/hotspot/share/runtime/objectMonitor.inline.hpp
--- a/src/hotspot/share/runtime/objectMonitor.inline.hpp
+++ b/src/hotspot/share/runtime/objectMonitor.inline.hpp
@@ -196,7 +196,7 @@
  }

  inline void 
ObjectMonitor::release_set_allocation_state(ObjectMonitor::AllocationState 
s) {
-  Atomic::release_store((int*)&_allocation_state, (int)s);
+  Atomic::release_store(&_allocation_state, s);
  }

  inline void 
ObjectMonitor::set_allocation_state(ObjectMonitor::AllocationState s) {
@@ -208,7 +208,7 @@
  }

  inline ObjectMonitor::AllocationState 
ObjectMonitor::allocation_state_acquire() const {
-  return (AllocationState)Atomic::load_acquire((int*)&_allocation_state);
+  return Atomic::load_acquire(&_allocation_state);
  }

  inline bool ObjectMonitor::is_free() const {

From goetz.lindenmaier at sap.com  Wed Jul 15 08:27:54 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 15 Jul 2020 08:27:54 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <1277163916.614880.1594747834571.JavaMail.zimbra@u-pem.fr>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <1277163916.614880.1594747834571.JavaMail.zimbra@u-pem.fr>
Message-ID: <AM4PR0202MB296465DFF92249DAD77B1385EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Remi,

Thanks for looking at the change.

> the static final fields should be in uppercase to make the code 
> more readable,
Fixed.

> you need to use new String("1") because "1" may be a valid string, 
Why that? I only use the values in this code, they are private, and the
field is private.
They just need to be different from each other, and different from all
potential results of getExtendedNPEMessage().

> i also think
> that "1" and "2" are not explicit enough, using something like
> "MUST_COMPUTE_EXTENTED_NPE_MESSAGE" seems better IMO.
Changed. I thought a short string saves memory ...
 
> i don't think you need to declare extendedMessage volatile, it is only
> accessed inside a synchronized block on this.
Fixed.
 
> in getMessage, you can use a early return to simplify the code shape
>   synchronized(this) {
>     if (extendedMessage == mustComputeExtendedNPEMessage) {
>        // Only the original stack trace was filled in. Message will
>        // compute correctly.
>        return extendedMessage = getExtendedNPEMessage();   // <-- HERE
I need to check for NO_EXTENDED_MESSAGE before returning. 
Else I return "2" now "NO_EXTENDED_MESSAGE".

I made a new webrev, but it seems I need to follow webrev 06...
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07.2/

Best regards,
  Goetz.


From christoph.dreis at freenet.de  Wed Jul 15 09:08:42 2020
From: christoph.dreis at freenet.de (Christoph Dreis)
Date: Wed, 15 Jul 2020 11:08:42 +0200
Subject: Performance of instanceof with interfaces is multiple times slower
 than with classes
Message-ID: <D14A6EE0-DE31-43CF-8FA2-FF484F121BD2@freenet.de>

Hi,

please forgive me if this is a stupid question or a known problem.

I was working on something that involved an instanceof check and needed to change it slightly.
I was surprised to see that the performance difference between an interface and a class is quite big.

E.g. consider the following benchmark:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class MyBenchmark {

	interface TestInterface {}
	private static class TestClass {}
	private static class AnotherClass {}

	@State(Scope.Thread)
	public static class BenchmarkState {
		private Object clazz = new MyBenchmark.TestClass();
	}

	@Benchmark
	public boolean testInstanceOfInterface(BenchmarkState state) {
		return state.clazz instanceof TestInterface;
	}

	@Benchmark
	public boolean testInstanceOfClass(BenchmarkState state) {
		return state.clazz instanceof AnotherClass;
	}
}

Benchmark                                                Mode  Cnt   Score    Error   Units
MyBenchmark.testInstanceOfClass                          avgt   10   2,085 ?  0,179   ns/op
MyBenchmark.testInstanceOfInterface                      avgt   10  18,783 ?  0,595   ns/op

I was surprised to see that the interface variant is so much slower.
Both checks should return false and there is no big hierarchy that needs to be walked up/down.

Could you enlighten me what the cause for this is and maybe point me to the code where this is done?
Is this maybe even a bug/regression? Can we maybe do something to improve the interface case?

Thanks in advance,
Cheers,
Christoph


From goetz.lindenmaier at sap.com  Wed Jul 15 09:16:19 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 15 Jul 2020 09:16:19 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <VI1PR0202MB2975A3C28CC10E7BC1910FA9EC6D0@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
 <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
Message-ID: <AM4PR0202MB29649738B827D5369DA3B2F9EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Mandy,
> fillInStackTrace and setStackTrace replace the stack trace of a NPE
> instance. Therefore I think both should behave consistently for any NPE
> instances with and without an explicit message.
Thanks.

> For webrev.06/webrev.07, this would behave as if NPE was created with an
> extended message which cannot be altered once constructed.? I expect
> that it'd be rare to see NPE instance thrown by VM (not explicitly
> constructed) but whose stack trace is replaced.? So I'm fine with this
> approach.
Yes, thanks.
 
> webrev.06 is okay while I think checking Throwable::backtrace != null is
> clearer as I suggested.
Yes, this would exactly check what I want to know. 
But it requires leaking information about the implementation details
of the backtrace field from Throwable. I don't think that is 
a good idea.

Please see also my replies to Remi (webrev 07.2) and Coleen and David
(webrev 06.2).

Best regards,
  Goetz.

http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06.2/
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/07.2/

> 
> Mandy
> 
> On 7/14/20 12:55 PM, coleen.phillimore at oracle.com wrote:
> >
> > Goetz and all,
> >
> > I have to admit, the version with the counter 06 is more intuitive to
> > me.? It would be even better if it was a boolean.? I don't think an
> > extra 32 bits in an NPE Throwable matters considering the backtrace is
> > a lot bigger.? The NPE Throwable in general shouldn't be a long lived
> > object, and there shouldn't be thousands of them.
> >
> > There seemed to be disagreement on the issue of the message not
> > matching the stack trace if the code calls setStackTrace(). It doesn't
> > seem like it should be the same at all to fillInStackTrace() to me,
> > but this latest patch maintains the status quo.? If you want to
> > explore this further, I think you should file a separate RFE, and fix
> > the reported bug with this patch.
> >
> > So if I get a vote, I'd pick 06.
> >
> > Thanks,
> > Coleen
> >
> > On 7/14/20 9:48 AM, Lindenmaier, Goetz wrote:
> >> Hi,
> >>
> >> Yes, Coleen, you are right. We must preserve the lazy
> >> computation, and also reduce overhead on discarded
> >> exceptions.
> >>
> >> And yes, we can do it with a counter:
> >> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> jdk15/06/
> >>
> >> but I would prefer placeholder strings:
> >> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> jdk15/07/
> >>
> >> This way we need only one new field.
> >>
> >> (I need two placeholders, because the getExtendedNPEMessage0()
> >> sometimes returns null. If I write null into the extendedMessage field,
> >> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
> >> time.)
> >>
> >> With webrev 07 the overhead on discarded exceptions is basically the
> >> same as with webrev 05: one additional field, one assignment in
> >> fillInStackTrace().
> >>
> >> What do you think?
> >>
> >> Best regards,
> >> ?? Goetz.
> >>
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: David Holmes <david.holmes at oracle.com>
> >>> Sent: Tuesday, July 14, 2020 1:55 PM
> >>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'forax at univ-
> mlv.fr'
> >>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> >>> <hotspot-runtime-dev at openjdk.java.net>
> >>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> message
> >>> after calling fillInStackTrace
> >>>
> >>> Correction ...
> >>>
> >>> On 14/07/2020 12:11 pm, David Holmes wrote:
> >>>> Hi Goetz,
> >>>>
> >>>> Okay ... if I understand your position correctly you are looking at
> >>>> this
> >>>> as if the extended message is created at the time the NPE is
> >>>> thrown, and
> >>>> it is an implementation detail that we actually determine it
> >>>> lazily. If
> >>>> it were eagerly determined then neither fillInstacktrace() nor
> >>>> setStackTrace() would make any difference to the message - just as
> >>>> with
> >>>> any other exception message.
> >>>>
> >>>> However, the lazy determination of the message causes a problem with
> >>>> fillInStackTrace() because that call will destroy the original
> >>>> backtrace
> >>>> needed to produce the original message, and create an incorrect
> >>>> message.
> >>>> setStackTrace() does not have a similar problem because, simply by the
> >>>> way the current implementation works it doesn't touch the original
> >>>> backtrace.
> >>>>
> >>>> So you are proposing to only fix the bug that is evident in
> >>>> relation to
> >>>> fillInStackTrace() by no longer evaluating the extended message if
> >>>> fillInStackTrace() is called after the NPE was constructed.
> >>>>
> >>>> But in doing so you break the illusion that the extended message acts
> >>>> as-if determined at construction time, because you now effectively
> >>>> clear
> >>>> it when fillInStackTrace is called.
> >>>>
> >>>> My position was that if fillInStackTrace can be seen to clear it, then
> >>>> setStackTrace (which is logically somewhat equivalent) should also be
> >>>> seen to clear it.
> >>>>
> >>>> Alternatively, add a new field to NPE to cache the extended error
> >>>> message, and explicitly evaluate the message if fillInStackTrace() is
> >>>> called. That will continue the illusion that the extended message was
> >>>> actually set at construction time. No changes needed to
> >>>> setStackTrace()
> >>>> as we can still lazily compute the extended message.
> >>>>
> >>>> Something like:
> >>>>
> >>>> private String extendedMessage;
> >>>>
> >>>> public synchronized Throwable fillInStackTrace() {
> >>>> ? ??? if (extendedMessage == NULL) {
> >>>> ? ??????? extendedMessage = getExtendedNPEMessage();
> >>>> ? ??? }
> >>>> ? ??? return super.fillInStackTrace();
> >>>> }
> >>> Coleen pointed out to me that we can't do it like this because we need
> >>> the initial fillInStacktrace to be fast and we want the extended
> >>> message
> >>> computed lazily. So it will still need a counter so we only do this on
> >>> the second call.
> >>>
> >>>
> >>> ?? private String extendedMessage;
> >>> ?? private int fillInCount;
> >>>
> >>> ?? public synchronized Throwable fillInStackTrace() {
> >>> ??????? if (extendedMessage == NULL && (fillInCount++ == 1)) {
> >>> ??????????? extendedMessage = getExtendedNPEMessage();
> >>> ??????? }
> >>> ??????? return super.fillInStackTrace();
> >>> ?? }
> >>>
> >>> or something to that effect.
> >>>
> >>> David
> >>> -----
> >>>
> >>>> public String getMessage() {
> >>>> ? ??? String message = super.getMessage();
> >>>> ? ??? synchronized(this) {
> >>>> ? ??????? if (message == null) {
> >>>> ? ??????????? // This NPE should have an extended message.
> >>>> ? ??????????? if (extendedMessage == NULL) {
> >>>> ? ??????????????? extendedMessage = getExtendedNPEMessage();
> >>>> ? ??????????? }
> >>>> ? ??????????? message = extendedMessage;
> >>>> ? ??????? }
> >>>> ? ??? }
> >>>> ? ??? return message;
> >>>> }
> >>>>
> >>>> Cheers,
> >>>> David
> >>>>
> >>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
> >>>>> Hi David,
> >>>>>
> >>>>>> Your extended message is only computed when there is no original
> >>>>>> message.
> >>>>> Hmm. I would say the extended message is only computed when
> >>>>> The NPE was raised by the runtime. It happens to never have a
> >>>>> message so far in these cases.
> >>>>> But this is two views to the same thing ??
> >>>>>
> >>>>>> You're concerned about this scenario:
> >>>>>>
> >>>>>> catch (NullPointerException npe) {
> >>>>>> ???? String msg1 = npe.getMessage(); // gets extends NPE message
> >>>>>> ???? npe.setStackTrace(...);
> >>>>>> ???? String msg2 = npe.getMessage(); // gets null
> >>>>>> }
> >>>>>>
> >>>>>> While I find it hard to imagine anyone doing this
> >>>>> Well, all the scenario are quite artificial:
> >>>>> ?? - why would you call fillInStackTrace on an exception thrown by
> >>>>> the VM?
> >>>>> ?? - why would you call setStackTrace at all?
> >>>>>> you can easily have
> >>>>>> specified that the extended message is only available with the
> >>>>>> original
> >>>>>> stacktrace, hence after a second call to fillInStackTrace, or a
> >>>>>> call to
> >>>>>> setStackTrace, then the message reverts to being empty.
> >>>>> The message is not meant to be a special thing that behaves different
> >>>>> from other messages.? Like sometime be available, sometime not.
> >>>>> It ended up being different through requirements during the
> >>>>> review.
> >>>>>
> >>>>>> To me that makes
> >>>>>> far more sense than having msg2 continue to report the extended
> info
> >>> for
> >>>>>> the original stacktrace when it now has a new stacktrace.
> >>>>>>
> >>>>>> I'm really not seeing why calling fillInstackTrace() a second time
> >>>>>> should be treated any differently to calling setStackTrace(). They
> >>>>>> should be handled consistently IMO.
> >>>>> But then you treat setStackTrace() differently from setStackTrace()
> >>>>> with other exceptions.
> >>>>> The reason to treat fillInStackTrace differently is that we lost
> >>>>> information
> >>>>> needed to compute it. This is not the case with setStackTrace().
> >>>>>
> >>>>> A different solution, the one I would have proposed if I had not
> >>>>> considered previous comments from reviews,? would be to just
> >>>>> compute the message in the runtime in the call of fillInStackTrace
> >>>>> before the old stack trace is lost and assign it to the message
> >>>>> field.
> >>>>> This way it would behave similar to all other exceptions. The message
> >>>>> would just be there ... just that it's computed lazily.
> >>>>> The cost of the algorithm wouldn't harm that much as other costly
> >>>>> algorithms (walking the stack) are performed at this point, too.
> >>>>>
> >>>>>> We are not talking about all exceptions only about your NPE
> extended
> >>>>>> error message.
> >>>>> Hmm, the inconsistency caused by the code you posted above
> >>>>> holds for all exceptions.? If you fiddle with the stack trace,
> >>>>> the message might become pointless.? Wrt. setStackTrace
> >>>>> they all behave the same.
> >>>>> Wrt. fillInStackTrace the message will be wrong. Only this
> >>>>> needs to be fixed.
> >>>>>
> >>>>> Best regards,
> >>>>> ??? Goetz.
> >>>>>
> >>>>>
> >>>>>> David
> >>>>>> -----
> >>>>>>
> >>>>>>> I implemented an example where wrong stack traces are
> >>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
> >>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>> NPE_fillInStackTrace-
> >>>>>> jdk15/05/mess_with_exceptions.patch
> >>>>>>> See also the generated output added to a comment in the patch.
> >>>>>>> If the NEP message text was missing in the second printout, I think
> >>>>>>> this really would be unexpected.
> >>>>>>> Please note that the correct message is printed after messing
> >>>>>>> with the stack trace, it's the stack trace that is wrong.
> >>>>>>> (Not as with the problem I am fixing here where a wrong
> >>>>>>> message is printed.)
> >>>>>>>
> >>>>>>> Best regards,
> >>>>>>> ???? Goetz.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>> I guess the normal usecase of setStackTrace is the other way
> >>>>>>>>> around:
> >>>>>>>>> Change the message and throw a new exception with the existing
> >>>>>>>>> stack trace:
> >>>>>>>>>
> >>>>>>>>> try {
> >>>>>>>>> ????? a.x;
> >>>>>>>>> catch (NullPointerException e) {
> >>>>>>>>> ????? throw new NullPointerException("My own error
> >>>>>>>> message").setStackTrace(e.getStackTrace);
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> And not taking an arbitrary stack trace and put it into an
> >>>>>>>>> exception
> >>>>>>>>> with existing message.
> >>>>>>>> Interesting usage.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> David
> >>>>>>>> -----
> >>>>>>>>
> >>>>>>>>> Best regards,
> >>>>>>>>> ????? Goetz.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
> >>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>> 'forax at univ-
> >>>>>>>> mlv.fr'
> >>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman
> <Alan.Bateman at oracle.com>
> >>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
> >>>>>>>>>> hotspot-runtime-dev
> >>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >>>>>> message
> >>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>
> >>>>>>>>>> Hi Goetz,
> >>>>>>>>>>
> >>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>>> True. To ensure you process the original backtrace only you
> >>>>>>>>>>>> need to
> >>>>>>>> add
> >>>>>>>>>>>> synchronization in getMessage():
> >>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>>>>> NPE_fillInStackTrace-
> >>>>>>>>>> jdk15/05/
> >>>>>>>>>>> I added the volatile, too, but as I understand the synchronized
> >>>>>>>>>>> block brings sufficient memory barriers that this also works
> >>>>>>>>>>> without.
> >>>>>>>>>> No "volatile" needed, or wanted, when all access is within
> >>>>>>>>>> synchronized
> >>>>>>>>>> regions.
> >>>>>>>>>>
> >>>>>>>>>>>> To be honest the idea that someone would share an
> exception
> >>>>>> instance
> >>>>>>>>>> and
> >>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst
> >>>>>>>>>>>> printing out
> >>>>>>>>>>>> information about it just seems highly unrealistic.
> >>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
> >>>>>> performance
> >>>>>>>>>> ??
> >>>>>>>>>>
> >>>>>>>>>> Contention was not my concern at all. :)
> >>>>>>>>>>
> >>>>>>>>>>>> Though after looking at comments in the test I would also
> >>>>>>>>>>>> suggest that setStackTrace be updated:
> >>>>>>>>>>> The test shows that after setStackTrace still the correct
> >>>>>>>>>>> message
> >>>>>>>>>>> is computed. This is because the algorithm uses
> >>>>>>>>>>> Throwable::backtrace
> >>>>>>>>>>> and not Throwable::stacktrace. Throwable::backtrace is not
> >>>>>>>>>>> affected by setStackTrace.
> >>>>>>>>>>> The behavior is just as with any exception. If you fiddle
> >>>>>>>>>>> with the stack trace, but don't adapt the message text,
> >>>>>>>>>>> the message might refer to other code than the stack trace
> >>>>>>>>>>> points to.
> >>>>>>>>>> But you can't adapt the message text - there is no
> >>>>>>>>>> setMessage! If
> >>>>>>>>>> the
> >>>>>>>>>> message is NULL and you call setStackTrace() then
> >>>>>>>>>> getMessage(), it
> >>>>>> makes
> >>>>>>>>>> no sense to return the extended error message that was
> >>>>>>>>>> associated
> >>>>>> with
> >>>>>>>>>> the original stack/backtrace.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> David
> >>>>>>>>>>
> >>>>>>>>>>> Best regards,
> >>>>>>>>>>> ?????? Goetz.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
> >>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>>>>> 'forax at univ-
> >>>>>>>>>> mlv.fr'
> >>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman
> <Alan.Bateman at oracle.com>
> >>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
> >>> runtime-
> >>>>>> dev
> >>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
> >>>>>>>>>>>> NullPointerException
> >>>>>>>> message
> >>>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Goetz,
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>>> Hi Remi,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> But how does volatile help?
> >>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
> >>>>>>>>>>>>> always the
> >>>>>>>>>>>>> right value.
> >>>>>>>>>>>>> But the backtrace may not be changed until I read it in
> >>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change it
> >>> after
> >>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
> >>> backtrace.
> >>>>>>>>>>>> True. To ensure you process the original backtrace only you
> >>>>>>>>>>>> need to
> >>>>>>>> add
> >>>>>>>>>>>> synchronization in getMessage():
> >>>>>>>>>>>>
> >>>>>>>>>>>> ??????????? public String getMessage() {
> >>>>>>>>>>>> ??????????????? String message = super.getMessage();
> >>>>>>>>>>>> ??????????????? // If the stack trace was changed the extended
> >>>>>>>>>>>> NPE algorithm
> >>>>>>>>>>>> ??????????????? // will compute a wrong message.
> >>>>>>>>>>>> +???????? synchronized(this) {
> >>>>>>>>>>>> !???????????? if (message == null && numStackTracesFilledIn ==
> >>>>>>>>>>>> 1) {
> >>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
> >>>>>>>>>>>> !???????????? }
> >>>>>>>>>>>> +???????? }
> >>>>>>>>>>>> ??????????????? return message;
> >>>>>>>>>>>> ??????????? }
> >>>>>>>>>>>>
> >>>>>>>>>>>> To be honest the idea that someone would share an
> exception
> >>>>>> instance
> >>>>>>>>>> and
> >>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst
> >>>>>>>>>>>> printing out
> >>>>>>>>>>>> information about it just seems highly unrealistic. But the
> >>>>>>>>>>>> above fixes
> >>>>>>>>>>>> it simply. Though after looking at comments in the test I
> >>>>>>>>>>>> would
> >>>>>>>>>>>> also
> >>>>>>>>>>>> suggest that setStackTrace be updated:
> >>>>>>>>>>>>
> >>>>>>>>>>>> ???????????? synchronized (this) {
> >>>>>>>>>>>> ????????????????? if (this.stackTrace == null && //
> >>>>>>>>>>>> Immutable stack
> >>>>>>>>>>>> ????????????????????? backtrace == null) // Test for out of
> >>>>>>>>>>>> protocol state
> >>>>>>>>>>>> ????????????????????? return;
> >>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
> >>>>>>>>>>>> ????????????????? this.stackTrace = defensiveCopy;
> >>>>>>>>>>>> ????????????? }
> >>>>>>>>>>>> ????????? }
> >>>>>>>>>>>>
> >>>>>>>>>>>> as that would seem to be another hole in the mechanism.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> I want to vote again for the much more simple version
> >>>>>>>>>>>>> proposed in webrev 02:
> >>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>>>>>>> NPE_fillInStackTrace-
> >>>>>>>>>>>> jdk15/02/
> >>>>>>>>>>>>
> >>>>>>>>>>>> I much prefer the latest version that recognises that only the
> >>>>>>>>>>>> original
> >>>>>>>>>>>> stack can be processed.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In the test:
> >>>>>>>>>>>>
> >>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
> >>>>>>>>>>>> for implicilty
> >>>>>>>>>>>>
> >>>>>>>>>>>> Two typos: crated? & implicilty
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> David
> >>>>>>>>>>>> -----
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> It's drawback is only that for this code:
> >>>>>>>>>>>>> ??????? ex = null;
> >>>>>>>>>>>>> ??????? ex.fillInStackTrace()
> >>>>>>>>>>>>> no message is created.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I think this really is acceptable.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous
> >>>>>>>>>>>>> mail:
> >>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace at
> >>> some
> >>>>>>>> point.
> >>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java stack
> >>>>>>>>>>>>>> trace is
> >>>>>> filled
> >>>>>>>>>> you
> >>>>>>>>>>>> don't
> >>>>>>>>>>>>>> compute any helpful message anyway.
> >>>>>>>>>>>>> The internal structure is no more deleted when the stack
> >>>>>>>>>>>>> trace
> >>>>>>>>>>>>> is filled. So the message can be computed later, too.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>> ??????? Goetz.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> >>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
> >>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>> Christoph
> >>>>>>>> Dreis
> >>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-
> >>>>>>>> runtime-
> >>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
> >>>>>> <david.holmes at oracle.com>
> >>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
> >>> NullPointerException
> >>>>>>>>>> message
> >>>>>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> yes,
> >>>>>>>>>>>>>> it's what i was saying,
> >>>>>>>>>>>>>> given that a NPE can be thrown very early, before
> >>>>>>>>>>>>>> VarHandle is
> >>>>>>>>>> initialized,
> >>>>>>>>>>>> i
> >>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile is
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>> best
> >>>>>> way
> >>>>>>>> to
> >>>>>>>>>>>>>> tackle that.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> R?mi
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ----- Mail original -----
> >>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> >>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
> >>>>>> "Christoph
> >>>>>>>>>>>> Dreis"
> >>>>>>>>>>>>>> <christoph.dreis at freenet.de>
> >>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
> >>>>>>>> dev at openjdk.java.net>,
> >>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi
> Forax"
> >>>>>>>>>>>>>>> <forax at univ-mlv.fr>
> >>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> >>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
> >>> NullPointerException
> >>>>>>>> message
> >>>>>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>>>>>> Hi Christoph,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
> >>>>>>>> numStackTracesFilledIn
> >>>>>>>>>>>>>>> without synchronization.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -Alan
> >


From goetz.lindenmaier at sap.com  Wed Jul 15 09:23:43 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 15 Jul 2020 09:23:43 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <4d27afc7-d070-6ef9-3d44-8f3b4d35a611@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <5fc7c1d8-6a6c-d9e8-7c44-6c7ebab80e23@oracle.com>
 <AM4PR0202MB296498B685EC138E99283495EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
 <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
 <4d27afc7-d070-6ef9-3d44-8f3b4d35a611@oracle.com>
Message-ID: <AM4PR0202MB29644E2CE13B6BB91D65BD23EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi everybody ??

First of all  thanks for all the feedback!

I updated webrev 06 to 06.2:
http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06.2/

- removed volatile
- renamed the field so it's obvious it's a ternary state, 
  documented the states and implemented it as proposed by
  David
- added a state change in getMessage(). This avoids repeated
  calls to the native method in case it returns null.

Should I use finals for the states? Or enums?
  private final int NO_BACKTRACE = 0; 
  private final int MUST_COMPUTE_LAZY_MESSAGE = 1; 
  private final int MESSAGE_COMPUTED = 2;

Best regards,
  Goetz.
  

> -----Original Message-----
> From: hotspot-runtime-dev <hotspot-runtime-dev-retn at openjdk.java.net>
> On Behalf Of coleen.phillimore at oracle.com
> Sent: Tuesday, July 14, 2020 10:27 PM
> To: Mandy Chung <mandy.chung at oracle.com>
> Cc: hotspot-runtime-dev at openjdk.java.net
> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
> after calling fillInStackTrace
> 
> 
> 
> On 7/14/20 4:17 PM, Mandy Chung wrote:
> > fillInStackTrace and setStackTrace replace the stack trace of a NPE
> > instance. Therefore I think both should behave consistently for any
> > NPE instances with and without an explicit message.
> >
> > For webrev.06/webrev.07, this would behave as if NPE was created with
> > an extended message which cannot be altered once constructed.? I
> > expect that it'd be rare to see NPE instance thrown by VM (not
> > explicitly constructed) but whose stack trace is replaced.? So I'm
> > fine with this approach.
> >
> > webrev.06 is okay while I think checking Throwable::backtrace != null
> > is clearer as I suggested.
> 
> I like that version 06 isolates knowledge to NullPointerException.java
> and doesn't have to know what the expected value of backtrace is in the
> super class.? I maintain my vote for 06.
> 
> Thanks, I was trying to understand the fillInStackTrace vs.
> setStackTrace issue, and your description makes sense to me.
> 
> Coleen
> 
> >
> > Mandy
> >
> > On 7/14/20 12:55 PM, coleen.phillimore at oracle.com wrote:
> >>
> >> Goetz and all,
> >>
> >> I have to admit, the version with the counter 06 is more intuitive to
> >> me.? It would be even better if it was a boolean. I don't think an
> >> extra 32 bits in an NPE Throwable matters considering the backtrace
> >> is a lot bigger.? The NPE Throwable in general shouldn't be a long
> >> lived object, and there shouldn't be thousands of them.
> >>
> >> There seemed to be disagreement on the issue of the message not
> >> matching the stack trace if the code calls setStackTrace(). It
> >> doesn't seem like it should be the same at all to fillInStackTrace()
> >> to me, but this latest patch maintains the status quo.? If you want
> >> to explore this further, I think you should file a separate RFE, and
> >> fix the reported bug with this patch.
> >>
> >> So if I get a vote, I'd pick 06.
> >>
> >> Thanks,
> >> Coleen
> >>
> >> On 7/14/20 9:48 AM, Lindenmaier, Goetz wrote:
> >>> Hi,
> >>>
> >>> Yes, Coleen, you are right. We must preserve the lazy
> >>> computation, and also reduce overhead on discarded
> >>> exceptions.
> >>>
> >>> And yes, we can do it with a counter:
> >>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> jdk15/06/
> >>>
> >>> but I would prefer placeholder strings:
> >>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
> jdk15/07/
> >>>
> >>> This way we need only one new field.
> >>>
> >>> (I need two placeholders, because the getExtendedNPEMessage0()
> >>> sometimes returns null. If I write null into the extendedMessage field,
> >>> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
> >>> time.)
> >>>
> >>> With webrev 07 the overhead on discarded exceptions is basically the
> >>> same as with webrev 05: one additional field, one assignment in
> >>> fillInStackTrace().
> >>>
> >>> What do you think?
> >>>
> >>> Best regards,
> >>> ?? Goetz.
> >>>
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: David Holmes <david.holmes at oracle.com>
> >>>> Sent: Tuesday, July 14, 2020 1:55 PM
> >>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>>> 'forax at univ-mlv.fr'
> >>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
> >>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
> >>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
> >>>> message
> >>>> after calling fillInStackTrace
> >>>>
> >>>> Correction ...
> >>>>
> >>>> On 14/07/2020 12:11 pm, David Holmes wrote:
> >>>>> Hi Goetz,
> >>>>>
> >>>>> Okay ... if I understand your position correctly you are looking
> >>>>> at this
> >>>>> as if the extended message is created at the time the NPE is
> >>>>> thrown, and
> >>>>> it is an implementation detail that we actually determine it
> >>>>> lazily. If
> >>>>> it were eagerly determined then neither fillInstacktrace() nor
> >>>>> setStackTrace() would make any difference to the message - just as
> >>>>> with
> >>>>> any other exception message.
> >>>>>
> >>>>> However, the lazy determination of the message causes a problem
> with
> >>>>> fillInStackTrace() because that call will destroy the original
> >>>>> backtrace
> >>>>> needed to produce the original message, and create an incorrect
> >>>>> message.
> >>>>> setStackTrace() does not have a similar problem because, simply by
> >>>>> the
> >>>>> way the current implementation works it doesn't touch the original
> >>>>> backtrace.
> >>>>>
> >>>>> So you are proposing to only fix the bug that is evident in
> >>>>> relation to
> >>>>> fillInStackTrace() by no longer evaluating the extended message if
> >>>>> fillInStackTrace() is called after the NPE was constructed.
> >>>>>
> >>>>> But in doing so you break the illusion that the extended message acts
> >>>>> as-if determined at construction time, because you now effectively
> >>>>> clear
> >>>>> it when fillInStackTrace is called.
> >>>>>
> >>>>> My position was that if fillInStackTrace can be seen to clear it,
> >>>>> then
> >>>>> setStackTrace (which is logically somewhat equivalent) should also be
> >>>>> seen to clear it.
> >>>>>
> >>>>> Alternatively, add a new field to NPE to cache the extended error
> >>>>> message, and explicitly evaluate the message if fillInStackTrace() is
> >>>>> called. That will continue the illusion that the extended message was
> >>>>> actually set at construction time. No changes needed to
> >>>>> setStackTrace()
> >>>>> as we can still lazily compute the extended message.
> >>>>>
> >>>>> Something like:
> >>>>>
> >>>>> private String extendedMessage;
> >>>>>
> >>>>> public synchronized Throwable fillInStackTrace() {
> >>>>> ? ??? if (extendedMessage == NULL) {
> >>>>> ? ??????? extendedMessage = getExtendedNPEMessage();
> >>>>> ? ??? }
> >>>>> ? ??? return super.fillInStackTrace();
> >>>>> }
> >>>> Coleen pointed out to me that we can't do it like this because we need
> >>>> the initial fillInStacktrace to be fast and we want the extended
> >>>> message
> >>>> computed lazily. So it will still need a counter so we only do this on
> >>>> the second call.
> >>>>
> >>>>
> >>>> ?? private String extendedMessage;
> >>>> ?? private int fillInCount;
> >>>>
> >>>> ?? public synchronized Throwable fillInStackTrace() {
> >>>> ??????? if (extendedMessage == NULL && (fillInCount++ == 1)) {
> >>>> ??????????? extendedMessage = getExtendedNPEMessage();
> >>>> ??????? }
> >>>> ??????? return super.fillInStackTrace();
> >>>> ?? }
> >>>>
> >>>> or something to that effect.
> >>>>
> >>>> David
> >>>> -----
> >>>>
> >>>>> public String getMessage() {
> >>>>> ? ??? String message = super.getMessage();
> >>>>> ? ??? synchronized(this) {
> >>>>> ? ??????? if (message == null) {
> >>>>> ? ??????????? // This NPE should have an extended message.
> >>>>> ? ??????????? if (extendedMessage == NULL) {
> >>>>> ? ??????????????? extendedMessage = getExtendedNPEMessage();
> >>>>> ? ??????????? }
> >>>>> ? ??????????? message = extendedMessage;
> >>>>> ? ??????? }
> >>>>> ? ??? }
> >>>>> ? ??? return message;
> >>>>> }
> >>>>>
> >>>>> Cheers,
> >>>>> David
> >>>>>
> >>>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
> >>>>>> Hi David,
> >>>>>>
> >>>>>>> Your extended message is only computed when there is no original
> >>>>>>> message.
> >>>>>> Hmm. I would say the extended message is only computed when
> >>>>>> The NPE was raised by the runtime. It happens to never have a
> >>>>>> message so far in these cases.
> >>>>>> But this is two views to the same thing ??
> >>>>>>
> >>>>>>> You're concerned about this scenario:
> >>>>>>>
> >>>>>>> catch (NullPointerException npe) {
> >>>>>>> ???? String msg1 = npe.getMessage(); // gets extends NPE message
> >>>>>>> ???? npe.setStackTrace(...);
> >>>>>>> ???? String msg2 = npe.getMessage(); // gets null
> >>>>>>> }
> >>>>>>>
> >>>>>>> While I find it hard to imagine anyone doing this
> >>>>>> Well, all the scenario are quite artificial:
> >>>>>> ?? - why would you call fillInStackTrace on an exception thrown
> >>>>>> by the VM?
> >>>>>> ?? - why would you call setStackTrace at all?
> >>>>>>> you can easily have
> >>>>>>> specified that the extended message is only available with the
> >>>>>>> original
> >>>>>>> stacktrace, hence after a second call to fillInStackTrace, or a
> >>>>>>> call to
> >>>>>>> setStackTrace, then the message reverts to being empty.
> >>>>>> The message is not meant to be a special thing that behaves
> >>>>>> different
> >>>>>> from other messages.? Like sometime be available, sometime not.
> >>>>>> It ended up being different through requirements during the
> >>>>>> review.
> >>>>>>
> >>>>>>> To me that makes
> >>>>>>> far more sense than having msg2 continue to report the extended
> >>>>>>> info
> >>>> for
> >>>>>>> the original stacktrace when it now has a new stacktrace.
> >>>>>>>
> >>>>>>> I'm really not seeing why calling fillInstackTrace() a second time
> >>>>>>> should be treated any differently to calling setStackTrace(). They
> >>>>>>> should be handled consistently IMO.
> >>>>>> But then you treat setStackTrace() differently from setStackTrace()
> >>>>>> with other exceptions.
> >>>>>> The reason to treat fillInStackTrace differently is that we lost
> >>>>>> information
> >>>>>> needed to compute it. This is not the case with setStackTrace().
> >>>>>>
> >>>>>> A different solution, the one I would have proposed if I had not
> >>>>>> considered previous comments from reviews,? would be to just
> >>>>>> compute the message in the runtime in the call of fillInStackTrace
> >>>>>> before the old stack trace is lost and assign it to the message
> >>>>>> field.
> >>>>>> This way it would behave similar to all other exceptions. The
> >>>>>> message
> >>>>>> would just be there ... just that it's computed lazily.
> >>>>>> The cost of the algorithm wouldn't harm that much as other costly
> >>>>>> algorithms (walking the stack) are performed at this point, too.
> >>>>>>
> >>>>>>> We are not talking about all exceptions only about your NPE
> >>>>>>> extended
> >>>>>>> error message.
> >>>>>> Hmm, the inconsistency caused by the code you posted above
> >>>>>> holds for all exceptions.? If you fiddle with the stack trace,
> >>>>>> the message might become pointless.? Wrt. setStackTrace
> >>>>>> they all behave the same.
> >>>>>> Wrt. fillInStackTrace the message will be wrong. Only this
> >>>>>> needs to be fixed.
> >>>>>>
> >>>>>> Best regards,
> >>>>>> ??? Goetz.
> >>>>>>
> >>>>>>
> >>>>>>> David
> >>>>>>> -----
> >>>>>>>
> >>>>>>>> I implemented an example where wrong stack traces are
> >>>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
> >>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>>> NPE_fillInStackTrace-
> >>>>>>> jdk15/05/mess_with_exceptions.patch
> >>>>>>>> See also the generated output added to a comment in the patch.
> >>>>>>>> If the NEP message text was missing in the second printout, I
> >>>>>>>> think
> >>>>>>>> this really would be unexpected.
> >>>>>>>> Please note that the correct message is printed after messing
> >>>>>>>> with the stack trace, it's the stack trace that is wrong.
> >>>>>>>> (Not as with the problem I am fixing here where a wrong
> >>>>>>>> message is printed.)
> >>>>>>>>
> >>>>>>>> Best regards,
> >>>>>>>> ???? Goetz.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>> I guess the normal usecase of setStackTrace is the other way
> >>>>>>>>>> around:
> >>>>>>>>>> Change the message and throw a new exception with the
> existing
> >>>>>>>>>> stack trace:
> >>>>>>>>>>
> >>>>>>>>>> try {
> >>>>>>>>>> ????? a.x;
> >>>>>>>>>> catch (NullPointerException e) {
> >>>>>>>>>> ????? throw new NullPointerException("My own error
> >>>>>>>>> message").setStackTrace(e.getStackTrace);
> >>>>>>>>>> }
> >>>>>>>>>>
> >>>>>>>>>> And not taking an arbitrary stack trace and put it into an
> >>>>>>>>>> exception
> >>>>>>>>>> with existing message.
> >>>>>>>>> Interesting usage.
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>> David
> >>>>>>>>> -----
> >>>>>>>>>
> >>>>>>>>>> Best regards,
> >>>>>>>>>> ????? Goetz.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
> >>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>>> 'forax at univ-
> >>>>>>>>> mlv.fr'
> >>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman
> <Alan.Bateman at oracle.com>
> >>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
> >>>>>>>>>>> hotspot-runtime-dev
> >>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
> NullPointerException
> >>>>>>> message
> >>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Goetz,
> >>>>>>>>>>>
> >>>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>>> True. To ensure you process the original backtrace only you
> >>>>>>>>>>>>> need to
> >>>>>>>>> add
> >>>>>>>>>>>>> synchronization in getMessage():
> >>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>>>>>> NPE_fillInStackTrace-
> >>>>>>>>>>> jdk15/05/
> >>>>>>>>>>>> I added the volatile, too, but as I understand the
> >>>>>>>>>>>> synchronized
> >>>>>>>>>>>> block brings sufficient memory barriers that this also works
> >>>>>>>>>>>> without.
> >>>>>>>>>>> No "volatile" needed, or wanted, when all access is within
> >>>>>>>>>>> synchronized
> >>>>>>>>>>> regions.
> >>>>>>>>>>>
> >>>>>>>>>>>>> To be honest the idea that someone would share an
> exception
> >>>>>>> instance
> >>>>>>>>>>> and
> >>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst
> >>>>>>>>>>>>> printing out
> >>>>>>>>>>>>> information about it just seems highly unrealistic.
> >>>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
> >>>>>>> performance
> >>>>>>>>>>> ??
> >>>>>>>>>>>
> >>>>>>>>>>> Contention was not my concern at all. :)
> >>>>>>>>>>>
> >>>>>>>>>>>>> Though after looking at comments in the test I would also
> >>>>>>>>>>>>> suggest that setStackTrace be updated:
> >>>>>>>>>>>> The test shows that after setStackTrace still the correct
> >>>>>>>>>>>> message
> >>>>>>>>>>>> is computed. This is because the algorithm uses
> >>>>>>>>>>>> Throwable::backtrace
> >>>>>>>>>>>> and not Throwable::stacktrace. Throwable::backtrace is not
> >>>>>>>>>>>> affected by setStackTrace.
> >>>>>>>>>>>> The behavior is just as with any exception. If you fiddle
> >>>>>>>>>>>> with the stack trace, but don't adapt the message text,
> >>>>>>>>>>>> the message might refer to other code than the stack trace
> >>>>>>>>>>>> points to.
> >>>>>>>>>>> But you can't adapt the message text - there is no
> >>>>>>>>>>> setMessage! If
> >>>>>>>>>>> the
> >>>>>>>>>>> message is NULL and you call setStackTrace() then
> >>>>>>>>>>> getMessage(), it
> >>>>>>> makes
> >>>>>>>>>>> no sense to return the extended error message that was
> >>>>>>>>>>> associated
> >>>>>>> with
> >>>>>>>>>>> the original stack/backtrace.
> >>>>>>>>>>>
> >>>>>>>>>>> Cheers,
> >>>>>>>>>>> David
> >>>>>>>>>>>
> >>>>>>>>>>>> Best regards,
> >>>>>>>>>>>> ?????? Goetz.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
> >>>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
> >>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>>>>>> 'forax at univ-
> >>>>>>>>>>> mlv.fr'
> >>>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman
> <Alan.Bateman at oracle.com>
> >>>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
> >>>> runtime-
> >>>>>>> dev
> >>>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
> >>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
> >>>>>>>>>>>>> NullPointerException
> >>>>>>>>> message
> >>>>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Goetz,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>>>> Hi Remi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> But how does volatile help?
> >>>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
> >>>>>>>>>>>>>> always the
> >>>>>>>>>>>>>> right value.
> >>>>>>>>>>>>>> But the backtrace may not be changed until I read it in
> >>>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change
> it
> >>>> after
> >>>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
> >>>> backtrace.
> >>>>>>>>>>>>> True. To ensure you process the original backtrace only you
> >>>>>>>>>>>>> need to
> >>>>>>>>> add
> >>>>>>>>>>>>> synchronization in getMessage():
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ??????????? public String getMessage() {
> >>>>>>>>>>>>> ??????????????? String message = super.getMessage();
> >>>>>>>>>>>>> ??????????????? // If the stack trace was changed the
> >>>>>>>>>>>>> extended
> >>>>>>>>>>>>> NPE algorithm
> >>>>>>>>>>>>> ??????????????? // will compute a wrong message.
> >>>>>>>>>>>>> +???????? synchronized(this) {
> >>>>>>>>>>>>> !???????????? if (message == null &&
> >>>>>>>>>>>>> numStackTracesFilledIn ==
> >>>>>>>>>>>>> 1) {
> >>>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
> >>>>>>>>>>>>> !???????????? }
> >>>>>>>>>>>>> +???????? }
> >>>>>>>>>>>>> ??????????????? return message;
> >>>>>>>>>>>>> ??????????? }
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> To be honest the idea that someone would share an
> exception
> >>>>>>> instance
> >>>>>>>>>>> and
> >>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst
> >>>>>>>>>>>>> printing out
> >>>>>>>>>>>>> information about it just seems highly unrealistic. But the
> >>>>>>>>>>>>> above fixes
> >>>>>>>>>>>>> it simply. Though after looking at comments in the test I
> >>>>>>>>>>>>> would
> >>>>>>>>>>>>> also
> >>>>>>>>>>>>> suggest that setStackTrace be updated:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ???????????? synchronized (this) {
> >>>>>>>>>>>>> ????????????????? if (this.stackTrace == null && //
> >>>>>>>>>>>>> Immutable stack
> >>>>>>>>>>>>> ????????????????????? backtrace == null) // Test for out of
> >>>>>>>>>>>>> protocol state
> >>>>>>>>>>>>> ????????????????????? return;
> >>>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
> >>>>>>>>>>>>> ????????????????? this.stackTrace = defensiveCopy;
> >>>>>>>>>>>>> ????????????? }
> >>>>>>>>>>>>> ????????? }
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> as that would seem to be another hole in the mechanism.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> I want to vote again for the much more simple version
> >>>>>>>>>>>>>> proposed in webrev 02:
> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
> >>>>>>>>> NPE_fillInStackTrace-
> >>>>>>>>>>>>> jdk15/02/
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I much prefer the latest version that recognises that only
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>> original
> >>>>>>>>>>>>> stack can be processed.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In the test:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
> >>>>>>>>>>>>> for implicilty
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Two typos: crated? & implicilty
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> David
> >>>>>>>>>>>>> -----
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> It's drawback is only that for this code:
> >>>>>>>>>>>>>> ??????? ex = null;
> >>>>>>>>>>>>>> ??????? ex.fillInStackTrace()
> >>>>>>>>>>>>>> no message is created.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I think this really is acceptable.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous
> >>>>>>>>>>>>>> mail:
> >>>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace
> at
> >>>> some
> >>>>>>>>> point.
> >>>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java
> >>>>>>>>>>>>>>> stack
> >>>>>>>>>>>>>>> trace is
> >>>>>>> filled
> >>>>>>>>>>> you
> >>>>>>>>>>>>> don't
> >>>>>>>>>>>>>>> compute any helpful message anyway.
> >>>>>>>>>>>>>> The internal structure is no more deleted when the stack
> >>>>>>>>>>>>>> trace
> >>>>>>>>>>>>>> is filled. So the message can be computed later, too.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>>> ??????? Goetz.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
> >>>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
> >>>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
> >>>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
> >>>> Christoph
> >>>>>>>>> Dreis
> >>>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev
> <hotspot-
> >>>>>>>>> runtime-
> >>>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
> >>>>>>> <david.holmes at oracle.com>
> >>>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
> >>>> NullPointerException
> >>>>>>>>>>> message
> >>>>>>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> yes,
> >>>>>>>>>>>>>>> it's what i was saying,
> >>>>>>>>>>>>>>> given that a NPE can be thrown very early, before
> >>>>>>>>>>>>>>> VarHandle is
> >>>>>>>>>>> initialized,
> >>>>>>>>>>>>> i
> >>>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile
> >>>>>>>>>>>>>>> is the
> >>>>>>>>>>>>>>> best
> >>>>>>> way
> >>>>>>>>> to
> >>>>>>>>>>>>>>> tackle that.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> R?mi
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> ----- Mail original -----
> >>>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
> >>>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
> >>>>>>> "Christoph
> >>>>>>>>>>>>> Dreis"
> >>>>>>>>>>>>>>> <christoph.dreis at freenet.de>
> >>>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
> >>>>>>>>> dev at openjdk.java.net>,
> >>>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi
> Forax"
> >>>>>>>>>>>>>>>> <forax at univ-mlv.fr>
> >>>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
> >>>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
> >>>> NullPointerException
> >>>>>>>>> message
> >>>>>>>>>>>>>>> after calling fillInStackTrace
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
> >>>>>>>>>>>>>>>>> Hi Christoph,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
> >>>>>>>>> numStackTracesFilledIn
> >>>>>>>>>>>>>>>> without synchronization.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> -Alan
> >>
> >


From coleen.phillimore at oracle.com  Wed Jul 15 11:05:39 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 15 Jul 2020 07:05:39 -0400
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB29644E2CE13B6BB91D65BD23EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
 <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
 <4d27afc7-d070-6ef9-3d44-8f3b4d35a611@oracle.com>
 <AM4PR0202MB29644E2CE13B6BB91D65BD23EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <48e82acb-f4b4-d39b-0d2f-53683ba57643@oracle.com>


To me 6.2 is perfect.? I'd missed that there were three states in the 
first 06, and this makes it clear to me.

On 7/15/20 5:23 AM, Lindenmaier, Goetz wrote:
> Hi everybody ??
>
> First of all  thanks for all the feedback!
>
> I updated webrev 06 to 06.2:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06.2/
>
> - removed volatile
> - renamed the field so it's obvious it's a ternary state,
>    documented the states and implemented it as proposed by
>    David
> - added a state change in getMessage(). This avoids repeated
>    calls to the native method in case it returns null.
>
> Should I use finals for the states? Or enums?
>    private final int NO_BACKTRACE = 0;
>    private final int MUST_COMPUTE_LAZY_MESSAGE = 1;
>    private final int MESSAGE_COMPUTED = 2;

If there were more than 3 states, I'd say it was worth big noisy 
uppercase letters, but all of this is local and limited and the comment 
says it all.

Reviewd by me.
Thanks,
Coleen
>
> Best regards,
>    Goetz.
>    
>
>> -----Original Message-----
>> From: hotspot-runtime-dev <hotspot-runtime-dev-retn at openjdk.java.net>
>> On Behalf Of coleen.phillimore at oracle.com
>> Sent: Tuesday, July 14, 2020 10:27 PM
>> To: Mandy Chung <mandy.chung at oracle.com>
>> Cc: hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>>
>>
>> On 7/14/20 4:17 PM, Mandy Chung wrote:
>>> fillInStackTrace and setStackTrace replace the stack trace of a NPE
>>> instance. Therefore I think both should behave consistently for any
>>> NPE instances with and without an explicit message.
>>>
>>> For webrev.06/webrev.07, this would behave as if NPE was created with
>>> an extended message which cannot be altered once constructed.? I
>>> expect that it'd be rare to see NPE instance thrown by VM (not
>>> explicitly constructed) but whose stack trace is replaced.? So I'm
>>> fine with this approach.
>>>
>>> webrev.06 is okay while I think checking Throwable::backtrace != null
>>> is clearer as I suggested.
>> I like that version 06 isolates knowledge to NullPointerException.java
>> and doesn't have to know what the expected value of backtrace is in the
>> super class.? I maintain my vote for 06.
>>
>> Thanks, I was trying to understand the fillInStackTrace vs.
>> setStackTrace issue, and your description makes sense to me.
>>
>> Coleen
>>
>>> Mandy
>>>
>>> On 7/14/20 12:55 PM, coleen.phillimore at oracle.com wrote:
>>>> Goetz and all,
>>>>
>>>> I have to admit, the version with the counter 06 is more intuitive to
>>>> me.? It would be even better if it was a boolean. I don't think an
>>>> extra 32 bits in an NPE Throwable matters considering the backtrace
>>>> is a lot bigger.? The NPE Throwable in general shouldn't be a long
>>>> lived object, and there shouldn't be thousands of them.
>>>>
>>>> There seemed to be disagreement on the issue of the message not
>>>> matching the stack trace if the code calls setStackTrace(). It
>>>> doesn't seem like it should be the same at all to fillInStackTrace()
>>>> to me, but this latest patch maintains the status quo.? If you want
>>>> to explore this further, I think you should file a separate RFE, and
>>>> fix the reported bug with this patch.
>>>>
>>>> So if I get a vote, I'd pick 06.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 7/14/20 9:48 AM, Lindenmaier, Goetz wrote:
>>>>> Hi,
>>>>>
>>>>> Yes, Coleen, you are right. We must preserve the lazy
>>>>> computation, and also reduce overhead on discarded
>>>>> exceptions.
>>>>>
>>>>> And yes, we can do it with a counter:
>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>> jdk15/06/
>>>>> but I would prefer placeholder strings:
>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>> jdk15/07/
>>>>> This way we need only one new field.
>>>>>
>>>>> (I need two placeholders, because the getExtendedNPEMessage0()
>>>>> sometimes returns null. If I write null into the extendedMessage field,
>>>>> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
>>>>> time.)
>>>>>
>>>>> With webrev 07 the overhead on discarded exceptions is basically the
>>>>> same as with webrev 05: one additional field, one assignment in
>>>>> fillInStackTrace().
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Best regards,
>>>>>  ?? Goetz.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>> Sent: Tuesday, July 14, 2020 1:55 PM
>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> 'forax at univ-mlv.fr'
>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>> message
>>>>>> after calling fillInStackTrace
>>>>>>
>>>>>> Correction ...
>>>>>>
>>>>>> On 14/07/2020 12:11 pm, David Holmes wrote:
>>>>>>> Hi Goetz,
>>>>>>>
>>>>>>> Okay ... if I understand your position correctly you are looking
>>>>>>> at this
>>>>>>> as if the extended message is created at the time the NPE is
>>>>>>> thrown, and
>>>>>>> it is an implementation detail that we actually determine it
>>>>>>> lazily. If
>>>>>>> it were eagerly determined then neither fillInstacktrace() nor
>>>>>>> setStackTrace() would make any difference to the message - just as
>>>>>>> with
>>>>>>> any other exception message.
>>>>>>>
>>>>>>> However, the lazy determination of the message causes a problem
>> with
>>>>>>> fillInStackTrace() because that call will destroy the original
>>>>>>> backtrace
>>>>>>> needed to produce the original message, and create an incorrect
>>>>>>> message.
>>>>>>> setStackTrace() does not have a similar problem because, simply by
>>>>>>> the
>>>>>>> way the current implementation works it doesn't touch the original
>>>>>>> backtrace.
>>>>>>>
>>>>>>> So you are proposing to only fix the bug that is evident in
>>>>>>> relation to
>>>>>>> fillInStackTrace() by no longer evaluating the extended message if
>>>>>>> fillInStackTrace() is called after the NPE was constructed.
>>>>>>>
>>>>>>> But in doing so you break the illusion that the extended message acts
>>>>>>> as-if determined at construction time, because you now effectively
>>>>>>> clear
>>>>>>> it when fillInStackTrace is called.
>>>>>>>
>>>>>>> My position was that if fillInStackTrace can be seen to clear it,
>>>>>>> then
>>>>>>> setStackTrace (which is logically somewhat equivalent) should also be
>>>>>>> seen to clear it.
>>>>>>>
>>>>>>> Alternatively, add a new field to NPE to cache the extended error
>>>>>>> message, and explicitly evaluate the message if fillInStackTrace() is
>>>>>>> called. That will continue the illusion that the extended message was
>>>>>>> actually set at construction time. No changes needed to
>>>>>>> setStackTrace()
>>>>>>> as we can still lazily compute the extended message.
>>>>>>>
>>>>>>> Something like:
>>>>>>>
>>>>>>> private String extendedMessage;
>>>>>>>
>>>>>>> public synchronized Throwable fillInStackTrace() {
>>>>>>>  ? ??? if (extendedMessage == NULL) {
>>>>>>>  ? ??????? extendedMessage = getExtendedNPEMessage();
>>>>>>>  ? ??? }
>>>>>>>  ? ??? return super.fillInStackTrace();
>>>>>>> }
>>>>>> Coleen pointed out to me that we can't do it like this because we need
>>>>>> the initial fillInStacktrace to be fast and we want the extended
>>>>>> message
>>>>>> computed lazily. So it will still need a counter so we only do this on
>>>>>> the second call.
>>>>>>
>>>>>>
>>>>>>  ?? private String extendedMessage;
>>>>>>  ?? private int fillInCount;
>>>>>>
>>>>>>  ?? public synchronized Throwable fillInStackTrace() {
>>>>>>  ??????? if (extendedMessage == NULL && (fillInCount++ == 1)) {
>>>>>>  ??????????? extendedMessage = getExtendedNPEMessage();
>>>>>>  ??????? }
>>>>>>  ??????? return super.fillInStackTrace();
>>>>>>  ?? }
>>>>>>
>>>>>> or something to that effect.
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> public String getMessage() {
>>>>>>>  ? ??? String message = super.getMessage();
>>>>>>>  ? ??? synchronized(this) {
>>>>>>>  ? ??????? if (message == null) {
>>>>>>>  ? ??????????? // This NPE should have an extended message.
>>>>>>>  ? ??????????? if (extendedMessage == NULL) {
>>>>>>>  ? ??????????????? extendedMessage = getExtendedNPEMessage();
>>>>>>>  ? ??????????? }
>>>>>>>  ? ??????????? message = extendedMessage;
>>>>>>>  ? ??????? }
>>>>>>>  ? ??? }
>>>>>>>  ? ??? return message;
>>>>>>> }
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>>
>>>>>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>>> Your extended message is only computed when there is no original
>>>>>>>>> message.
>>>>>>>> Hmm. I would say the extended message is only computed when
>>>>>>>> The NPE was raised by the runtime. It happens to never have a
>>>>>>>> message so far in these cases.
>>>>>>>> But this is two views to the same thing ??
>>>>>>>>
>>>>>>>>> You're concerned about this scenario:
>>>>>>>>>
>>>>>>>>> catch (NullPointerException npe) {
>>>>>>>>>  ???? String msg1 = npe.getMessage(); // gets extends NPE message
>>>>>>>>>  ???? npe.setStackTrace(...);
>>>>>>>>>  ???? String msg2 = npe.getMessage(); // gets null
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> While I find it hard to imagine anyone doing this
>>>>>>>> Well, all the scenario are quite artificial:
>>>>>>>>  ?? - why would you call fillInStackTrace on an exception thrown
>>>>>>>> by the VM?
>>>>>>>>  ?? - why would you call setStackTrace at all?
>>>>>>>>> you can easily have
>>>>>>>>> specified that the extended message is only available with the
>>>>>>>>> original
>>>>>>>>> stacktrace, hence after a second call to fillInStackTrace, or a
>>>>>>>>> call to
>>>>>>>>> setStackTrace, then the message reverts to being empty.
>>>>>>>> The message is not meant to be a special thing that behaves
>>>>>>>> different
>>>>>>>> from other messages.? Like sometime be available, sometime not.
>>>>>>>> It ended up being different through requirements during the
>>>>>>>> review.
>>>>>>>>
>>>>>>>>> To me that makes
>>>>>>>>> far more sense than having msg2 continue to report the extended
>>>>>>>>> info
>>>>>> for
>>>>>>>>> the original stacktrace when it now has a new stacktrace.
>>>>>>>>>
>>>>>>>>> I'm really not seeing why calling fillInstackTrace() a second time
>>>>>>>>> should be treated any differently to calling setStackTrace(). They
>>>>>>>>> should be handled consistently IMO.
>>>>>>>> But then you treat setStackTrace() differently from setStackTrace()
>>>>>>>> with other exceptions.
>>>>>>>> The reason to treat fillInStackTrace differently is that we lost
>>>>>>>> information
>>>>>>>> needed to compute it. This is not the case with setStackTrace().
>>>>>>>>
>>>>>>>> A different solution, the one I would have proposed if I had not
>>>>>>>> considered previous comments from reviews,? would be to just
>>>>>>>> compute the message in the runtime in the call of fillInStackTrace
>>>>>>>> before the old stack trace is lost and assign it to the message
>>>>>>>> field.
>>>>>>>> This way it would behave similar to all other exceptions. The
>>>>>>>> message
>>>>>>>> would just be there ... just that it's computed lazily.
>>>>>>>> The cost of the algorithm wouldn't harm that much as other costly
>>>>>>>> algorithms (walking the stack) are performed at this point, too.
>>>>>>>>
>>>>>>>>> We are not talking about all exceptions only about your NPE
>>>>>>>>> extended
>>>>>>>>> error message.
>>>>>>>> Hmm, the inconsistency caused by the code you posted above
>>>>>>>> holds for all exceptions.? If you fiddle with the stack trace,
>>>>>>>> the message might become pointless.? Wrt. setStackTrace
>>>>>>>> they all behave the same.
>>>>>>>> Wrt. fillInStackTrace the message will be wrong. Only this
>>>>>>>> needs to be fixed.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>  ??? Goetz.
>>>>>>>>
>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> I implemented an example where wrong stack traces are
>>>>>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>> NPE_fillInStackTrace-
>>>>>>>>> jdk15/05/mess_with_exceptions.patch
>>>>>>>>>> See also the generated output added to a comment in the patch.
>>>>>>>>>> If the NEP message text was missing in the second printout, I
>>>>>>>>>> think
>>>>>>>>>> this really would be unexpected.
>>>>>>>>>> Please note that the correct message is printed after messing
>>>>>>>>>> with the stack trace, it's the stack trace that is wrong.
>>>>>>>>>> (Not as with the problem I am fixing here where a wrong
>>>>>>>>>> message is printed.)
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>  ???? Goetz.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> I guess the normal usecase of setStackTrace is the other way
>>>>>>>>>>>> around:
>>>>>>>>>>>> Change the message and throw a new exception with the
>> existing
>>>>>>>>>>>> stack trace:
>>>>>>>>>>>>
>>>>>>>>>>>> try {
>>>>>>>>>>>>  ????? a.x;
>>>>>>>>>>>> catch (NullPointerException e) {
>>>>>>>>>>>>  ????? throw new NullPointerException("My own error
>>>>>>>>>>> message").setStackTrace(e.getStackTrace);
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> And not taking an arbitrary stack trace and put it into an
>>>>>>>>>>>> exception
>>>>>>>>>>>> with existing message.
>>>>>>>>>>> Interesting usage.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>  ????? Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> 'forax at univ-
>>>>>>>>>>> mlv.fr'
>>>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman
>> <Alan.Bateman at oracle.com>
>>>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
>>>>>>>>>>>>> hotspot-runtime-dev
>>>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>>>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>>>>>> need to
>>>>>>>>>>> add
>>>>>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>>>> jdk15/05/
>>>>>>>>>>>>>> I added the volatile, too, but as I understand the
>>>>>>>>>>>>>> synchronized
>>>>>>>>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>>>>>>>>> without.
>>>>>>>>>>>>> No "volatile" needed, or wanted, when all access is within
>>>>>>>>>>>>> synchronized
>>>>>>>>>>>>> regions.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To be honest the idea that someone would share an
>> exception
>>>>>>>>> instance
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst
>>>>>>>>>>>>>>> printing out
>>>>>>>>>>>>>>> information about it just seems highly unrealistic.
>>>>>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>>>>>>>>> performance
>>>>>>>>>>>>> ??
>>>>>>>>>>>>>
>>>>>>>>>>>>> Contention was not my concern at all. :)
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>>>> The test shows that after setStackTrace still the correct
>>>>>>>>>>>>>> message
>>>>>>>>>>>>>> is computed. This is because the algorithm uses
>>>>>>>>>>>>>> Throwable::backtrace
>>>>>>>>>>>>>> and not Throwable::stacktrace. Throwable::backtrace is not
>>>>>>>>>>>>>> affected by setStackTrace.
>>>>>>>>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>>>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>>>>>>>>> the message might refer to other code than the stack trace
>>>>>>>>>>>>>> points to.
>>>>>>>>>>>>> But you can't adapt the message text - there is no
>>>>>>>>>>>>> setMessage! If
>>>>>>>>>>>>> the
>>>>>>>>>>>>> message is NULL and you call setStackTrace() then
>>>>>>>>>>>>> getMessage(), it
>>>>>>>>> makes
>>>>>>>>>>>>> no sense to return the extended error message that was
>>>>>>>>>>>>> associated
>>>>>>>>> with
>>>>>>>>>>>>> the original stack/backtrace.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>  ?????? Goetz.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>>>> 'forax at univ-
>>>>>>>>>>>>> mlv.fr'
>>>>>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman
>> <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
>>>>>> runtime-
>>>>>>>>> dev
>>>>>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>>>>>>>>>>>>>>> NullPointerException
>>>>>>>>>>> message
>>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>> Hi Remi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> But how does volatile help?
>>>>>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
>>>>>>>>>>>>>>>> always the
>>>>>>>>>>>>>>>> right value.
>>>>>>>>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change
>> it
>>>>>> after
>>>>>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
>>>>>> backtrace.
>>>>>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>>>>>> need to
>>>>>>>>>>> add
>>>>>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  ??????????? public String getMessage() {
>>>>>>>>>>>>>>>  ??????????????? String message = super.getMessage();
>>>>>>>>>>>>>>>  ??????????????? // If the stack trace was changed the
>>>>>>>>>>>>>>> extended
>>>>>>>>>>>>>>> NPE algorithm
>>>>>>>>>>>>>>>  ??????????????? // will compute a wrong message.
>>>>>>>>>>>>>>> +???????? synchronized(this) {
>>>>>>>>>>>>>>> !???????????? if (message == null &&
>>>>>>>>>>>>>>> numStackTracesFilledIn ==
>>>>>>>>>>>>>>> 1) {
>>>>>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
>>>>>>>>>>>>>>> !???????????? }
>>>>>>>>>>>>>>> +???????? }
>>>>>>>>>>>>>>>  ??????????????? return message;
>>>>>>>>>>>>>>>  ??????????? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To be honest the idea that someone would share an
>> exception
>>>>>>>>> instance
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst
>>>>>>>>>>>>>>> printing out
>>>>>>>>>>>>>>> information about it just seems highly unrealistic. But the
>>>>>>>>>>>>>>> above fixes
>>>>>>>>>>>>>>> it simply. Though after looking at comments in the test I
>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  ???????????? synchronized (this) {
>>>>>>>>>>>>>>>  ????????????????? if (this.stackTrace == null && //
>>>>>>>>>>>>>>> Immutable stack
>>>>>>>>>>>>>>>  ????????????????????? backtrace == null) // Test for out of
>>>>>>>>>>>>>>> protocol state
>>>>>>>>>>>>>>>  ????????????????????? return;
>>>>>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
>>>>>>>>>>>>>>>  ????????????????? this.stackTrace = defensiveCopy;
>>>>>>>>>>>>>>>  ????????????? }
>>>>>>>>>>>>>>>  ????????? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>>>>>>>>> proposed in webrev 02:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>>>>>> jdk15/02/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I much prefer the latest version that recognises that only
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> original
>>>>>>>>>>>>>>> stack can be processed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In the test:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
>>>>>>>>>>>>>>> for implicilty
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Two typos: crated? & implicilty
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>>>>>>>>  ??????? ex = null;
>>>>>>>>>>>>>>>>  ??????? ex.fillInStackTrace()
>>>>>>>>>>>>>>>> no message is created.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think this really is acceptable.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous
>>>>>>>>>>>>>>>> mail:
>>>>>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace
>> at
>>>>>> some
>>>>>>>>>>> point.
>>>>>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java
>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>> trace is
>>>>>>>>> filled
>>>>>>>>>>>>> you
>>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>>>>>>>>> The internal structure is no more deleted when the stack
>>>>>>>>>>>>>>>> trace
>>>>>>>>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>  ??????? Goetz.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> Christoph
>>>>>>>>>>> Dreis
>>>>>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-
>>>>>>>>>>> runtime-
>>>>>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>>>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>>>>>> NullPointerException
>>>>>>>>>>>>> message
>>>>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> yes,
>>>>>>>>>>>>>>>>> it's what i was saying,
>>>>>>>>>>>>>>>>> given that a NPE can be thrown very early, before
>>>>>>>>>>>>>>>>> VarHandle is
>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>> i
>>>>>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile
>>>>>>>>>>>>>>>>> is the
>>>>>>>>>>>>>>>>> best
>>>>>>>>> way
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> tackle that.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> R?mi
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>>>>>>>>> "Christoph
>>>>>>>>>>>>>>> Dreis"
>>>>>>>>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>>>>>>>>> dev at openjdk.java.net>,
>>>>>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi
>> Forax"
>>>>>>>>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
>>>>>> NullPointerException
>>>>>>>>>>> message
>>>>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>>>>>>>>> numStackTracesFilledIn
>>>>>>>>>>>>>>>>>> without synchronization.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Alan


From david.holmes at oracle.com  Wed Jul 15 11:36:49 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 15 Jul 2020 21:36:49 +1000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB29644E2CE13B6BB91D65BD23EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
 <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
 <4d27afc7-d070-6ef9-3d44-8f3b4d35a611@oracle.com>
 <AM4PR0202MB29644E2CE13B6BB91D65BD23EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <a23558d5-4e5b-4d82-be9d-16988458e6c4@oracle.com>

Hi Goetz,

On 15/07/2020 7:23 pm, Lindenmaier, Goetz wrote:
> Hi everybody ??
> 
> First of all  thanks for all the feedback!
> 
> I updated webrev 06 to 06.2:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06.2/
> 
> - removed volatile
> - renamed the field so it's obvious it's a ternary state,
>    documented the states and implemented it as proposed by
>    David
> - added a state change in getMessage(). This avoids repeated
>    calls to the native method in case it returns null.
> 
> Should I use finals for the states? Or enums?
>    private final int NO_BACKTRACE = 0;
>    private final int MUST_COMPUTE_LAZY_MESSAGE = 1;
>    private final int MESSAGE_COMPUTED = 2;

final statics works okay for me - though if the state variable is still 
a "counter" you don't need symbolic names for the states. But this is 
good to go for me.

Thanks,
David

> Best regards,
>    Goetz.
>    
> 
>> -----Original Message-----
>> From: hotspot-runtime-dev <hotspot-runtime-dev-retn at openjdk.java.net>
>> On Behalf Of coleen.phillimore at oracle.com
>> Sent: Tuesday, July 14, 2020 10:27 PM
>> To: Mandy Chung <mandy.chung at oracle.com>
>> Cc: hotspot-runtime-dev at openjdk.java.net
>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message
>> after calling fillInStackTrace
>>
>>
>>
>> On 7/14/20 4:17 PM, Mandy Chung wrote:
>>> fillInStackTrace and setStackTrace replace the stack trace of a NPE
>>> instance. Therefore I think both should behave consistently for any
>>> NPE instances with and without an explicit message.
>>>
>>> For webrev.06/webrev.07, this would behave as if NPE was created with
>>> an extended message which cannot be altered once constructed.? I
>>> expect that it'd be rare to see NPE instance thrown by VM (not
>>> explicitly constructed) but whose stack trace is replaced.? So I'm
>>> fine with this approach.
>>>
>>> webrev.06 is okay while I think checking Throwable::backtrace != null
>>> is clearer as I suggested.
>>
>> I like that version 06 isolates knowledge to NullPointerException.java
>> and doesn't have to know what the expected value of backtrace is in the
>> super class.? I maintain my vote for 06.
>>
>> Thanks, I was trying to understand the fillInStackTrace vs.
>> setStackTrace issue, and your description makes sense to me.
>>
>> Coleen
>>
>>>
>>> Mandy
>>>
>>> On 7/14/20 12:55 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Goetz and all,
>>>>
>>>> I have to admit, the version with the counter 06 is more intuitive to
>>>> me.? It would be even better if it was a boolean. I don't think an
>>>> extra 32 bits in an NPE Throwable matters considering the backtrace
>>>> is a lot bigger.? The NPE Throwable in general shouldn't be a long
>>>> lived object, and there shouldn't be thousands of them.
>>>>
>>>> There seemed to be disagreement on the issue of the message not
>>>> matching the stack trace if the code calls setStackTrace(). It
>>>> doesn't seem like it should be the same at all to fillInStackTrace()
>>>> to me, but this latest patch maintains the status quo.? If you want
>>>> to explore this further, I think you should file a separate RFE, and
>>>> fix the reported bug with this patch.
>>>>
>>>> So if I get a vote, I'd pick 06.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 7/14/20 9:48 AM, Lindenmaier, Goetz wrote:
>>>>> Hi,
>>>>>
>>>>> Yes, Coleen, you are right. We must preserve the lazy
>>>>> computation, and also reduce overhead on discarded
>>>>> exceptions.
>>>>>
>>>>> And yes, we can do it with a counter:
>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>> jdk15/06/
>>>>>
>>>>> but I would prefer placeholder strings:
>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-
>> jdk15/07/
>>>>>
>>>>> This way we need only one new field.
>>>>>
>>>>> (I need two placeholders, because the getExtendedNPEMessage0()
>>>>> sometimes returns null. If I write null into the extendedMessage field,
>>>>> fillInStackTrace sets it to mustComputeExtendedNPEMessage a second
>>>>> time.)
>>>>>
>>>>> With webrev 07 the overhead on discarded exceptions is basically the
>>>>> same as with webrev 05: one additional field, one assignment in
>>>>> fillInStackTrace().
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Best regards,
>>>>>  ?? Goetz.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>> Sent: Tuesday, July 14, 2020 1:55 PM
>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> 'forax at univ-mlv.fr'
>>>>>> <forax at univ-mlv.fr>; Alan Bateman <Alan.Bateman at oracle.com>
>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-runtime-dev
>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>> Subject: Re: [15] RFR: 8248476: No helpful NullPointerException
>>>>>> message
>>>>>> after calling fillInStackTrace
>>>>>>
>>>>>> Correction ...
>>>>>>
>>>>>> On 14/07/2020 12:11 pm, David Holmes wrote:
>>>>>>> Hi Goetz,
>>>>>>>
>>>>>>> Okay ... if I understand your position correctly you are looking
>>>>>>> at this
>>>>>>> as if the extended message is created at the time the NPE is
>>>>>>> thrown, and
>>>>>>> it is an implementation detail that we actually determine it
>>>>>>> lazily. If
>>>>>>> it were eagerly determined then neither fillInstacktrace() nor
>>>>>>> setStackTrace() would make any difference to the message - just as
>>>>>>> with
>>>>>>> any other exception message.
>>>>>>>
>>>>>>> However, the lazy determination of the message causes a problem
>> with
>>>>>>> fillInStackTrace() because that call will destroy the original
>>>>>>> backtrace
>>>>>>> needed to produce the original message, and create an incorrect
>>>>>>> message.
>>>>>>> setStackTrace() does not have a similar problem because, simply by
>>>>>>> the
>>>>>>> way the current implementation works it doesn't touch the original
>>>>>>> backtrace.
>>>>>>>
>>>>>>> So you are proposing to only fix the bug that is evident in
>>>>>>> relation to
>>>>>>> fillInStackTrace() by no longer evaluating the extended message if
>>>>>>> fillInStackTrace() is called after the NPE was constructed.
>>>>>>>
>>>>>>> But in doing so you break the illusion that the extended message acts
>>>>>>> as-if determined at construction time, because you now effectively
>>>>>>> clear
>>>>>>> it when fillInStackTrace is called.
>>>>>>>
>>>>>>> My position was that if fillInStackTrace can be seen to clear it,
>>>>>>> then
>>>>>>> setStackTrace (which is logically somewhat equivalent) should also be
>>>>>>> seen to clear it.
>>>>>>>
>>>>>>> Alternatively, add a new field to NPE to cache the extended error
>>>>>>> message, and explicitly evaluate the message if fillInStackTrace() is
>>>>>>> called. That will continue the illusion that the extended message was
>>>>>>> actually set at construction time. No changes needed to
>>>>>>> setStackTrace()
>>>>>>> as we can still lazily compute the extended message.
>>>>>>>
>>>>>>> Something like:
>>>>>>>
>>>>>>> private String extendedMessage;
>>>>>>>
>>>>>>> public synchronized Throwable fillInStackTrace() {
>>>>>>>  ? ??? if (extendedMessage == NULL) {
>>>>>>>  ? ??????? extendedMessage = getExtendedNPEMessage();
>>>>>>>  ? ??? }
>>>>>>>  ? ??? return super.fillInStackTrace();
>>>>>>> }
>>>>>> Coleen pointed out to me that we can't do it like this because we need
>>>>>> the initial fillInStacktrace to be fast and we want the extended
>>>>>> message
>>>>>> computed lazily. So it will still need a counter so we only do this on
>>>>>> the second call.
>>>>>>
>>>>>>
>>>>>>  ?? private String extendedMessage;
>>>>>>  ?? private int fillInCount;
>>>>>>
>>>>>>  ?? public synchronized Throwable fillInStackTrace() {
>>>>>>  ??????? if (extendedMessage == NULL && (fillInCount++ == 1)) {
>>>>>>  ??????????? extendedMessage = getExtendedNPEMessage();
>>>>>>  ??????? }
>>>>>>  ??????? return super.fillInStackTrace();
>>>>>>  ?? }
>>>>>>
>>>>>> or something to that effect.
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> public String getMessage() {
>>>>>>>  ? ??? String message = super.getMessage();
>>>>>>>  ? ??? synchronized(this) {
>>>>>>>  ? ??????? if (message == null) {
>>>>>>>  ? ??????????? // This NPE should have an extended message.
>>>>>>>  ? ??????????? if (extendedMessage == NULL) {
>>>>>>>  ? ??????????????? extendedMessage = getExtendedNPEMessage();
>>>>>>>  ? ??????????? }
>>>>>>>  ? ??????????? message = extendedMessage;
>>>>>>>  ? ??????? }
>>>>>>>  ? ??? }
>>>>>>>  ? ??? return message;
>>>>>>> }
>>>>>>>
>>>>>>> Cheers,
>>>>>>> David
>>>>>>>
>>>>>>> On 14/07/2020 12:48 am, Lindenmaier, Goetz wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>>> Your extended message is only computed when there is no original
>>>>>>>>> message.
>>>>>>>> Hmm. I would say the extended message is only computed when
>>>>>>>> The NPE was raised by the runtime. It happens to never have a
>>>>>>>> message so far in these cases.
>>>>>>>> But this is two views to the same thing ??
>>>>>>>>
>>>>>>>>> You're concerned about this scenario:
>>>>>>>>>
>>>>>>>>> catch (NullPointerException npe) {
>>>>>>>>>  ???? String msg1 = npe.getMessage(); // gets extends NPE message
>>>>>>>>>  ???? npe.setStackTrace(...);
>>>>>>>>>  ???? String msg2 = npe.getMessage(); // gets null
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> While I find it hard to imagine anyone doing this
>>>>>>>> Well, all the scenario are quite artificial:
>>>>>>>>  ?? - why would you call fillInStackTrace on an exception thrown
>>>>>>>> by the VM?
>>>>>>>>  ?? - why would you call setStackTrace at all?
>>>>>>>>> you can easily have
>>>>>>>>> specified that the extended message is only available with the
>>>>>>>>> original
>>>>>>>>> stacktrace, hence after a second call to fillInStackTrace, or a
>>>>>>>>> call to
>>>>>>>>> setStackTrace, then the message reverts to being empty.
>>>>>>>> The message is not meant to be a special thing that behaves
>>>>>>>> different
>>>>>>>> from other messages.? Like sometime be available, sometime not.
>>>>>>>> It ended up being different through requirements during the
>>>>>>>> review.
>>>>>>>>
>>>>>>>>> To me that makes
>>>>>>>>> far more sense than having msg2 continue to report the extended
>>>>>>>>> info
>>>>>> for
>>>>>>>>> the original stacktrace when it now has a new stacktrace.
>>>>>>>>>
>>>>>>>>> I'm really not seeing why calling fillInstackTrace() a second time
>>>>>>>>> should be treated any differently to calling setStackTrace(). They
>>>>>>>>> should be handled consistently IMO.
>>>>>>>> But then you treat setStackTrace() differently from setStackTrace()
>>>>>>>> with other exceptions.
>>>>>>>> The reason to treat fillInStackTrace differently is that we lost
>>>>>>>> information
>>>>>>>> needed to compute it. This is not the case with setStackTrace().
>>>>>>>>
>>>>>>>> A different solution, the one I would have proposed if I had not
>>>>>>>> considered previous comments from reviews,? would be to just
>>>>>>>> compute the message in the runtime in the call of fillInStackTrace
>>>>>>>> before the old stack trace is lost and assign it to the message
>>>>>>>> field.
>>>>>>>> This way it would behave similar to all other exceptions. The
>>>>>>>> message
>>>>>>>> would just be there ... just that it's computed lazily.
>>>>>>>> The cost of the algorithm wouldn't harm that much as other costly
>>>>>>>> algorithms (walking the stack) are performed at this point, too.
>>>>>>>>
>>>>>>>>> We are not talking about all exceptions only about your NPE
>>>>>>>>> extended
>>>>>>>>> error message.
>>>>>>>> Hmm, the inconsistency caused by the code you posted above
>>>>>>>> holds for all exceptions.? If you fiddle with the stack trace,
>>>>>>>> the message might become pointless.? Wrt. setStackTrace
>>>>>>>> they all behave the same.
>>>>>>>> Wrt. fillInStackTrace the message will be wrong. Only this
>>>>>>>> needs to be fixed.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>  ??? Goetz.
>>>>>>>>
>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> I implemented an example where wrong stack traces are
>>>>>>>>>> printed with LinkageError and NPE, modifying a jtreg test:
>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>> NPE_fillInStackTrace-
>>>>>>>>> jdk15/05/mess_with_exceptions.patch
>>>>>>>>>> See also the generated output added to a comment in the patch.
>>>>>>>>>> If the NEP message text was missing in the second printout, I
>>>>>>>>>> think
>>>>>>>>>> this really would be unexpected.
>>>>>>>>>> Please note that the correct message is printed after messing
>>>>>>>>>> with the stack trace, it's the stack trace that is wrong.
>>>>>>>>>> (Not as with the problem I am fixing here where a wrong
>>>>>>>>>> message is printed.)
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>  ???? Goetz.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> I guess the normal usecase of setStackTrace is the other way
>>>>>>>>>>>> around:
>>>>>>>>>>>> Change the message and throw a new exception with the
>> existing
>>>>>>>>>>>> stack trace:
>>>>>>>>>>>>
>>>>>>>>>>>> try {
>>>>>>>>>>>>  ????? a.x;
>>>>>>>>>>>> catch (NullPointerException e) {
>>>>>>>>>>>>  ????? throw new NullPointerException("My own error
>>>>>>>>>>> message").setStackTrace(e.getStackTrace);
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> And not taking an arbitrary stack trace and put it into an
>>>>>>>>>>>> exception
>>>>>>>>>>>> with existing message.
>>>>>>>>>>> Interesting usage.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> David
>>>>>>>>>>> -----
>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>  ????? Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>>>> Sent: Friday, July 3, 2020 9:30 AM
>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> 'forax at univ-
>>>>>>>>>>> mlv.fr'
>>>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman
>> <Alan.Bateman at oracle.com>
>>>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>;
>>>>>>>>>>>>> hotspot-runtime-dev
>>>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>> NullPointerException
>>>>>>>>> message
>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 3/07/2020 4:32 pm, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>>>>>> need to
>>>>>>>>>>> add
>>>>>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>>>> jdk15/05/
>>>>>>>>>>>>>> I added the volatile, too, but as I understand the
>>>>>>>>>>>>>> synchronized
>>>>>>>>>>>>>> block brings sufficient memory barriers that this also works
>>>>>>>>>>>>>> without.
>>>>>>>>>>>>> No "volatile" needed, or wanted, when all access is within
>>>>>>>>>>>>> synchronized
>>>>>>>>>>>>> regions.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To be honest the idea that someone would share an
>> exception
>>>>>>>>> instance
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst
>>>>>>>>>>>>>>> printing out
>>>>>>>>>>>>>>> information about it just seems highly unrealistic.
>>>>>>>>>>>>>> Yes, contention here is quite unlikely, so it should not harm
>>>>>>>>> performance
>>>>>>>>>>>>> ??
>>>>>>>>>>>>>
>>>>>>>>>>>>> Contention was not my concern at all. :)
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Though after looking at comments in the test I would also
>>>>>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>>>> The test shows that after setStackTrace still the correct
>>>>>>>>>>>>>> message
>>>>>>>>>>>>>> is computed. This is because the algorithm uses
>>>>>>>>>>>>>> Throwable::backtrace
>>>>>>>>>>>>>> and not Throwable::stacktrace. Throwable::backtrace is not
>>>>>>>>>>>>>> affected by setStackTrace.
>>>>>>>>>>>>>> The behavior is just as with any exception. If you fiddle
>>>>>>>>>>>>>> with the stack trace, but don't adapt the message text,
>>>>>>>>>>>>>> the message might refer to other code than the stack trace
>>>>>>>>>>>>>> points to.
>>>>>>>>>>>>> But you can't adapt the message text - there is no
>>>>>>>>>>>>> setMessage! If
>>>>>>>>>>>>> the
>>>>>>>>>>>>> message is NULL and you call setStackTrace() then
>>>>>>>>>>>>> getMessage(), it
>>>>>>>>> makes
>>>>>>>>>>>>> no sense to return the extended error message that was
>>>>>>>>>>>>> associated
>>>>>>>>> with
>>>>>>>>>>>>> the original stack/backtrace.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> David
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>  ?????? Goetz.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: David Holmes <david.holmes at oracle.com>
>>>>>>>>>>>>>>> Sent: Friday, July 3, 2020 3:37 AM
>>>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>>>>> 'forax at univ-
>>>>>>>>>>>>> mlv.fr'
>>>>>>>>>>>>>>> <forax at univ-mlv.fr>; Alan Bateman
>> <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>> Cc: Christoph Dreis <christoph.dreis at freenet.de>; hotspot-
>>>>>> runtime-
>>>>>>>>> dev
>>>>>>>>>>>>>>> <hotspot-runtime-dev at openjdk.java.net>
>>>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>>>>>>>>>>>>>>> NullPointerException
>>>>>>>>>>> message
>>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Goetz,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 3/07/2020 5:30 am, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>> Hi Remi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> But how does volatile help?
>>>>>>>>>>>>>>>> I see the test for numStackTracesFilledIn == 1 then gets
>>>>>>>>>>>>>>>> always the
>>>>>>>>>>>>>>>> right value.
>>>>>>>>>>>>>>>> But the backtrace may not be changed until I read it in
>>>>>>>>>>>>>>>> getExtendedNPEMessage.? The other thread could change
>> it
>>>>>> after
>>>>>>>>>>>>>>>> checking numStackTracesFilledIn and before I read the
>>>>>> backtrace.
>>>>>>>>>>>>>>> True. To ensure you process the original backtrace only you
>>>>>>>>>>>>>>> need to
>>>>>>>>>>> add
>>>>>>>>>>>>>>> synchronization in getMessage():
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  ??????????? public String getMessage() {
>>>>>>>>>>>>>>>  ??????????????? String message = super.getMessage();
>>>>>>>>>>>>>>>  ??????????????? // If the stack trace was changed the
>>>>>>>>>>>>>>> extended
>>>>>>>>>>>>>>> NPE algorithm
>>>>>>>>>>>>>>>  ??????????????? // will compute a wrong message.
>>>>>>>>>>>>>>> +???????? synchronized(this) {
>>>>>>>>>>>>>>> !???????????? if (message == null &&
>>>>>>>>>>>>>>> numStackTracesFilledIn ==
>>>>>>>>>>>>>>> 1) {
>>>>>>>>>>>>>>> !???????????????? return getExtendedNPEMessage();
>>>>>>>>>>>>>>> !???????????? }
>>>>>>>>>>>>>>> +???????? }
>>>>>>>>>>>>>>>  ??????????????? return message;
>>>>>>>>>>>>>>>  ??????????? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To be honest the idea that someone would share an
>> exception
>>>>>>>>> instance
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> concurrently mutate it with fillInStackTrace() whilst
>>>>>>>>>>>>>>> printing out
>>>>>>>>>>>>>>> information about it just seems highly unrealistic. But the
>>>>>>>>>>>>>>> above fixes
>>>>>>>>>>>>>>> it simply. Though after looking at comments in the test I
>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>> suggest that setStackTrace be updated:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  ???????????? synchronized (this) {
>>>>>>>>>>>>>>>  ????????????????? if (this.stackTrace == null && //
>>>>>>>>>>>>>>> Immutable stack
>>>>>>>>>>>>>>>  ????????????????????? backtrace == null) // Test for out of
>>>>>>>>>>>>>>> protocol state
>>>>>>>>>>>>>>>  ????????????????????? return;
>>>>>>>>>>>>>>> +?????????? numStackTracesFilledIn++;
>>>>>>>>>>>>>>>  ????????????????? this.stackTrace = defensiveCopy;
>>>>>>>>>>>>>>>  ????????????? }
>>>>>>>>>>>>>>>  ????????? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> as that would seem to be another hole in the mechanism.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I want to vote again for the much more simple version
>>>>>>>>>>>>>>>> proposed in webrev 02:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr20/8248476-
>>>>>>>>>>> NPE_fillInStackTrace-
>>>>>>>>>>>>>>> jdk15/02/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I much prefer the latest version that recognises that only
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> original
>>>>>>>>>>>>>>> stack can be processed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In the test:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +???????? // This holds for explicitly crated NPEs, but also
>>>>>>>>>>>>>>> for implicilty
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Two typos: crated? & implicilty
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It's drawback is only that for this code:
>>>>>>>>>>>>>>>>  ??????? ex = null;
>>>>>>>>>>>>>>>>  ??????? ex.fillInStackTrace()
>>>>>>>>>>>>>>>> no message is created.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think this really is acceptable.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Remi, I didn't comment on this statement from a previous
>>>>>>>>>>>>>>>> mail:
>>>>>>>>>>>>>>>>>> Hmm, Throwable.stackTrace is used for the stack trace
>> at
>>>>>> some
>>>>>>>>>>> point.
>>>>>>>>>>>>>>>>> yes, it contains the Java stack trace, but if the Java
>>>>>>>>>>>>>>>>> stack
>>>>>>>>>>>>>>>>> trace is
>>>>>>>>> filled
>>>>>>>>>>>>> you
>>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>>>>> compute any helpful message anyway.
>>>>>>>>>>>>>>>> The internal structure is no more deleted when the stack
>>>>>>>>>>>>>>>> trace
>>>>>>>>>>>>>>>> is filled. So the message can be computed later, too.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>  ??????? Goetz.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>> From: forax at univ-mlv.fr <forax at univ-mlv.fr>
>>>>>>>>>>>>>>>>> Sent: Thursday, July 2, 2020 8:52 PM
>>>>>>>>>>>>>>>>> To: Alan Bateman <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>>>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>>>> Christoph
>>>>>>>>>>> Dreis
>>>>>>>>>>>>>>>>> <christoph.dreis at freenet.de>; hotspot-runtime-dev
>> <hotspot-
>>>>>>>>>>> runtime-
>>>>>>>>>>>>>>>>> dev at openjdk.java.net>; David Holmes
>>>>>>>>> <david.holmes at oracle.com>
>>>>>>>>>>>>>>>>> Subject: Re: [15] RFR: 8248476: No helpful
>>>>>> NullPointerException
>>>>>>>>>>>>> message
>>>>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> yes,
>>>>>>>>>>>>>>>>> it's what i was saying,
>>>>>>>>>>>>>>>>> given that a NPE can be thrown very early, before
>>>>>>>>>>>>>>>>> VarHandle is
>>>>>>>>>>>>> initialized,
>>>>>>>>>>>>>>> i
>>>>>>>>>>>>>>>>> believe that declaring numStackTracesFilledIn volatile
>>>>>>>>>>>>>>>>> is the
>>>>>>>>>>>>>>>>> best
>>>>>>>>> way
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> tackle that.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> R?mi
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ----- Mail original -----
>>>>>>>>>>>>>>>>>> De: "Alan Bateman" <Alan.Bateman at oracle.com>
>>>>>>>>>>>>>>>>>> ?: "Goetz Lindenmaier" <goetz.lindenmaier at sap.com>,
>>>>>>>>> "Christoph
>>>>>>>>>>>>>>> Dreis"
>>>>>>>>>>>>>>>>> <christoph.dreis at freenet.de>
>>>>>>>>>>>>>>>>>> Cc: "hotspot-runtime-dev" <hotspot-runtime-
>>>>>>>>>>> dev at openjdk.java.net>,
>>>>>>>>>>>>>>>>> "David Holmes" <david.holmes at oracle.com>, "Remi
>> Forax"
>>>>>>>>>>>>>>>>>> <forax at univ-mlv.fr>
>>>>>>>>>>>>>>>>>> Envoy?: Jeudi 2 Juillet 2020 20:47:31
>>>>>>>>>>>>>>>>>> Objet: Re: [15] RFR: 8248476: No helpful
>>>>>> NullPointerException
>>>>>>>>>>> message
>>>>>>>>>>>>>>>>> after calling fillInStackTrace
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 02/07/2020 17:45, Lindenmaier, Goetz wrote:
>>>>>>>>>>>>>>>>>>> Hi Christoph,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I fixed the comment, thanks for pointing that out.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> One other thing is that NPE::getMessage reads
>>>>>>>>>>> numStackTracesFilledIn
>>>>>>>>>>>>>>>>>> without synchronization.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -Alan
>>>>
>>>
> 

From luhenry at microsoft.com  Wed Jul 15 13:15:14 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Wed, 15 Jul 2020 13:15:14 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
Message-ID: <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi David,

Thanks for your feedback.

> can we use __int32 for clarity rather than "long"?

The Win32 API explicitly uses `long`, and I made sure for these `DEFINE_` macros to use the type used in the declaration of the API. If you are ok with the difference, I'm happy to change that to __int32.

I've uploaded the new webrevs at http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.01

(I've also moved the previous webrevs to their respective webrev.00 folders).

Thank you,

--
Ludovic
________________________________________
From: David Holmes <david.holmes at oracle.com>
Sent: Sunday, July 12, 2020 19:43
To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
Cc: openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

Hi Ludovic,

On 9/07/2020 11:55 pm, Ludovic Henry wrote:
> Hello,
>
> As part of adding support for Windows-AArch64, I've had the opportunity to read through most of the Windows-x86 code. In doing so, I found some code that I think can be simplified and made easier to read and maintain.
>
> The three areas I have found are:
> - Atomics: Hotspot doesn't make use of existing intrinsics provided by MSVC and Win32, even ones available since Windows XP.
> - Exception handling: there is some code repetition which, even if functional, is subpar.
> - Frames: we can use the existing os::fetch_frame_from_context to simplify the code and reduce frame parsing logic duplication.
>
> I've split the webrevs along the above lines, making each simpler to review. I'm also hosting these webrevs on Bernhard Urban's CR as I currently do not have authorship. I'll also work with him to update the description of the JBS.

Thanks for doing the split!

As a general comment can you please ensure that the Oracle copyright
second year is updated to 2020. Thanks.

Overall these cleanups look good. Thanks for providing them.

> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248817&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C14256cbf115b41c4e39c08d826d68c6a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302050209431654&amp;sdata=V5uMNQ%2FLK6enIbupfof66gviFxjLRzVLlrNAwW0r0JU%3D&amp;reserved=0
> Webrevs:
> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C14256cbf115b41c4e39c08d826d68c6a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302050209431654&amp;sdata=%2B63SBk1XCKXuCPC1ciOI2J2lhZd%2FM7jRbzciiGvYBbE%3D&amp;reserved=0

Love this cleanup! Great to see all the stubroutines go for x86.

src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp

Please delete this entire (archaic) comment block.

  42 // The following alternative implementations are needed because
  43 // Windows 95 doesn't support (some of) the corresponding Windows NT
  44 // calls. Furthermore, these versions allow inlining in the caller.
  45 // (More precisely: The documentation for InterlockedExchange says
  46 // it is supported for Windows 95. However, when single-stepping
  47 // through the assembly code we cannot step into the routine and
  48 // when looking at the routine address we see only garbage code.
  49 // Better safe then sorry!). Was bug 7/31/98 (gri).
  50 //
  51 // Performance note: On uniprocessors, the 'lock' prefixes are not
  52 // necessary (and expensive). We should generate separate cases if
  53 // this becomes a performance problem.

In this (and elsewhere):

  80 DEFINE_STUB_ADD(4, long,    InterlockedAdd)
  81 DEFINE_STUB_ADD(8, __int64, InterlockedAdd64)

can we use __int32 for clarity rather than "long"?

> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-exception-handling%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C14256cbf115b41c4e39c08d826d68c6a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302050209441605&amp;sdata=VEk0oD7xvDH4vrU4TF2qsQpw%2B7XLANcEcc2K5f%2BOCeQ%3D&amp;reserved=0

Looks good!

> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-frames%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C14256cbf115b41c4e39c08d826d68c6a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302050209441605&amp;sdata=tWcWMTclQIGmqtcIJUKzU8ksNMsbr0HxQeJ%2BeepKVTU%3D&amp;reserved=0

Looks good!

Thanks,
David
-----

> Tests: jtreg:hotspot:tier, jtreg:jdk:tier1, jtreg:jdk:tier2, jtreg:langtools on Windows-x86 and Windows-x86_64, no regressions.
>
> Thank you,
>
> --
> Ludovic
>

From david.holmes at oracle.com  Wed Jul 15 13:32:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 15 Jul 2020 23:32:37 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>

Hi Ludovic,

On 15/07/2020 11:15 pm, Ludovic Henry wrote:
> Hi David,
> 
> Thanks for your feedback.
> 
>> can we use __int32 for clarity rather than "long"?
> 
> The Win32 API explicitly uses `long`, and I made sure for these `DEFINE_` macros to use the type used in the declaration of the API. If you are ok with the difference, I'm happy to change that to __int32.

I prefer to see _int32 in our code as "long" can be quite ambiguous 
depending on the reader (and something we are trying to eradicate from 
shared code -not everyone is aware of the LLP64 programming model versus 
LP64).

Thanks,
David
-----

> I've uploaded the new webrevs at http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.01
> 
> (I've also moved the previous webrevs to their respective webrev.00 folders).
> 
> Thank you,
> 
> --
> Ludovic
> ________________________________________
> From: David Holmes <david.holmes at oracle.com>
> Sent: Sunday, July 12, 2020 19:43
> To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
> Cc: openjdk-aarch64
> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
> 
> Hi Ludovic,
> 
> On 9/07/2020 11:55 pm, Ludovic Henry wrote:
>> Hello,
>>
>> As part of adding support for Windows-AArch64, I've had the opportunity to read through most of the Windows-x86 code. In doing so, I found some code that I think can be simplified and made easier to read and maintain.
>>
>> The three areas I have found are:
>> - Atomics: Hotspot doesn't make use of existing intrinsics provided by MSVC and Win32, even ones available since Windows XP.
>> - Exception handling: there is some code repetition which, even if functional, is subpar.
>> - Frames: we can use the existing os::fetch_frame_from_context to simplify the code and reduce frame parsing logic duplication.
>>
>> I've split the webrevs along the above lines, making each simpler to review. I'm also hosting these webrevs on Bernhard Urban's CR as I currently do not have authorship. I'll also work with him to update the description of the JBS.
> 
> Thanks for doing the split!
> 
> As a general comment can you please ensure that the Oracle copyright
> second year is updated to 2020. Thanks.
> 
> Overall these cleanups look good. Thanks for providing them.
> 
>> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248817&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C14256cbf115b41c4e39c08d826d68c6a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302050209431654&amp;sdata=V5uMNQ%2FLK6enIbupfof66gviFxjLRzVLlrNAwW0r0JU%3D&amp;reserved=0
>> Webrevs:
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C14256cbf115b41c4e39c08d826d68c6a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302050209431654&amp;sdata=%2B63SBk1XCKXuCPC1ciOI2J2lhZd%2FM7jRbzciiGvYBbE%3D&amp;reserved=0
> 
> Love this cleanup! Great to see all the stubroutines go for x86.
> 
> src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp
> 
> Please delete this entire (archaic) comment block.
> 
>    42 // The following alternative implementations are needed because
>    43 // Windows 95 doesn't support (some of) the corresponding Windows NT
>    44 // calls. Furthermore, these versions allow inlining in the caller.
>    45 // (More precisely: The documentation for InterlockedExchange says
>    46 // it is supported for Windows 95. However, when single-stepping
>    47 // through the assembly code we cannot step into the routine and
>    48 // when looking at the routine address we see only garbage code.
>    49 // Better safe then sorry!). Was bug 7/31/98 (gri).
>    50 //
>    51 // Performance note: On uniprocessors, the 'lock' prefixes are not
>    52 // necessary (and expensive). We should generate separate cases if
>    53 // this becomes a performance problem.
> 
> In this (and elsewhere):
> 
>    80 DEFINE_STUB_ADD(4, long,    InterlockedAdd)
>    81 DEFINE_STUB_ADD(8, __int64, InterlockedAdd64)
> 
> can we use __int32 for clarity rather than "long"?
> 
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-exception-handling%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C14256cbf115b41c4e39c08d826d68c6a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302050209441605&amp;sdata=VEk0oD7xvDH4vrU4TF2qsQpw%2B7XLANcEcc2K5f%2BOCeQ%3D&amp;reserved=0
> 
> Looks good!
> 
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-frames%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C14256cbf115b41c4e39c08d826d68c6a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302050209441605&amp;sdata=tWcWMTclQIGmqtcIJUKzU8ksNMsbr0HxQeJ%2BeepKVTU%3D&amp;reserved=0
> 
> Looks good!
> 
> Thanks,
> David
> -----
> 
>> Tests: jtreg:hotspot:tier, jtreg:jdk:tier1, jtreg:jdk:tier2, jtreg:langtools on Windows-x86 and Windows-x86_64, no regressions.
>>
>> Thank you,
>>
>> --
>> Ludovic
>>

From daniel.daugherty at oracle.com  Wed Jul 15 14:23:30 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 15 Jul 2020 10:23:30 -0400
Subject: [15] RFR(S): 8246676: monitor list lock operations need more
 fencing and 8247280
In-Reply-To: <4d874d07-2c43-ebc4-7d2a-a8fa15e914b1@oracle.com>
References: <096dfe66-cc4c-1d83-e876-914937d2f87e@oracle.com>
 <4d874d07-2c43-ebc4-7d2a-a8fa15e914b1@oracle.com>
Message-ID: <ffbedf68-4795-805e-bbf7-bf136c289a8d@oracle.com>

On 7/15/20 2:46 AM, David Holmes wrote:
> Hi Dan,
>
> On 15/07/2020 12:30 pm, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> These fixes are targeted for JDK15 and I would like to push both of 
>> these
>> fixes before the RDP2 cutoff on Thursday.
>>
>> I have a JDK15 fix ready for a couple of related ObjectMonitor bug 
>> fixes:
>>
>> ???? JDK-8246676 monitor list lock operations need more fencing
>> ???? https://bugs.openjdk.java.net/browse/JDK-8246676
>>
>> ???? JDK-8247280 more fencing needed in async deflation for non-TSO 
>> machines
>> ???? https://bugs.openjdk.java.net/browse/JDK-8247280
>>
>> The fix for JDK-8246676 has been through three rounds of preliminary
>> code review with David H.; Erik O. and Robbin participated in the first
>> preliminary code review round. Mostly comment changes or backouts of
>> code changes to return to the baseline code were done in the second and
>> third preliminary rounds.
>>
>> The fix for JDK-8247280 has been through two rounds of preliminary
>> code review with David H. The bug fix itself was suggested by Erik O. so
>> he's likely on-board with my implementation of the fix. :-)
>>
>> Many thanks to David H., Erik O. and Robbin for their many emails on
>> these topics and for reviewing the preliminary webrevs.
>>
>> Here are the two webrevs:
>>
>> http://cr.openjdk.java.net/~dcubed/8247280-webrev/0-for-jdk15/
>
> My only follow up here is the proper fix for using Atomics with enums. 
> The patch is below and I've tested it with tiers 1-3 (link sent 
> separately).

Thanks! I appreciate the help with the metaprogramming stuff...


>
>> http://cr.openjdk.java.net/~dcubed/8246676-webrev/0-for-jdk15/
>
> Also looks good.

Thanks!


>
>> The project is currently baselined on jdk-15+30 and has gone through
>> Mach5 Tier[1-3],4,5,6,7,8 testing with no regressions. I've also run
>> my inflation stress kit on Linux-X64 and macOSX without any regressions.
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>
> Thanks for working through all the details on this.

Thank you for the many emails, discussions and reviews. This memory
consistency stuff is hard to get right.


>
> David
> -----
>
> diff -r 3bef86e53c51 src/hotspot/share/runtime/objectMonitor.hpp
> --- a/src/hotspot/share/runtime/objectMonitor.hpp
> +++ b/src/hotspot/share/runtime/objectMonitor.hpp
> @@ -27,6 +27,7 @@
>
> ?#include "memory/allocation.hpp"
> ?#include "memory/padded.hpp"
> +#include "metaprogramming/isRegisteredEnum.hpp"
> ?#include "oops/markWord.hpp"
> ?#include "runtime/os.hpp"
> ?#include "runtime/park.hpp"
> @@ -372,4 +373,7 @@
> ?? void????? install_displaced_markword_in_object(const oop obj);
> ?};
>
> +// Register for atomic operations.
> +template<> struct IsRegisteredEnum<ObjectMonitor::AllocationState> : 
> public TrueType {};
> +
> ?#endif // SHARE_RUNTIME_OBJECTMONITOR_HPP
> diff -r 3bef86e53c51 src/hotspot/share/runtime/objectMonitor.inline.hpp
> --- a/src/hotspot/share/runtime/objectMonitor.inline.hpp
> +++ b/src/hotspot/share/runtime/objectMonitor.inline.hpp
> @@ -196,7 +196,7 @@
> ?}
>
> ?inline void 
> ObjectMonitor::release_set_allocation_state(ObjectMonitor::AllocationState 
> s) {
> -? Atomic::release_store((int*)&_allocation_state, (int)s);
> +? Atomic::release_store(&_allocation_state, s);
> ?}
>
> ?inline void 
> ObjectMonitor::set_allocation_state(ObjectMonitor::AllocationState s) {
> @@ -208,7 +208,7 @@
> ?}
>
> ?inline ObjectMonitor::AllocationState 
> ObjectMonitor::allocation_state_acquire() const {
> -? return 
> (AllocationState)Atomic::load_acquire((int*)&_allocation_state);
> +? return Atomic::load_acquire(&_allocation_state);
> ?}
>
> ?inline bool ObjectMonitor::is_free() const {

That ended up being much simpler than I imagined late last night... :-)

Thanks for coding this up and for running it through a Mach5 Tier[1-3].
I very much appreciate it.

Dan


From jamsheed.c.m at oracle.com  Wed Jul 15 15:55:44 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Wed, 15 Jul 2020 21:25:44 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock* below
 than low_mark"
Message-ID: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>

Hi,

Async handling at method entry requires it to be aware of 
synchronization(like whether it is doing async handling before lock 
acquire or after)

This is required as exception handler rely on this info for unlocking.? 
Async handling code never had this special condition handled and it 
worked most of the time as we were using biased locking which got 
disabled by [1]

There was one other issue reported in similar time[2]. This issue got 
triggered in test case by [3], back to back extra safepoint after 
suspend and TLH for ThreadDeath. So in this setup both PopFrame request 
and Thread.Stop request happened together for the test scenario and it 
reached java method entry with pending_exception set.

I have done a partial fix for the issue, mainly to handle production 
mode crash failures(do not unlock flag related ones)

Fix detail:

1) I save restore the "do not unlock" flag in async handling.

2) Return for floating pending exception for some cases(PopFrame, Early 
return related). This is debug(JVMTI) feature and floating exception can 
get cleaned just like that in present compiler request and deopt code.

webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/

There are more problems in these code areas, like we clear all 
exceptions in compilation request path(interpreter,c1), as well as 
deoptimization path.

All these un-handled cases will be separately handled by 
https://bugs.openjdk.java.net/browse/JDK-8249451

Request for review.

Best regards,

Jamsheed

[1]https://bugs.openjdk.java.net/browse/JDK-8231264 
<https://bugs.openjdk.java.net/browse/JDK-8231264>

[2] https://bugs.openjdk.java.net/browse/JDK-8246727

[3] https://bugs.openjdk.java.net/browse/JDK-8221207


From mandy.chung at oracle.com  Wed Jul 15 16:05:00 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Wed, 15 Jul 2020 09:05:00 -0700
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <AM4PR0202MB29644E2CE13B6BB91D65BD23EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
 <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
 <4d27afc7-d070-6ef9-3d44-8f3b4d35a611@oracle.com>
 <AM4PR0202MB29644E2CE13B6BB91D65BD23EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <8a8c885f-11be-ce4b-65a1-06332465887a@oracle.com>


On 7/15/20 2:23 AM, Lindenmaier, Goetz wrote:
> Hi everybody ??
>
> First of all  thanks for all the feedback!
>
> I updated webrev 06 to 06.2:
> http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06.2/

Looks okay with me.

A typo in the test - should be "expectedMessage"

1307 String excpectedMessage =


Mandy


From luhenry at microsoft.com  Wed Jul 15 17:00:02 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Wed, 15 Jul 2020 17:00:02 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <d5a5e563-e0c5-ec0a-8640-ea940c05f738@oracle.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
 <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
 <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <d5a5e563-e0c5-ec0a-8640-ea940c05f738@oracle.com>
Message-ID: <MWHPR21MB05114BF8C3AB71CF2125FB25B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi David,

>> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.
>
> That is good to know. But this is something that Microsoft should be
> documenting explicitly - even if just a blanket statement that all
> syscalls (which are what exactly?) provide an implicit memory barrier
> (of what type exactly?).

I don't think it's because SetEvent is a syscall that we can assume it has a barrier (even though syscall do guarantee a barrier), it's more that SetEvent is an equivalent to sem_post. And if you cannot assume that sem_post or SetEvent guarantee a memory barrier (full or at least store_release), then you could not trust any standard locking mechanism (what's the point of synchronizing if the CPU can load and store outside of the critical section).

>> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.
>
> jcstress tests will execute the native runtime code of course, but they
> won't be "stressing" it as such.

Makes sense, thanks for the clarification.

--
Ludovic

I agree with you on the value of a more explicit documentation, and I'll go look for that. If it doesn't exist, I'll put the request to have it documented somewhere on docs.microsoft.com. In the meantime, it is safe to assume that SetEvent contains a memory barrier that has at least a store_release semantic. Similarly, WaitForSingleObect and WaitForMultipleObjects have at least a load_acquire memory barrier, and are also syscalls (actually guaranteeing a full memory barrier).

________________________________________
From: David Holmes <david.holmes at oracle.com>
Sent: Monday, July 13, 2020 19:25
To: Ludovic Henry; Andrew Haley; Thomas St?fe
Cc: Kim Barrett; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model

Hi Ludovic,

On 14/07/2020 11:28 am, Ludovic Henry wrote:
> Hello,
>
>> But if we are dealing with non-TSO races then it would be good to get
>> some guidance from Microsoft as to the memory ordering properties of
>> various API's to ensure that we are maintaining correct ordering. For
>> example, in the destructor we have:
>>
>> 81     lock_owner = 0;
>> 82     // No lost wakeups, lock_event stays signaled until reset.
>> 83     DWORD ret = SetEvent(lock_event);
>>
>> but unless we are guaranteed that the store to lock_owner cannot be
>> reordered by the compiler or the hardware, to appear to be after the
>> SetEvent, then the logic is broken. Generally, because Windows only
>> supported TSO systems, we have assumed that the compiler will not
>> reorder code across these kind of API calls. But now we also need
>> hardware guarantees.
>
> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.

That is good to know. But this is something that Microsoft should be
documenting explicitly - even if just a blanket statement that all
syscalls (which are what exactly?) provide an implicit memory barrier
(of what type exactly?).

> As for the general question around platforms with weaker memory models, AArch64 is not the first such platform that MSVC and Windows have been ported to. It is safe to assume that MSVC has a similar approach to GCC and Clang on memory reordering optimizations. [1] also gives some pointers on some MSVC specific knobs for working around the weaker memory model.

The /volatile:ms is the kind of build control I was wondering about.
Thanks for the pointer.

> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.

jcstress tests will execute the native runtime code of course, but they
won't be "stressing" it as such.

Cheers,
David
-----

> I hope this helps to answer your questions.
>
> [1] https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fbuild%2Fcommon-visual-cpp-arm-migration-issues%3Fview%3Dvs-2019%23volatile-keyword-default-behavior&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C0a66ab918637459bb2a408d8279d4400%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302903701111218&amp;sdata=vRedVDK0JAKS8QMepJYW%2Ffqga8H5pQKrptBBVtSGjG4%3D&amp;reserved=0
>
> --
> Ludovic
> ________________________________________
> From: Andrew Haley <aph at redhat.com>
> Sent: Monday, July 13, 2020 01:36
> To: David Holmes; Thomas St?fe
> Cc: Kim Barrett; Ludovic Henry; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
>
> On 13/07/2020 06:48, David Holmes wrote:
>> Hi Thomas,
>>
>> On 13/07/2020 2:41 pm, Thomas St?fe wrote:
>>>
>>> Can a compiler reorder system calls and stores? How would it determine
>>> if this is safe to do?
>
> I very much doubt it.
>
>> A compiler can reorder anything it likes if it can determine it is safe
>> to do so. :)
>
> I'm fairly sure the compiler doesn't care about that!
>
>>> I'd be surprised if Microsoft loosened up reordering since this would
>>> mean existing software cannot just be recompiled for arm and expected to
>>> work. But this is just a guess of course.
>>
>> It's an interesting point because I would expect there to be a lot of
>> software written for Windows that contains assumptions of TSO that would
>> in fact fail when run on Aarch64. I don't know if there are any special
>> mechanisms to force a binary to run in TSO mode on Aarch64 under Windows
>> (or build flags), that would allow for ease of migration.
>
> There's no standard hardware mechanism that would do so.
>
> I've been very surprised at how little software has broken on AArch64
> because of memory ordering. Like you, I initially assumed that stuff
> would break all over the place, but by and large it was OK. I know of
> two reasons: firstly, programmers are pretty conservative and tend to
> use simple and reliable mechanisms such as safe publication and
> mutexes for inter-thread communication. But also, and maybe more
> importantly, the kinds of reordering the hardware can do are not very
> different from those compilers do. Therefore, anyone playing fast and
> loose with TSO has probably already been bitten by the compiler.
>
>> But unless all Windows software will run in such a mode there is a
>> need for MS to document what the memory consistency properties of
>> various APIs are (as POSIX does [1]).
>
> Indeed. I would have thought it existed somewhere.
>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C0a66ab918637459bb2a408d8279d4400%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302903701111218&amp;sdata=AkJOKTuxv6knUzQtTMt1ZUhAYasMIqhzX%2Bp%2FNwHY5rc%3D&amp;reserved=0>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C0a66ab918637459bb2a408d8279d4400%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637302903701111218&amp;sdata=4q5HhnMSPcuh9ADTPTp60zZpc2ZrQ4663HiR8x6inmc%3D&amp;reserved=0
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>

From goetz.lindenmaier at sap.com  Wed Jul 15 17:37:55 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 15 Jul 2020 17:37:55 +0000
Subject: [15] RFR: 8248476: No helpful NullPointerException message after
 calling fillInStackTrace
In-Reply-To: <8a8c885f-11be-ce4b-65a1-06332465887a@oracle.com>
References: <AM4PR0202MB29643CE0B82EBDCA0A2A2B2FEC6E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <61209ad8-57a6-192b-5ca5-6fca15b461a6@oracle.com>
 <AM4PR0202MB29649EA69F833E69D0AC3B92EC6A0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <062c1265-1934-ac54-21c9-523ab22f2ced@oracle.com>
 <AM4PR0202MB2964A0A39711380E35196212EC670@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <18ea868c-70b1-2dc1-b830-39a709097668@oracle.com>
 <AM4PR0202MB2964DD165E4F899C5DBA05C8EC600@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <6ae11939-eb50-b517-f45e-72359d0287b9@oracle.com>
 <e3d4159d-346b-617d-ef86-9b3ae103ad0a@oracle.com>
 <VI1PR0202MB2975343DABBB1A19737FFA25EC610@VI1PR0202MB2975.eurprd02.prod.outlook.com>
 <bc8c6c71-b0d9-44cd-2c32-0270156f7ae6@oracle.com>
 <8e17f341-f8fd-a2cd-ca6c-2117bda2b9fb@oracle.com>
 <4d27afc7-d070-6ef9-3d44-8f3b4d35a611@oracle.com>
 <AM4PR0202MB29644E2CE13B6BB91D65BD23EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <8a8c885f-11be-ce4b-65a1-06332465887a@oracle.com>
Message-ID: <AM4PR0202MB29648520C1C80F110DF5D3A8EC7E0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi everybody,

I pushed 06.2. I did not do further changes, except the type-o.
I would have preferred a further round of review in case of the
Symbolic names, but with the deadline I thought I better push
the existing webrev.

Thanks again!

Best regards,
  Goetz.

From: Mandy Chung <mandy.chung at oracle.com>
Sent: Wednesday, July 15, 2020 6:05 PM
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'coleen.phillimore at oracle.com' <coleen.phillimore at oracle.com>; David Holmes <david.holmes at oracle.com>; 'forax at univ-mlv.fr' <forax at univ-mlv.fr>
Cc: hotspot-runtime-dev at openjdk.java.net
Subject: Re: [15] RFR: 8248476: No helpful NullPointerException message after calling fillInStackTrace


On 7/15/20 2:23 AM, Lindenmaier, Goetz wrote:

Hi everybody ??


First of all  thanks for all the feedback!


I updated webrev 06 to 06.2:

http://cr.openjdk.java.net/~goetz/wr20/8248476-NPE_fillInStackTrace-jdk15/06.2/

Looks okay with me.

A typo in the test - should be "expectedMessage"

1307         String excpectedMessage =

Mandy

From luhenry at microsoft.com  Wed Jul 15 18:29:17 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Wed, 15 Jul 2020 18:29:17 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
Message-ID: <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi David,

I gave a try to using `__int32` and `int` in place of `long`, but MSVC complains of type differences in the parameters passed to the `Interlocked*` functions.

```
C:\git\jdk\src\hotspot\os_cpu\windows_x86\atomic_windows_x86.hpp(103): error C2665: '_InterlockedCompareExchange': none of the 4
overloads could convert all the argument types
C:\git\jdk\build\devkit\10\include\um\winbase.h(9501): note: could be 'unsigned __int64 _InterlockedCompareExchange(volatile unsigned __int64 *,unsigned __int64,unsigned __int64)'
C:\git\jdk\build\devkit\10\include\um\winbase.h(9488): note: or       'unsigned long _InterlockedCompareExchange(volatile unsigned long *,unsigned long,unsigned long)'
C:\git\jdk\build\devkit\10\include\um\winbase.h(9477): note: or       'unsigned int _InterlockedCompareExchange(volatile unsigned int *,unsigned int,unsigned int)'
[.....]
```

To clarify the use of `long` over `__int32` or `int`, I've instead added a comment (see http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.02/src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp.udiff.html).

Another complementary solution is to call the `DEFINE_*` macros, not with a numerical constant as the first argument, but with the value returned by the sizeof of the second argument. For example, we can have the following for cmpxchg:

```
DEFINE_STUB_CMPXCHG(sizeof(char),     char,    _InterlockedCompareExchange8) // Use the intrinsic as InterlockedCompareExchange8 does not exist
DEFINE_STUB_CMPXCHG(sizeof(long),     long,    InterlockedCompareExchange)
DEFINE_STUB_CMPXCHG(sizeof(__int64), __int64, InterlockedCompareExchange64)
```

In that case, we can even do away with the first argument altogether, like the following:

```
#define DEFINE_STUB_CMPXCHG(StubName, StubType)                            \
  template<>                                                               \
  template<typename T>                                                     \
  inline T Atomic::PlatformCmpxchg<sizeof(StubType)>::operator()(T volatile* dest, \
                                                         T compare_value,  \
                                                         T exchange_value, \
                                                         atomic_memory_order order) const { \
    STATIC_ASSERT(sizeof(StubType) == sizeof(T));                          \
    return PrimitiveConversions::cast<T>(                                  \
      StubName(reinterpret_cast<StubType volatile *>(dest),                \
               PrimitiveConversions::cast<StubType>(exchange_value),       \
               PrimitiveConversions::cast<StubType>(compare_value)));      \
  }

DEFINE_STUB_CMPXCHG(_InterlockedCompareExchange8, char) // Use the intrinsic as InterlockedCompareExchange8 does not exist
DEFINE_STUB_CMPXCHG(InterlockedCompareExchange,   long)
DEFINE_STUB_CMPXCHG(InterlockedCompareExchange64, __int64)

#undef DEFINE_STUB_CMPXCHG
```

That makes it very clear that the type is for the Interlocked* function, not the source data type (like int32_t/int64_t or jint/jlong).

I uploaded updated webrevs at http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.02/ and http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.03/, with the former not containing this `sizeof(StupType)` change, and the latter containing it.

Thank you!

________________________________________
From: David Holmes <david.holmes at oracle.com>
Sent: Wednesday, July 15, 2020 06:32
To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
Cc: openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

Hi Ludovic,

On 15/07/2020 11:15 pm, Ludovic Henry wrote:
> Hi David,
>
> Thanks for your feedback.
>
>> can we use __int32 for clarity rather than "long"?
>
> The Win32 API explicitly uses `long`, and I made sure for these `DEFINE_` macros to use the type used in the declaration of the API. If you are ok with the difference, I'm happy to change that to __int32.

I prefer to see _int32 in our code as "long" can be quite ambiguous
depending on the reader (and something we are trying to eradicate from
shared code -not everyone is aware of the LLP64 programming model versus
LP64).

Thanks,
David
-----

> I've uploaded the new webrevs at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.01&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=GfaCMQYLrWynyShUR%2B6eRjpMR4Y%2Bj05PyaVAstTUoKU%3D&amp;reserved=0
>
> (I've also moved the previous webrevs to their respective webrev.00 folders).
>
> Thank you,
>
> --
> Ludovic
> ________________________________________
> From: David Holmes <david.holmes at oracle.com>
> Sent: Sunday, July 12, 2020 19:43
> To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
> Cc: openjdk-aarch64
> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
>
> Hi Ludovic,
>
> On 9/07/2020 11:55 pm, Ludovic Henry wrote:
>> Hello,
>>
>> As part of adding support for Windows-AArch64, I've had the opportunity to read through most of the Windows-x86 code. In doing so, I found some code that I think can be simplified and made easier to read and maintain.
>>
>> The three areas I have found are:
>> - Atomics: Hotspot doesn't make use of existing intrinsics provided by MSVC and Win32, even ones available since Windows XP.
>> - Exception handling: there is some code repetition which, even if functional, is subpar.
>> - Frames: we can use the existing os::fetch_frame_from_context to simplify the code and reduce frame parsing logic duplication.
>>
>> I've split the webrevs along the above lines, making each simpler to review. I'm also hosting these webrevs on Bernhard Urban's CR as I currently do not have authorship. I'll also work with him to update the description of the JBS.
>
> Thanks for doing the split!
>
> As a general comment can you please ensure that the Oracle copyright
> second year is updated to 2020. Thanks.
>
> Overall these cleanups look good. Thanks for providing them.
>
>> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248817&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=7X%2FV4ILiMMUmimw7XkvjDS9qD%2FhgnGqv%2FjNQpLFsARs%3D&amp;reserved=0
>> Webrevs:
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=WlTIIZQG8qP2M0qnDQgjYiQcDa5RjvOV4Dqa7zi9JrI%3D&amp;reserved=0
>
> Love this cleanup! Great to see all the stubroutines go for x86.
>
> src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp
>
> Please delete this entire (archaic) comment block.
>
>    42 // The following alternative implementations are needed because
>    43 // Windows 95 doesn't support (some of) the corresponding Windows NT
>    44 // calls. Furthermore, these versions allow inlining in the caller.
>    45 // (More precisely: The documentation for InterlockedExchange says
>    46 // it is supported for Windows 95. However, when single-stepping
>    47 // through the assembly code we cannot step into the routine and
>    48 // when looking at the routine address we see only garbage code.
>    49 // Better safe then sorry!). Was bug 7/31/98 (gri).
>    50 //
>    51 // Performance note: On uniprocessors, the 'lock' prefixes are not
>    52 // necessary (and expensive). We should generate separate cases if
>    53 // this becomes a performance problem.
>
> In this (and elsewhere):
>
>    80 DEFINE_STUB_ADD(4, long,    InterlockedAdd)
>    81 DEFINE_STUB_ADD(8, __int64, InterlockedAdd64)
>
> can we use __int32 for clarity rather than "long"?
>
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-exception-handling%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=WIO7rvn6PEqROcyRJlNxGw0etcvNWP6Me8s4Q2PXcCE%3D&amp;reserved=0
>
> Looks good!
>
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-frames%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=oENY9tbPbPV4lhcmBsH%2FYFwb5e76OtC1EcALGXClKDY%3D&amp;reserved=0
>
> Looks good!
>
> Thanks,
> David
> -----
>
>> Tests: jtreg:hotspot:tier, jtreg:jdk:tier1, jtreg:jdk:tier2, jtreg:langtools on Windows-x86 and Windows-x86_64, no regressions.
>>
>> Thank you,
>>
>> --
>> Ludovic
>>

From patricio.chilano.mateo at oracle.com  Wed Jul 15 20:50:20 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Wed, 15 Jul 2020 17:50:20 -0300
Subject: [15] RFR(S): 8246676: monitor list lock operations need more
 fencing and 8247280
In-Reply-To: <ffbedf68-4795-805e-bbf7-bf136c289a8d@oracle.com>
References: <096dfe66-cc4c-1d83-e876-914937d2f87e@oracle.com>
 <4d874d07-2c43-ebc4-7d2a-a8fa15e914b1@oracle.com>
 <ffbedf68-4795-805e-bbf7-bf136c289a8d@oracle.com>
Message-ID: <556250c1-6175-174b-56af-eee90eedf9b4@oracle.com>

Hi Dan,

Changes look good to me. Thanks for the fixes.

Patricio
On 7/15/20 11:23 AM, Daniel D. Daugherty wrote:
> On 7/15/20 2:46 AM, David Holmes wrote:
>> Hi Dan,
>>
>> On 15/07/2020 12:30 pm, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> These fixes are targeted for JDK15 and I would like to push both of 
>>> these
>>> fixes before the RDP2 cutoff on Thursday.
>>>
>>> I have a JDK15 fix ready for a couple of related ObjectMonitor bug 
>>> fixes:
>>>
>>> ???? JDK-8246676 monitor list lock operations need more fencing
>>> ???? https://bugs.openjdk.java.net/browse/JDK-8246676
>>>
>>> ???? JDK-8247280 more fencing needed in async deflation for non-TSO 
>>> machines
>>> ???? https://bugs.openjdk.java.net/browse/JDK-8247280
>>>
>>> The fix for JDK-8246676 has been through three rounds of preliminary
>>> code review with David H.; Erik O. and Robbin participated in the first
>>> preliminary code review round. Mostly comment changes or backouts of
>>> code changes to return to the baseline code were done in the second and
>>> third preliminary rounds.
>>>
>>> The fix for JDK-8247280 has been through two rounds of preliminary
>>> code review with David H. The bug fix itself was suggested by Erik 
>>> O. so
>>> he's likely on-board with my implementation of the fix. :-)
>>>
>>> Many thanks to David H., Erik O. and Robbin for their many emails on
>>> these topics and for reviewing the preliminary webrevs.
>>>
>>> Here are the two webrevs:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8247280-webrev/0-for-jdk15/
>>
>> My only follow up here is the proper fix for using Atomics with 
>> enums. The patch is below and I've tested it with tiers 1-3 (link 
>> sent separately).
>
> Thanks! I appreciate the help with the metaprogramming stuff...
>
>
>>
>>> http://cr.openjdk.java.net/~dcubed/8246676-webrev/0-for-jdk15/
>>
>> Also looks good.
>
> Thanks!
>
>
>>
>>> The project is currently baselined on jdk-15+30 and has gone through
>>> Mach5 Tier[1-3],4,5,6,7,8 testing with no regressions. I've also run
>>> my inflation stress kit on Linux-X64 and macOSX without any 
>>> regressions.
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>
>> Thanks for working through all the details on this.
>
> Thank you for the many emails, discussions and reviews. This memory
> consistency stuff is hard to get right.
>
>
>>
>> David
>> -----
>>
>> diff -r 3bef86e53c51 src/hotspot/share/runtime/objectMonitor.hpp
>> --- a/src/hotspot/share/runtime/objectMonitor.hpp
>> +++ b/src/hotspot/share/runtime/objectMonitor.hpp
>> @@ -27,6 +27,7 @@
>>
>> ?#include "memory/allocation.hpp"
>> ?#include "memory/padded.hpp"
>> +#include "metaprogramming/isRegisteredEnum.hpp"
>> ?#include "oops/markWord.hpp"
>> ?#include "runtime/os.hpp"
>> ?#include "runtime/park.hpp"
>> @@ -372,4 +373,7 @@
>> ?? void????? install_displaced_markword_in_object(const oop obj);
>> ?};
>>
>> +// Register for atomic operations.
>> +template<> struct IsRegisteredEnum<ObjectMonitor::AllocationState> : 
>> public TrueType {};
>> +
>> ?#endif // SHARE_RUNTIME_OBJECTMONITOR_HPP
>> diff -r 3bef86e53c51 src/hotspot/share/runtime/objectMonitor.inline.hpp
>> --- a/src/hotspot/share/runtime/objectMonitor.inline.hpp
>> +++ b/src/hotspot/share/runtime/objectMonitor.inline.hpp
>> @@ -196,7 +196,7 @@
>> ?}
>>
>> ?inline void 
>> ObjectMonitor::release_set_allocation_state(ObjectMonitor::AllocationState 
>> s) {
>> -? Atomic::release_store((int*)&_allocation_state, (int)s);
>> +? Atomic::release_store(&_allocation_state, s);
>> ?}
>>
>> ?inline void 
>> ObjectMonitor::set_allocation_state(ObjectMonitor::AllocationState s) {
>> @@ -208,7 +208,7 @@
>> ?}
>>
>> ?inline ObjectMonitor::AllocationState 
>> ObjectMonitor::allocation_state_acquire() const {
>> -? return 
>> (AllocationState)Atomic::load_acquire((int*)&_allocation_state);
>> +? return Atomic::load_acquire(&_allocation_state);
>> ?}
>>
>> ?inline bool ObjectMonitor::is_free() const {
>
> That ended up being much simpler than I imagined late last night... :-)
>
> Thanks for coding this up and for running it through a Mach5 Tier[1-3].
> I very much appreciate it.
>
> Dan
>


From daniel.daugherty at oracle.com  Wed Jul 15 20:55:41 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 15 Jul 2020 16:55:41 -0400
Subject: [15] RFR(S): 8246676: monitor list lock operations need more
 fencing and 8247280
In-Reply-To: <556250c1-6175-174b-56af-eee90eedf9b4@oracle.com>
References: <096dfe66-cc4c-1d83-e876-914937d2f87e@oracle.com>
 <4d874d07-2c43-ebc4-7d2a-a8fa15e914b1@oracle.com>
 <ffbedf68-4795-805e-bbf7-bf136c289a8d@oracle.com>
 <556250c1-6175-174b-56af-eee90eedf9b4@oracle.com>
Message-ID: <fa9d3d32-af6d-68ce-59b0-b1cf1bb9c508@oracle.com>

Patricio,

Thanks for the review!

Dan


On 7/15/20 4:50 PM, Patricio Chilano wrote:
> Hi Dan,
>
> Changes look good to me. Thanks for the fixes.
>
> Patricio
> On 7/15/20 11:23 AM, Daniel D. Daugherty wrote:
>> On 7/15/20 2:46 AM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> On 15/07/2020 12:30 pm, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> These fixes are targeted for JDK15 and I would like to push both of 
>>>> these
>>>> fixes before the RDP2 cutoff on Thursday.
>>>>
>>>> I have a JDK15 fix ready for a couple of related ObjectMonitor bug 
>>>> fixes:
>>>>
>>>> ???? JDK-8246676 monitor list lock operations need more fencing
>>>> ???? https://bugs.openjdk.java.net/browse/JDK-8246676
>>>>
>>>> ???? JDK-8247280 more fencing needed in async deflation for non-TSO 
>>>> machines
>>>> ???? https://bugs.openjdk.java.net/browse/JDK-8247280
>>>>
>>>> The fix for JDK-8246676 has been through three rounds of preliminary
>>>> code review with David H.; Erik O. and Robbin participated in the 
>>>> first
>>>> preliminary code review round. Mostly comment changes or backouts of
>>>> code changes to return to the baseline code were done in the second 
>>>> and
>>>> third preliminary rounds.
>>>>
>>>> The fix for JDK-8247280 has been through two rounds of preliminary
>>>> code review with David H. The bug fix itself was suggested by Erik 
>>>> O. so
>>>> he's likely on-board with my implementation of the fix. :-)
>>>>
>>>> Many thanks to David H., Erik O. and Robbin for their many emails on
>>>> these topics and for reviewing the preliminary webrevs.
>>>>
>>>> Here are the two webrevs:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8247280-webrev/0-for-jdk15/
>>>
>>> My only follow up here is the proper fix for using Atomics with 
>>> enums. The patch is below and I've tested it with tiers 1-3 (link 
>>> sent separately).
>>
>> Thanks! I appreciate the help with the metaprogramming stuff...
>>
>>
>>>
>>>> http://cr.openjdk.java.net/~dcubed/8246676-webrev/0-for-jdk15/
>>>
>>> Also looks good.
>>
>> Thanks!
>>
>>
>>>
>>>> The project is currently baselined on jdk-15+30 and has gone through
>>>> Mach5 Tier[1-3],4,5,6,7,8 testing with no regressions. I've also run
>>>> my inflation stress kit on Linux-X64 and macOSX without any 
>>>> regressions.
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>>>
>>> Thanks for working through all the details on this.
>>
>> Thank you for the many emails, discussions and reviews. This memory
>> consistency stuff is hard to get right.
>>
>>
>>>
>>> David
>>> -----
>>>
>>> diff -r 3bef86e53c51 src/hotspot/share/runtime/objectMonitor.hpp
>>> --- a/src/hotspot/share/runtime/objectMonitor.hpp
>>> +++ b/src/hotspot/share/runtime/objectMonitor.hpp
>>> @@ -27,6 +27,7 @@
>>>
>>> ?#include "memory/allocation.hpp"
>>> ?#include "memory/padded.hpp"
>>> +#include "metaprogramming/isRegisteredEnum.hpp"
>>> ?#include "oops/markWord.hpp"
>>> ?#include "runtime/os.hpp"
>>> ?#include "runtime/park.hpp"
>>> @@ -372,4 +373,7 @@
>>> ?? void????? install_displaced_markword_in_object(const oop obj);
>>> ?};
>>>
>>> +// Register for atomic operations.
>>> +template<> struct IsRegisteredEnum<ObjectMonitor::AllocationState> 
>>> : public TrueType {};
>>> +
>>> ?#endif // SHARE_RUNTIME_OBJECTMONITOR_HPP
>>> diff -r 3bef86e53c51 src/hotspot/share/runtime/objectMonitor.inline.hpp
>>> --- a/src/hotspot/share/runtime/objectMonitor.inline.hpp
>>> +++ b/src/hotspot/share/runtime/objectMonitor.inline.hpp
>>> @@ -196,7 +196,7 @@
>>> ?}
>>>
>>> ?inline void 
>>> ObjectMonitor::release_set_allocation_state(ObjectMonitor::AllocationState 
>>> s) {
>>> -? Atomic::release_store((int*)&_allocation_state, (int)s);
>>> +? Atomic::release_store(&_allocation_state, s);
>>> ?}
>>>
>>> ?inline void 
>>> ObjectMonitor::set_allocation_state(ObjectMonitor::AllocationState s) {
>>> @@ -208,7 +208,7 @@
>>> ?}
>>>
>>> ?inline ObjectMonitor::AllocationState 
>>> ObjectMonitor::allocation_state_acquire() const {
>>> -? return 
>>> (AllocationState)Atomic::load_acquire((int*)&_allocation_state);
>>> +? return Atomic::load_acquire(&_allocation_state);
>>> ?}
>>>
>>> ?inline bool ObjectMonitor::is_free() const {
>>
>> That ended up being much simpler than I imagined late last night... :-)
>>
>> Thanks for coding this up and for running it through a Mach5 Tier[1-3].
>> I very much appreciate it.
>>
>> Dan
>>
>


From jamsheed.c.m at oracle.com  Wed Jul 15 22:16:11 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 03:46:11 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
Message-ID: <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>

(Thank you Dean, adding serviceability team as this issue involves JVMTI 
features PopFrame, EarlyReturn features)

JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381

(testing: mach5, tier1-5 links in JBS)

Best regards,

Jamsheed

On 15/07/2020 21:25, Jamsheed C M wrote:
>
> Hi,
>
> Async handling at method entry requires it to be aware of 
> synchronization(like whether it is doing async handling before lock 
> acquire or after)
>
> This is required as exception handler rely on this info for 
> unlocking.? Async handling code never had this special condition 
> handled and it worked most of the time as we were using biased locking 
> which got disabled by [1]
>
> There was one other issue reported in similar time[2]. This issue got 
> triggered in test case by [3], back to back extra safepoint after 
> suspend and TLH for ThreadDeath. So in this setup both PopFrame 
> request and Thread.Stop request happened together for the test 
> scenario and it reached java method entry with pending_exception set.
>
> I have done a partial fix for the issue, mainly to handle production 
> mode crash failures(do not unlock flag related ones)
>
> Fix detail:
>
> 1) I save restore the "do not unlock" flag in async handling.
>
> 2) Return for floating pending exception for some cases(PopFrame, 
> Early return related). This is debug(JVMTI) feature and floating 
> exception can get cleaned just like that in present compiler request 
> and deopt code.
>
> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>
> There are more problems in these code areas, like we clear all 
> exceptions in compilation request path(interpreter,c1), as well as 
> deoptimization path.
>
> All these un-handled cases will be separately handled by 
> https://bugs.openjdk.java.net/browse/JDK-8249451
>
> Request for review.
>
> Best regards,
>
> Jamsheed
>
> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>
> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>
> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>

From david.holmes at oracle.com  Wed Jul 15 23:50:35 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 16 Jul 2020 09:50:35 +1000
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
Message-ID: <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>

Hi Jamsheed,

On 16/07/2020 8:16 am, Jamsheed C M wrote:
> (Thank you Dean, adding serviceability team as this issue involves JVMTI 
> features PopFrame, EarlyReturn features)

It is not at all obvious how your proposed fix impacts the JVM TI features.

> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
> 
> (testing: mach5, tier1-5 links in JBS)
> 
> Best regards,
> 
> Jamsheed
> 
> On 15/07/2020 21:25, Jamsheed C M wrote:
>>
>> Hi,
>>
>> Async handling at method entry requires it to be aware of 
>> synchronization(like whether it is doing async handling before lock 
>> acquire or after)
>>
>> This is required as exception handler rely on this info for 
>> unlocking.? Async handling code never had this special condition 
>> handled and it worked most of the time as we were using biased locking 
>> which got disabled by [1]
>>
>> There was one other issue reported in similar time[2]. This issue got 
>> triggered in test case by [3], back to back extra safepoint after 
>> suspend and TLH for ThreadDeath. So in this setup both PopFrame 
>> request and Thread.Stop request happened together for the test 
>> scenario and it reached java method entry with pending_exception set.
>>
>> I have done a partial fix for the issue, mainly to handle production 
>> mode crash failures(do not unlock flag related ones)
>>
>> Fix detail:
>>
>> 1) I save restore the "do not unlock" flag in async handling.

Sorry but you completely changed the fix compared to what we discussed 
and what I pre-reviewed! What happened to changing from JRT_ENTRY to 
JRT_ENTRY_NOASYNC? It is going to take me a lot of time and effort to 
determine that this save/restore of the "do not unlock flag" is actually 
correct and valid!

>>
>> 2) Return for floating pending exception for some cases(PopFrame, 
>> Early return related). This is debug(JVMTI) feature and floating 
>> exception can get cleaned just like that in present compiler request 
>> and deopt code.

What part of the change addresses this?

Thanks,
David
-----

>>
>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>
>> There are more problems in these code areas, like we clear all 
>> exceptions in compilation request path(interpreter,c1), as well as 
>> deoptimization path.
>>
>> All these un-handled cases will be separately handled by 
>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>
>> Request for review.
>>
>> Best regards,
>>
>> Jamsheed
>>
>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>
>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>
>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>

From jamsheed.c.m at oracle.com  Thu Jul 16 00:01:21 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 05:31:21 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
Message-ID: <973c4e4c-ed0e-7152-8387-28243a3ac275@oracle.com>

Hi David,

On 16/07/2020 05:20, David Holmes wrote:
> Hi Jamsheed,
>
> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>> (Thank you Dean, adding serviceability team as this issue involves 
>> JVMTI features PopFrame, EarlyReturn features)
>
> It is not at all obvious how your proposed fix impacts the JVM TI 
> features.

Yes, proposed fix doesn't. Fix doesn't plan to address JVMTI feature 
related issues.

Added just to keep everyone in the loop.

Best regards,

Jamsheed
>
>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>
>> (testing: mach5, tier1-5 links in JBS)
>>
>> Best regards,
>>
>> Jamsheed
>>
>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>
>>> Hi,
>>>
>>> Async handling at method entry requires it to be aware of 
>>> synchronization(like whether it is doing async handling before lock 
>>> acquire or after)
>>>
>>> This is required as exception handler rely on this info for 
>>> unlocking.? Async handling code never had this special condition 
>>> handled and it worked most of the time as we were using biased 
>>> locking which got disabled by [1]
>>>
>>> There was one other issue reported in similar time[2]. This issue 
>>> got triggered in test case by [3], back to back extra safepoint 
>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>> PopFrame request and Thread.Stop request happened together for the 
>>> test scenario and it reached java method entry with 
>>> pending_exception set.
>>>
>>> I have done a partial fix for the issue, mainly to handle production 
>>> mode crash failures(do not unlock flag related ones)
>>>
>>> Fix detail:
>>>
>>> 1) I save restore the "do not unlock" flag in async handling.
>
> Sorry but you completely changed the fix compared to what we discussed 
> and what I pre-reviewed! What happened to changing from JRT_ENTRY to 
> JRT_ENTRY_NOASYNC? It is going to take me a lot of time and effort to 
> determine that this save/restore of the "do not unlock flag" is 
> actually correct and valid!
>
>>>
>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>> Early return related). This is debug(JVMTI) feature and floating 
>>> exception can get cleaned just like that in present compiler request 
>>> and deopt code.
>
> What part of the change addresses this?
>
> Thanks,
> David
> -----
>
>>>
>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>
>>> There are more problems in these code areas, like we clear all 
>>> exceptions in compilation request path(interpreter,c1), as well as 
>>> deoptimization path.
>>>
>>> All these un-handled cases will be separately handled by 
>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>
>>> Request for review.
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>

From jamsheed.c.m at oracle.com  Thu Jul 16 00:37:25 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 06:07:25 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
Message-ID: <122f8079-958c-acdf-bb60-3934729a313a@oracle.com>

Hi David,

On 16/07/2020 05:20, David Holmes wrote:
>>>
>>> Hi,
>>>
>>> Async handling at method entry requires it to be aware of 
>>> synchronization(like whether it is doing async handling before lock 
>>> acquire or after)
>>>
>>> This is required as exception handler rely on this info for 
>>> unlocking.? Async handling code never had this special condition 
>>> handled and it worked most of the time as we were using biased 
>>> locking which got disabled by [1]
>>>
>>> There was one other issue reported in similar time[2]. This issue 
>>> got triggered in test case by [3], back to back extra safepoint 
>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>> PopFrame request and Thread.Stop request happened together for the 
>>> test scenario and it reached java method entry with 
>>> pending_exception set.
>>>
>>> I have done a partial fix for the issue, mainly to handle production 
>>> mode crash failures(do not unlock flag related ones)
>>>
>>> Fix detail:
>>>
>>> 1) I save restore the "do not unlock" flag in async handling.
>
> Sorry but you completely changed the fix compared to what we discussed 
> and what I pre-reviewed! What happened to changing from JRT_ENTRY to 
> JRT_ENTRY_NOASYNC? It is going to take me a lot of time and effort to 
> determine that this save/restore of the "do not unlock flag" is 
> actually correct and valid!

I tried JRT_ENTRY to JRT_ENTRY_NOASYNC. but unfortunately that made some 
tests to fail(logs in JBS), I didn't investigate it in detail, but what 
I presume is

pending_async_exception is set for those failing scenarios but as we 
have? disabled async handling in some prominent code paths, the 
exception is never delivered.

>>>
>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>> Early return related). This is debug(JVMTI) feature and floating 
>>> exception can get cleaned just like that in present compiler request 
>>> and deopt code.
>
> What part of the change addresses this?

It doesn't address this issue completely. As it requires other changes 
in compilation request path(c1,interpreter) and deopt.

Just made changes to interpreter part(compilation request part). that 
fixes interpreter part partially.

  JRT_ENTRY(nmethod*,
            InterpreterRuntime::frequency_counter_overflow_inner(JavaThread* thread, address branch_bcp))
+ if (HAS_PENDING_EXCEPTION) {
+ return NULL;
+ }   JRT_ENTRY(void, InterpreterRuntime::profile_method(JavaThread* thread))
+ if (HAS_PENDING_EXCEPTION) {
+ return;
+ }

Best regards

Jamsheed

>
> Thanks,
> David
> ----- 

From david.holmes at oracle.com  Thu Jul 16 01:07:33 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 16 Jul 2020 11:07:33 +1000
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
Message-ID: <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>

Hi Jamsheed,

tl;dr version: fix looks good. Thanks for working through things with me 
on this one.

Long version ... for the sake of other reviewers (and myself) I'm going 
to walk through the problem scenario and how the fix addresses it, 
because the bug report is long and confusing and touches on a number of 
different issues with async exception handling.

We are dealing with the code generated for Java method entry, and in 
particular for a synchronized Java method. We do a lot of things in the 
entry code before we actually lock the monitor and jump to the Java 
method. Some of those things include method profiling and the counter 
overflow check for the JIT. If an exception is thrown at this point, the 
logic to remove the activation would unlock the monitor - which we 
haven't actually locked yet! So we have the 
do_not_unlock_if_synchronized flag which is stored in the current 
JavaThread. We set that flag true so that if any exceptions result in 
activation removal, the removal logic won't try to unlock the monitor. 
Once we're ready to lock the monitor we set the flag back to false (note 
there is an implicit assumption here that monitor locking can never 
raise an exception).

The problem arises with async exceptions, or more specifically the async 
exception that is raised due to an "unsafe access error". This is where 
a memory-mapped ByteBuffer causes an access violation (SEGV) due to a 
bad pointer. The signal handler simply sets a flag to indicate we 
encountered an "unsafe access error", adjusts the BCI to the next 
instruction and allows execution to proceed at the next instruction. It 
is then expected that the runtime will "soon" notice this pending unsafe 
access error and create and throw the InternalError instance that 
indicates the ByteBuffer operation failed. This requires executing Java 
code.

One of the places that checks for that pending unsafe access error is in 
the destructor of the JRT_ENTRY wrapper that is used for the method 
profiling and counter overflow checking. This occurs whilst the 
do_not_unlock_if_synchronized flag is true, so the resulting 
InternalError won't result in an attempt to unlock the not-locked monitor.

The problem is that creating the InternalError executes Java code - it 
calls constructors, which call methods etc. And some of those methods 
are synchronized. So the method entry logic for such a call will set 
do_not_unlock_if_synchronized to true, perform all the preamble related 
to the call, then set do_not_unlock_if_synchronized to false, lock the 
monitor and make the call. When construction completes the InternalError 
is thrown and we remove the activation for the method we had originally 
started to call. But now the do_not_unlock_if_synchronized flag has been 
reset to false by the nested Java method call, so we do in fact try to 
unlock a monitor that was never locked, and things break.

This nesting problem is well known and we have a mechanism for dealing 
with - the UnlockFlagSaver. The actual logic executed for profiling 
methods and doing the counter overflow check contains the requisite 
UnlockFlagSaver to avoid the problem just outlined. Unfortunately the 
async exception is processed in the JRT_ENTRY wrapper, which is outside 
the scope of those UnlockFlagSaver helpers and so they don't help in 
this case.

So the fix is to "simply" move the UnlockFlagSaver deeper into the call 
stack to the code that actually does the async exception processing:

  void JavaThread::check_and_handle_async_exceptions(bool 
check_unsafe_error) {
+   // May be we are at method entry and requires to save do not unlock 
flag.
+   UnlockFlagSaver fs(this);

so now after the InternalError has been created and thrown we will 
restore the original value of the do_not_unlock_if_synchronized flag 
(false) and so the InternalError will not cause activation removal to 
attempt to unlock the not-locked monitor.

The scope of the UnlockFlagSaver could be narrowed to the actual logic 
for processing the unsafe access error, but it seems fine at method scope.

A second fix is that the overflow counter check had an assertion that it 
was not executed with any pending exceptions. But that turned out to be 
false for reasons I can't fully explain, but it again appears to relate 
to a pending async exception being installed prior to the method call - 
and seems related to the two referenced JVM TI functions. The simple 
solution here is to delete the assertion and to check for pending 
exceptions on entry to the code and just return immediately. The 
JRT_ENTRY destructor will see the pending exception and propagate it.

Cheers,
David

On 16/07/2020 9:50 am, David Holmes wrote:
> Hi Jamsheed,
> 
> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>> (Thank you Dean, adding serviceability team as this issue involves 
>> JVMTI features PopFrame, EarlyReturn features)
> 
> It is not at all obvious how your proposed fix impacts the JVM TI features.
> 
>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>
>> (testing: mach5, tier1-5 links in JBS)
>>
>> Best regards,
>>
>> Jamsheed
>>
>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>
>>> Hi,
>>>
>>> Async handling at method entry requires it to be aware of 
>>> synchronization(like whether it is doing async handling before lock 
>>> acquire or after)
>>>
>>> This is required as exception handler rely on this info for 
>>> unlocking.? Async handling code never had this special condition 
>>> handled and it worked most of the time as we were using biased 
>>> locking which got disabled by [1]
>>>
>>> There was one other issue reported in similar time[2]. This issue got 
>>> triggered in test case by [3], back to back extra safepoint after 
>>> suspend and TLH for ThreadDeath. So in this setup both PopFrame 
>>> request and Thread.Stop request happened together for the test 
>>> scenario and it reached java method entry with pending_exception set.
>>>
>>> I have done a partial fix for the issue, mainly to handle production 
>>> mode crash failures(do not unlock flag related ones)
>>>
>>> Fix detail:
>>>
>>> 1) I save restore the "do not unlock" flag in async handling.
> 
> Sorry but you completely changed the fix compared to what we discussed 
> and what I pre-reviewed! What happened to changing from JRT_ENTRY to 
> JRT_ENTRY_NOASYNC? It is going to take me a lot of time and effort to 
> determine that this save/restore of the "do not unlock flag" is actually 
> correct and valid!
> 
>>>
>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>> Early return related). This is debug(JVMTI) feature and floating 
>>> exception can get cleaned just like that in present compiler request 
>>> and deopt code.
> 
> What part of the change addresses this?
> 
> Thanks,
> David
> -----
> 
>>>
>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>
>>> There are more problems in these code areas, like we clear all 
>>> exceptions in compilation request path(interpreter,c1), as well as 
>>> deoptimization path.
>>>
>>> All these un-handled cases will be separately handled by 
>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>
>>> Request for review.
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>

From jamsheed.c.m at oracle.com  Thu Jul 16 02:03:31 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 07:33:31 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
 <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
Message-ID: <e547e782-dfba-7984-75ee-1df9e2e80fd9@oracle.com>

Hi David,

On 16/07/2020 06:37, David Holmes wrote:
> Hi Jamsheed,
>
> tl;dr version: fix looks good. Thanks for working through things with 
> me on this one.
>
> Long version ... for the sake of other reviewers (and myself) I'm 
> going to walk through the problem scenario and how the fix addresses 
> it, because the bug report is long and confusing and touches on a 
> number of different issues with async exception handling.
>
> We are dealing with the code generated for Java method entry, and in 
> particular for a synchronized Java method. We do a lot of things in 
> the entry code before we actually lock the monitor and jump to the 
> Java method. Some of those things include method profiling and the 
> counter overflow check for the JIT. If an exception is thrown at this 
> point, the logic to remove the activation would unlock the monitor - 
> which we haven't actually locked yet! So we have the 
> do_not_unlock_if_synchronized flag which is stored in the current 
> JavaThread. We set that flag true so that if any exceptions result in 
> activation removal, the removal logic won't try to unlock the monitor. 
> Once we're ready to lock the monitor we set the flag back to false 
> (note there is an implicit assumption here that monitor locking can 
> never raise an exception).
>
> The problem arises with async exceptions, or more specifically the 
> async exception that is raised due to an "unsafe access error". This 
> is where a memory-mapped ByteBuffer causes an access violation (SEGV) 
> due to a bad pointer. The signal handler simply sets a flag to 
> indicate we encountered an "unsafe access error", adjusts the BCI to 
> the next instruction and allows execution to proceed at the next 
> instruction. It is then expected that the runtime will "soon" notice 
> this pending unsafe access error and create and throw the 
> InternalError instance that indicates the ByteBuffer operation failed. 
> This requires executing Java code.
>
> One of the places that checks for that pending unsafe access error is 
> in the destructor of the JRT_ENTRY wrapper that is used for the method 
> profiling and counter overflow checking. This occurs whilst the 
> do_not_unlock_if_synchronized flag is true, so the resulting 
> InternalError won't result in an attempt to unlock the not-locked 
> monitor.
>
> The problem is that creating the InternalError executes Java code - it 
> calls constructors, which call methods etc. And some of those methods 
> are synchronized. So the method entry logic for such a call will set 
> do_not_unlock_if_synchronized to true, perform all the preamble 
> related to the call, then set do_not_unlock_if_synchronized to false, 
> lock the monitor and make the call. When construction completes the 
> InternalError is thrown and we remove the activation for the method we 
> had originally started to call. But now the 
> do_not_unlock_if_synchronized flag has been reset to false by the 
> nested Java method call, so we do in fact try to unlock a monitor that 
> was never locked, and things break.
>
> This nesting problem is well known and we have a mechanism for dealing 
> with - the UnlockFlagSaver. The actual logic executed for profiling 
> methods and doing the counter overflow check contains the requisite 
> UnlockFlagSaver to avoid the problem just outlined. Unfortunately the 
> async exception is processed in the JRT_ENTRY wrapper, which is 
> outside the scope of those UnlockFlagSaver helpers and so they don't 
> help in this case.
>
> So the fix is to "simply" move the UnlockFlagSaver deeper into the 
> call stack to the code that actually does the async exception processing:
>
> ?void JavaThread::check_and_handle_async_exceptions(bool 
> check_unsafe_error) {
> +?? // May be we are at method entry and requires to save do not 
> unlock flag.
> +?? UnlockFlagSaver fs(this);
>
> so now after the InternalError has been created and thrown we will 
> restore the original value of the do_not_unlock_if_synchronized flag 
> (false) and so the InternalError will not cause activation removal to 
> attempt to unlock the not-locked monitor.
>
> The scope of the UnlockFlagSaver could be narrowed to the actual logic 
> for processing the unsafe access error, but it seems fine at method 
> scope.
>
> A second fix is that the overflow counter check had an assertion that 
> it was not executed with any pending exceptions. But that turned out 
> to be false for reasons I can't fully explain, but it again appears to 
> relate to a pending async exception being installed prior to the 
> method call - and seems related to the two referenced JVM TI 
> functions. The simple solution here is to delete the assertion and to 
> check for pending exceptions on entry to the code and just return 
> immediately. The JRT_ENTRY destructor will see the pending exception 
> and propagate it.

Thanks a lot for the opportunity, for all the help, and for putting 
detailed description of the problem here.

Best regards,

Jamsheed

>
> Cheers,
> David
>
> On 16/07/2020 9:50 am, David Holmes wrote:
>> Hi Jamsheed,
>>
>> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>>> (Thank you Dean, adding serviceability team as this issue involves 
>>> JVMTI features PopFrame, EarlyReturn features)
>>
>> It is not at all obvious how your proposed fix impacts the JVM TI 
>> features.
>>
>>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>>
>>> (testing: mach5, tier1-5 links in JBS)
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>>
>>>> Hi,
>>>>
>>>> Async handling at method entry requires it to be aware of 
>>>> synchronization(like whether it is doing async handling before lock 
>>>> acquire or after)
>>>>
>>>> This is required as exception handler rely on this info for 
>>>> unlocking.? Async handling code never had this special condition 
>>>> handled and it worked most of the time as we were using biased 
>>>> locking which got disabled by [1]
>>>>
>>>> There was one other issue reported in similar time[2]. This issue 
>>>> got triggered in test case by [3], back to back extra safepoint 
>>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>>> PopFrame request and Thread.Stop request happened together for the 
>>>> test scenario and it reached java method entry with 
>>>> pending_exception set.
>>>>
>>>> I have done a partial fix for the issue, mainly to handle 
>>>> production mode crash failures(do not unlock flag related ones)
>>>>
>>>> Fix detail:
>>>>
>>>> 1) I save restore the "do not unlock" flag in async handling.
>>
>> Sorry but you completely changed the fix compared to what we 
>> discussed and what I pre-reviewed! What happened to changing from 
>> JRT_ENTRY to JRT_ENTRY_NOASYNC? It is going to take me a lot of time 
>> and effort to determine that this save/restore of the "do not unlock 
>> flag" is actually correct and valid!
>>
>>>>
>>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>>> Early return related). This is debug(JVMTI) feature and floating 
>>>> exception can get cleaned just like that in present compiler 
>>>> request and deopt code.
>>
>> What part of the change addresses this?
>>
>> Thanks,
>> David
>> -----
>>
>>>>
>>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>>
>>>> There are more problems in these code areas, like we clear all 
>>>> exceptions in compilation request path(interpreter,c1), as well as 
>>>> deoptimization path.
>>>>
>>>> All these un-handled cases will be separately handled by 
>>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>>
>>>> Request for review.
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>>
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>>
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>>

From david.holmes at oracle.com  Thu Jul 16 02:30:30 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 16 Jul 2020 12:30:30 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>

Hi Ludovic,

On 16/07/2020 4:29 am, Ludovic Henry wrote:
> Hi David,
> 
> I gave a try to using `__int32` and `int` in place of `long`, but MSVC complains of type differences in the parameters passed to the `Interlocked*` functions.
> 
> ```
> C:\git\jdk\src\hotspot\os_cpu\windows_x86\atomic_windows_x86.hpp(103): error C2665: '_InterlockedCompareExchange': none of the 4
> overloads could convert all the argument types
> C:\git\jdk\build\devkit\10\include\um\winbase.h(9501): note: could be 'unsigned __int64 _InterlockedCompareExchange(volatile unsigned __int64 *,unsigned __int64,unsigned __int64)'
> C:\git\jdk\build\devkit\10\include\um\winbase.h(9488): note: or       'unsigned long _InterlockedCompareExchange(volatile unsigned long *,unsigned long,unsigned long)'
> C:\git\jdk\build\devkit\10\include\um\winbase.h(9477): note: or       'unsigned int _InterlockedCompareExchange(volatile unsigned int *,unsigned int,unsigned int)'
> [.....]
> ```

That's a shame - and somewhat surprising to me. But so be it.

> To clarify the use of `long` over `__int32` or `int`, I've instead added a comment (see http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.02/src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp.udiff.html).
> 
> Another complementary solution is to call the `DEFINE_*` macros, not with a numerical constant as the first argument, but with the value returned by the sizeof of the second argument. For example, we can have the following for cmpxchg:
> 
> ```
> DEFINE_STUB_CMPXCHG(sizeof(char),     char,    _InterlockedCompareExchange8) // Use the intrinsic as InterlockedCompareExchange8 does not exist
> DEFINE_STUB_CMPXCHG(sizeof(long),     long,    InterlockedCompareExchange)
> DEFINE_STUB_CMPXCHG(sizeof(__int64), __int64, InterlockedCompareExchange64)
> ```
> 
> In that case, we can even do away with the first argument altogether, like the following:
> 
> ```
> #define DEFINE_STUB_CMPXCHG(StubName, StubType)                            \
>    template<>                                                               \
>    template<typename T>                                                     \
>    inline T Atomic::PlatformCmpxchg<sizeof(StubType)>::operator()(T volatile* dest, \
>                                                           T compare_value,  \
>                                                           T exchange_value, \
>                                                           atomic_memory_order order) const { \
>      STATIC_ASSERT(sizeof(StubType) == sizeof(T));                          \
>      return PrimitiveConversions::cast<T>(                                  \
>        StubName(reinterpret_cast<StubType volatile *>(dest),                \
>                 PrimitiveConversions::cast<StubType>(exchange_value),       \
>                 PrimitiveConversions::cast<StubType>(compare_value)));      \
>    }
> 
> DEFINE_STUB_CMPXCHG(_InterlockedCompareExchange8, char) // Use the intrinsic as InterlockedCompareExchange8 does not exist
> DEFINE_STUB_CMPXCHG(InterlockedCompareExchange,   long)
> DEFINE_STUB_CMPXCHG(InterlockedCompareExchange64, __int64)
> 
> #undef DEFINE_STUB_CMPXCHG
> ```
> 
> That makes it very clear that the type is for the Interlocked* function, not the source data type (like int32_t/int64_t or jint/jlong).
> 
> I uploaded updated webrevs at http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.02/ and http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.03/, with the former not containing this `sizeof(StupType)` change, and the latter containing it.

I'm not very good with templates so I've asked Kim Barrett if he can 
take a look at this aspect.

As a style nit can you realign the parameters for those declaration you 
modified e.g.

59   inline D Atomic::PlatformAdd<sizeof(StubType)>::add_and_fetch(D 
volatile* dest, \
  60                                                         I 
add_value,      \
  61 
atomic_memory_order order) const { \


  76   inline T Atomic::PlatformXchg<sizeof(StubType)>::operator()(T 
volatile* dest, \
  77                                                       T 
exchange_value, \
  78 
atomic_memory_order order) const { \

etc.

Thanks,
David
-----

> Thank you!
> 
> ________________________________________
> From: David Holmes <david.holmes at oracle.com>
> Sent: Wednesday, July 15, 2020 06:32
> To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
> Cc: openjdk-aarch64
> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
> 
> Hi Ludovic,
> 
> On 15/07/2020 11:15 pm, Ludovic Henry wrote:
>> Hi David,
>>
>> Thanks for your feedback.
>>
>>> can we use __int32 for clarity rather than "long"?
>>
>> The Win32 API explicitly uses `long`, and I made sure for these `DEFINE_` macros to use the type used in the declaration of the API. If you are ok with the difference, I'm happy to change that to __int32.
> 
> I prefer to see _int32 in our code as "long" can be quite ambiguous
> depending on the reader (and something we are trying to eradicate from
> shared code -not everyone is aware of the LLP64 programming model versus
> LP64).
> 
> Thanks,
> David
> -----
> 
>> I've uploaded the new webrevs at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.01&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=GfaCMQYLrWynyShUR%2B6eRjpMR4Y%2Bj05PyaVAstTUoKU%3D&amp;reserved=0
>>
>> (I've also moved the previous webrevs to their respective webrev.00 folders).
>>
>> Thank you,
>>
>> --
>> Ludovic
>> ________________________________________
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Sunday, July 12, 2020 19:43
>> To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
>> Cc: openjdk-aarch64
>> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
>>
>> Hi Ludovic,
>>
>> On 9/07/2020 11:55 pm, Ludovic Henry wrote:
>>> Hello,
>>>
>>> As part of adding support for Windows-AArch64, I've had the opportunity to read through most of the Windows-x86 code. In doing so, I found some code that I think can be simplified and made easier to read and maintain.
>>>
>>> The three areas I have found are:
>>> - Atomics: Hotspot doesn't make use of existing intrinsics provided by MSVC and Win32, even ones available since Windows XP.
>>> - Exception handling: there is some code repetition which, even if functional, is subpar.
>>> - Frames: we can use the existing os::fetch_frame_from_context to simplify the code and reduce frame parsing logic duplication.
>>>
>>> I've split the webrevs along the above lines, making each simpler to review. I'm also hosting these webrevs on Bernhard Urban's CR as I currently do not have authorship. I'll also work with him to update the description of the JBS.
>>
>> Thanks for doing the split!
>>
>> As a general comment can you please ensure that the Oracle copyright
>> second year is updated to 2020. Thanks.
>>
>> Overall these cleanups look good. Thanks for providing them.
>>
>>> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248817&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=7X%2FV4ILiMMUmimw7XkvjDS9qD%2FhgnGqv%2FjNQpLFsARs%3D&amp;reserved=0
>>> Webrevs:
>>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=WlTIIZQG8qP2M0qnDQgjYiQcDa5RjvOV4Dqa7zi9JrI%3D&amp;reserved=0
>>
>> Love this cleanup! Great to see all the stubroutines go for x86.
>>
>> src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp
>>
>> Please delete this entire (archaic) comment block.
>>
>>     42 // The following alternative implementations are needed because
>>     43 // Windows 95 doesn't support (some of) the corresponding Windows NT
>>     44 // calls. Furthermore, these versions allow inlining in the caller.
>>     45 // (More precisely: The documentation for InterlockedExchange says
>>     46 // it is supported for Windows 95. However, when single-stepping
>>     47 // through the assembly code we cannot step into the routine and
>>     48 // when looking at the routine address we see only garbage code.
>>     49 // Better safe then sorry!). Was bug 7/31/98 (gri).
>>     50 //
>>     51 // Performance note: On uniprocessors, the 'lock' prefixes are not
>>     52 // necessary (and expensive). We should generate separate cases if
>>     53 // this becomes a performance problem.
>>
>> In this (and elsewhere):
>>
>>     80 DEFINE_STUB_ADD(4, long,    InterlockedAdd)
>>     81 DEFINE_STUB_ADD(8, __int64, InterlockedAdd64)
>>
>> can we use __int32 for clarity rather than "long"?
>>
>>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-exception-handling%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=WIO7rvn6PEqROcyRJlNxGw0etcvNWP6Me8s4Q2PXcCE%3D&amp;reserved=0
>>
>> Looks good!
>>
>>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-frames%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6b7ccc27a32e4a54478e08d828c3901c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304167686421708&amp;sdata=oENY9tbPbPV4lhcmBsH%2FYFwb5e76OtC1EcALGXClKDY%3D&amp;reserved=0
>>
>> Looks good!
>>
>> Thanks,
>> David
>> -----
>>
>>> Tests: jtreg:hotspot:tier, jtreg:jdk:tier1, jtreg:jdk:tier2, jtreg:langtools on Windows-x86 and Windows-x86_64, no regressions.
>>>
>>> Thank you,
>>>
>>> --
>>> Ludovic
>>>

From jamsheed.c.m at oracle.com  Thu Jul 16 07:00:18 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 12:30:18 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
 <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
Message-ID: <55b4473d-8aa4-77e0-1145-2a94a0a5f62e@oracle.com>

Hi all,

could i get another review?

Best regards,

Jamsheed

On 16/07/2020 06:37, David Holmes wrote:
> Hi Jamsheed,
>
> tl;dr version: fix looks good. Thanks for working through things with 
> me on this one.
>
> Long version ... for the sake of other reviewers (and myself) I'm 
> going to walk through the problem scenario and how the fix addresses 
> it, because the bug report is long and confusing and touches on a 
> number of different issues with async exception handling.
>
> We are dealing with the code generated for Java method entry, and in 
> particular for a synchronized Java method. We do a lot of things in 
> the entry code before we actually lock the monitor and jump to the 
> Java method. Some of those things include method profiling and the 
> counter overflow check for the JIT. If an exception is thrown at this 
> point, the logic to remove the activation would unlock the monitor - 
> which we haven't actually locked yet! So we have the 
> do_not_unlock_if_synchronized flag which is stored in the current 
> JavaThread. We set that flag true so that if any exceptions result in 
> activation removal, the removal logic won't try to unlock the monitor. 
> Once we're ready to lock the monitor we set the flag back to false 
> (note there is an implicit assumption here that monitor locking can 
> never raise an exception).
>
> The problem arises with async exceptions, or more specifically the 
> async exception that is raised due to an "unsafe access error". This 
> is where a memory-mapped ByteBuffer causes an access violation (SEGV) 
> due to a bad pointer. The signal handler simply sets a flag to 
> indicate we encountered an "unsafe access error", adjusts the BCI to 
> the next instruction and allows execution to proceed at the next 
> instruction. It is then expected that the runtime will "soon" notice 
> this pending unsafe access error and create and throw the 
> InternalError instance that indicates the ByteBuffer operation failed. 
> This requires executing Java code.
>
> One of the places that checks for that pending unsafe access error is 
> in the destructor of the JRT_ENTRY wrapper that is used for the method 
> profiling and counter overflow checking. This occurs whilst the 
> do_not_unlock_if_synchronized flag is true, so the resulting 
> InternalError won't result in an attempt to unlock the not-locked 
> monitor.
>
> The problem is that creating the InternalError executes Java code - it 
> calls constructors, which call methods etc. And some of those methods 
> are synchronized. So the method entry logic for such a call will set 
> do_not_unlock_if_synchronized to true, perform all the preamble 
> related to the call, then set do_not_unlock_if_synchronized to false, 
> lock the monitor and make the call. When construction completes the 
> InternalError is thrown and we remove the activation for the method we 
> had originally started to call. But now the 
> do_not_unlock_if_synchronized flag has been reset to false by the 
> nested Java method call, so we do in fact try to unlock a monitor that 
> was never locked, and things break.
>
> This nesting problem is well known and we have a mechanism for dealing 
> with - the UnlockFlagSaver. The actual logic executed for profiling 
> methods and doing the counter overflow check contains the requisite 
> UnlockFlagSaver to avoid the problem just outlined. Unfortunately the 
> async exception is processed in the JRT_ENTRY wrapper, which is 
> outside the scope of those UnlockFlagSaver helpers and so they don't 
> help in this case.
>
> So the fix is to "simply" move the UnlockFlagSaver deeper into the 
> call stack to the code that actually does the async exception processing:
>
> ?void JavaThread::check_and_handle_async_exceptions(bool 
> check_unsafe_error) {
> +?? // May be we are at method entry and requires to save do not 
> unlock flag.
> +?? UnlockFlagSaver fs(this);
>
> so now after the InternalError has been created and thrown we will 
> restore the original value of the do_not_unlock_if_synchronized flag 
> (false) and so the InternalError will not cause activation removal to 
> attempt to unlock the not-locked monitor.
>
> The scope of the UnlockFlagSaver could be narrowed to the actual logic 
> for processing the unsafe access error, but it seems fine at method 
> scope.
>
> A second fix is that the overflow counter check had an assertion that 
> it was not executed with any pending exceptions. But that turned out 
> to be false for reasons I can't fully explain, but it again appears to 
> relate to a pending async exception being installed prior to the 
> method call - and seems related to the two referenced JVM TI 
> functions. The simple solution here is to delete the assertion and to 
> check for pending exceptions on entry to the code and just return 
> immediately. The JRT_ENTRY destructor will see the pending exception 
> and propagate it.
>
> Cheers,
> David
>
> On 16/07/2020 9:50 am, David Holmes wrote:
>> Hi Jamsheed,
>>
>> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>>> (Thank you Dean, adding serviceability team as this issue involves 
>>> JVMTI features PopFrame, EarlyReturn features)
>>
>> It is not at all obvious how your proposed fix impacts the JVM TI 
>> features.
>>
>>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>>
>>> (testing: mach5, tier1-5 links in JBS)
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>>
>>>> Hi,
>>>>
>>>> Async handling at method entry requires it to be aware of 
>>>> synchronization(like whether it is doing async handling before lock 
>>>> acquire or after)
>>>>
>>>> This is required as exception handler rely on this info for 
>>>> unlocking.? Async handling code never had this special condition 
>>>> handled and it worked most of the time as we were using biased 
>>>> locking which got disabled by [1]
>>>>
>>>> There was one other issue reported in similar time[2]. This issue 
>>>> got triggered in test case by [3], back to back extra safepoint 
>>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>>> PopFrame request and Thread.Stop request happened together for the 
>>>> test scenario and it reached java method entry with 
>>>> pending_exception set.
>>>>
>>>> I have done a partial fix for the issue, mainly to handle 
>>>> production mode crash failures(do not unlock flag related ones)
>>>>
>>>> Fix detail:
>>>>
>>>> 1) I save restore the "do not unlock" flag in async handling.
>>
>> Sorry but you completely changed the fix compared to what we 
>> discussed and what I pre-reviewed! What happened to changing from 
>> JRT_ENTRY to JRT_ENTRY_NOASYNC? It is going to take me a lot of time 
>> and effort to determine that this save/restore of the "do not unlock 
>> flag" is actually correct and valid!
>>
>>>>
>>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>>> Early return related). This is debug(JVMTI) feature and floating 
>>>> exception can get cleaned just like that in present compiler 
>>>> request and deopt code.
>>
>> What part of the change addresses this?
>>
>> Thanks,
>> David
>> -----
>>
>>>>
>>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>>
>>>> There are more problems in these code areas, like we clear all 
>>>> exceptions in compilation request path(interpreter,c1), as well as 
>>>> deoptimization path.
>>>>
>>>> All these un-handled cases will be separately handled by 
>>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>>
>>>> Request for review.
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>>
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>>
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>>

From coleen.phillimore at oracle.com  Thu Jul 16 14:43:17 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 16 Jul 2020 10:43:17 -0400
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <55b4473d-8aa4-77e0-1145-2a94a0a5f62e@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
 <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
 <55b4473d-8aa4-77e0-1145-2a94a0a5f62e@oracle.com>
Message-ID: <38336861-a8eb-fdb0-7860-9cbc8eb820b6@oracle.com>


Thanks to David's description of the problem and the fix, this makes 
sense to me now.
I don't like it and we should revisit async exceptions for all the other 
problems it causes, but this change looks safe and good.

thanks,
Coleen

On 7/16/20 3:00 AM, Jamsheed C M wrote:
> Hi all,
>
> could i get another review?
>
> Best regards,
>
> Jamsheed
>
> On 16/07/2020 06:37, David Holmes wrote:
>> Hi Jamsheed,
>>
>> tl;dr version: fix looks good. Thanks for working through things with 
>> me on this one.
>>
>> Long version ... for the sake of other reviewers (and myself) I'm 
>> going to walk through the problem scenario and how the fix addresses 
>> it, because the bug report is long and confusing and touches on a 
>> number of different issues with async exception handling.
>>
>> We are dealing with the code generated for Java method entry, and in 
>> particular for a synchronized Java method. We do a lot of things in 
>> the entry code before we actually lock the monitor and jump to the 
>> Java method. Some of those things include method profiling and the 
>> counter overflow check for the JIT. If an exception is thrown at this 
>> point, the logic to remove the activation would unlock the monitor - 
>> which we haven't actually locked yet! So we have the 
>> do_not_unlock_if_synchronized flag which is stored in the current 
>> JavaThread. We set that flag true so that if any exceptions result in 
>> activation removal, the removal logic won't try to unlock the 
>> monitor. Once we're ready to lock the monitor we set the flag back to 
>> false (note there is an implicit assumption here that monitor locking 
>> can never raise an exception).
>>
>> The problem arises with async exceptions, or more specifically the 
>> async exception that is raised due to an "unsafe access error". This 
>> is where a memory-mapped ByteBuffer causes an access violation (SEGV) 
>> due to a bad pointer. The signal handler simply sets a flag to 
>> indicate we encountered an "unsafe access error", adjusts the BCI to 
>> the next instruction and allows execution to proceed at the next 
>> instruction. It is then expected that the runtime will "soon" notice 
>> this pending unsafe access error and create and throw the 
>> InternalError instance that indicates the ByteBuffer operation 
>> failed. This requires executing Java code.
>>
>> One of the places that checks for that pending unsafe access error is 
>> in the destructor of the JRT_ENTRY wrapper that is used for the 
>> method profiling and counter overflow checking. This occurs whilst 
>> the do_not_unlock_if_synchronized flag is true, so the resulting 
>> InternalError won't result in an attempt to unlock the not-locked 
>> monitor.
>>
>> The problem is that creating the InternalError executes Java code - 
>> it calls constructors, which call methods etc. And some of those 
>> methods are synchronized. So the method entry logic for such a call 
>> will set do_not_unlock_if_synchronized to true, perform all the 
>> preamble related to the call, then set do_not_unlock_if_synchronized 
>> to false, lock the monitor and make the call. When construction 
>> completes the InternalError is thrown and we remove the activation 
>> for the method we had originally started to call. But now the 
>> do_not_unlock_if_synchronized flag has been reset to false by the 
>> nested Java method call, so we do in fact try to unlock a monitor 
>> that was never locked, and things break.
>>
>> This nesting problem is well known and we have a mechanism for 
>> dealing with - the UnlockFlagSaver. The actual logic executed for 
>> profiling methods and doing the counter overflow check contains the 
>> requisite UnlockFlagSaver to avoid the problem just outlined. 
>> Unfortunately the async exception is processed in the JRT_ENTRY 
>> wrapper, which is outside the scope of those UnlockFlagSaver helpers 
>> and so they don't help in this case.
>>
>> So the fix is to "simply" move the UnlockFlagSaver deeper into the 
>> call stack to the code that actually does the async exception 
>> processing:
>>
>> ?void JavaThread::check_and_handle_async_exceptions(bool 
>> check_unsafe_error) {
>> +?? // May be we are at method entry and requires to save do not 
>> unlock flag.
>> +?? UnlockFlagSaver fs(this);
>>
>> so now after the InternalError has been created and thrown we will 
>> restore the original value of the do_not_unlock_if_synchronized flag 
>> (false) and so the InternalError will not cause activation removal to 
>> attempt to unlock the not-locked monitor.
>>
>> The scope of the UnlockFlagSaver could be narrowed to the actual 
>> logic for processing the unsafe access error, but it seems fine at 
>> method scope.
>>
>> A second fix is that the overflow counter check had an assertion that 
>> it was not executed with any pending exceptions. But that turned out 
>> to be false for reasons I can't fully explain, but it again appears 
>> to relate to a pending async exception being installed prior to the 
>> method call - and seems related to the two referenced JVM TI 
>> functions. The simple solution here is to delete the assertion and to 
>> check for pending exceptions on entry to the code and just return 
>> immediately. The JRT_ENTRY destructor will see the pending exception 
>> and propagate it.
>>
>> Cheers,
>> David
>>
>> On 16/07/2020 9:50 am, David Holmes wrote:
>>> Hi Jamsheed,
>>>
>>> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>>>> (Thank you Dean, adding serviceability team as this issue involves 
>>>> JVMTI features PopFrame, EarlyReturn features)
>>>
>>> It is not at all obvious how your proposed fix impacts the JVM TI 
>>> features.
>>>
>>>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>>>
>>>> (testing: mach5, tier1-5 links in JBS)
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Async handling at method entry requires it to be aware of 
>>>>> synchronization(like whether it is doing async handling before 
>>>>> lock acquire or after)
>>>>>
>>>>> This is required as exception handler rely on this info for 
>>>>> unlocking.? Async handling code never had this special condition 
>>>>> handled and it worked most of the time as we were using biased 
>>>>> locking which got disabled by [1]
>>>>>
>>>>> There was one other issue reported in similar time[2]. This issue 
>>>>> got triggered in test case by [3], back to back extra safepoint 
>>>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>>>> PopFrame request and Thread.Stop request happened together for the 
>>>>> test scenario and it reached java method entry with 
>>>>> pending_exception set.
>>>>>
>>>>> I have done a partial fix for the issue, mainly to handle 
>>>>> production mode crash failures(do not unlock flag related ones)
>>>>>
>>>>> Fix detail:
>>>>>
>>>>> 1) I save restore the "do not unlock" flag in async handling.
>>>
>>> Sorry but you completely changed the fix compared to what we 
>>> discussed and what I pre-reviewed! What happened to changing from 
>>> JRT_ENTRY to JRT_ENTRY_NOASYNC? It is going to take me a lot of time 
>>> and effort to determine that this save/restore of the "do not unlock 
>>> flag" is actually correct and valid!
>>>
>>>>>
>>>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>>>> Early return related). This is debug(JVMTI) feature and floating 
>>>>> exception can get cleaned just like that in present compiler 
>>>>> request and deopt code.
>>>
>>> What part of the change addresses this?
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>>
>>>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>>>
>>>>> There are more problems in these code areas, like we clear all 
>>>>> exceptions in compilation request path(interpreter,c1), as well as 
>>>>> deoptimization path.
>>>>>
>>>>> All these un-handled cases will be separately handled by 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>>>
>>>>> Request for review.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>>>
>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>>>
>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>>>


From jamsheed.c.m at oracle.com  Thu Jul 16 14:49:48 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 20:19:48 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <38336861-a8eb-fdb0-7860-9cbc8eb820b6@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
 <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
 <55b4473d-8aa4-77e0-1145-2a94a0a5f62e@oracle.com>
 <38336861-a8eb-fdb0-7860-9cbc8eb820b6@oracle.com>
Message-ID: <24043cec-b3f2-bfa0-fd66-f2fcedc4be27@oracle.com>

Hi Coleen,

Thank you for the review.

Best regards,

Jamsheed

On 16/07/2020 20:13, coleen.phillimore at oracle.com wrote:
>
> Thanks to David's description of the problem and the fix, this makes 
> sense to me now.
> I don't like it and we should revisit async exceptions for all the 
> other problems it causes, but this change looks safe and good.
>
> thanks,
> Coleen
>
> On 7/16/20 3:00 AM, Jamsheed C M wrote:
>> Hi all,
>>
>> could i get another review?
>>
>> Best regards,
>>
>> Jamsheed
>>
>> On 16/07/2020 06:37, David Holmes wrote:
>>> Hi Jamsheed,
>>>
>>> tl;dr version: fix looks good. Thanks for working through things 
>>> with me on this one.
>>>
>>> Long version ... for the sake of other reviewers (and myself) I'm 
>>> going to walk through the problem scenario and how the fix addresses 
>>> it, because the bug report is long and confusing and touches on a 
>>> number of different issues with async exception handling.
>>>
>>> We are dealing with the code generated for Java method entry, and in 
>>> particular for a synchronized Java method. We do a lot of things in 
>>> the entry code before we actually lock the monitor and jump to the 
>>> Java method. Some of those things include method profiling and the 
>>> counter overflow check for the JIT. If an exception is thrown at 
>>> this point, the logic to remove the activation would unlock the 
>>> monitor - which we haven't actually locked yet! So we have the 
>>> do_not_unlock_if_synchronized flag which is stored in the current 
>>> JavaThread. We set that flag true so that if any exceptions result 
>>> in activation removal, the removal logic won't try to unlock the 
>>> monitor. Once we're ready to lock the monitor we set the flag back 
>>> to false (note there is an implicit assumption here that monitor 
>>> locking can never raise an exception).
>>>
>>> The problem arises with async exceptions, or more specifically the 
>>> async exception that is raised due to an "unsafe access error". This 
>>> is where a memory-mapped ByteBuffer causes an access violation 
>>> (SEGV) due to a bad pointer. The signal handler simply sets a flag 
>>> to indicate we encountered an "unsafe access error", adjusts the BCI 
>>> to the next instruction and allows execution to proceed at the next 
>>> instruction. It is then expected that the runtime will "soon" notice 
>>> this pending unsafe access error and create and throw the 
>>> InternalError instance that indicates the ByteBuffer operation 
>>> failed. This requires executing Java code.
>>>
>>> One of the places that checks for that pending unsafe access error 
>>> is in the destructor of the JRT_ENTRY wrapper that is used for the 
>>> method profiling and counter overflow checking. This occurs whilst 
>>> the do_not_unlock_if_synchronized flag is true, so the resulting 
>>> InternalError won't result in an attempt to unlock the not-locked 
>>> monitor.
>>>
>>> The problem is that creating the InternalError executes Java code - 
>>> it calls constructors, which call methods etc. And some of those 
>>> methods are synchronized. So the method entry logic for such a call 
>>> will set do_not_unlock_if_synchronized to true, perform all the 
>>> preamble related to the call, then set do_not_unlock_if_synchronized 
>>> to false, lock the monitor and make the call. When construction 
>>> completes the InternalError is thrown and we remove the activation 
>>> for the method we had originally started to call. But now the 
>>> do_not_unlock_if_synchronized flag has been reset to false by the 
>>> nested Java method call, so we do in fact try to unlock a monitor 
>>> that was never locked, and things break.
>>>
>>> This nesting problem is well known and we have a mechanism for 
>>> dealing with - the UnlockFlagSaver. The actual logic executed for 
>>> profiling methods and doing the counter overflow check contains the 
>>> requisite UnlockFlagSaver to avoid the problem just outlined. 
>>> Unfortunately the async exception is processed in the JRT_ENTRY 
>>> wrapper, which is outside the scope of those UnlockFlagSaver helpers 
>>> and so they don't help in this case.
>>>
>>> So the fix is to "simply" move the UnlockFlagSaver deeper into the 
>>> call stack to the code that actually does the async exception 
>>> processing:
>>>
>>> ?void JavaThread::check_and_handle_async_exceptions(bool 
>>> check_unsafe_error) {
>>> +?? // May be we are at method entry and requires to save do not 
>>> unlock flag.
>>> +?? UnlockFlagSaver fs(this);
>>>
>>> so now after the InternalError has been created and thrown we will 
>>> restore the original value of the do_not_unlock_if_synchronized flag 
>>> (false) and so the InternalError will not cause activation removal 
>>> to attempt to unlock the not-locked monitor.
>>>
>>> The scope of the UnlockFlagSaver could be narrowed to the actual 
>>> logic for processing the unsafe access error, but it seems fine at 
>>> method scope.
>>>
>>> A second fix is that the overflow counter check had an assertion 
>>> that it was not executed with any pending exceptions. But that 
>>> turned out to be false for reasons I can't fully explain, but it 
>>> again appears to relate to a pending async exception being installed 
>>> prior to the method call - and seems related to the two referenced 
>>> JVM TI functions. The simple solution here is to delete the 
>>> assertion and to check for pending exceptions on entry to the code 
>>> and just return immediately. The JRT_ENTRY destructor will see the 
>>> pending exception and propagate it.
>>>
>>> Cheers,
>>> David
>>>
>>> On 16/07/2020 9:50 am, David Holmes wrote:
>>>> Hi Jamsheed,
>>>>
>>>> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>>>>> (Thank you Dean, adding serviceability team as this issue involves 
>>>>> JVMTI features PopFrame, EarlyReturn features)
>>>>
>>>> It is not at all obvious how your proposed fix impacts the JVM TI 
>>>> features.
>>>>
>>>>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>>>>
>>>>> (testing: mach5, tier1-5 links in JBS)
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Async handling at method entry requires it to be aware of 
>>>>>> synchronization(like whether it is doing async handling before 
>>>>>> lock acquire or after)
>>>>>>
>>>>>> This is required as exception handler rely on this info for 
>>>>>> unlocking.? Async handling code never had this special condition 
>>>>>> handled and it worked most of the time as we were using biased 
>>>>>> locking which got disabled by [1]
>>>>>>
>>>>>> There was one other issue reported in similar time[2]. This issue 
>>>>>> got triggered in test case by [3], back to back extra safepoint 
>>>>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>>>>> PopFrame request and Thread.Stop request happened together for 
>>>>>> the test scenario and it reached java method entry with 
>>>>>> pending_exception set.
>>>>>>
>>>>>> I have done a partial fix for the issue, mainly to handle 
>>>>>> production mode crash failures(do not unlock flag related ones)
>>>>>>
>>>>>> Fix detail:
>>>>>>
>>>>>> 1) I save restore the "do not unlock" flag in async handling.
>>>>
>>>> Sorry but you completely changed the fix compared to what we 
>>>> discussed and what I pre-reviewed! What happened to changing from 
>>>> JRT_ENTRY to JRT_ENTRY_NOASYNC? It is going to take me a lot of 
>>>> time and effort to determine that this save/restore of the "do not 
>>>> unlock flag" is actually correct and valid!
>>>>
>>>>>>
>>>>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>>>>> Early return related). This is debug(JVMTI) feature and floating 
>>>>>> exception can get cleaned just like that in present compiler 
>>>>>> request and deopt code.
>>>>
>>>> What part of the change addresses this?
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>>>
>>>>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>>>>
>>>>>> There are more problems in these code areas, like we clear all 
>>>>>> exceptions in compilation request path(interpreter,c1), as well 
>>>>>> as deoptimization path.
>>>>>>
>>>>>> All these un-handled cases will be separately handled by 
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>>>>
>>>>>> Request for review.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>>>>
>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>>>>
>>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>>>>
>

From luhenry at microsoft.com  Thu Jul 16 15:39:18 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 16 Jul 2020 15:39:18 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
Message-ID: <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi David,

> I'm not very good with templates so I've asked Kim Barrett if he can
> take a look at this aspect.

Sounds good, looking forward for his review.

I'm updating the webrev at http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.03 with the proper realignement of the parameters.

Thank you,

________________________________________
From: David Holmes <david.holmes at oracle.com>
Sent: Wednesday, July 15, 2020 19:30
To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
Cc: openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

Hi Ludovic,

On 16/07/2020 4:29 am, Ludovic Henry wrote:
> Hi David,
>
> I gave a try to using `__int32` and `int` in place of `long`, but MSVC complains of type differences in the parameters passed to the `Interlocked*` functions.
>
> ```
> C:\git\jdk\src\hotspot\os_cpu\windows_x86\atomic_windows_x86.hpp(103): error C2665: '_InterlockedCompareExchange': none of the 4
> overloads could convert all the argument types
> C:\git\jdk\build\devkit\10\include\um\winbase.h(9501): note: could be 'unsigned __int64 _InterlockedCompareExchange(volatile unsigned __int64 *,unsigned __int64,unsigned __int64)'
> C:\git\jdk\build\devkit\10\include\um\winbase.h(9488): note: or       'unsigned long _InterlockedCompareExchange(volatile unsigned long *,unsigned long,unsigned long)'
> C:\git\jdk\build\devkit\10\include\um\winbase.h(9477): note: or       'unsigned int _InterlockedCompareExchange(volatile unsigned int *,unsigned int,unsigned int)'
> [.....]
> ```

That's a shame - and somewhat surprising to me. But so be it.

> To clarify the use of `long` over `__int32` or `int`, I've instead added a comment (see https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.02%2Fsrc%2Fhotspot%2Fos_cpu%2Fwindows_x86%2Fatomic_windows_x86.hpp.udiff.html&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca4c420e8c4c347f078b108d8293083c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304635629602629&amp;sdata=Yfa8ezxQ7L1EG8AHQBmvLCLjFocFxN0eOG4SLaP1xrY%3D&amp;reserved=0).
>
> Another complementary solution is to call the `DEFINE_*` macros, not with a numerical constant as the first argument, but with the value returned by the sizeof of the second argument. For example, we can have the following for cmpxchg:
>
> ```
> DEFINE_STUB_CMPXCHG(sizeof(char),     char,    _InterlockedCompareExchange8) // Use the intrinsic as InterlockedCompareExchange8 does not exist
> DEFINE_STUB_CMPXCHG(sizeof(long),     long,    InterlockedCompareExchange)
> DEFINE_STUB_CMPXCHG(sizeof(__int64), __int64, InterlockedCompareExchange64)
> ```
>
> In that case, we can even do away with the first argument altogether, like the following:
>
> ```
> #define DEFINE_STUB_CMPXCHG(StubName, StubType)                            \
>    template<>                                                               \
>    template<typename T>                                                     \
>    inline T Atomic::PlatformCmpxchg<sizeof(StubType)>::operator()(T volatile* dest, \
>                                                           T compare_value,  \
>                                                           T exchange_value, \
>                                                           atomic_memory_order order) const { \
>      STATIC_ASSERT(sizeof(StubType) == sizeof(T));                          \
>      return PrimitiveConversions::cast<T>(                                  \
>        StubName(reinterpret_cast<StubType volatile *>(dest),                \
>                 PrimitiveConversions::cast<StubType>(exchange_value),       \
>                 PrimitiveConversions::cast<StubType>(compare_value)));      \
>    }
>
> DEFINE_STUB_CMPXCHG(_InterlockedCompareExchange8, char) // Use the intrinsic as InterlockedCompareExchange8 does not exist
> DEFINE_STUB_CMPXCHG(InterlockedCompareExchange,   long)
> DEFINE_STUB_CMPXCHG(InterlockedCompareExchange64, __int64)
>
> #undef DEFINE_STUB_CMPXCHG
> ```
>
> That makes it very clear that the type is for the Interlocked* function, not the source data type (like int32_t/int64_t or jint/jlong).
>
> I uploaded updated webrevs at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.02%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca4c420e8c4c347f078b108d8293083c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304635629602629&amp;sdata=ibnTPnXhzDG7%2B6XUjWVwe6Nl2W7TArrN3WT2kfyT4SE%3D&amp;reserved=0 and https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.03%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca4c420e8c4c347f078b108d8293083c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304635629612585&amp;sdata=Dco897%2BKfPDjWB7rcZxXFQ50%2Fgx18Z0SkqIL7C1Gh14%3D&amp;reserved=0, with the former not containing this `sizeof(StupType)` change, and the latter containing it.

I'm not very good with templates so I've asked Kim Barrett if he can
take a look at this aspect.

As a style nit can you realign the parameters for those declaration you
modified e.g.

59   inline D Atomic::PlatformAdd<sizeof(StubType)>::add_and_fetch(D
volatile* dest, \
  60                                                         I
add_value,      \
  61
atomic_memory_order order) const { \


  76   inline T Atomic::PlatformXchg<sizeof(StubType)>::operator()(T
volatile* dest, \
  77                                                       T
exchange_value, \
  78
atomic_memory_order order) const { \

etc.

Thanks,
David
-----

> Thank you!
>
> ________________________________________
> From: David Holmes <david.holmes at oracle.com>
> Sent: Wednesday, July 15, 2020 06:32
> To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
> Cc: openjdk-aarch64
> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
>
> Hi Ludovic,
>
> On 15/07/2020 11:15 pm, Ludovic Henry wrote:
>> Hi David,
>>
>> Thanks for your feedback.
>>
>>> can we use __int32 for clarity rather than "long"?
>>
>> The Win32 API explicitly uses `long`, and I made sure for these `DEFINE_` macros to use the type used in the declaration of the API. If you are ok with the difference, I'm happy to change that to __int32.
>
> I prefer to see _int32 in our code as "long" can be quite ambiguous
> depending on the reader (and something we are trying to eradicate from
> shared code -not everyone is aware of the LLP64 programming model versus
> LP64).
>
> Thanks,
> David
> -----
>
>> I've uploaded the new webrevs at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.01&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca4c420e8c4c347f078b108d8293083c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304635629612585&amp;sdata=E0mkL8%2F17tuFATeIsDf73x29fIcmCyTmL6phKtLerV0%3D&amp;reserved=0
>>
>> (I've also moved the previous webrevs to their respective webrev.00 folders).
>>
>> Thank you,
>>
>> --
>> Ludovic
>> ________________________________________
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Sunday, July 12, 2020 19:43
>> To: Ludovic Henry; hotspot-runtime-dev at openjdk.java.net
>> Cc: openjdk-aarch64
>> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
>>
>> Hi Ludovic,
>>
>> On 9/07/2020 11:55 pm, Ludovic Henry wrote:
>>> Hello,
>>>
>>> As part of adding support for Windows-AArch64, I've had the opportunity to read through most of the Windows-x86 code. In doing so, I found some code that I think can be simplified and made easier to read and maintain.
>>>
>>> The three areas I have found are:
>>> - Atomics: Hotspot doesn't make use of existing intrinsics provided by MSVC and Win32, even ones available since Windows XP.
>>> - Exception handling: there is some code repetition which, even if functional, is subpar.
>>> - Frames: we can use the existing os::fetch_frame_from_context to simplify the code and reduce frame parsing logic duplication.
>>>
>>> I've split the webrevs along the above lines, making each simpler to review. I'm also hosting these webrevs on Bernhard Urban's CR as I currently do not have authorship. I'll also work with him to update the description of the JBS.
>>
>> Thanks for doing the split!
>>
>> As a general comment can you please ensure that the Oracle copyright
>> second year is updated to 2020. Thanks.
>>
>> Overall these cleanups look good. Thanks for providing them.
>>
>>> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248817&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca4c420e8c4c347f078b108d8293083c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304635629612585&amp;sdata=XWRpvGxz%2BfvaMuEC3HSm%2FslP%2F%2FTUAa%2BVmRNGG13ycPk%3D&amp;reserved=0
>>> Webrevs:
>>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca4c420e8c4c347f078b108d8293083c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304635629612585&amp;sdata=V9B8ZTdgWdP%2FpmMO6uu3J2Iow%2FkUTCI4AmHAIVY7Czc%3D&amp;reserved=0
>>
>> Love this cleanup! Great to see all the stubroutines go for x86.
>>
>> src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp
>>
>> Please delete this entire (archaic) comment block.
>>
>>     42 // The following alternative implementations are needed because
>>     43 // Windows 95 doesn't support (some of) the corresponding Windows NT
>>     44 // calls. Furthermore, these versions allow inlining in the caller.
>>     45 // (More precisely: The documentation for InterlockedExchange says
>>     46 // it is supported for Windows 95. However, when single-stepping
>>     47 // through the assembly code we cannot step into the routine and
>>     48 // when looking at the routine address we see only garbage code.
>>     49 // Better safe then sorry!). Was bug 7/31/98 (gri).
>>     50 //
>>     51 // Performance note: On uniprocessors, the 'lock' prefixes are not
>>     52 // necessary (and expensive). We should generate separate cases if
>>     53 // this becomes a performance problem.
>>
>> In this (and elsewhere):
>>
>>     80 DEFINE_STUB_ADD(4, long,    InterlockedAdd)
>>     81 DEFINE_STUB_ADD(8, __int64, InterlockedAdd64)
>>
>> can we use __int32 for clarity rather than "long"?
>>
>>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-exception-handling%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca4c420e8c4c347f078b108d8293083c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304635629612585&amp;sdata=I6m7szHhTHBhajWSU5HeYjegdzqfWBO6FwUAuud6tIE%3D&amp;reserved=0
>>
>> Looks good!
>>
>>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-frames%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca4c420e8c4c347f078b108d8293083c2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304635629612585&amp;sdata=xPBjFcUi01eGtZAZJXSF5mcwHSIGnJ4pM3CBMTJC568%3D&amp;reserved=0
>>
>> Looks good!
>>
>> Thanks,
>> David
>> -----
>>
>>> Tests: jtreg:hotspot:tier, jtreg:jdk:tier1, jtreg:jdk:tier2, jtreg:langtools on Windows-x86 and Windows-x86_64, no regressions.
>>>
>>> Thank you,
>>>
>>> --
>>> Ludovic
>>>

From goetz.lindenmaier at sap.com  Thu Jul 16 16:30:23 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 16 Jul 2020 16:30:23 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard, 

I'll answer to the obvious things in this mail now.
I'll go through the code thoroughly again and write 
a review of my findings thereafter.

> So here is the new webrev.6
> 
> Webrev.6:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/
> Delta:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.inc/
Thanks for the incremental webrev, it's helpful!
 
> I spent most of the time running a microbenchmark [1] I wrote to answer
> questions from your
> review. At first I had trouble with variance in the results until I found out it
> was due to the NUMA
> architecture of the server I used. After that I noticed that there was a
> performance regression of
> about 5% even at low agent activity. I finally found out that it was due to the
> implementation of
> JavaThread::wait_for_object_deoptimization() which is called by the target
> of the JVMTI operation to
> self suspend for object deoptimization. I fixed this by adding limited spinning
> before calling
> wait() on the monitor.
> 
> The delta includes many changes in comments, renaming of names, etc. So
> I'd like to summarize
> functional changes:
> 
> * Collected all the code for the testing feature DeoptimizeObjectsALot in
> compileBroker.cpp and reworked it.
Thanks, this makes it much more compact.

>   With DeoptimizeObjectsALot enabled internal threads are started that
> deoptimize frames and
>   objects. The number of threads started are given with
> DeoptimizeObjectsALotThreadCountAll and
>   DeoptimizeObjectsALotThreadCountSingle. The former targets all existing
> threads whereas the
>   latter operates on a single thread selected round robin.
> 
>   I removed the mode where deoptimizations were performed at every nth
> exit from the runtime. I never used it.

Do I get it right? You have a n:1 and a n:all test scenario.
 n:1: n threads deoptimize 1 Jana thread    where n = DOALThreadCountSingle
 n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?

> * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> execute it always independently
>   of is_thread_fully_suspended().
Is this also a performance optimization?

> * Bugfix in EscapeBarrier::thread_added(): must not clear deopt flag. Found
> this testing with DeoptimizeObjectsALot.
Ok.

> * Added EscapeBarrier::thread_removed().
Ok.

> * EscapeBarrier constructors: barriers can now be entirely disabled by
> disabling DoEscapeAnalysis.
>   This effectively disables the enhancement.
Good!

> * JavaThread::wait_for_object_deoptimization():
>   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> safepoint check! This
>     caused issues with not walkable stacks with DeoptimizeObjectsALot.
OK. As I understand, there was one safepoint check in the old version, 
now there is one in each iteration.  I assume this is intended, right?

>   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> microbenchmark [1]
Ok.  Nice improvement, nice catch!

> 
> I refer to some more changes answering your questions and comments inline
> below.
> 
> Thanks,
> Richard.
> 
> [1] Microbenchmark:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/
> 


> > I understand you annotate at safepoints where the escape analysis
> > finds out that an object is "better" than global escape.
> > This are the cases where the analysis identifies optimization
> > opportunities. These annotations are then used to deoptimize
> > frames and the objects referenced by them.
> > Doesn't this overestimate the optimized
> > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > out.
> 
> Yes, the implementation is conservative, but it is comparatively simple and
> the additional debug
> info is just 2 flags per safepoint. 
Thanks. It also helped that you explained to me offline that 
there are more optimizations than only lock elimination and scalar
replacement done based on the ea information.
The ea refines the IR graph with allows follow up optimizations 
which can not easily be tracked back to the escaping objects or 
the call sites where they do not escape. 
Thus, if there are non-global escaping objects, you have to 
deoptimize the frame.
Did I repeat that correctly?
With this understanding, a row of my proposed renamings/comments
are obsolete.


> On the other hand, those JVMTI operations
> that really trigger
> deoptimizations are expected to be comparatively infrequent such that
> switching to the interpreter
> for a few microseconds will hardly have an effect.
That sounds reasonable.

> I've done microbenchmarking to check this.
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> 
> I found that in the worst case performance can be impacted by 10%. If the
> agent is extremely active
> and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every
> millisecond or more often,
> then the performance impact can be 30%. But I would think that this is not
> realistic. These calls
> are issued in interactive sessions to analyze deadlocks.
Ok. 
 
> We could get more precise deoptimizations by adding a third flag per
> safepoint for ea-local objects
> among the owned monitors. This would help improve the worst case in the
> benchmark. But I'm not
> convinced, if it is worth it.
> 
> Refer to the README.txt of the microbenchmark for a more detailled
> discussion.
 
> > pcDesc.hpp
> >
> > I would like to see some documentation of the methods. 
> Done. I didn't take your text, though, because I only noticed it after writing
> my own. Let me know if you are not ok with it.
That's fine. My texts were only proposals, you as author know better
what goes on anyways.

> > scopeDesc.cpp
> >
> >   Besides refactoring copy escape info from pcDesc to scopeDesc
> >   and add accessors. Trivial.
> >
> >   In scopeDesc.hpp you talk about NoEscape and ArgEscape.
> >   This are opto terms, but scopeDesc is a shared datastructure
> >   that does not depend on a specific compiler.
> >   Please explain what is going on without using these terms.
> 
> Actually these are not too opto specific terms. They are used in the paper
> referenced in
> escape.hpp. Also you can easily google them. I'd rather keep the comments
> as they are.
Hmm, I'm not really happy with this, as also the papers
are for the compiler community, and probably not familiar to 
others that work with HotSpot.
But stay with your terms if you think it makes it clearer.
Anyways, with now understanding why you use conservative
Information (see above), the descriptions I had in mind are not precise.

> > callnode.hpp
> >
> > You add functionality to annotate callnodes with escape information
> > This is carried through code generation to final output where it is
> > added to the compiled methods meta information.
> >
> > At Safepoints in general jvmti can access
> >   - Objects that were scalar replaced. They must be reallocated.
> >     (Flag EliminateAllocations)
> >   - Objects that should be locked but are not because they never
> >     escape the thread. They need to be relocked.
> >
> > At calls, Objects where locks have been removed escape to callees.
> > We must persist this information so that if jvmti accesses the
> > object in a callee, we can determine by looking at the caller that
> > it needs to be relocked.
> 
> Note that the ea-optimization must not be at the current location, it can also
> follow when control
> returns to the caller. Lock elimination isn't the only relevant optimization.
Yes, I understood now, see above. Thanks for explaining.
> Accesses to instance
> members or array elements can be optimized as well.
You mean the compiler can/will ignore volatile or memory ordering
requirements for non-escaping objects? Sounds reasonable to do.

> > // Returns true if at least one of the arguments to the call is an oop
> > // that does not escape globally.
> > bool ConnectionGraph::has_arg_escape(CallJavaNode* call) {
> 
> IMHO the method names are descriptive and don't need the comments. But I
> give in :) (only replaced
> "oop" with "object")
Thanks. Yes, object is better than oop.

> You are right, it is not correct how flags are checked. Especially if only
> running with the JVMCI compiler.
>
> I changed Deoptimization::deoptimize_objects_internal() to make
> reallocation and relocking dependent
> on similar checks as in Deoptimization::fetch_unroll_info_helper().
> Furthermore EscapeBarriers are
> conditionally activated depending on the following (see EscapeBarrier ctors):
> 
> JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
> COMPILER2_PRESENT(|| DoEscapeAnalysis)
> 
> So the enhancement can be practically completely disabled by disabling
> DoEscapeAnalysis, which is
> what C2 currently does if JVMTI capabilities that allow access to local
> references are taken.
Thanks for fixing. 

> I went for the latter.
> 
> > In fetch_unroll_info_helper, I don't understand why you need
> >  && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
> > for eliminated locks, but not for skalar replaced objects?
> 
> In short reallocation is idempotent, relocking is not.
> 
> Without the enhancement Deoptimization::realloc_objects() can already be
> called more than once for a frame:
> 
> First call in materializeVirtualObjects() (also iterateFrames()).
> 
> Second (indirect) call in fetch_unroll_info_helper().
> 
> The objects from the first call are saved as jvmti deferred updates when
> realloc_objects()
> returns. Note that there is no relationship to jvmti. The thing in common is
> that updates cannot be
> directely installed into a compiled frame, it is necessary to deoptimize the
> frame and defer the
> updates until the compiled frame gets replaced. Every time the vframes
> corresponding to the owner
> frame are iterated, they get the deferred updates. So in
> fetch_unroll_info_helper() the
> GrowableArray<compiledVFrame*>* chunk reference them too. All
> references to the objects created by
> the second (indirect) call to realloc_objects() are never used, because
> compiledVFrame accessors to
> locals, expressions, and monitors override them with the deferred updates.
> The objects become
> unreachable and get gc'ed.
OK, so repeatedly computed vFrames always have the first version of 
reallocated objects by construction, so it needs not be handled here.
But also due to construction, objects might be allocated just to be
discarded.
 
> materializeVirtualObjects() does not bother with relocking.
> deoptimize_objects_internal(), which is
> introduced by the enhancement, does relock objects, after all the lock
> elimination becomes illegal 
> with the change in escape state. Relocking twice does not work, so the
> enhancement avoids it by
> checking EscapeBarrier::objs_are_deoptimized(thread, deoptee.id()).
> 
> Note that materializeVirtualObjects() can be called more than once and will
> always return the very
> same objects, even though it calls realloc_objects() again.
Ok.


> > I would guess it is because the eliminated locks can be applied to
> > argEscape, but scalar replacement only to noescape objects?
> > I.e. it might have been done before?
> >
> > But why isn't this the case for eliminate_allocations?
> > deoptimize_objects_internal does both unconditionally,
> > so both can happen to inner frames, right?
> 
> Sorry, I don't quite understand. Hope the explanation above helps.
Yes.  I was guessing wrong :)

> >   I like if boolean operators are at the beginning of broken lines,
> >   but I think hotspot convention is to have them at the end.
> Ok, fixed.
Thanks.

> 
> > Code will get much more simple if BiasedLocking is removed.
> >
> > EscapeBarrier:: ...
> >
> > (This class maybe would qualify for a file of its own.)
> >
> > deoptimize_objects()
> > I would mention escape analysis only as side remark.  Also, as I understand,
> > there is only one frame at given depth?
> > // Deoptimize frames with optimized objects. This can be omitted locks and
> > // objects not allocated but replaced by scalars. In C2, these optimizations
> > // are based on escape analysis.
> > // Up to depth, deoptimize frames with any optimized objects.
> > // From depth to entry_frame, deoptimize only frames that
> > // pass optimized objects to their callees.
> > (First part similar for the comment above
> EscapeBarrier::deoptimize_objects_internal().)
> 
> I've reworked the comment. Let me know if you still think it needs to be
> improved.
Good now, thanks (maybe break the long line ...)


> > What is the check (cur_depth <= depth) good for? Can you
> > ever walk past entry_frame?
> 
> Yes (assuming you mean the outer while-statement), there are java frames
> beyond the entry frame if a
> native method calls java methods again. So we visit all frames up to the given
> depth and from there
> we continue to the entry frame. It is not necessary to continue beyond that
> entry frame, because
> escape analysis assumes that arguments to native functions escape globally.
> 
> Example: Let the java stack look like this:
> 
> +---------+
> | Frame A |
> +---------+
> | Frame N |
> +---------+
> | Frame B |
> +---------+ <- top of stack
> 
> Where java method A calls native method N and N calls java method B.
> 
> Very simplified the native stack will look like this
> 
> +-------------------------+
> | Frame of JIT Compiled A |
> +-------------------------+
> | Frame N                 |
> +-------------------------+
> | Entry Frame             |
> +-------------------------+
> | Frame B                 |
> +-------------------------+ <- top of stack
> 
> The entry frame is an activation of the call stub, which is a small assembler
> routine that
> translates from the native calling convention to the java calling convention.
> 
> There cannot be any ArgEscape that is passed to B (see above), therefore we
> can stop the stackwalk
> at the entry frame if depth is 1. If depth is 3 we have to continue to Frame A,
> as it is directely
> accessed. 
Ok, thanks, nice explanation!!

> > Isn't vf->is_compiled_frame() prerequisite that "Move to next physical
> frame"
> > is needed? You could move it into the other check.
> > If so, similar for deoptimize_objects_all_threads().
> 
> Only compiledVFrame require moving to the /top/ frame. Fixed.
Thanks, this looks better.

> > Syncronization: looks good. I think others had a look at this before.
> >
> > EscapeBarrier::deoptimize_objects_internal()
> >   The method name is misleading, it is not used by
> >   deoptimize_objects().
> >   Also, method with the same name is in Deopitmization.
> >   Proposal: deoptimize_objects_thread() ?
> 
> Sorry, but I don't see, why it would be misleading.
> What would be the meaning of 'deoptimize_objects_thread'? I don't
> understand that name.
1. I have no idea why it's called "_internal". Because it is private?
   By the name, I would expect that EscapeBarrier::deoptimize_objects()
   calls it for some internal tasks. But it does not.
2. My proposal: deoptimize_objects_all_threads() iterates all threads 
and calls deoptimize_objects(_one)_thread(thread) for each of these.
That's how I would have named it. 
But no bike shedding, if you don't see what I mean it's not obvious.


> > C1 stubs: this really shows you tested all configurations, great!
> >
> >
> > mutexLocker: ok.
> > objectMonitor.cpp: ok
> > stackValue.hpp   Is this missing clearing a bug?
> 
> In short: that change is not needed anymore. I'll remove it again.
Good. Thanks for the details.

> > Renaming deferred_locals to deferred_updates is good, as well as
> > adding a datastructure for it.
> > (Adding this data structure might be a breakout, too.)
> >
> > good.
> >
> > thread.cpp
> >
> > good.
> >
> > vframe.cpp
> >
> > Is this a bug in existing code?
> > Makes sense.
> 
> Depends on your definition of bug. There are no references to
> vframe::is_entry_frame() in the
> existing code. I would think it is a bug.
So it is :)

> 
> >
> > vframe_hp.hpp
> > (What stands _hp for? helper? The file should be named
> compiledVFrame ...)
> >
> > not_global_escape_in_scope() ...
> > Again, you mention escape analysis here. Comments above hold, too.
> 
> I think it is the right name, because it is meaningful and simple.
Ok, accepted ... given my understandings from above.

> 
> > You introduce JvmtiDeferredUpdates. Good.
> >
> > vframe_hp.cpp
> >
> > Changes for JvmtiDeferredUpdates, escape state accessors,
> >
> > line 422:
> > Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?
> >
> >
> > macros.hpp
> >   Good.
> >
> >
> > Test coding
> > ============
> >
> > compileBroker.h|cpp
> >
> > You introduce a third class of threads handled here and
> > add a new flag to distinguish it. Before, the two kinds
> > of threads were distinguished implicitly by passing in
> > a compiler for compiler threads.
> > The new thread kind is only used for testing in debug.
> >
> > make_thread:
> > You could assert (comp != NULL...) to assure previous
> > conditions.
> 
> If replaced the if-statements with a switch-statement, made sure all enum-
> elements are covered, and
> added the assertion you suggested.
> 
> > line 989 indentation broken
> 
> You are referring to this block I assume:
> (from
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/src/hots
> pot/share/compiler/compileBroker.cpp.frames.html)
> 
>  976   if (MethodFlushing) {
>  977     // Initialize the sweeper thread
>  978     Handle thread_oop = create_thread_oop("Sweeper thread", CHECK);
>  979     jobject thread_handle = JNIHandles::make_local(THREAD,
> thread_oop());
>  980     make_thread(sweeper_t, thread_handle, NULL, NULL, THREAD);
>  981   }
>  982
>  983 #if defined(ASSERT) && COMPILER2_OR_JVMCI
>  984   if (DeoptimizeObjectsALot == 2) {
>  985     // Initialize and start the object deoptimizer threads
>  986     for (int thread_count = 0; thread_count <
> DeoptimizeObjectsALotThreadCount; thread_count++) {
>  987       Handle thread_oop = create_thread_oop("Deoptimize objects a lot
> thread", CHECK);
>  988       jobject thread_handle = JNIHandles::make_local(THREAD,
> thread_oop());
>  989       make_thread(deoptimizer_t, thread_handle, NULL, NULL, THREAD);
>  990     }
>  991   }
>  992 #endif // defined(ASSERT) && COMPILER2_OR_JVMCI
> 
> I cannot really see broken indentation here. Am I looking at the wrong
> location?
I don't have the source version I reviewed last time any more, so 
I can't check. But maybe an artefact from patching ... if there were
tabs jcheck would have told you, so that's not it. No problem.

Best regards,
  Goetz.

From kim.barrett at oracle.com  Thu Jul 16 18:20:26 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 16 Jul 2020 14:20:26 -0400
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>

> On Jul 16, 2020, at 11:39 AM, Ludovic Henry <luhenry at microsoft.com> wrote:
> 
> Hi David,
> 
>> I'm not very good with templates so I've asked Kim Barrett if he can
>> take a look at this aspect.
> 
> Sounds good, looking forward for his review.

atomic_windows_x86.hpp:  
I think retaining the "stub" nomenclature is misleading; "stub" has a
somewhat specific meaning in low-level HotSpot code.  "intrinsic"
might be a better choice.

atomic_windows_x86.hpp:  
I'm guessing that as a followup, as part of the aarch64 port, these
will probably be changed to dispatch on the memory order to choose the
appropriately ordered intrinsics.  But what's being proposed seems
sufficient for now.

atomic_windows_x86.hpp:  
I noticed that HotSpot's cmpxchg argument order is different than the
MSVC intrinsic's argument order.  That might be worthy of a comment.

> I'm updating the webrev at http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.03 with the proper realignement of the parameters.

That parameter alignment problem doesn?t seem to have been fixed.

Other than these minor nits, this change looks good to me.

Nice to see the stub functions go away.


From luhenry at microsoft.com  Thu Jul 16 18:27:00 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 16 Jul 2020 18:27:00 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
Message-ID: <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Kim,

> atomic_windows_x86.hpp:
> I think retaining the "stub" nomenclature is misleading; "stub" has a
> somewhat specific meaning in low-level HotSpot code.  "intrinsic"
> might be a better choice.

Ok, let me rename that to IntrinsicName and IntrinsicType.

> atomic_windows_x86.hpp:
> I'm guessing that as a followup, as part of the aarch64 port, these
> will probably be changed to dispatch on the memory order to choose the
> appropriately ordered intrinsics.  But what's being proposed seems
> sufficient for now.

This code will not be shared directly with aarch64, even though it is very similar. As part of landing the Windows-AArch64 changes, we can imagine merging the two into src/hotspot/os/windows/atomic_windows.hpp for example.

> atomic_windows_x86.hpp:
> I noticed that HotSpot's cmpxchg argument order is different than the
> MSVC intrinsic's argument order.  That might be worthy of a comment.

Good point, let me add a comment.

> That parameter alignment problem doesn?t seem to have been fixed.

Bernhard (whom I'm using his CR) hasn't had a chance to upload it yet, I'll let you know once it's done.

> Nice to see the stub functions go away.

It does indeed seems to simplify the code drastically, so happy to participate to that as well :)

Thank you,

________________________________________
From: Kim Barrett <kim.barrett at oracle.com>
Sent: Thursday, July 16, 2020 11:20
To: Ludovic Henry
Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

> On Jul 16, 2020, at 11:39 AM, Ludovic Henry <luhenry at microsoft.com> wrote:
>
> Hi David,
>
>> I'm not very good with templates so I've asked Kim Barrett if he can
>> take a look at this aspect.
>
> Sounds good, looking forward for his review.

atomic_windows_x86.hpp:
I think retaining the "stub" nomenclature is misleading; "stub" has a
somewhat specific meaning in low-level HotSpot code.  "intrinsic"
might be a better choice.

atomic_windows_x86.hpp:
I'm guessing that as a followup, as part of the aarch64 port, these
will probably be changed to dispatch on the memory order to choose the
appropriately ordered intrinsics.  But what's being proposed seems
sufficient for now.

atomic_windows_x86.hpp:
I noticed that HotSpot's cmpxchg argument order is different than the
MSVC intrinsic's argument order.  That might be worthy of a comment.

> I'm updating the webrev at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.03&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ca70528183500444c167c08d829b4eede%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637305204373988228&amp;sdata=HYyx395qz5%2FvR3xQHI15LGnods%2BJRyyDWP83CUTC%2Fs4%3D&amp;reserved=0 with the proper realignement of the parameters.

That parameter alignment problem doesn?t seem to have been fixed.

Other than these minor nits, this change looks good to me.

Nice to see the stub functions go away.


From luhenry at microsoft.com  Thu Jul 16 22:00:37 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 16 Jul 2020 22:00:37 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>,
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>

I've upload these latest changes to http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.04

________________________________________
From: Ludovic Henry <luhenry at microsoft.com>
Sent: Thursday, July 16, 2020 11:27
To: Kim Barrett
Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

Hi Kim,

> atomic_windows_x86.hpp:
> I think retaining the "stub" nomenclature is misleading; "stub" has a
> somewhat specific meaning in low-level HotSpot code.  "intrinsic"
> might be a better choice.

Ok, let me rename that to IntrinsicName and IntrinsicType.

> atomic_windows_x86.hpp:
> I'm guessing that as a followup, as part of the aarch64 port, these
> will probably be changed to dispatch on the memory order to choose the
> appropriately ordered intrinsics.  But what's being proposed seems
> sufficient for now.

This code will not be shared directly with aarch64, even though it is very similar. As part of landing the Windows-AArch64 changes, we can imagine merging the two into src/hotspot/os/windows/atomic_windows.hpp for example.

> atomic_windows_x86.hpp:
> I noticed that HotSpot's cmpxchg argument order is different than the
> MSVC intrinsic's argument order.  That might be worthy of a comment.

Good point, let me add a comment.

> That parameter alignment problem doesn?t seem to have been fixed.

Bernhard (whom I'm using his CR) hasn't had a chance to upload it yet, I'll let you know once it's done.

> Nice to see the stub functions go away.

It does indeed seems to simplify the code drastically, so happy to participate to that as well :)

Thank you,

________________________________________
From: Kim Barrett <kim.barrett at oracle.com>
Sent: Thursday, July 16, 2020 11:20
To: Ludovic Henry
Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

> On Jul 16, 2020, at 11:39 AM, Ludovic Henry <luhenry at microsoft.com> wrote:
>
> Hi David,
>
>> I'm not very good with templates so I've asked Kim Barrett if he can
>> take a look at this aspect.
>
> Sounds good, looking forward for his review.

atomic_windows_x86.hpp:
I think retaining the "stub" nomenclature is misleading; "stub" has a
somewhat specific meaning in low-level HotSpot code.  "intrinsic"
might be a better choice.

atomic_windows_x86.hpp:
I'm guessing that as a followup, as part of the aarch64 port, these
will probably be changed to dispatch on the memory order to choose the
appropriately ordered intrinsics.  But what's being proposed seems
sufficient for now.

atomic_windows_x86.hpp:
I noticed that HotSpot's cmpxchg argument order is different than the
MSVC intrinsic's argument order.  That might be worthy of a comment.

> I'm updating the webrev at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.03&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C06667a1f27ef4b6bc47208d829b5d4c7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637305208221370933&amp;sdata=c278Ey9Aoc%2BOARBRc1SRxIgdBt9N5d3rf9jlsgqFamg%3D&amp;reserved=0 with the proper realignement of the parameters.

That parameter alignment problem doesn?t seem to have been fixed.

Other than these minor nits, this change looks good to me.

Nice to see the stub functions go away.


From david.holmes at oracle.com  Thu Jul 16 23:11:46 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 17 Jul 2020 09:11:46 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <bc6c8f7a-d817-78b4-4d81-4960403d2f2b@oracle.com>

Latest changes look good to me!

Thanks,
David

On 17/07/2020 8:00 am, Ludovic Henry wrote:
> I've upload these latest changes to http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.04
> 
> ________________________________________
> From: Ludovic Henry <luhenry at microsoft.com>
> Sent: Thursday, July 16, 2020 11:27
> To: Kim Barrett
> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
> 
> Hi Kim,
> 
>> atomic_windows_x86.hpp:
>> I think retaining the "stub" nomenclature is misleading; "stub" has a
>> somewhat specific meaning in low-level HotSpot code.  "intrinsic"
>> might be a better choice.
> 
> Ok, let me rename that to IntrinsicName and IntrinsicType.
> 
>> atomic_windows_x86.hpp:
>> I'm guessing that as a followup, as part of the aarch64 port, these
>> will probably be changed to dispatch on the memory order to choose the
>> appropriately ordered intrinsics.  But what's being proposed seems
>> sufficient for now.
> 
> This code will not be shared directly with aarch64, even though it is very similar. As part of landing the Windows-AArch64 changes, we can imagine merging the two into src/hotspot/os/windows/atomic_windows.hpp for example.
> 
>> atomic_windows_x86.hpp:
>> I noticed that HotSpot's cmpxchg argument order is different than the
>> MSVC intrinsic's argument order.  That might be worthy of a comment.
> 
> Good point, let me add a comment.
> 
>> That parameter alignment problem doesn?t seem to have been fixed.
> 
> Bernhard (whom I'm using his CR) hasn't had a chance to upload it yet, I'll let you know once it's done.
> 
>> Nice to see the stub functions go away.
> 
> It does indeed seems to simplify the code drastically, so happy to participate to that as well :)
> 
> Thank you,
> 
> ________________________________________
> From: Kim Barrett <kim.barrett at oracle.com>
> Sent: Thursday, July 16, 2020 11:20
> To: Ludovic Henry
> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
> 
>> On Jul 16, 2020, at 11:39 AM, Ludovic Henry <luhenry at microsoft.com> wrote:
>>
>> Hi David,
>>
>>> I'm not very good with templates so I've asked Kim Barrett if he can
>>> take a look at this aspect.
>>
>> Sounds good, looking forward for his review.
> 
> atomic_windows_x86.hpp:
> I think retaining the "stub" nomenclature is misleading; "stub" has a
> somewhat specific meaning in low-level HotSpot code.  "intrinsic"
> might be a better choice.
> 
> atomic_windows_x86.hpp:
> I'm guessing that as a followup, as part of the aarch64 port, these
> will probably be changed to dispatch on the memory order to choose the
> appropriately ordered intrinsics.  But what's being proposed seems
> sufficient for now.
> 
> atomic_windows_x86.hpp:
> I noticed that HotSpot's cmpxchg argument order is different than the
> MSVC intrinsic's argument order.  That might be worthy of a comment.
> 
>> I'm updating the webrev at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.03&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C06667a1f27ef4b6bc47208d829b5d4c7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637305208221370933&amp;sdata=c278Ey9Aoc%2BOARBRc1SRxIgdBt9N5d3rf9jlsgqFamg%3D&amp;reserved=0 with the proper realignement of the parameters.
> 
> That parameter alignment problem doesn?t seem to have been fixed.
> 
> Other than these minor nits, this change looks good to me.
> 
> Nice to see the stub functions go away.
> 

From kim.barrett at oracle.com  Fri Jul 17 01:43:38 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 16 Jul 2020 21:43:38 -0400
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>

> On Jul 16, 2020, at 6:00 PM, Ludovic Henry <luhenry at microsoft.com> wrote:
> 
> I've upload these latest changes to http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.04

The change from "StubName" => "IntrinsicName" made the indenting of
arguments in the calls no longer lined up normally. Line 65, line 82,
and lines 104-5 are now abnormally indented.

Other than that, looks good.  I don't need another webrev for a fix of
the indentation.


From goetz.lindenmaier at sap.com  Fri Jul 17 12:30:40 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 17 Jul 2020 12:30:40 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB2964FAF58FBD21D6705A4418EC7C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard,

> I'll answer to the obvious things in this mail now.
> I'll go through the code thoroughly again and write
> a review of my findings thereafter.
As promised a detailed walk-throug, but without any major findings:

c1_IR.hpp: ok
ci_Env.h|cpp: ok
compiledMethod.cpp, nmethod.cpp: ok
debugInfoRec.h|cpp: ok
scopeDesc.h|cpp ok

compileBroker.h|cpp: 
Maybe a bit of documentation how and why you start 
the threads? I had expected there are two test
scenarios run after each other, but now I understand 'Single'
and 'All' run simultaneously.  Well, this really is a stress test!
Also good the two variants of depotimization are
stressed against each other.
Besides that really nice it's all in one place.

rootResolver.cpp: ok
jvmciCodeInstaller.cpp: ok

c2compiler.cpp: The essence of this change! Just one line :)
Great!

callnode.hpp ok
escape.h|cpp ok
macro.cpp 
I was not that happy with the names saying not_global_escape
and similar. I now agreed you have to use the terms of the escape
analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with 
the 'not' in the term, I always try to expand the name to some
sentence with a negated verb, but it makes no sense.
For example, "has_not_global_escape_in_scope" expands to 
"Hasn't a global escape in its scope." in my thinking, which makes 
no sense. You probably mean
"Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} 
in its scope."

C2 is using the word "non" in this context, e.g., here 
alloc->is_non_escaping.

non obviously negates the adjective 'global',
non-global or nonglobal even is a English term I find in the 
net. 
So what about "has_non_global_escape_in_scope?"

matcher.cpp ok

output.cpp:1071
Please break the long line.

jvmtiCodeBlobEvents.cpp ok

jvmtiEnv.cpp
MaxJavaStackTraceDepth is only documented to affect
the exceptions stack trace depth, not to limit jvmti 
operations. Therefore I wondered why it is used here. 
Non of your business, but the flag should
document this in globals.hpp, too.  
Does jvmti specify that the same limits are used ...?
ok on your side.

jvmtiEnvBase.cpp  ok
jvmtiImpl.h|cpp  ok
jvmtiTagMap.cpp ok
whitebox.cpp ok

deoptimization.cpp

line 177: Please break line
line 246, 281: Please break line
1578, 1583, 1589, 1632, 1649, 1651 Break line

1651: You use 'non'-terms, too: non-escaping :)

2805, 2929, 2946ff, break lines

deoptimization.hpp

158, 174, 176 ... I would break lines too, but here you are in
good company :)

globals.hpp ok
mutexLocker.h|cpp ok
objectMonitor.cpp ok

thread.cpp 

2631 typo: sapfepont --> safepoint

thread.hpp ok
thread.inline.hpp ok
vframe.cpp ok
vframe_hp.cpp   458ff break lines
vframe_hp.hpp ok
macros.hpp ok
TEST.ROOT ok
WhiteBox.java ok

IterateHeapWithEscapeAnalysisEnabled.java

line 415:
msg("wait until target thread has set testMethod_result");
while (testMethod_result == 0) {
    Thread.sleep(50);
}
Might the test run into timeouts at this place?
The field is volatile, i.e. it will be reloaded
in each iteration. But will dontinline_testMethod
write it back to main memory in time?

libIterateHeapWithEscapeAnalysisEnabled.c ok

EATests.java

This is a very elaborate test.
I found a row of test cases illustrating issues
we talked about before. Really helpful!

1311: TypeO materialize -> materialized

1640: setting local variable i triggers always deoptimization
  --> setting local variable i always triggers deoptimization

2176: dontinline_calee --> dontinline_callee
2510: poping --> popping  ... but I'm not sure here.

https://www.urbandictionary.com/define.php?term=poping
poping
Drinking large amounts of Dextromethorphan Hydrobromide (DXM)based cough syrup, and then embarking on an adventure while wandering around neighborhoods or parks all night. This is usually done while listening to Punk rock music from a portable jambox. 
;)
Don?t do it! ??

EATestsJVMTI.java

I think you can just copy this test description into the other
test. You can have two @test comments, they will be treated
as separate tests.  The @requires will be evaluated accordingly.
For an example see 
test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerExceptionTest.java
which has two different compile setups for the test class (-g).

so, that's it for reading code ...


Some general remarks, maybe a bit picky ...:
I think you could use less commas ',' in comments.
As I understand, you need a comma if the relative
sentence is at the beginning, but not if it is at 
the end:
  If Corona is over, I go to the office.
but
  I go to the office if Corona is over.
I think the same holds for 'because', 'while' etc.
E.g., jvmtiEnvBase.cpp:1313, jvmtiImpl.cpp:646ff, 
vframe_hp.hpp 104ff

Also, I like full sentences in comments.  
Especially for me as foreign speaker, this makes
things much more clear. I.e., I try to make it
a real sentence with articles, capitalized and a
dot at the end if there is a subject and a verb
in first place.
E.g., jvmtiEnvBase.cpp:1327
In many places, your comments read really 
well but some are quite abbreviated I think.

E.g. thread.cpp:2601 is an example where a simple
'a' helps a lot.
"Single deoptimization is typically very short."
I would add 'A': "A single deoptimization is typically very short (fast?)."
An other meaning of the comment I first considered is this:
"Single deoptimization is typically very short, all_threads deoptimization takes longer"
having in mind the functions
EscapeBarries::deoptimize_objects_all_threads()  
and 
EscapeBarries::deoptimize_objects() doing a single thread.
German with it's compound nouns is helpful here :)

Einzeldeoptimierung <--> eine einzelne Deoptimierung

Best regards,
  Goetz.


From luhenry at microsoft.com  Fri Jul 17 18:26:21 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 17 Jul 2020 18:26:21 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
Message-ID: <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Kim,

I've updated the webrev at http://cr.openjdk.java.net/~burban/luhenry/8248817-atomics/webrev.04 with these spacing fixes.

________________________________________
From: Kim Barrett <kim.barrett at oracle.com>
Sent: Thursday, July 16, 2020 18:43
To: Ludovic Henry
Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

> On Jul 16, 2020, at 6:00 PM, Ludovic Henry <luhenry at microsoft.com> wrote:
>
> I've upload these latest changes to https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C2adf5828b4e94998970908d829f32633%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637305471582207295&amp;sdata=VbHwg0QeJT4rzAtupDoktQLag34s7fWSeP59gGaoywg%3D&amp;reserved=0

The change from "StubName" => "IntrinsicName" made the indenting of
arguments in the calls no longer lined up normally. Line 65, line 82,
and lines 104-5 are now abnormally indented.

Other than that, looks good.  I don't need another webrev for a fix of
the indentation.


From gerard.ziemski at oracle.com  Fri Jul 17 19:19:58 2020
From: gerard.ziemski at oracle.com (gerard ziemski)
Date: Fri, 17 Jul 2020 14:19:58 -0500
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash log
 file
In-Reply-To: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
Message-ID: <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>

Hi all,

Please review this small fix that adds the OS version and the OS build 
number to the hs_err_pidXXX.log output in the ?Summary? section for Mac 
platform (it?s easier to use for developers than the Darwin kernel 
version that we display right now).

This is how things used to look:


--------------- S U M M A R Y ------------

Command Line: Crasher

Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, 
Darwin 19.5.0
Time: Thu Jul 16 14:01:46 2020 CDT elapsed time: 1.089465 seconds (0d 0h 
0m 1s)


And this is how the ?Summary? section looks like with the proposed change:


--------------- S U M M A R Y ------------

Command Line: Crasher

Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, 
Darwin 19.5.0, macOS 10.15.5 (19F101)
Time: Thu Jul 16 14:02:29 2020 CDT elapsed time: 0.360881 seconds (0d 0h 
0m 0s)


bug link at https://bugs.openjdk.java.net/browse/JDK-8237591
open webrev at http://cr.openjdk.java.net/~gziemski/8237591_rev1
testing Mach5 hs_tier1,2,3,4,5 in progress


cheers


From gerard.ziemski at oracle.com  Fri Jul 17 19:21:57 2020
From: gerard.ziemski at oracle.com (gerard ziemski)
Date: Fri, 17 Jul 2020 14:21:57 -0500
Subject: RFR (M) 8237727: Mac: after we handle a crash, Apple's crash reporter
 is left with incorrect state
In-Reply-To: <274ED2F0-38D1-48AE-AD84-75775D4A57AE@me.com>
References: <274ED2F0-38D1-48AE-AD84-75775D4A57AE@me.com>
Message-ID: <b90eabcd-3807-0f84-ca05-4e2474cf88ea@oracle.com>


Hi all,

Please review this enhancement, which changes how we handle a crash on 
macOS, so that the native macOS CrashReporter can create its own crash 
report alongside ours, with correct crash signal and frame.

Normally after we handle a crash we?terminate the process with either 
with exit() or abort():

#1 - When we terminate with abort() (the case when the core dump is 
enabled, i.e. -XX:+CreateCoredumpOnCrash) macOS CrashReported doesn?t 
see the original crash that we handled, but only sees the abort, which 
it correctly, but confusingly reports as the termination reason.

#2 - When we terminate with exit() (the case when the core dump is 
disabled, i.e. -XX:-CreateCoredumpOnCrash) macOS CrashReporter doesn?t 
see the crash and does not generate a report at all.


With this proposed fix we handle the crash as usual, but then instead of 
aborting/exiting, we allow the process to crash again, which allows the 
macOS CrashReported to generate its crash log with correct exception 
type and termination signal, showing the actual frame that crashed, in 
all cases (regardless of whether the core dump is enabled or disabled)

Before, the CrashReported would only indicate the (abort) exception type 
with no termination signal:


Exception Type: ?EXC_BAD_ACCESS (SIGABRT)
Exception Codes: ? ? ? KERN_INVALID_ADDRESS at 0x0000000000000008
Exception Note: ? ? ? ?EXC_CORPSE_NOTIFY


But now we get (correct) exception type and termination signal:


Exception Type: ?EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: ? ? ? KERN_INVALID_ADDRESS at 0x0000000000000000
Exception Note: ? ? ? ?EXC_CORPSE_NOTIFY

Termination Signal: ? ?Segmentation fault: 11
Termination Reason: ? ?Namespace SIGNAL, Code 0xb
Terminating Process: ? exc handler [1497]


In addition, instead of a frame like this:


Thread 2 Crashed:
0 ? libsystem_kernel.dylib 0x00007fff6ec4633a __pthread_kill + 10
1 ? libsystem_pthread.dylib 0x00007fff6ed02e60 pthread_kill + 430
2 ? libsystem_c.dylib 0x00007fff6ebcd808 abort + 120
3 ? libjvm.dylib 0x00000001037c6da1 os::abort(bool, void*, void const*) 
+ 49 (os_bsd.cpp:1069)
4 ? libjvm.dylib 0x0000000103a9a249 VMError::report_and_die(int, char 
const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, 
void*, char const*, int, unsigned long) + 3017 (vmError.cpp:1639)
5 ? libjvm.dylib 0x0000000103a99655 VMError::report_and_die(Thread*, 
unsigned int, unsigned char*, void*, void*, char const*, ...) + 149 
(vmError.cpp:1315)
6 ? libjvm.dylib 0x0000000103a9a341 VMError::report_and_die(Thread*, 
unsigned int, unsigned char*, void*, void*) + 33 (vmError.cpp:1322)
7 ? libjvm.dylib 0x00000001037cbdfa JVM_handle_bsd_signal + 618 
(os_bsd_x86.cpp:763)
8 ? libjvm.dylib 0x00000001037c8ce9 signalHandler(int, __siginfo*, 
void*) + 89 (os_bsd.cpp:2589)
9 ? libsystem_platform.dylib 0x00007fff6ecf75fd _sigtramp + 29
10 ???? 000000000000000000 0 + 0
11 ?libjvm.dylib 0x0000000103a4aa51 Unsafe_PutInt(JNIEnv_*, _jobject*, 
_jobject*, long, int) + 241 (unsafe.cpp:313)
12 ???? 0x00000001060baade 0 + 4396395230
13 ???? 0x00000001060b207e 0 + 4396359806
14 ???? 0x00000001060b207e 0 + 4396359806
15 ???? 0x00000001060b207e 0 + 4396359806
16 ???? 0x00000001060a89ca 0 + 4396321226
17 ?libjvm.dylib 0x0000000103225792 JavaCalls::call_helper(JavaValue*, 
methodHandle const&, JavaCallArguments*, Thread*) + 1426 (javaCalls.cpp:430)
18 ?libjvm.dylib 0x00000001032e1ee1 jni_invoke_static(JNIEnv_*, 
JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, 
Thread*) + 417 (jni.cpp:975)
19 ?libjvm.dylib 0x00000001032e9605 jni_CallStaticVoidMethod + 645 
(jni.cpp:1830)
20 ?libjli.dylib 0x0000000101122d3f JavaMain + 2495 (java.c:556)
21 ?libjli.dylib 0x00000001011254a9 ThreadJavaMain + 9 
(java_md_macosx.m:720)
22 ?libsystem_pthread.dylib 0x00007fff6ed03109 _pthread_start + 148
23 ?libsystem_pthread.dylib 0x00007fff6ecfeb8b thread_start + 15


CrashReporter?will now show:


Thread 2 Crashed:
0 ? libjvm.dylib 0x0000000108034415 MemoryAccess<int>::put(int) + 205 
(unsafe.cpp:233)
1 ? libjvm.dylib 0x00000001080249a0 Unsafe_PutInt(JNIEnv_*, _jobject*, 
_jobject*, long, int) + 206 (unsafe.cpp:313)
2 ? ??? 0x0000000111ad9ade 0 + 4591557342
3 ? ??? 0x0000000111ad107e 0 + 4591521918
4 ? ??? 0x0000000111ad107e 0 + 4591521918
5 ? ??? 0x0000000111ad107e 0 + 4591521918
6 ? ??? 0x0000000111ac79ca 0 + 4591483338
7 ? libjvm.dylib 0x0000000107f37656 JavaCalls::call_helper(JavaValue*, 
methodHandle const&, JavaCallArguments*, Thread*) + 1006 (javaCalls.cpp:430)
8 ? libjvm.dylib 0x0000000108077b7b jni_invoke_static(JNIEnv_*, 
JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, 
Thread*) + 260 (jni.cpp:975)
9 ? libjvm.dylib 0x000000010807dffc jni_CallStaticVoidMethod + 529 
(jni.cpp:1830)
10 ?libjli.dylib 0x0000000104452d3f JavaMain + 2495 (java.c:556)
11 ?libjli.dylib 0x00000001044554a9 ThreadJavaMain + 9 
(java_md_macosx.m:720)
12 ?libsystem_pthread.dylib 0x00007fff6ed03109 _pthread_start + 148
13 ?libsystem_pthread.dylib 0x00007fff6ecfeb8b thread_start + 15


Which correctly identifies the top frame that caused the crash (instead 
of "??? 000000000000000000 0 + 0?)

Also, in our?hs_err_pidX?log we now show an approximate location to 
the?crash log report produced by the CrashReporter, ex:


--------------- ?S Y S T E M ?---------------

...

CrashReporter log: /Users/gerard/Library/Logs/DiagnosticReports/java_*

END.


Lastly, having macOS CrashReporter produce its crash log, in addition to 
our own hs_err_pidX?log, is valuable because it shows back traces for 
all the native threads (ours only shows the crashed one) as well as line 
offsets to the files (ours shows binary offsets). It?s also nice to 
validate that the info in our crash log is correct.

For full before and after CrashReporter logs please see the bug.

bug link at https://bugs.openjdk.java.net/browse/JDK-8237727
open webrev at http://cr.openjdk.java.net/~gziemski/8237727_rev1
testing: passes Mach5 hs_tier1,2,3,4,5

cheers


From thomas.stuefe at gmail.com  Fri Jul 17 21:19:43 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Fri, 17 Jul 2020 23:19:43 +0200
Subject: add microcode version to the hs_err files
In-Reply-To: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>

Hi Vladimir,

I think this would be more suited to hotspot-runtime.

http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html

+#if defined(IA32) || defined(AMD64)

Is that not synonymous with x86?

+    while ((read = getline(&line, &len, fp)) != -1) {
+      if (len > 10 && strstr(line, "microcode") != NULL) {
+        char* rev = strchr(line, ':');
+        if (rev != NULL) sscanf(rev + 1, "%x", &result);
+        break;
+      }
+    }
+    free(line);

Not sure this works as intended. At the first call to getline() it will
allocate a line buffer for you and return it. That buffer will be as large
as the first line you happen to read. You then pass that same buffer into
getline to fetch the next lines, but what if those are longer than the
first?

But anyway it would be better to pass a simple caller provided buffer in -
stack allocated. Since this function is called at crash time and the C heap
could be corrupted.

Cheers, Thomas


On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
vladimir.a.ivanov at intel.com> wrote:

> Hello,
>
> could you please review the patch
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>
> This patch add the microcode version for different OSes that may be useful
> in the issue resolution process.
>
>
>
> The reported microcode version for different OSes loos as:
>
>
>
> Linux (RHEL7.7):
>
> # cat hs_err_pid251046.log |grep microc
>
> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core)
> family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx,
> sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
> fma, clflush, clflushopt, clwb
>
>
>
> Windows (Win10, v1809):
>
> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
> fma, clflush, clflushopt
>
>
>
> MacOS (Darwin):
>
> $ cat hs_err_pid95187.log |grep microc
>
> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
> fma, clflush, clflushopt
>
>
>
> Thanks, Vladimir
>
>
>   Thanks, Vladimir
>
>

From thomas.stuefe at gmail.com  Fri Jul 17 21:26:16 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Fri, 17 Jul 2020 23:26:16 +0200
Subject: add microcode version to the hs_err files
In-Reply-To: <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
Message-ID: <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>

On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com>
wrote:

> Hi Vladimir,
>
> I think this would be more suited to hotspot-runtime.
>
>
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>
> +#if defined(IA32) || defined(AMD64)
>
> Is that not synonymous with x86?
>
> +    while ((read = getline(&line, &len, fp)) != -1) {
> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> +        char* rev = strchr(line, ':');
> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> +        break;
> +      }
> +    }
> +    free(line);
>
> Not sure this works as intended. At the first call to getline() it will
> allocate a line buffer for you and return it. That buffer will be as large
> as the first line you happen to read. You then pass that same buffer into
> getline to fetch the next lines, but what if those are longer than the
> first?
>
>
Forget that point, getline calls realloc() on the line buffer to resize it,
so this should be okay.

Thanks, Thomas


> But anyway it would be better to pass a simple caller provided buffer in -
> stack allocated. Since this function is called at crash time and the C heap
> could be corrupted.
>
> Cheers, Thomas
>
>
> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com> wrote:
>
>> Hello,
>>
>> could you please review the patch
>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>
>> This patch add the microcode version for different OSes that may be
>> useful in the issue resolution process.
>>
>>
>>
>> The reported microcode version for different OSes loos as:
>>
>>
>>
>> Linux (RHEL7.7):
>>
>> # cat hs_err_pid251046.log |grep microc
>>
>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr,
>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2,
>> aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2,
>> adx, fma, clflush, clflushopt, clwb
>>
>>
>>
>> Windows (Win10, v1809):
>>
>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
>> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>> fma, clflush, clflushopt
>>
>>
>>
>> MacOS (Darwin):
>>
>> $ cat hs_err_pid95187.log |grep microc
>>
>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
>> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
>> fma, clflush, clflushopt
>>
>>
>>
>> Thanks, Vladimir
>>
>>
>>   Thanks, Vladimir
>>
>>

From thomas.stuefe at gmail.com  Fri Jul 17 22:02:29 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 18 Jul 2020 00:02:29 +0200
Subject: add microcode version to the hs_err files
In-Reply-To: <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>

Hi Vladimir,

On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
vladimir.a.ivanov at intel.com> wrote:

> >  +#if defined(IA32) || defined(AMD64)
> >
> > Is that not synonymous with x86?
>
> This patter was copied from the method ?print_model_name_and_flags? (file
> os/linux/os_linux.cpp).
>
> This method also read the ?/proc/cpuinfo? file and I reuse it as
> ?template? for the new method.
>
> It is better to use one pattern to work with exactly same file but in
> general you are right.
>
> The X86 is defined in the file ./share/utilities/macros.hpp as:
>
> #if defined(IA32) || defined(AMD64)
>
> #define X86
>
> #define X86_ONLY(code) code
>
> #define NOT_X86(code)
>
>
>
> The question here: could I delete this ?ifdefs? while this method should
> work on x86 only?
>
>
>

os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp
is shared among all architectures.

So, in the former you do not need to exclude non-x86 architectures.

Cheers, Thomas


> Thanks, Vladimir
>
>
>
> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
> *Sent:* Friday, July 17, 2020 2:26 PM
> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
> runtime <hotspot-runtime-dev at openjdk.java.net>
> *Cc:* hotspot-compiler-dev at openjdk.java.net
> *Subject:* Re: add microcode version to the hs_err files
>
>
>
>
>
>
>
> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com>
> wrote:
>
> Hi Vladimir,
>
>
>
> I think this would be more suited to hotspot-runtime.
>
>
>
>
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>
>
> +#if defined(IA32) || defined(AMD64)
>
> Is that not synonymous with x86?
>
>
>
> +    while ((read = getline(&line, &len, fp)) != -1) {
> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> +        char* rev = strchr(line, ':');
> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> +        break;
> +      }
> +    }
> +    free(line);
>
>
>
> Not sure this works as intended. At the first call to getline() it will
> allocate a line buffer for you and return it. That buffer will be as large
> as the first line you happen to read. You then pass that same buffer into
> getline to fetch the next lines, but what if those are longer than the
> first?
>
>
>
>
>
> Forget that point, getline calls realloc() on the line buffer to resize
> it, so this should be okay.
>
>
>
> Thanks, Thomas
>
>
>
> But anyway it would be better to pass a simple caller provided buffer in -
> stack allocated. Since this function is called at crash time and the C heap
> could be corrupted.
>
>
>
> Cheers, Thomas
>
>
>
>
>
> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com> wrote:
>
> Hello,
>
> could you please review the patch
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>
> This patch add the microcode version for different OSes that may be useful
> in the issue resolution process.
>
>
>
> The reported microcode version for different OSes loos as:
>
>
>
> Linux (RHEL7.7):
>
> # cat hs_err_pid251046.log |grep microc
>
> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core)
> family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx,
> sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
> fma, clflush, clflushopt, clwb
>
>
>
> Windows (Win10, v1809):
>
> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
> fma, clflush, clflushopt
>
>
>
> MacOS (Darwin):
>
> $ cat hs_err_pid95187.log |grep microc
>
> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
> fma, clflush, clflushopt
>
>
>
> Thanks, Vladimir
>
>
>   Thanks, Vladimir
>
>

From vladimir.kozlov at oracle.com  Fri Jul 17 23:03:20 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 16:03:20 -0700
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
Message-ID: <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>

I updated subject to our formal review request format (JDK version, RFE's id and subject).

I moved RFE to runtime group as Thomas said:

https://bugs.openjdk.java.net/browse/JDK-8249672

Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:

#  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718
# V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*) const+0xeb

V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*) const+0xeb
V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a
V  [libjvm.so+0x13cd30b]  os::free(void*)+0x5b
V  [libjvm.so+0x13e5598]  os::cpu_microcode_revision()+0xc8
V  [libjvm.so+0x17d314c]  VM_Version::get_processor_features()+0x76c
V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d
V  [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26
V  [libjvm.so+0xcb2895]   init_globals()+0x55
V  [libjvm.so+0x16dde63]  Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3


Regards,
Vladimir K

On 7/17/20 3:02 PM, Thomas St?fe wrote:
> Hi Vladimir,
> 
> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com> wrote:
> 
>>>   +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>
>> This patter was copied from the method ?print_model_name_and_flags? (file
>> os/linux/os_linux.cpp).
>>
>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>> ?template? for the new method.
>>
>> It is better to use one pattern to work with exactly same file but in
>> general you are right.
>>
>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>
>> #if defined(IA32) || defined(AMD64)
>>
>> #define X86
>>
>> #define X86_ONLY(code) code
>>
>> #define NOT_X86(code)
>>
>>
>>
>> The question here: could I delete this ?ifdefs? while this method should
>> work on x86 only?
>>
>>
>>
> 
> os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp
> is shared among all architectures.
> 
> So, in the former you do not need to exclude non-x86 architectures.
> 
> Cheers, Thomas
> 
> 
>> Thanks, Vladimir
>>
>>
>>
>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
>> *Sent:* Friday, July 17, 2020 2:26 PM
>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
>> runtime <hotspot-runtime-dev at openjdk.java.net>
>> *Cc:* hotspot-compiler-dev at openjdk.java.net
>> *Subject:* Re: add microcode version to the hs_err files
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com>
>> wrote:
>>
>> Hi Vladimir,
>>
>>
>>
>> I think this would be more suited to hotspot-runtime.
>>
>>
>>
>>
>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>
>>
>> +#if defined(IA32) || defined(AMD64)
>>
>> Is that not synonymous with x86?
>>
>>
>>
>> +    while ((read = getline(&line, &len, fp)) != -1) {
>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>> +        char* rev = strchr(line, ':');
>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>> +        break;
>> +      }
>> +    }
>> +    free(line);
>>
>>
>>
>> Not sure this works as intended. At the first call to getline() it will
>> allocate a line buffer for you and return it. That buffer will be as large
>> as the first line you happen to read. You then pass that same buffer into
>> getline to fetch the next lines, but what if those are longer than the
>> first?
>>
>>
>>
>>
>>
>> Forget that point, getline calls realloc() on the line buffer to resize
>> it, so this should be okay.
>>
>>
>>
>> Thanks, Thomas
>>
>>
>>
>> But anyway it would be better to pass a simple caller provided buffer in -
>> stack allocated. Since this function is called at crash time and the C heap
>> could be corrupted.
>>
>>
>>
>> Cheers, Thomas
>>
>>
>>
>>
>>
>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>> vladimir.a.ivanov at intel.com> wrote:
>>
>> Hello,
>>
>> could you please review the patch
>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>
>> This patch add the microcode version for different OSes that may be useful
>> in the issue resolution process.
>>
>>
>>
>> The reported microcode version for different OSes loos as:
>>
>>
>>
>> Linux (RHEL7.7):
>>
>> # cat hs_err_pid251046.log |grep microc
>>
>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core)
>> family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx,
>> sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>> fma, clflush, clflushopt, clwb
>>
>>
>>
>> Windows (Win10, v1809):
>>
>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
>> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>> fma, clflush, clflushopt
>>
>>
>>
>> MacOS (Darwin):
>>
>> $ cat hs_err_pid95187.log |grep microc
>>
>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
>> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
>> fma, clflush, clflushopt
>>
>>
>>
>> Thanks, Vladimir
>>
>>
>>    Thanks, Vladimir
>>
>>

From john.r.rose at oracle.com  Fri Jul 17 23:13:24 2020
From: john.r.rose at oracle.com (John Rose)
Date: Fri, 17 Jul 2020 16:13:24 -0700
Subject: Performance of instanceof with interfaces is multiple times
 slower than with classes
In-Reply-To: <D14A6EE0-DE31-43CF-8FA2-FF484F121BD2@freenet.de>
References: <D14A6EE0-DE31-43CF-8FA2-FF484F121BD2@freenet.de>
Message-ID: <6A5EB60C-88C2-4160-9C55-A82B8FA95ECF@oracle.com>

On Jul 15, 2020, at 2:08 AM, Christoph Dreis <christoph.dreis at freenet.de> wrote:
> 
> Could you enlighten me what the cause for this is and maybe point me to the code where this is done?

A microbenchmark like that doesn?t demonstrate much.

The JVM optimizations which are designed for real applications may or may not apply to this code.  They will certainly apply differently in real-life application execution.  In real life, class hierarchies are more complex, and so the JVM expects to do more work telling types apart; at that point differences between single inheritance and multiple inheritance requires differing algorithms with different complexities and costs.  Also, in real life, inputs to such hot paths are not just all one type (as in your micro), which is another reason you get into a different set of costs and complexities.

If this were related to a real-world application, or a realistic benchmark, I might get assembly code for the hot paths in the two methods, and see what the CPU is doing that makes a difference in speed.  Then I might meditate on what decisions the JVM made to choose that code, and see if there is an improvement possible.

You ask ?point me to the code? which suggests you haven?t looked at the code yet.  When you do (and I hope you do!) you will find that there are many, many algorithms and decisions in the JVM that affect the treatment of runtime type tests.  You could start by grepping around for ?subtype? (case insensitive) in the *.[ch]pp and *.ad files under src/hotspot.

> Is this maybe even a bug/regression? Can we maybe do something to improve the interface case?

I will stick my neck out to say that microbenchmarks alone by definition never demonstrate performance regressions or bugs.

Yes, we can always do something to improve it.  Improving this micro could involve attempting to treat trivial interface hierarchies like trivial class hierarchies:  But who would this benefit?  Our investment of effort is surely driven by a hope to spend limited effort to get the most benefit on real applications.

? John


From vladimir.kozlov at oracle.com  Fri Jul 17 23:17:00 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 16:17:00 -0700
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
Message-ID: <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>

I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not 
know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.

Someone from Runtime may suggest what is the best for this case.

Thanks,
Vladimir K

[1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792

On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> I updated subject to our formal review request format (JDK version, RFE's id and subject).
> 
> I moved RFE to runtime group as Thomas said:
> 
> https://bugs.openjdk.java.net/browse/JDK-8249672
> 
> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
> 
> #? SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718
> # V? [libjvm.so+0xc12b0b]? GuardedMemory::print_on(outputStream*) const+0xeb
> 
> V? [libjvm.so+0xc12b0b]? GuardedMemory::print_on(outputStream*) const+0xeb
> V? [libjvm.so+0x13c898a]? verify_memory(void*)+0x26a
> V? [libjvm.so+0x13cd30b]? os::free(void*)+0x5b
> V? [libjvm.so+0x13e5598]? os::cpu_microcode_revision()+0xc8
> V? [libjvm.so+0x17d314c]? VM_Version::get_processor_features()+0x76c
> V? [libjvm.so+0x17d6ead]? VM_Version::initialize()+0x10d
> V? [libjvm.so+0x17ce6c6]? VM_Version_init()+0x26
> V? [libjvm.so+0xcb2895]?? init_globals()+0x55
> V? [libjvm.so+0x16dde63]? Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
> 
> 
> Regards,
> Vladimir K
> 
> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>> Hi Vladimir,
>>
>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
>> vladimir.a.ivanov at intel.com> wrote:
>>
>>>> ? +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>
>>> This patter was copied from the method ?print_model_name_and_flags? (file
>>> os/linux/os_linux.cpp).
>>>
>>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>>> ?template? for the new method.
>>>
>>> It is better to use one pattern to work with exactly same file but in
>>> general you are right.
>>>
>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>
>>> #if defined(IA32) || defined(AMD64)
>>>
>>> #define X86
>>>
>>> #define X86_ONLY(code) code
>>>
>>> #define NOT_X86(code)
>>>
>>>
>>>
>>> The question here: could I delete this ?ifdefs? while this method should
>>> work on x86 only?
>>>
>>>
>>>
>>
>> os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp
>> is shared among all architectures.
>>
>> So, in the former you do not need to exclude non-x86 architectures.
>>
>> Cheers, Thomas
>>
>>
>>> Thanks, Vladimir
>>>
>>>
>>>
>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
>>> runtime <hotspot-runtime-dev at openjdk.java.net>
>>> *Cc:* hotspot-compiler-dev at openjdk.java.net
>>> *Subject:* Re: add microcode version to the hs_err files
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com>
>>> wrote:
>>>
>>> Hi Vladimir,
>>>
>>>
>>>
>>> I think this would be more suited to hotspot-runtime.
>>>
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html 
>>>
>>>
>>>
>>> +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>>
>>>
>>>
>>> +??? while ((read = getline(&line, &len, fp)) != -1) {
>>> +????? if (len > 10 && strstr(line, "microcode") != NULL) {
>>> +??????? char* rev = strchr(line, ':');
>>> +??????? if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>> +??????? break;
>>> +????? }
>>> +??? }
>>> +??? free(line);
>>>
>>>
>>>
>>> Not sure this works as intended. At the first call to getline() it will
>>> allocate a line buffer for you and return it. That buffer will be as large
>>> as the first line you happen to read. You then pass that same buffer into
>>> getline to fetch the next lines, but what if those are longer than the
>>> first?
>>>
>>>
>>>
>>>
>>>
>>> Forget that point, getline calls realloc() on the line buffer to resize
>>> it, so this should be okay.
>>>
>>>
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> But anyway it would be better to pass a simple caller provided buffer in -
>>> stack allocated. Since this function is called at crash time and the C heap
>>> could be corrupted.
>>>
>>>
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>>> vladimir.a.ivanov at intel.com> wrote:
>>>
>>> Hello,
>>>
>>> could you please review the patch
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>
>>> This patch add the microcode version for different OSes that may be useful
>>> in the issue resolution process.
>>>
>>>
>>>
>>> The reported microcode version for different OSes loos as:
>>>
>>>
>>>
>>> Linux (RHEL7.7):
>>>
>>> # cat hs_err_pid251046.log |grep microc
>>>
>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core)
>>> family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx,
>>> sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>>> fma, clflush, clflushopt, clwb
>>>
>>>
>>>
>>> Windows (Win10, v1809):
>>>
>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
>>> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
>>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>>> fma, clflush, clflushopt
>>>
>>>
>>>
>>> MacOS (Darwin):
>>>
>>> $ cat hs_err_pid95187.log |grep microc
>>>
>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
>>> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
>>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>>> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
>>> fma, clflush, clflushopt
>>>
>>>
>>>
>>> Thanks, Vladimir
>>>
>>>
>>> ?? Thanks, Vladimir
>>>
>>>

From vladimir.kozlov at oracle.com  Fri Jul 17 23:24:07 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 16:24:07 -0700
Subject: add microcode version to the hs_err files
In-Reply-To: <BYAPR11MB378241E44D75A7AAC274DECDA77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <BYAPR11MB378241E44D75A7AAC274DECDA77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <ce56b6b9-2498-4050-eeef-6ee7facc6be0@oracle.com>

I forked new e-mail thread with correct subject line:

[16] RFR(S) 8249672: Include microcode revision in features_string on x86

Lets continue discussion there. There is issue with changes in os_linux_x86.cpp

Regards,
Vladimir K

On 7/17/20 3:52 PM, Ivanov, Vladimir A wrote:
> Thanks for your comment.
> The updated patch available as http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.01/
> 
> Thanks, Vladimir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Friday, July 17, 2020 3:02 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: add microcode version to the hs_err files
> 
> Hi Vladimir,
> 
> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>   +#if defined(IA32) || defined(AMD64)
>>
>> Is that not synonymous with x86?
> This patter was copied from the method ?print_model_name_and_flags? (file os/linux/os_linux.cpp).
> This method also read the ?/proc/cpuinfo? file and I reuse it as ?template? for the new method.
> It is better to use one pattern to work with exactly same file but in general you are right.
> The X86 is defined in the file ./share/utilities/macros.hpp as:
> #if defined(IA32) || defined(AMD64)
> #define X86
> #define X86_ONLY(code) code
> #define NOT_X86(code)
> 
> The question here: could I delete this ?ifdefs? while this method should work on x86 only?
> 
> 
> os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp is shared among all architectures.
> 
> So, in the former you do not need to exclude non-x86 architectures.
> 
> Cheers, Thomas
> 
> Thanks, Vladimir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
> Sent: Friday, July 17, 2020 2:26 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
> Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: add microcode version to the hs_err files
> 
> 
> 
> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>> wrote:
> Hi Vladimir,
> 
> I think this would be more suited to hotspot-runtime.
> 
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
> 
> +#if defined(IA32) || defined(AMD64)
> 
> Is that not synonymous with x86?
> 
> +    while ((read = getline(&line, &len, fp)) != -1) {
> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> +        char* rev = strchr(line, ':');
> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> +        break;
> +      }
> +    }
> +    free(line);
> 
> Not sure this works as intended. At the first call to getline() it will allocate a line buffer for you and return it. That buffer will be as large as the first line you happen to read. You then pass that same buffer into getline to fetch the next lines, but what if those are longer than the first?
> 
> 
> Forget that point, getline calls realloc() on the line buffer to resize it, so this should be okay.
> 
> Thanks, Thomas
> 
> But anyway it would be better to pass a simple caller provided buffer in - stack allocated. Since this function is called at crash time and the C heap could be corrupted.
> 
> Cheers, Thomas
> 
> 
> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Hello,
> 
> could you please review the patch  http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> 
> This patch add the microcode version for different OSes that may be useful in the issue resolution process.
> 
> 
> 
> The reported microcode version for different OSes loos as:
> 
> 
> 
> Linux (RHEL7.7):
> 
> # cat hs_err_pid251046.log |grep microc
> 
> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
> 
> 
> 
> Windows (Win10, v1809):
> 
> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
> 
> 
> 
> MacOS (Darwin):
> 
> $ cat hs_err_pid95187.log |grep microc
> 
> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt
> 
> 
> 
> Thanks, Vladimir
> 
> 
>    Thanks, Vladimir
> 

From suenaga at oss.nttdata.com  Sat Jul 18 00:19:16 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sat, 18 Jul 2020 09:19:16 +0900
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash
 log file
In-Reply-To: <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
 <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
Message-ID: <4dfca747-833e-5ee7-0e7f-36b4958d464b@oss.nttdata.com>

Hi Gerard,

I cannot review it because I do not have and am not familiar for Mac, but I have some comments.


   - You set OS name to `os` with strncpy(), but can you use #define for them? For example:
       #ifdef __APPLE__
       #define OSNAME "Darwin"
       #elif defined __OpenBSD__
       #define OSNAME "OpenBSD"
       #else
       #define OSNAME "BSD"
       #endif

         :

        snprintf(buf, buflen, OSNAME " %s, macOS %s", os, release, osproductversion);


   - You can replace strncpy() to write '\0'
       strncpy(release, "", sizeof(release));  ->  release[0] = '\0';


Thanks,

Yasumasa


On 2020/07/18 4:19, gerard ziemski wrote:
> Hi all,
> 
> Please review this small fix that adds the OS version and the OS build number to the hs_err_pidXXX.log output in the ?Summary? section for Mac platform (it?s easier to use for developers than the Darwin kernel version that we display right now).
> 
> This is how things used to look:
> 
> 
> --------------- S U M M A R Y ------------
> 
> Command Line: Crasher
> 
> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, Darwin 19.5.0
> Time: Thu Jul 16 14:01:46 2020 CDT elapsed time: 1.089465 seconds (0d 0h 0m 1s)
> 
> 
> And this is how the ?Summary? section looks like with the proposed change:
> 
> 
> --------------- S U M M A R Y ------------
> 
> Command Line: Crasher
> 
> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, Darwin 19.5.0, macOS 10.15.5 (19F101)
> Time: Thu Jul 16 14:02:29 2020 CDT elapsed time: 0.360881 seconds (0d 0h 0m 0s)
> 
> 
> bug link at https://bugs.openjdk.java.net/browse/JDK-8237591
> open webrev at http://cr.openjdk.java.net/~gziemski/8237591_rev1
> testing Mach5 hs_tier1,2,3,4,5 in progress
> 
> 
> cheers
> 

From thomas.stuefe at gmail.com  Sat Jul 18 04:41:33 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 18 Jul 2020 06:41:33 +0200
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>

Hi,

yes, you must use the raw free here (for the same reason we cannot pass in
an os::malloc() allocated buffer to getline, since if it were to resize it
would use raw ::realloc() internally and crash the same way).

But as I wrote in my first mail to the original thread, I would not use
c-heap memory at all, since this function is used during crash reporting in
the signal handler and the c-heap may be corrupted.

It the max line length of /proc/cpu can be reliably predicted (so that
getline wont realloc()) I would pass a stack allocated buffer into getline.
If not, I would not use getline() at all but rewrite this, probably using
fgets().

Cheers, Thomas


On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <
vladimir.a.ivanov at intel.com> wrote:

> Thanks, I expected the C's functions here. Let's wait a little bit for
> Runtime team and update work with buffer.
>
>  Thanks, Vladimir
>
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Friday, July 17, 2020 4:17 PM
> To: Thomas St?fe <thomas.stuefe at gmail.com>; Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>;
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in
> features_string on x86
>
> I think the issue is 'line' buffer is allocated by libc getline() and
> os:free() which is HotSpot function [1] does not know about it. You need
> C's ::free() or use HS's os::malloc() to allocate 'line' buffer.
>
> Someone from Runtime may suggest what is the best for this case.
>
> Thanks,
> Vladimir K
>
> [1]
> http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792
>
> On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> > I updated subject to our formal review request format (JDK version,
> RFE's id and subject).
> >
> > I moved RFE to runtime group as Thomas said:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8249672
> >
> > Submitted tier1 testing to build on all our supported platforms. And
> debug builds on linux failed:
> >
> > #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
> > [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> > const+0xeb
> >
> > V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> > const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
> > [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
> > os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
> > VM_Version::get_processor_features()+0x76c
> > V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
> > [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
> > init_globals()+0x55 V  [libjvm.so+0x16dde63]
> > Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
> >
> >
> > Regards,
> > Vladimir K
> >
> > On 7/17/20 3:02 PM, Thomas St?fe wrote:
> >> Hi Vladimir,
> >>
> >> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
> >> vladimir.a.ivanov at intel.com> wrote:
> >>
> >>>>   +#if defined(IA32) || defined(AMD64)
> >>>>
> >>>> Is that not synonymous with x86?
> >>>
> >>> This patter was copied from the method ?print_model_name_and_flags?
> >>> (file os/linux/os_linux.cpp).
> >>>
> >>> This method also read the ?/proc/cpuinfo? file and I reuse it as
> >>> ?template? for the new method.
> >>>
> >>> It is better to use one pattern to work with exactly same file but
> >>> in general you are right.
> >>>
> >>> The X86 is defined in the file ./share/utilities/macros.hpp as:
> >>>
> >>> #if defined(IA32) || defined(AMD64)
> >>>
> >>> #define X86
> >>>
> >>> #define X86_ONLY(code) code
> >>>
> >>> #define NOT_X86(code)
> >>>
> >>>
> >>>
> >>> The question here: could I delete this ?ifdefs? while this method
> >>> should work on x86 only?
> >>>
> >>>
> >>>
> >>
> >> os_linux_x86.cpp is compiled for x86 platforms only, whereas
> >> os_linux.cpp is shared among all architectures.
> >>
> >> So, in the former you do not need to exclude non-x86 architectures.
> >>
> >> Cheers, Thomas
> >>
> >>
> >>> Thanks, Vladimir
> >>>
> >>>
> >>>
> >>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
> >>> *Sent:* Friday, July 17, 2020 2:26 PM
> >>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
> >>> runtime <hotspot-runtime-dev at openjdk.java.net>
> >>> *Cc:* hotspot-compiler-dev at openjdk.java.net
> >>> *Subject:* Re: add microcode version to the hs_err files
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
> >>> <thomas.stuefe at gmail.com>
> >>> wrote:
> >>>
> >>> Hi Vladimir,
> >>>
> >>>
> >>>
> >>> I think this would be more suited to hotspot-runtime.
> >>>
> >>>
> >>>
> >>>
> >>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> >>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
> >>>
> >>>
> >>>
> >>> +#if defined(IA32) || defined(AMD64)
> >>>
> >>> Is that not synonymous with x86?
> >>>
> >>>
> >>>
> >>> +    while ((read = getline(&line, &len, fp)) != -1) {
> >>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> >>> +        char* rev = strchr(line, ':');
> >>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> >>> +        break;
> >>> +      }
> >>> +    }
> >>> +    free(line);
> >>>
> >>>
> >>>
> >>> Not sure this works as intended. At the first call to getline() it
> >>> will allocate a line buffer for you and return it. That buffer will
> >>> be as large as the first line you happen to read. You then pass that
> >>> same buffer into getline to fetch the next lines, but what if those
> >>> are longer than the first?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Forget that point, getline calls realloc() on the line buffer to
> >>> resize it, so this should be okay.
> >>>
> >>>
> >>>
> >>> Thanks, Thomas
> >>>
> >>>
> >>>
> >>> But anyway it would be better to pass a simple caller provided
> >>> buffer in - stack allocated. Since this function is called at crash
> >>> time and the C heap could be corrupted.
> >>>
> >>>
> >>>
> >>> Cheers, Thomas
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
> >>> vladimir.a.ivanov at intel.com> wrote:
> >>>
> >>> Hello,
> >>>
> >>> could you please review the patch
> >>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> >>>
> >>> This patch add the microcode version for different OSes that may be
> >>> useful in the issue resolution process.
> >>>
> >>>
> >>>
> >>> The reported microcode version for different OSes loos as:
> >>>
> >>>
> >>>
> >>> Linux (RHEL7.7):
> >>>
> >>> # cat hs_err_pid251046.log |grep microc
> >>>
> >>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
> >>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
> >>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
> >>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
> >>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
> >>>
> >>>
> >>>
> >>> Windows (Win10, v1809):
> >>>
> >>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
> >>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
> >>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
> >>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
> >>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
> >>>
> >>>
> >>>
> >>> MacOS (Darwin):
> >>>
> >>> $ cat hs_err_pid95187.log |grep microc
> >>>
> >>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
> >>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
> >>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
> >>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
> >>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
> >>>
> >>>
> >>>
> >>> Thanks, Vladimir
> >>>
> >>>
> >>>    Thanks, Vladimir
> >>>
> >>>
>

From thomas.stuefe at gmail.com  Sat Jul 18 05:24:45 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 18 Jul 2020 07:24:45 +0200
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
 <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>

Oh, sorry, you are right :(

I was under the assumption you wanted to call os::cpu_microcode_revision()
directly from within VMError::report(). During initialization using c-heap
like this should not be a problem and you can forget about 9/10ths of what
I wrote, sorry.

In that case your original variant is fine, my only suggestion would be to
clearly mark the free as ::free() with a comment to prevent someone from
correcting it to os::free.

Thank you,

Thomas


On Sat, Jul 18, 2020 at 7:08 AM Ivanov, Vladimir A <
vladimir.a.ivanov at intel.com> wrote:

> Hi,
>
> seems, this info created during initialization phase. Is it correct?
> Collect or parse common info at the crash point usually not a good idea.
> During initialization usage of the c-heap not a problem.
>
> The ?::free? work OK here. At least tier1 test produce same results for
> patched and non-patched builds. But these tests not generates real case for
> hs_err files.
>
> It looks like 2k byte array enough for the one record for CPU from cpuinfo
> file. Will update code to use local buffer.
>
>
>
> Thanks, Vladimir
>
>
>
> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
> *Sent:* Friday, July 17, 2020 9:42 PM
> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
> *Cc:* Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime <
> hotspot-runtime-dev at openjdk.java.net>;
> hotspot-compiler-dev at openjdk.java.net
> *Subject:* Re: [16] RFR(S) 8249672: Include microcode revision in
> features_string on x86
>
>
>
> Hi,
>
>
>
> yes, you must use the raw free here (for the same reason we cannot pass in
> an os::malloc() allocated buffer to getline, since if it were to resize it
> would use raw ::realloc() internally and crash the same way).
>
>
>
> But as I wrote in my first mail to the original thread, I would not use
> c-heap memory at all, since this function is used during crash reporting in
> the signal handler and the c-heap may be corrupted.
>
>
>
> It the max line length of /proc/cpu can be reliably predicted (so that
> getline wont realloc()) I would pass a stack allocated buffer into getline.
> If not, I would not use getline() at all but rewrite this, probably using
> fgets().
>
>
>
> Cheers, Thomas
>
>
>
>
>
>
>
>
>
> On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com> wrote:
>
> Thanks, I expected the C's functions here. Let's wait a little bit for
> Runtime team and update work with buffer.
>
>  Thanks, Vladimir
>
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Friday, July 17, 2020 4:17 PM
> To: Thomas St?fe <thomas.stuefe at gmail.com>; Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>;
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in
> features_string on x86
>
> I think the issue is 'line' buffer is allocated by libc getline() and
> os:free() which is HotSpot function [1] does not know about it. You need
> C's ::free() or use HS's os::malloc() to allocate 'line' buffer.
>
> Someone from Runtime may suggest what is the best for this case.
>
> Thanks,
> Vladimir K
>
> [1]
> http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792
>
> On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> > I updated subject to our formal review request format (JDK version,
> RFE's id and subject).
> >
> > I moved RFE to runtime group as Thomas said:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8249672
> >
> > Submitted tier1 testing to build on all our supported platforms. And
> debug builds on linux failed:
> >
> > #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
> > [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> > const+0xeb
> >
> > V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> > const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
> > [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
> > os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
> > VM_Version::get_processor_features()+0x76c
> > V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
> > [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
> > init_globals()+0x55 V  [libjvm.so+0x16dde63]
> > Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
> >
> >
> > Regards,
> > Vladimir K
> >
> > On 7/17/20 3:02 PM, Thomas St?fe wrote:
> >> Hi Vladimir,
> >>
> >> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
> >> vladimir.a.ivanov at intel.com> wrote:
> >>
> >>>>   +#if defined(IA32) || defined(AMD64)
> >>>>
> >>>> Is that not synonymous with x86?
> >>>
> >>> This patter was copied from the method ?print_model_name_and_flags?
> >>> (file os/linux/os_linux.cpp).
> >>>
> >>> This method also read the ?/proc/cpuinfo? file and I reuse it as
> >>> ?template? for the new method.
> >>>
> >>> It is better to use one pattern to work with exactly same file but
> >>> in general you are right.
> >>>
> >>> The X86 is defined in the file ./share/utilities/macros.hpp as:
> >>>
> >>> #if defined(IA32) || defined(AMD64)
> >>>
> >>> #define X86
> >>>
> >>> #define X86_ONLY(code) code
> >>>
> >>> #define NOT_X86(code)
> >>>
> >>>
> >>>
> >>> The question here: could I delete this ?ifdefs? while this method
> >>> should work on x86 only?
> >>>
> >>>
> >>>
> >>
> >> os_linux_x86.cpp is compiled for x86 platforms only, whereas
> >> os_linux.cpp is shared among all architectures.
> >>
> >> So, in the former you do not need to exclude non-x86 architectures.
> >>
> >> Cheers, Thomas
> >>
> >>
> >>> Thanks, Vladimir
> >>>
> >>>
> >>>
> >>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
> >>> *Sent:* Friday, July 17, 2020 2:26 PM
> >>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
> >>> runtime <hotspot-runtime-dev at openjdk.java.net>
> >>> *Cc:* hotspot-compiler-dev at openjdk.java.net
> >>> *Subject:* Re: add microcode version to the hs_err files
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
> >>> <thomas.stuefe at gmail.com>
> >>> wrote:
> >>>
> >>> Hi Vladimir,
> >>>
> >>>
> >>>
> >>> I think this would be more suited to hotspot-runtime.
> >>>
> >>>
> >>>
> >>>
> >>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> >>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
> >>>
> >>>
> >>>
> >>> +#if defined(IA32) || defined(AMD64)
> >>>
> >>> Is that not synonymous with x86?
> >>>
> >>>
> >>>
> >>> +    while ((read = getline(&line, &len, fp)) != -1) {
> >>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> >>> +        char* rev = strchr(line, ':');
> >>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> >>> +        break;
> >>> +      }
> >>> +    }
> >>> +    free(line);
> >>>
> >>>
> >>>
> >>> Not sure this works as intended. At the first call to getline() it
> >>> will allocate a line buffer for you and return it. That buffer will
> >>> be as large as the first line you happen to read. You then pass that
> >>> same buffer into getline to fetch the next lines, but what if those
> >>> are longer than the first?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Forget that point, getline calls realloc() on the line buffer to
> >>> resize it, so this should be okay.
> >>>
> >>>
> >>>
> >>> Thanks, Thomas
> >>>
> >>>
> >>>
> >>> But anyway it would be better to pass a simple caller provided
> >>> buffer in - stack allocated. Since this function is called at crash
> >>> time and the C heap could be corrupted.
> >>>
> >>>
> >>>
> >>> Cheers, Thomas
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
> >>> vladimir.a.ivanov at intel.com> wrote:
> >>>
> >>> Hello,
> >>>
> >>> could you please review the patch
> >>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> >>>
> >>> This patch add the microcode version for different OSes that may be
> >>> useful in the issue resolution process.
> >>>
> >>>
> >>>
> >>> The reported microcode version for different OSes loos as:
> >>>
> >>>
> >>>
> >>> Linux (RHEL7.7):
> >>>
> >>> # cat hs_err_pid251046.log |grep microc
> >>>
> >>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
> >>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
> >>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
> >>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
> >>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
> >>>
> >>>
> >>>
> >>> Windows (Win10, v1809):
> >>>
> >>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
> >>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
> >>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
> >>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
> >>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
> >>>
> >>>
> >>>
> >>> MacOS (Darwin):
> >>>
> >>> $ cat hs_err_pid95187.log |grep microc
> >>>
> >>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
> >>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
> >>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
> >>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
> >>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
> >>>
> >>>
> >>>
> >>> Thanks, Vladimir
> >>>
> >>>
> >>>    Thanks, Vladimir
> >>>
> >>>
>
>

From christoph.dreis at freenet.de  Sat Jul 18 07:24:30 2020
From: christoph.dreis at freenet.de (Christoph Dreis)
Date: Sat, 18 Jul 2020 09:24:30 +0200
Subject: Performance of instanceof with interfaces is multiple times
 slower than with classes
In-Reply-To: <6A5EB60C-88C2-4160-9C55-A82B8FA95ECF@oracle.com>
References: <D14A6EE0-DE31-43CF-8FA2-FF484F121BD2@freenet.de>
 <6A5EB60C-88C2-4160-9C55-A82B8FA95ECF@oracle.com>
Message-ID: <CF1D4F09-047B-4777-8EB6-D8E8B248A427@freenet.de>

Hi John,

thanks for your answer. I think I need to elaborate a bit more on the background of this question.

I was looking at Set.copyOf() and noticed that it copies the collection into a HashSet if it isn?t an immutable collection.
Yet, this doesn?t seem to be necessary in case the passed collection is already a Set, which I think is not that uncommon.
So I changed it slightly to the following:

    static <E> Set<E> copyOf(Collection<? extends E> coll) {
        if (coll instanceof ImmutableCollections.AbstractImmutableSet) {
            return (Set<E>)coll;
        } else if (coll instanceof Set) { // that is the new bit
            return (Set<E>)Set.of(coll.toArray());
        } else {
            return (Set<E>)Set.of(new HashSet<>(coll).toArray());
        }
    }

Doing that showed the wanted performance boost for Sets. When passed a Collections.emptySet() this seems to make it allocation free even.
But for other inputs, like a List, it degraded drastically.

I tried to benchmark this change with the following benchmark:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class MyBenchmark {

	@State(Scope.Thread)
	public static class BenchmarkState {
		private Collection<String> emptySet = Collections.emptySet();
		private Collection<String> emptyList = Collections.emptyList();
	}
	@Benchmark
	public Set<String> testCopyEmptySet(BenchmarkState state) {
		return Set.copyOf(state.emptySet);
	}

	@Benchmark
	public Set<String> testCopyEmptyList(BenchmarkState state) {
		return Set.copyOf(state.emptyList);
	}
}

Which showed the following results

Before
Benchmark                                                        Mode  Cnt     Score     Error   Units
MyBenchmark.testCopyEmptyList                                   avgt   10     5,436 ?   0,286   ns/op
MyBenchmark.testCopyEmptyList:?gc.alloc.rate.norm               avgt   10    48,002 ?   0,001    B/op
MyBenchmark.testCopyEmptySet                                    avgt   10     5,319 ?   0,472   ns/op
MyBenchmark.testCopyEmptySet:?gc.alloc.rate.norm                avgt   10    48,002 ?   0,001    B/op

After
Benchmark                                                       Mode  Cnt     Score     Error   Units
MyBenchmark.testCopyEmptyList                                   avgt   10    24,835 ?   0,621   ns/op
MyBenchmark.testCopyEmptyList:?gc.alloc.rate.norm               avgt   10    48,004 ?   0,001    B/op
MyBenchmark.testCopyEmptySet                                    avgt   10     2,494 ?   0,179   ns/op
MyBenchmark.testCopyEmptySet:?gc.alloc.rate.norm                avgt   10    ? 10??              B/op

I was surprised to see such a big difference, so I played around with it.

My first suspect was inlining problems with the new solution, but that didn't turn out to be true.
I ended up pinning it down to the instanceof check, because when I changed the additional instanceof check to AbstractSet for a test I saw no regression for the list case and still saw my improvement for the empty set.

All I was trying to do with the provided benchmark in my first mail was to isolate the "problem" and to showcase that this seems to be independent from hierarchy, which exists in the collection API test above. In retrospect, this didn't help here, which I'm sorry about.

> You ask ?point me to the code? which suggests you haven?t looked at the code yet.  When you do (and I hope you do!) you will find that there are many, many algorithms and decisions in the JVM that affect the treatment of runtime > type tests.  You could start by grepping around for ?subtype? (case insensitive) in the *.[ch]pp and *.ad files under src/hotspot.

I have in fact looked at the code already before the first mail and found emit_typecheck_helper in src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp, which I think is involved - please correct me if I'm wrong.
And MacroAssembler::check_klass_subtype_fast_path in src/hotspot/cpu/x86/macroAssembler_x86.cpp etc. Depending on the architecture (x86, aarch etc.) of course.

Like you said, there are many things involved and I was hoping for a good starting point really from an experienced developer or a rough explanation what might be involved in the example. I understand now that there doesn't seem to be a "simple" explanation. Unfortunately, the C++ side of things is not as good documented as the Java side of things and that makes it relatively complicated for people like me who aren't familiar with the code to follow the flows sometimes. I can assure you: I do want to look at code, hence the - admittedly clumsy - question.

I didn't do a great job of explaining that I looked upfront before asking, so that's on me.

Nonetheless, I hope this sheds some light on the background of this and I hope the collection case earlier justifies as a more real-life example.

Cheers,
Christoph


Von: John Rose <john.r.rose at oracle.com>
Datum: Samstag, 18. Juli 2020 um 01:13
An: Christoph Dreis <christoph.dreis at freenet.de>
Cc: hotspot-runtime-dev <hotspot-runtime-dev at openjdk.java.net>
Betreff: Re: Performance of instanceof with interfaces is multiple times slower than with classes

On Jul 15, 2020, at 2:08 AM, Christoph Dreis <mailto:christoph.dreis at freenet.de> wrote:

Could you enlighten me what the cause for this is and maybe point me to the code where this is done?

A microbenchmark like that doesn?t demonstrate much.

The JVM optimizations which are designed for real applications may or may not apply to this code. ?They will certainly apply differently in real-life application execution. ?In real life, class hierarchies are more complex, and so the JVM expects to do more work telling types apart; at that point differences between single inheritance and multiple inheritance requires differing algorithms with different complexities and costs. ?Also, in real life, inputs to such hot paths are not just all one type (as in your micro), which is another reason you get into a different set of costs and complexities.

If this were related to a real-world application, or a realistic benchmark, I might get assembly code for the hot paths in the two methods, and see what the CPU is doing that makes a difference in speed. ?Then I might meditate on what decisions the JVM made to choose that code, and see if there is an improvement possible.

You ask ?point me to the code? which suggests you haven?t looked at the code yet. ?When you do (and I hope you do!) you will find that there are many, many algorithms and decisions in the JVM that affect the treatment of runtime type tests. ?You could start by grepping around for ?subtype? (case insensitive) in the *.[ch]pp and *.ad files under src/hotspot.

Is this maybe even a bug/regression? Can we maybe do something to improve the interface case?

I will stick my neck out to say that microbenchmarks alone by definition never demonstrate performance regressions or bugs.

Yes, we can always do something to improve it. ?Improving this micro could involve attempting to treat trivial interface hierarchies like trivial class hierarchies: ?But who would this benefit? ?Our investment of effort is surely driven by a hope to spend limited effort to get the most benefit on real applications.

? John


From aph at redhat.com  Sat Jul 18 13:00:58 2020
From: aph at redhat.com (Andrew Haley)
Date: Sat, 18 Jul 2020 14:00:58 +0100
Subject: Performance of instanceof with interfaces is multiple times
 slower than with classes
In-Reply-To: <6A5EB60C-88C2-4160-9C55-A82B8FA95ECF@oracle.com>
References: <D14A6EE0-DE31-43CF-8FA2-FF484F121BD2@freenet.de>
 <6A5EB60C-88C2-4160-9C55-A82B8FA95ECF@oracle.com>
Message-ID: <0c6adbe9-1795-e220-42c3-6d318125ffff@redhat.com>

On 18/07/2020 00:13, John Rose wrote:
> Yes, we can always do something to improve it.  Improving this micro could involve attempting to treat trivial interface hierarchies like trivial class hierarchies:  But who would this benefit?

Personally speaking, I don't like counter-intuitive performance
gotchas. I know it's been quite a long time since high-level-language
constructs mapped simply onto machine instructions, but even so
predictability is nice to have.

And, just for some healthy competition, we had O(1) interface dispatch
and instanceof in GCJ during the last century. Just sayin'.  :-)

Mind you, that O(1) might not have been very fast...

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Sat Jul 18 13:11:22 2020
From: aph at redhat.com (Andrew Haley)
Date: Sat, 18 Jul 2020 14:11:22 +0100
Subject: Performance of instanceof with interfaces is multiple times
 slower than with classes
In-Reply-To: <D14A6EE0-DE31-43CF-8FA2-FF484F121BD2@freenet.de>
References: <D14A6EE0-DE31-43CF-8FA2-FF484F121BD2@freenet.de>
Message-ID: <f1bcf5e7-aca1-7b2a-52c0-2cce2c1b586b@redhat.com>

On 15/07/2020 10:08, Christoph Dreis wrote:
> Benchmark                                                Mode  Cnt   Score    Error   Units
> MyBenchmark.testInstanceOfClass                          avgt   10   2,085 ?  0,179   ns/op
> MyBenchmark.testInstanceOfInterface                      avgt   10  18,783 ?  0,595   ns/op
>
> I was surprised to see that the interface variant is so much slower.
> Both checks should return false and there is no big hierarchy that needs to be walked up/down.
>
> Could you enlighten me what the cause for this is and maybe point me to the code where this is done?
> Is this maybe even a bug/regression? Can we maybe do something to improve the interface case?

You need to keep in mind that JMH testing of such extremely small
intervals of time is difficult to do. A single load from L1 cache has
a latency of four or five cycles, so about 1 - 1.3ns. Your
testInstanceOfClass test takes essentially no time at all: that 2ns is
the cost of Blackhole.consume, as you'll see if you try

    @Benchmark
	public boolean testEmptyMethod(BenchmarkState state) {
        return true;
    }

I getL

Benchmark                       Mode  Cnt   Score   Error  Units
Invoke.testEmptyMethod          avgt    3   5.172 ? 0.019  ns/op
Invoke.testInstanceOfClass      avgt    3   5.175 ? 0.023  ns/op
Invoke.testInstanceOfInterface  avgt    3  13.370 ? 0.303  ns/op


Your microbenchmark here is too micro.  :-)

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From david.holmes at oracle.com  Mon Jul 20 01:06:44 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 11:06:44 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
Message-ID: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>

Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/

This is a simple cleanup that touches files across a number of VM areas 
- hence the cross-post.

Whilst working on a different JNI fix I noticed that in most cases in 
jni.cpp we were using the following form of make_local:

JNIHandles::make_local(env, obj);

and what that form does is first extract the thread from the JNIEnv:

JavaThread* thread = JavaThread::thread_from_jni_environment(env);
return thread->active_handles()->allocate_handle(obj);

but there is also another, faster, variant for when you already have the 
"thread":

jobject JNIHandles::make_local(Thread* thread, oop obj) {
   return thread->active_handles()->allocate_handle(obj);
}

When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, 
UNSAFE_ENTRY etc) it has already extracted the thread from the JNIEnv:

     JavaThread* thread=JavaThread::thread_from_jni_environment(env);

and further defined:

     Thread* THREAD = thread;

so we always already have direct access to the "thread" available (or 
indirect via TRAPS), and in fact we can end up removing the 
make_local(JNIEnv* env, oop obj) variant altogether.

Along the way I spotted some related issues with unnecessary use of 
Thread::current() when it is already available from TRAPS, and some 
other cases where we extracted the JNIEnv from a thread only to later 
extract the thread from the JNIEnv.

Testing: tiers 1 - 3

Thanks,
David
-----

From david.holmes at oracle.com  Mon Jul 20 04:15:56 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 14:15:56 +1000
Subject: RFR (M) 8237727: Mac: after we handle a crash, Apple's crash
 reporter is left with incorrect state
In-Reply-To: <b90eabcd-3807-0f84-ca05-4e2474cf88ea@oracle.com>
References: <274ED2F0-38D1-48AE-AD84-75775D4A57AE@me.com>
 <b90eabcd-3807-0f84-ca05-4e2474cf88ea@oracle.com>
Message-ID: <7312aea2-cadc-ba88-e847-3baf00d1f971@oracle.com>

Hi Gerard,

On 18/07/2020 5:21 am, gerard ziemski wrote:
> 
> Hi all,
> 
> Please review this enhancement, which changes how we handle a crash on 
> macOS, so that the native macOS CrashReporter can create its own crash 
> report alongside ours, with correct crash signal and frame.

So ... we have a flag, UseOSErrorReporting, to control whether the VM 
handles error reporting or whether it lets the OS handle things. The VM 
doesn't directly interact with OS error reporting but obviously if that 
OS reporting differs whether ::exit or ::abort is called then the VM 
does affect that. So my initial question is:

Should the user not just set UseOSErrorReporting if they want the 
CrashReporter to have full control, and indeed will it work correctly if 
the user sets that?

I can see some motivation for have the VM handle things and get a nice 
hs_err file, whilst also having the OS handle things and get, e.g., 
stacktraces of all threads (yes I've often wished for that myself!).

That said there are practical implications of this change. Primarily 
from a testing perspective IIUC currently if we disable core dumps then 
the CrashReporter will not be involved at all, but with this change it 
will be - correct? In which case do we now risk filling up test machine 
disks with these error logs?

My own feeling here is that we may want this for the abort case but if 
we do an exit (because the user doesn't want a core dump) then we've 
chosen to treat this failure as not-a-crash, and so it is appropriate 
for the OS crash reporter to not be involved.

Cheers,
David
-----

> Normally after we handle a crash we?terminate the process with either 
> with exit() or abort():
> 
> #1 - When we terminate with abort() (the case when the core dump is 
> enabled, i.e. -XX:+CreateCoredumpOnCrash) macOS CrashReported doesn?t 
> see the original crash that we handled, but only sees the abort, which 
> it correctly, but confusingly reports as the termination reason.
> 
> #2 - When we terminate with exit() (the case when the core dump is 
> disabled, i.e. -XX:-CreateCoredumpOnCrash) macOS CrashReporter doesn?t 
> see the crash and does not generate a report at all.
> 
> 
> With this proposed fix we handle the crash as usual, but then instead of 
> aborting/exiting, we allow the process to crash again, which allows the 
> macOS CrashReported to generate its crash log with correct exception 
> type and termination signal, showing the actual frame that crashed, in 
> all cases (regardless of whether the core dump is enabled or disabled)
> 
> Before, the CrashReported would only indicate the (abort) exception type 
> with no termination signal:
> 
> 
> Exception Type: ?EXC_BAD_ACCESS (SIGABRT)
> Exception Codes: ? ? ? KERN_INVALID_ADDRESS at 0x0000000000000008
> Exception Note: ? ? ? ?EXC_CORPSE_NOTIFY
> 
> 
> But now we get (correct) exception type and termination signal:
> 
> 
> Exception Type: ?EXC_BAD_ACCESS (SIGSEGV)
> Exception Codes: ? ? ? KERN_INVALID_ADDRESS at 0x0000000000000000
> Exception Note: ? ? ? ?EXC_CORPSE_NOTIFY
> 
> Termination Signal: ? ?Segmentation fault: 11
> Termination Reason: ? ?Namespace SIGNAL, Code 0xb
> Terminating Process: ? exc handler [1497]
> 
> 
> In addition, instead of a frame like this:
> 
> 
> Thread 2 Crashed:
> 0 ? libsystem_kernel.dylib 0x00007fff6ec4633a __pthread_kill + 10
> 1 ? libsystem_pthread.dylib 0x00007fff6ed02e60 pthread_kill + 430
> 2 ? libsystem_c.dylib 0x00007fff6ebcd808 abort + 120
> 3 ? libjvm.dylib 0x00000001037c6da1 os::abort(bool, void*, void const*) 
> + 49 (os_bsd.cpp:1069)
> 4 ? libjvm.dylib 0x0000000103a9a249 VMError::report_and_die(int, char 
> const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, 
> void*, char const*, int, unsigned long) + 3017 (vmError.cpp:1639)
> 5 ? libjvm.dylib 0x0000000103a99655 VMError::report_and_die(Thread*, 
> unsigned int, unsigned char*, void*, void*, char const*, ...) + 149 
> (vmError.cpp:1315)
> 6 ? libjvm.dylib 0x0000000103a9a341 VMError::report_and_die(Thread*, 
> unsigned int, unsigned char*, void*, void*) + 33 (vmError.cpp:1322)
> 7 ? libjvm.dylib 0x00000001037cbdfa JVM_handle_bsd_signal + 618 
> (os_bsd_x86.cpp:763)
> 8 ? libjvm.dylib 0x00000001037c8ce9 signalHandler(int, __siginfo*, 
> void*) + 89 (os_bsd.cpp:2589)
> 9 ? libsystem_platform.dylib 0x00007fff6ecf75fd _sigtramp + 29
> 10 ???? 000000000000000000 0 + 0
> 11 ?libjvm.dylib 0x0000000103a4aa51 Unsafe_PutInt(JNIEnv_*, _jobject*, 
> _jobject*, long, int) + 241 (unsafe.cpp:313)
> 12 ???? 0x00000001060baade 0 + 4396395230
> 13 ???? 0x00000001060b207e 0 + 4396359806
> 14 ???? 0x00000001060b207e 0 + 4396359806
> 15 ???? 0x00000001060b207e 0 + 4396359806
> 16 ???? 0x00000001060a89ca 0 + 4396321226
> 17 ?libjvm.dylib 0x0000000103225792 JavaCalls::call_helper(JavaValue*, 
> methodHandle const&, JavaCallArguments*, Thread*) + 1426 
> (javaCalls.cpp:430)
> 18 ?libjvm.dylib 0x00000001032e1ee1 jni_invoke_static(JNIEnv_*, 
> JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, 
> Thread*) + 417 (jni.cpp:975)
> 19 ?libjvm.dylib 0x00000001032e9605 jni_CallStaticVoidMethod + 645 
> (jni.cpp:1830)
> 20 ?libjli.dylib 0x0000000101122d3f JavaMain + 2495 (java.c:556)
> 21 ?libjli.dylib 0x00000001011254a9 ThreadJavaMain + 9 
> (java_md_macosx.m:720)
> 22 ?libsystem_pthread.dylib 0x00007fff6ed03109 _pthread_start + 148
> 23 ?libsystem_pthread.dylib 0x00007fff6ecfeb8b thread_start + 15
> 
> 
> CrashReporter?will now show:
> 
> 
> Thread 2 Crashed:
> 0 ? libjvm.dylib 0x0000000108034415 MemoryAccess<int>::put(int) + 205 
> (unsafe.cpp:233)
> 1 ? libjvm.dylib 0x00000001080249a0 Unsafe_PutInt(JNIEnv_*, _jobject*, 
> _jobject*, long, int) + 206 (unsafe.cpp:313)
> 2 ? ??? 0x0000000111ad9ade 0 + 4591557342
> 3 ? ??? 0x0000000111ad107e 0 + 4591521918
> 4 ? ??? 0x0000000111ad107e 0 + 4591521918
> 5 ? ??? 0x0000000111ad107e 0 + 4591521918
> 6 ? ??? 0x0000000111ac79ca 0 + 4591483338
> 7 ? libjvm.dylib 0x0000000107f37656 JavaCalls::call_helper(JavaValue*, 
> methodHandle const&, JavaCallArguments*, Thread*) + 1006 
> (javaCalls.cpp:430)
> 8 ? libjvm.dylib 0x0000000108077b7b jni_invoke_static(JNIEnv_*, 
> JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, 
> Thread*) + 260 (jni.cpp:975)
> 9 ? libjvm.dylib 0x000000010807dffc jni_CallStaticVoidMethod + 529 
> (jni.cpp:1830)
> 10 ?libjli.dylib 0x0000000104452d3f JavaMain + 2495 (java.c:556)
> 11 ?libjli.dylib 0x00000001044554a9 ThreadJavaMain + 9 
> (java_md_macosx.m:720)
> 12 ?libsystem_pthread.dylib 0x00007fff6ed03109 _pthread_start + 148
> 13 ?libsystem_pthread.dylib 0x00007fff6ecfeb8b thread_start + 15
> 
> 
> Which correctly identifies the top frame that caused the crash (instead 
> of "??? 000000000000000000 0 + 0?)
> 
> Also, in our?hs_err_pidX?log we now show an approximate location to 
> the?crash log report produced by the CrashReporter, ex:
> 
> 
> --------------- ?S Y S T E M ?---------------
> 
> ...
> 
> CrashReporter log: /Users/gerard/Library/Logs/DiagnosticReports/java_*
> 
> END.
> 
> 
> Lastly, having macOS CrashReporter produce its crash log, in addition to 
> our own hs_err_pidX?log, is valuable because it shows back traces for 
> all the native threads (ours only shows the crashed one) as well as line 
> offsets to the files (ours shows binary offsets). It?s also nice to 
> validate that the info in our crash log is correct.
> 
> For full before and after CrashReporter logs please see the bug.
> 
> bug link at https://bugs.openjdk.java.net/browse/JDK-8237727
> open webrev at http://cr.openjdk.java.net/~gziemski/8237727_rev1
> testing: passes Mach5 hs_tier1,2,3,4,5
> 
> cheers
> 

From david.holmes at oracle.com  Mon Jul 20 04:16:49 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 14:16:49 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
Message-ID: <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>

Subject line got truncated by accident ...

On 20/07/2020 11:06 am, David Holmes wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
> 
> This is a simple cleanup that touches files across a number of VM areas 
> - hence the cross-post.
> 
> Whilst working on a different JNI fix I noticed that in most cases in 
> jni.cpp we were using the following form of make_local:
> 
> JNIHandles::make_local(env, obj);
> 
> and what that form does is first extract the thread from the JNIEnv:
> 
> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
> return thread->active_handles()->allocate_handle(obj);
> 
> but there is also another, faster, variant for when you already have the 
> "thread":
> 
> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>  ? return thread->active_handles()->allocate_handle(obj);
> }
> 
> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, 
> UNSAFE_ENTRY etc) it has already extracted the thread from the JNIEnv:
> 
>  ??? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
> 
> and further defined:
> 
>  ??? Thread* THREAD = thread;
> 
> so we always already have direct access to the "thread" available (or 
> indirect via TRAPS), and in fact we can end up removing the 
> make_local(JNIEnv* env, oop obj) variant altogether.
> 
> Along the way I spotted some related issues with unnecessary use of 
> Thread::current() when it is already available from TRAPS, and some 
> other cases where we extracted the JNIEnv from a thread only to later 
> extract the thread from the JNIEnv.
> 
> Testing: tiers 1 - 3
> 
> Thanks,
> David
> -----

From david.holmes at oracle.com  Mon Jul 20 04:37:46 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 14:37:46 +1000
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash
 log file
In-Reply-To: <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
 <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
Message-ID: <70bb6f74-e626-bd54-ddf0-568bebe933e9@oracle.com>

Hi Gerard,

On 18/07/2020 5:19 am, gerard ziemski wrote:
> Hi all,
> 
> Please review this small fix that adds the OS version and the OS build 
> number to the hs_err_pidXXX.log output in the ?Summary? section for Mac 
> platform (it?s easier to use for developers than the Darwin kernel 
> version that we display right now).
> 
> This is how things used to look:
> 
> 
> --------------- S U M M A R Y ------------
> 
> Command Line: Crasher
> 
> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, 
> Darwin 19.5.0
> Time: Thu Jul 16 14:01:46 2020 CDT elapsed time: 1.089465 seconds (0d 0h 
> 0m 1s)
> 
> 
> And this is how the ?Summary? section looks like with the proposed change:
> 
> 
> --------------- S U M M A R Y ------------
> 
> Command Line: Crasher
> 
> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, 
> Darwin 19.5.0, macOS 10.15.5 (19F101)
> Time: Thu Jul 16 14:02:29 2020 CDT elapsed time: 0.360881 seconds (0d 0h 
> 0m 0s)
> 
> 
> bug link at https://bugs.openjdk.java.net/browse/JDK-8237591
> open webrev at http://cr.openjdk.java.net/~gziemski/8237591_rev1
> testing Mach5 hs_tier1,2,3,4,5 in progress

Just to be clear, the changes prior to:

1555 #ifdef __APPLE__

are just fixing up existing indentation errors - correct?

The actual change seems okay, just one query:

1562     int mib_build[] = { CTL_KERN, KERN_OSVERSION };

I couldn't find KERN_OSVERSION documented for sysctl - is it a "recent" 
addition?

Thanks,
David

> 
> cheers
> 

From kim.barrett at oracle.com  Mon Jul 20 05:22:49 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 20 Jul 2020 01:22:49 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
Message-ID: <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>

> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Subject line got truncated by accident ...
> 
> On 20/07/2020 11:06 am, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>> This is a simple cleanup that touches files across a number of VM areas - hence the cross-post.
>> Whilst working on a different JNI fix I noticed that in most cases in jni.cpp we were using the following form of make_local:
>> JNIHandles::make_local(env, obj);
>> and what that form does is first extract the thread from the JNIEnv:
>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>> return thread->active_handles()->allocate_handle(obj);
>> but there is also another, faster, variant for when you already have the "thread":
>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>   return thread->active_handles()->allocate_handle(obj);
>> }
>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread from the JNIEnv:
>>     JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>> and further defined:
>>     Thread* THREAD = thread;
>> so we always already have direct access to the "thread" available (or indirect via TRAPS), and in fact we can end up removing the make_local(JNIEnv* env, oop obj) variant altogether.
>> Along the way I spotted some related issues with unnecessary use of Thread::current() when it is already available from TRAPS, and some other cases where we extracted the JNIEnv from a thread only to later extract the thread from the JNIEnv.
>> Testing: tiers 1 - 3
>> Thanks,
>> David
>> -----

------------------------------------------------------------------------------
src/hotspot/share/classfile/javaClasses.cpp
 439     JNIEnv *env = thread->jni_environment();

Since env is no longer used on the next line, move this down to where
it is used, at line 444.

------------------------------------------------------------------------------
src/hotspot/share/classfile/verifier.cpp
 299   JNIEnv *env = thread->jni_environment();

env now seems to only be used at line 320.  Move this closer.

------------------------------------------------------------------------------
src/hotspot/share/prims/jni.cpp
 743     result = JNIHandles::make_local(THREAD, result_handle());

jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
previously it just used "thread". Maybe this change shouldn't be made?
Or can the other uses be changed to THREAD for consistency?

------------------------------------------------------------------------------
src/hotspot/share/prims/jvm.cpp

The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
instead of "THREAD", even though other places nearby are using
"THREAD".  That inconsistency is kind of unfortunate, but doesn't seem
easily avoidable.

------------------------------------------------------------------------------


From david.holmes at oracle.com  Mon Jul 20 05:53:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 15:53:37 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
Message-ID: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>

Hi Kim,

Thanks for looking at this.

Updated webrev at:

http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/

On 20/07/2020 3:22 pm, Kim Barrett wrote:
>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Subject line got truncated by accident ...
>>
>> On 20/07/2020 11:06 am, David Holmes wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>> This is a simple cleanup that touches files across a number of VM areas - hence the cross-post.
>>> Whilst working on a different JNI fix I noticed that in most cases in jni.cpp we were using the following form of make_local:
>>> JNIHandles::make_local(env, obj);
>>> and what that form does is first extract the thread from the JNIEnv:
>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>> return thread->active_handles()->allocate_handle(obj);
>>> but there is also another, faster, variant for when you already have the "thread":
>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>    return thread->active_handles()->allocate_handle(obj);
>>> }
>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread from the JNIEnv:
>>>      JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>> and further defined:
>>>      Thread* THREAD = thread;
>>> so we always already have direct access to the "thread" available (or indirect via TRAPS), and in fact we can end up removing the make_local(JNIEnv* env, oop obj) variant altogether.
>>> Along the way I spotted some related issues with unnecessary use of Thread::current() when it is already available from TRAPS, and some other cases where we extracted the JNIEnv from a thread only to later extract the thread from the JNIEnv.
>>> Testing: tiers 1 - 3
>>> Thanks,
>>> David
>>> -----
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/javaClasses.cpp
>   439     JNIEnv *env = thread->jni_environment();
> 
> Since env is no longer used on the next line, move this down to where
> it is used, at line 444.

Fixed.

> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/verifier.cpp
>   299   JNIEnv *env = thread->jni_environment();
> 
> env now seems to only be used at line 320.  Move this closer.

Fixed.

> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jni.cpp
>   743     result = JNIHandles::make_local(THREAD, result_handle());
> 
> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
> previously it just used "thread". Maybe this change shouldn't be made?
> Or can the other uses be changed to THREAD for consistency?

"thread" and "THREAD" are interchangeable for anything expecting a 
"Thread*" (and somewhat surprisingly a number of API's that only work 
for JavaThreads actually take a Thread*. :( ). I had choice between 
trying to be file-wide consistent with the make_local calls, versus 
local-code consistent, and used THREAD as it is available in both 
JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
"thread" for local consistency.

> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jvm.cpp
> 
> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
> instead of "THREAD", even though other places nearby are using
> "THREAD".  That inconsistency is kind of unfortunate, but doesn't seem
> easily avoidable.

Everything that uses THREAD in a JVM_ENTRY method can be changed to use 
"thread" instead. But I'm not sure it's a consistency worth pursuing at 
least as part of these changes (there are likely similar issues with 
most of the touched files).

Thanks,
David

> ------------------------------------------------------------------------------
> 

From kim.barrett at oracle.com  Mon Jul 20 06:15:13 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 20 Jul 2020 02:15:13 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <A4977786-905C-46C2-AE30-807D9A932080@oracle.com>

> On Jul 20, 2020, at 1:53 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Hi Kim,
> 
> Thanks for looking at this.
> 
> Updated webrev at:
> 
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/

Looks good.

> 
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
>> src/hotspot/share/prims/jni.cpp
>>  743     result = JNIHandles::make_local(THREAD, result_handle());
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
> 
> "thread" and "THREAD" are interchangeable for anything expecting a "Thread*" (and somewhat surprisingly a number of API's that only work for JavaThreads actually take a Thread*. :( ). I had choice between trying to be file-wide consistent with the make_local calls, versus local-code consistent, and used THREAD as it is available in both JNI_ENTRY and via TRAPS. But I can certainly make a local change to "thread" for local consistency.

I don?t feel strongly either way.  It just struck me as a little odd to have the mix in close proximity,
especially since I think consistently using either one might work in this function.  But being consistent
about make_local usage has something to be said for it too.

>> src/hotspot/share/prims/jvm.cpp
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".  That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
> 
> Everything that uses THREAD in a JVM_ENTRY method can be changed to use "thread" instead. But I'm not sure it's a consistency worth pursuing at least as part of these changes (there are likely similar issues with most of the touched files).

Yeah, it?s not really obvious whether to use THREAD or thread in some cases.
But I agree that addressing any inconsistencies there is mostly out of scope for
this change.


From david.holmes at oracle.com  Mon Jul 20 07:53:48 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 17:53:48 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <A4977786-905C-46C2-AE30-807D9A932080@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <A4977786-905C-46C2-AE30-807D9A932080@oracle.com>
Message-ID: <6e0d9af0-92f0-1eba-fc0a-22eebf008fe0@oracle.com>

Thanks Kim!

David

On 20/07/2020 4:15 pm, Kim Barrett wrote:
>> On Jul 20, 2020, at 1:53 AM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
> 
> Looks good.
> 
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
>>> src/hotspot/share/prims/jni.cpp
>>>   743     result = JNIHandles::make_local(THREAD, result_handle());
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a "Thread*" (and somewhat surprisingly a number of API's that only work for JavaThreads actually take a Thread*. :( ). I had choice between trying to be file-wide consistent with the make_local calls, versus local-code consistent, and used THREAD as it is available in both JNI_ENTRY and via TRAPS. But I can certainly make a local change to "thread" for local consistency.
> 
> I don?t feel strongly either way.  It just struck me as a little odd to have the mix in close proximity,
> especially since I think consistently using either one might work in this function.  But being consistent
> about make_local usage has something to be said for it too.
> 
>>> src/hotspot/share/prims/jvm.cpp
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".  That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to use "thread" instead. But I'm not sure it's a consistency worth pursuing at least as part of these changes (there are likely similar issues with most of the touched files).
> 
> Yeah, it?s not really obvious whether to use THREAD or thread in some cases.
> But I agree that addressing any inconsistencies there is mostly out of scope for
> this change.
> 

From tobias.hartmann at oracle.com  Mon Jul 20 08:23:58 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 20 Jul 2020 10:23:58 +0200
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
Message-ID: <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>

Hi,

On 08.07.20 10:26, Liu, Xin wrote:
> ControlIntrinsic/DisableIntrinsic in compiler directives are more complex. The matched directive is only parsed when hotspot attempts to compile the corresponding method.
> 
> I validate at that time and JVM will crash if it doesnot meet guarantee() statement.

I don't think a guarantee should be used here, i.e. the VM shouldn't crash but we should exit
gracefully with an error message. Isn't it possible to piggy-back on the error mechanism in
DirectivesParser?

> I added Method::external_name_short() which only returns the shorter method name in the form of  "classname::method".
> 
> Probably hotspot has had similar code, but I failed to discover. please let me know and I will remove it.

I would just use name_and_sig_as_C_string().

jvmFlagConstraintList.cpp:180/181
- Wrong indentation

jvmFlagConstraintsCompiler.cpp:388/400
- Maybe change the error message to "Unrecognized intrinsic detected in DisableIntrinsic [...]"

Best regards,
Tobias

From thomas.stuefe at gmail.com  Mon Jul 20 10:59:46 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 20 Jul 2020 12:59:46 +0200
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
Message-ID: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>

Hi,

could I please have reviews for this very trivial patch?

I found that gtest ignores invalid jvm options randomly since it relies on
an uninitialized variable.

issue: https://bugs.openjdk.java.net/browse/JDK-8249748
webrev:
http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/

Thanks, Thomas

From shade at redhat.com  Mon Jul 20 11:07:36 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 20 Jul 2020 13:07:36 +0200
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
Message-ID: <14e2a1b2-b3e9-2458-9f7b-2a6168d18ac9@redhat.com>

On 7/20/20 12:59 PM, Thomas St?fe wrote:
> I found that gtest ignores invalid jvm options randomly since it relies on
> an uninitialized variable.
> 
> issue: https://bugs.openjdk.java.net/browse/JDK-8249748
> webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/

Ouch. Looks good.

Since this is what InitializeJVM is doing, it looks trivial to me too.

-- 
Thanks,
-Aleksey


From thomas.stuefe at gmail.com  Mon Jul 20 11:23:15 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 20 Jul 2020 13:23:15 +0200
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <14e2a1b2-b3e9-2458-9f7b-2a6168d18ac9@redhat.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <14e2a1b2-b3e9-2458-9f7b-2a6168d18ac9@redhat.com>
Message-ID: <CAA-vtUx70xe9kZDJdJOt+_+xVQSX6KeqiepqLvy9ckh=WaPULA@mail.gmail.com>

Thanks Aleksey.

On Mon, Jul 20, 2020 at 1:07 PM Aleksey Shipilev <shade at redhat.com> wrote:

> On 7/20/20 12:59 PM, Thomas St?fe wrote:
> > I found that gtest ignores invalid jvm options randomly since it relies
> on
> > an uninitialized variable.
> >
> > issue: https://bugs.openjdk.java.net/browse/JDK-8249748
> > webrev:
> >
> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
>
> Ouch. Looks good.
>
> Since this is what InitializeJVM is doing, it looks trivial to me too.
>
> --
> Thanks,
> -Aleksey
>
>

From david.holmes at oracle.com  Mon Jul 20 12:54:12 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 22:54:12 +1000
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
Message-ID: <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>

Hi Thomas,

On 20/07/2020 8:59 pm, Thomas St?fe wrote:
> Hi,
> 
> could I please have reviews for this very trivial patch?
> 
> I found that gtest ignores invalid jvm options randomly since it relies on
> an uninitialized variable.

Is it really random? I would have expected it to be basically always 
non-zero and hence always ignore unknown options. I'm not at all clear 
if running of the gtests might rely on always ignoring 
unexpected/unknown flags - does it have the capability to distinguish 
product and non-product test runs?

I think this needs wider review from people familiar with how our gtest 
tests are run. (I have no idea - I never use it.)

Thanks,
David

> issue: https://bugs.openjdk.java.net/browse/JDK-8249748
> webrev:
> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
> 
> Thanks, Thomas
> 

From thomas.stuefe at gmail.com  Mon Jul 20 14:17:39 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 20 Jul 2020 16:17:39 +0200
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
Message-ID: <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>

Hi David,


On Mon, Jul 20, 2020 at 2:54 PM David Holmes <david.holmes at oracle.com>
wrote:

> Hi Thomas,
>
> On 20/07/2020 8:59 pm, Thomas St?fe wrote:
> > Hi,
> >
> > could I please have reviews for this very trivial patch?
> >
> > I found that gtest ignores invalid jvm options randomly since it relies
> on
> > an uninitialized variable.
>
> Is it really random? I would have expected it to be basically always
> non-zero and hence always ignore unknown options.


I remember seeing cmdline option errors from gtest, and since this coding
is very old, this must be either my failing memory or the random content of
the underlying uninitialized stack memory.


> I'm not at all clear
> if running of the gtests might rely on always ignoring
> unexpected/unknown flags -


It does not. googletest fishes its own arguments from the initial argument
vector and passes the rest off to the JVM. So the JVM ignoring or not
ignoring arguments will not change anything.


> does it have the capability to distinguish
> product and non-product test runs?
>

Gtests are hotspot coding and there are sections running with #ifdef
ASSERT, though I am not sure why that would matter?


>
> I think this needs wider review from people familiar with how our gtest
> tests are run. (I have no idea - I never use it.)
>

I am quite familiar with it since I use it almost daily. I really depend on
it being able to interpret JVM options.

In fact I am a bit dismayed by this bug since I write tons of tests for
JEP387 and was feeling very smug about the tests running through in all my
test scenarios, only to find that since days I keep running the same -
default - scenario over and over again :( No, that should be really fixed.

Gtestlauncher is called as part of the jtreg tests by the GtestWrapper, but
that does not pass any options to it.

But seriously, if there are tests which pass options to it, they probably
want those options to do something in the jvm, so ignoring them silently is
not good.

Thanks, Thomas


> Thanks,
> David
>
> > issue: https://bugs.openjdk.java.net/browse/JDK-8249748
> > webrev:
> >
> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
> >
> > Thanks, Thomas
> >
>

From daniel.daugherty at oracle.com  Mon Jul 20 17:07:10 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 20 Jul 2020 13:07:10 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <328fb322-5b14-968b-7b13-4b449a8d98fd@oracle.com>

On 7/20/20 1:53 AM, David Holmes wrote:
> Hi Kim,
>
> Thanks for looking at this.
>
> Updated webrev at:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/

I like this cleanup very much!


src/hotspot/share/classfile/javaClasses.cpp
 ??? No comments.

src/hotspot/share/classfile/verifier.cpp
 ??? L298: ? JavaThread* thread = (JavaThread*)THREAD;
 ??? L307: ? ResourceMark rm(THREAD);
 ??????? Since we've gone to the trouble of creating the 'thread' variable,
 ??????? I would prefer it to be used instead of THREAD where possible.

src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
 ??? L1021: ? HandleMark hm;
 ??????? Can this be 'hm(THREAD)'? (Not your problem, but while you're
 ??????? in that file?)

src/hotspot/share/prims/jni.cpp
 ??? No comments.

src/hotspot/share/prims/jvm.cpp
 ??? L140: ? ResourceMark rm;
 ??????? Can this be 'rm(THREAD)'? (Not your problem, but while you're
 ??????? in that file?)

 ??? L611: ? Handle stackStream_h(THREAD, 
JNIHandles::resolve_non_null(stackStream));
 ??? L617: ? objArrayHandle frames_array_h(THREAD, fa);
 ??? L626: ? return JNIHandles::make_local(THREAD, result);
 ??????? Since we've gone to the trouble of creating the 'jt' variable,
 ??????? I would prefer it to be used instead of THREAD where possible.

 ??? L767: ? vframeStream vfst(thread);
 ??? L788???????? return (jclass) JNIHandles::make_local(THREAD, 
m->method_holder()->java_mirror());
 ??????? Can we use 'thread' on L788? (preferred)
 ??????? Can we use 'THREAD' on L767? (less preferred)

 ??? L949: ? ResourceMark rm(THREAD);
 ??? L951: ? Handle class_loader (THREAD, JNIHandles::resolve(loader));
 ??? L955: ?????????????????????????? THREAD);
 ??? L957: ? Handle protection_domain (THREAD, JNIHandles::resolve(pd));
 ??? L968: ? return (jclass) JNIHandles::make_local(THREAD, 
k->java_mirror());
 ??????? Since we've gone to the trouble of creating the 'jt' variable,
 ??????? I would prefer it to be used instead of THREAD where possible.

 ??? L986: ? JavaThread* jt = (JavaThread*) THREAD;
 ??????? This 'jt' is unused and can be deleted (Not your problem, but 
while you're
 ??????? in that file?)

 ??? L1154: ? while (*p != '\0') {
 ??? L1155: ????? if (*p == '.') {
 ??? L1156: ????????? *p = '/';
 ??? L1157: ????? }
 ??? L1158: ????? p++;
 ??????? Nit - the indents are wrong on L1155-58. (Not your problem, but 
while you're
 ??????? in that file?)

 ??? L1389: ? ResourceMark rm(THREAD);
 ??? L1446: ??? return JNIHandles::make_local(THREAD, result);
 ??? L1460: ? return JNIHandles::make_local(THREAD, result);
 ??????? Can we use 'thread' on L1389? (preferred) And then the line you
 ??????? touched could also be 'thread' and we'll be consistent in this
 ??????? function...

 ??? L3287: ? oop jthread = thread->threadObj();
 ??? L3288: ? assert (thread != NULL, "no current thread!");
 ??????? I think the assert is wrong. It should be:

 ??????????? assert(jthread != NULL, "no current thread!");

 ??????? If 'thread == NULL', then we would have crashed at L3287.
 ??????? Also notice that I deleted the extra ' ' before '('. (Not
 ??????? your problem, but while you're in that file?)

 ??? L3289: ? return JNIHandles::make_local(THREAD, jthread);
 ??????? Can you use 'thread' instead of 'THREAD' here for consistency?

 ??? L3681: ??? method_handle = Handle(THREAD, JNIHandles::resolve(method));
 ??? L3682: ??? Handle receiver(THREAD, JNIHandles::resolve(obj));
 ??? L3683: ??? objArrayHandle args(THREAD, 
objArrayOop(JNIHandles::resolve(args0)));
 ??? L3685: ??? jobject res = JNIHandles::make_local(THREAD, result);
 ??????? Can you use 'thread' instead of 'THREAD' here for consistency?

 ??? L3705: ? objArrayHandle args(THREAD, 
objArrayOop(JNIHandles::resolve(args0)));
 ??? L3707?? jobject res = JNIHandles::make_local(THREAD, result);
 ??????? Can you use 'thread' instead of 'THREAD' here for consistency?

src/hotspot/share/prims/methodHandles.cpp
 ??? No comments.

src/hotspot/share/prims/methodHandles.hpp
 ??? No comments.

src/hotspot/share/prims/unsafe.cpp
 ??? No comments.

src/hotspot/share/prims/whitebox.cpp
 ??? No comments.

src/hotspot/share/runtime/jniHandles.cpp
 ??? No comments.

src/hotspot/share/runtime/jniHandles.hpp
 ??? No comments.

src/hotspot/share/services/management.cpp
 ??? No comments.


None of my comments above are "must do". If you choose to make the
changes, a new webrev isn't required, but would be useful for a
sanity check.

Thumbs up.

Dan


>
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM 
>>>> areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>> in jni.cpp we were using the following form of make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already 
>>>> have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>> from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available 
>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of 
>>>> Thread::current() when it is already available from TRAPS, and some 
>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>> later extract the thread from the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
>
> "thread" and "THREAD" are interchangeable for anything expecting a 
> "Thread*" (and somewhat surprisingly a number of API's that only work 
> for JavaThreads actually take a Thread*. :( ). I had choice between 
> trying to be file-wide consistent with the make_local calls, versus 
> local-code consistent, and used THREAD as it is available in both 
> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
> "thread" for local consistency.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
>
> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
> use "thread" instead. But I'm not sure it's a consistency worth 
> pursuing at least as part of these changes (there are likely similar 
> issues with most of the touched files).
>
> Thanks,
> David
>
>> ------------------------------------------------------------------------------ 
>>
>>


From vladimir.kozlov at oracle.com  Mon Jul 20 20:20:26 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 20 Jul 2020 13:20:26 -0700
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <5e298ff3-6dc1-c4fa-4545-1fc26d7379b5@oracle.com>

Hi David,

Changes look good.

On 7/20/20 10:07 AM, Daniel D. Daugherty wrote:
 > src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
 >      L1021:   HandleMark hm;
 > Can this be 'hm(THREAD)'? (Not your problem, but while you're in that file?)

There are several cases like this in jvmciCompilerToVM.cpp and may be in other places.
I think it should be done as separate clean up.

Thanks,
Vladimir

On 7/19/20 10:53 PM, David Holmes wrote:
> Hi Kim,
> 
> Thanks for looking at this.
> 
> Updated webrev at:
> 
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
> 
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases in jni.cpp we were using the following form of 
>>>> make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted 
>>>> the thread from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available (or indirect via TRAPS), and in fact we can end up 
>>>> removing the make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of Thread::current() when it is already available 
>>>> from TRAPS, and some other cases where we extracted the JNIEnv from a thread only to later extract the thread from 
>>>> the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
> 
> Fixed.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
> 
> Fixed.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
> 
> "thread" and "THREAD" are interchangeable for anything expecting a "Thread*" (and somewhat surprisingly a number of 
> API's that only work for JavaThreads actually take a Thread*. :( ). I had choice between trying to be file-wide 
> consistent with the make_local calls, versus local-code consistent, and used THREAD as it is available in both JNI_ENTRY 
> and via TRAPS. But I can certainly make a local change to "thread" for local consistency.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
> 
> Everything that uses THREAD in a JVM_ENTRY method can be changed to use "thread" instead. But I'm not sure it's a 
> consistency worth pursuing at least as part of these changes (there are likely similar issues with most of the touched 
> files).
> 
> Thanks,
> David
> 
>> ------------------------------------------------------------------------------
>>

From david.holmes at oracle.com  Mon Jul 20 22:06:05 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Jul 2020 08:06:05 +1000
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
 <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
Message-ID: <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>

Hi Thomas,

On 21/07/2020 12:17 am, Thomas St?fe wrote:
> Hi David,
> 
> 
> On Mon, Jul 20, 2020 at 2:54 PM David Holmes <david.holmes at oracle.com 
> <mailto:david.holmes at oracle.com>> wrote:
> 
>     Hi Thomas,
> 
>     On 20/07/2020 8:59 pm, Thomas St?fe wrote:
>      > Hi,
>      >
>      > could I please have reviews for this very trivial patch?
>      >
>      > I found that gtest ignores invalid jvm options randomly since it
>     relies on
>      > an uninitialized variable.
> 
>     Is it really random? I would have expected it to be basically always
>     non-zero and hence always ignore unknown options. 
> 
> 
> I remember seeing cmdline option errors from gtest, and since this 
> coding is very old, this must be either my failing memory or the random 
> content of the underlying uninitialized stack memory.

Yes but there is only one value of that uninitialized memory (zero) that 
would cause unrecognised options to not be ignored.

>     I'm not at all clear
>     if running of the gtests might rely on always ignoring
>     unexpected/unknown flags - 
> 
> 
> It does not. googletest fishes its own arguments from the initial 
> argument vector and passes the rest off to the JVM. So the JVM ignoring 
> or not ignoring arguments will not change anything.
> 
>     does it have the capability to distinguish
>     product and non-product test runs?
> 
> 
> Gtests are hotspot?coding and there are sections running?with #ifdef 
> ASSERT, though I am not sure why that would matter?

For the same reason we sometimes need to @require that a VM is a debug 
VM with a jtreg test - because the flag passed in is a non-product flag. 
Also when we run test suites we often pass 
-XX:+IgnoreUnrecognisedVMOptions, again so that non-product flags don't 
cause a failure with release bits.

> 
>     I think this needs wider review from people familiar with how our gtest
>     tests are run. (I have no idea - I never use it.)
> 
> 
> I am quite familiar with it since I use it almost daily. I really depend 
> on it being able to interpret JVM options.
> 
> In fact I am a bit dismayed by this bug since I write tons of tests for 
> JEP387 and was feeling very smug about the tests running through in all 
> my test scenarios, only to find that since days I keep running the same 
> - default?- scenario over and over again :( No, that should be really fixed.

I don't understand what you mean. This setting should only affect what 
happens with unrecognised "bad" arguments (as per your subject).

> Gtestlauncher is called as part of the jtreg tests by the GtestWrapper, 
> but that does not pass any options to it.

Pardon my ignorance but how does one specify options for a gtest then?

> But seriously, if there are tests which?pass options to it, they 
> probably want those options to do something in the?jvm, so ignoring them 
> silently is not good.

Again only "bad" options are ignored - which should only impact use of 
non-product flags with product bits.

David
-----

> Thanks, Thomas
> 
> 
>     Thanks,
>     David
> 
>      > issue: https://bugs.openjdk.java.net/browse/JDK-8249748
>      > webrev:
>      >
>     http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
>      >
>      > Thanks, Thomas
>      >
> 

From vladimir.kozlov at oracle.com  Mon Jul 20 22:37:11 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 20 Jul 2020 15:37:11 -0700
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <BYAPR11MB37826BC619E8ECC8BF62C711A77B0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
 <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>
 <BYAPR11MB37826BC619E8ECC8BF62C711A77B0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <d1d2cc32-6e80-e76e-0431-9d87c665c6c4@oracle.com>

Looks good.

Passed my tier1 testing.

Thanks,
Vladimir

On 7/20/20 10:12 AM, Ivanov, Vladimir A wrote:
> HI,
> The updated patch available as http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.03/
> It use the ?fgets? instead of ?getline? to use local memory.
> The tier1 tests passed on the release and fastdebug builds on Linux and fastdebug builds on MacOS systems.
> Testing results same for patched and non-patched builds.
> 
> Thanks, Vladmir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Friday, July 17, 2020 10:25 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
> Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86
> 
> Oh, sorry, you are right :(
> 
> I was under the assumption you wanted to call os::cpu_microcode_revision() directly from within VMError::report(). During initialization using c-heap like this should not be a problem and you can forget about 9/10ths of what I wrote, sorry.
> 
> In that case your original variant is fine, my only suggestion would be to clearly mark the free as ::free() with a comment to prevent someone from correcting it to os::free.
> 
> Thank you,
> 
> Thomas
> 
> 
> 
> On Sat, Jul 18, 2020 at 7:08 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Hi,
> seems, this info created during initialization phase. Is it correct? Collect or parse common info at the crash point usually not a good idea. During initialization usage of the c-heap not a problem.
> The ?::free? work OK here. At least tier1 test produce same results for patched and non-patched builds. But these tests not generates real case for hs_err files.
> It looks like 2k byte array enough for the one record for CPU from cpuinfo file. Will update code to use local buffer.
> 
> Thanks, Vladimir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
> Sent: Friday, July 17, 2020 9:42 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
> Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86
> 
> Hi,
> 
> yes, you must use the raw free here (for the same reason we cannot pass in an os::malloc() allocated buffer to getline, since if it were to resize it would use raw ::realloc() internally and crash the same way).
> 
> But as I wrote in my first mail to the original thread, I would not use c-heap memory at all, since this function is used during crash reporting in the signal handler and the c-heap may be corrupted.
> 
> It the max line length of /proc/cpu can be reliably predicted (so that getline wont realloc()) I would pass a stack allocated buffer into getline. If not, I would not use getline() at all but rewrite this, probably using fgets().
> 
> Cheers, Thomas
> 
> 
> 
> 
> On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.
> 
>   Thanks, Vladimir
> 
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>
> Sent: Friday, July 17, 2020 4:17 PM
> To: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>; Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86
> 
> I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.
> 
> Someone from Runtime may suggest what is the best for this case.
> 
> Thanks,
> Vladimir K
> 
> [1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792
> 
> On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
>> I updated subject to our formal review request format (JDK version, RFE's id and subject).
>>
>> I moved RFE to runtime group as Thomas said:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8249672
>>
>> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
>>
>> #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
>> [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
>> const+0xeb
>>
>> V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
>> const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
>> [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
>> os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
>> VM_Version::get_processor_features()+0x76c
>> V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
>> [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
>> init_globals()+0x55 V  [libjvm.so+0x16dde63]
>> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
>>
>>
>> Regards,
>> Vladimir K
>>
>> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>>> Hi Vladimir,
>>>
>>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>
>>>>>    +#if defined(IA32) || defined(AMD64)
>>>>>
>>>>> Is that not synonymous with x86?
>>>>
>>>> This patter was copied from the method ?print_model_name_and_flags?
>>>> (file os/linux/os_linux.cpp).
>>>>
>>>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>>>> ?template? for the new method.
>>>>
>>>> It is better to use one pattern to work with exactly same file but
>>>> in general you are right.
>>>>
>>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>>
>>>> #if defined(IA32) || defined(AMD64)
>>>>
>>>> #define X86
>>>>
>>>> #define X86_ONLY(code) code
>>>>
>>>> #define NOT_X86(code)
>>>>
>>>>
>>>>
>>>> The question here: could I delete this ?ifdefs? while this method
>>>> should work on x86 only?
>>>>
>>>>
>>>>
>>>
>>> os_linux_x86.cpp is compiled for x86 platforms only, whereas
>>> os_linux.cpp is shared among all architectures.
>>>
>>> So, in the former you do not need to exclude non-x86 architectures.
>>>
>>> Cheers, Thomas
>>>
>>>
>>>> Thanks, Vladimir
>>>>
>>>>
>>>>
>>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev
>>>> runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
>>>> *Cc:* hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
>>>> *Subject:* Re: add microcode version to the hs_err files
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
>>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>>> wrote:
>>>>
>>>> Hi Vladimir,
>>>>
>>>>
>>>>
>>>> I think this would be more suited to hotspot-runtime.
>>>>
>>>>
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>>
>>>>
>>>>
>>>> +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>>
>>>>
>>>>
>>>> +    while ((read = getline(&line, &len, fp)) != -1) {
>>>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>>>> +        char* rev = strchr(line, ':');
>>>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>>> +        break;
>>>> +      }
>>>> +    }
>>>> +    free(line);
>>>>
>>>>
>>>>
>>>> Not sure this works as intended. At the first call to getline() it
>>>> will allocate a line buffer for you and return it. That buffer will
>>>> be as large as the first line you happen to read. You then pass that
>>>> same buffer into getline to fetch the next lines, but what if those
>>>> are longer than the first?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Forget that point, getline calls realloc() on the line buffer to
>>>> resize it, so this should be okay.
>>>>
>>>>
>>>>
>>>> Thanks, Thomas
>>>>
>>>>
>>>>
>>>> But anyway it would be better to pass a simple caller provided
>>>> buffer in - stack allocated. Since this function is called at crash
>>>> time and the C heap could be corrupted.
>>>>
>>>>
>>>>
>>>> Cheers, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> could you please review the patch
>>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>>
>>>> This patch add the microcode version for different OSes that may be
>>>> useful in the issue resolution process.
>>>>
>>>>
>>>>
>>>> The reported microcode version for different OSes loos as:
>>>>
>>>>
>>>>
>>>> Linux (RHEL7.7):
>>>>
>>>> # cat hs_err_pid251046.log |grep microc
>>>>
>>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
>>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
>>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
>>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
>>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>>
>>>>
>>>>
>>>> Windows (Win10, v1809):
>>>>
>>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
>>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
>>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
>>>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>>
>>>>
>>>>
>>>> MacOS (Darwin):
>>>>
>>>> $ cat hs_err_pid95187.log |grep microc
>>>>
>>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
>>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
>>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
>>>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>>
>>>>
>>>>
>>>> Thanks, Vladimir
>>>>
>>>>
>>>>     Thanks, Vladimir
>>>>
>>>>

From ioi.lam at oracle.com  Tue Jul 21 00:12:31 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 20 Jul 2020 17:12:31 -0700
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes _body[0]
Message-ID: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>

Hi please review this very simple fix:

diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
--- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 -0700
+++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 -0700
@@ -51,8 +51,11 @@
 ?Symbol::Symbol(const u1* name, int length, int refcount) {
 ?? _hash_and_refcount = pack_hash_and_refcount((short)os::random(), 
refcount);
 ?? _length = length;
-? _body[0] = 0;? // in case length == 0
 ?? memcpy(_body, name, length);
+? // For symbols of length 0 and 1: _body[0] (and _body[1]) are 
uninitialized and may
+? // contain random values, which will only be read by 
Symbol::identity_hash(),
+? // which would tolerate such randomness. These values never change 
during the lifetime
+? // of the Symbol.
 ?}


Passed hs tiers 1/2. Running tiers 3/4 now.

Thanks
- Ioi

From david.holmes at oracle.com  Tue Jul 21 02:36:46 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Jul 2020 12:36:46 +1000
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
Message-ID: <582bfa41-c9e9-de73-d985-385d0c1ce9ae@oracle.com>

Hi Ioi,

On 21/07/2020 10:12 am, Ioi Lam wrote:
> Hi please review this very simple fix:
> 
> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 -0700
> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 -0700
> @@ -51,8 +51,11 @@
>  ?Symbol::Symbol(const u1* name, int length, int refcount) {
>  ?? _hash_and_refcount = pack_hash_and_refcount((short)os::random(), 
> refcount);
>  ?? _length = length;
> -? _body[0] = 0;? // in case length == 0
>  ?? memcpy(_body, name, length);
> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are 
> uninitialized and may

Can we ever have a Symbol of length zero?

If the Symbol name is length 1 then surely _body[0] is initialized to 
the single character of that name?

The change seems harmless given a zero length symbol is meaningless, but 
the commentary just confuses things to me.

Thanks,
David
-----

> +? // contain random values, which will only be read by 
> Symbol::identity_hash(),
> +? // which would tolerate such randomness. These values never change 
> during the lifetime
> +? // of the Symbol.
>  ?}
> 
> 
> Passed hs tiers 1/2. Running tiers 3/4 now.
> 
> Thanks
> - Ioi

From ioi.lam at oracle.com  Tue Jul 21 02:50:49 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 20 Jul 2020 19:50:49 -0700
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <582bfa41-c9e9-de73-d985-385d0c1ce9ae@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <582bfa41-c9e9-de73-d985-385d0c1ce9ae@oracle.com>
Message-ID: <4ef694cd-adc3-ed1b-9835-b7ce9cf90b9b@oracle.com>


On 7/20/20 7:36 PM, David Holmes wrote:
> Hi Ioi,
>
> On 21/07/2020 10:12 am, Ioi Lam wrote:
>> Hi please review this very simple fix:
>>
>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 
>> -0700
>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 
>> -0700
>> @@ -51,8 +51,11 @@
>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>> ??? _hash_and_refcount = pack_hash_and_refcount((short)os::random(), 
>> refcount);
>> ??? _length = length;
>> -? _body[0] = 0;? // in case length == 0
>> ??? memcpy(_body, name, length);
>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are 
>> uninitialized and may
>
> Can we ever have a Symbol of length zero?
>
> If the Symbol name is length 1 then surely _body[0] is initialized to 
> the single character of that name?
>
> The change seems harmless given a zero length symbol is meaningless, 
> but the commentary just confuses things to me.
>

Hi David,

We can have a valid Symbol of length 0. All UTF8 constants in 
classfiles, including the literal string "", are represented as Symbols.

How about

// Random, uninitialized values may appear in _body[0] and _body[1]
// for Symbols of length 0 and 1. These random values never change during
// the lifetime of the Symbol, and are read only by Symbol::identity_hash(),
// which would tolerate such randomness.


Thanks
- Ioi
> Thanks,
> David
> -----
>
>> +? // contain random values, which will only be read by 
>> Symbol::identity_hash(),
>> +? // which would tolerate such randomness. These values never change 
>> during the lifetime
>> +? // of the Symbol.
>> ??}
>>
>>
>> Passed hs tiers 1/2. Running tiers 3/4 now.
>>
>> Thanks
>> - Ioi


From david.holmes at oracle.com  Tue Jul 21 02:57:49 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Jul 2020 12:57:49 +1000
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <4ef694cd-adc3-ed1b-9835-b7ce9cf90b9b@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <582bfa41-c9e9-de73-d985-385d0c1ce9ae@oracle.com>
 <4ef694cd-adc3-ed1b-9835-b7ce9cf90b9b@oracle.com>
Message-ID: <9894fafd-93bb-1c35-9b92-11f5f682eda2@oracle.com>

Hi Ioi,

On 21/07/2020 12:50 pm, Ioi Lam wrote:
> 
> 
> On 7/20/20 7:36 PM, David Holmes wrote:
>> Hi Ioi,
>>
>> On 21/07/2020 10:12 am, Ioi Lam wrote:
>>> Hi please review this very simple fix:
>>>
>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 
>>> -0700
>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 
>>> -0700
>>> @@ -51,8 +51,11 @@
>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>> ??? _hash_and_refcount = pack_hash_and_refcount((short)os::random(), 
>>> refcount);
>>> ??? _length = length;
>>> -? _body[0] = 0;? // in case length == 0
>>> ??? memcpy(_body, name, length);
>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are 
>>> uninitialized and may
>>
>> Can we ever have a Symbol of length zero?
>>
>> If the Symbol name is length 1 then surely _body[0] is initialized to 
>> the single character of that name?
>>
>> The change seems harmless given a zero length symbol is meaningless, 
>> but the commentary just confuses things to me.
>>
> 
> Hi David,
> 
> We can have a valid Symbol of length 0. All UTF8 constants in 
> classfiles, including the literal string "", are represented as Symbols.

Ah I see.
> How about
> 
> // Random, uninitialized values may appear in _body[0] and _body[1]
> // for Symbols of length 0 and 1. These random values never change during
> // the lifetime of the Symbol, and are read only by 
> Symbol::identity_hash(),

What if as_quoted_ascii() were called on the zero-length symbol? That 
reads _body[0] unconditionally. The same for use of base(). ??

Thanks,
David
-----

> // which would tolerate such randomness.
> 
> 
> Thanks
> - Ioi
>> Thanks,
>> David
>> -----
>>
>>> +? // contain random values, which will only be read by 
>>> Symbol::identity_hash(),
>>> +? // which would tolerate such randomness. These values never change 
>>> during the lifetime
>>> +? // of the Symbol.
>>> ??}
>>>
>>>
>>> Passed hs tiers 1/2. Running tiers 3/4 now.
>>>
>>> Thanks
>>> - Ioi
> 

From ioi.lam at oracle.com  Tue Jul 21 03:26:51 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 20 Jul 2020 20:26:51 -0700
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <9894fafd-93bb-1c35-9b92-11f5f682eda2@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <582bfa41-c9e9-de73-d985-385d0c1ce9ae@oracle.com>
 <4ef694cd-adc3-ed1b-9835-b7ce9cf90b9b@oracle.com>
 <9894fafd-93bb-1c35-9b92-11f5f682eda2@oracle.com>
Message-ID: <6ee76d0a-4ca3-1ae7-f47f-e403ce90e969@oracle.com>


On 7/20/20 7:57 PM, David Holmes wrote:
> Hi Ioi,
>
> On 21/07/2020 12:50 pm, Ioi Lam wrote:
>>
>>
>> On 7/20/20 7:36 PM, David Holmes wrote:
>>> Hi Ioi,
>>>
>>> On 21/07/2020 10:12 am, Ioi Lam wrote:
>>>> Hi please review this very simple fix:
>>>>
>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 
>>>> -0700
>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 
>>>> -0700
>>>> @@ -51,8 +51,11 @@
>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>> ??? _hash_and_refcount = 
>>>> pack_hash_and_refcount((short)os::random(), refcount);
>>>> ??? _length = length;
>>>> -? _body[0] = 0;? // in case length == 0
>>>> ??? memcpy(_body, name, length);
>>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are 
>>>> uninitialized and may
>>>
>>> Can we ever have a Symbol of length zero?
>>>
>>> If the Symbol name is length 1 then surely _body[0] is initialized 
>>> to the single character of that name?
>>>
>>> The change seems harmless given a zero length symbol is meaningless, 
>>> but the commentary just confuses things to me.
>>>
>>
>> Hi David,
>>
>> We can have a valid Symbol of length 0. All UTF8 constants in 
>> classfiles, including the literal string "", are represented as Symbols.
>
> Ah I see.
>> How about
>>
>> // Random, uninitialized values may appear in _body[0] and _body[1]
>> // for Symbols of length 0 and 1. These random values never change 
>> during
>> // the lifetime of the Symbol, and are read only by 
>> Symbol::identity_hash(),
>
> What if as_quoted_ascii() were called on the zero-length symbol? That 
> reads _body[0] unconditionally. The same for use of base(). ??

Hi David,

as_quoted_ascii() and base() do not read the contents of _body[0]. They 
take the address of _body[0]:

char* Symbol::as_quoted_ascii() const {
 ? const char *ptr = (const char *)&_body[0];
 ? int quoted_length = UTF8::quoted_ascii_length(ptr, utf8_length());
 ? char* result = NEW_RESOURCE_ARRAY(char, quoted_length + 1);
 ? UTF8::as_quoted_ascii(ptr, utf8_length(), result, quoted_length + 1);
 ? return result;
}

 ? const u1* base() const { return &_body[0]; }

Thanks
- Ioi


>
> Thanks,
> David
> -----
>
>> // which would tolerate such randomness.
>>
>>
>> Thanks
>> - Ioi
>>> Thanks,
>>> David
>>> -----
>>>
>>>> +? // contain random values, which will only be read by 
>>>> Symbol::identity_hash(),
>>>> +? // which would tolerate such randomness. These values never 
>>>> change during the lifetime
>>>> +? // of the Symbol.
>>>> ??}
>>>>
>>>>
>>>> Passed hs tiers 1/2. Running tiers 3/4 now.
>>>>
>>>> Thanks
>>>> - Ioi
>>


From david.holmes at oracle.com  Tue Jul 21 03:59:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Jul 2020 13:59:37 +1000
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <6ee76d0a-4ca3-1ae7-f47f-e403ce90e969@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <582bfa41-c9e9-de73-d985-385d0c1ce9ae@oracle.com>
 <4ef694cd-adc3-ed1b-9835-b7ce9cf90b9b@oracle.com>
 <9894fafd-93bb-1c35-9b92-11f5f682eda2@oracle.com>
 <6ee76d0a-4ca3-1ae7-f47f-e403ce90e969@oracle.com>
Message-ID: <7ee62f02-c354-8a9c-96ee-4d020292c435@oracle.com>

On 21/07/2020 1:26 pm, Ioi Lam wrote:
> On 7/20/20 7:57 PM, David Holmes wrote:
>> Hi Ioi,
>>
>> On 21/07/2020 12:50 pm, Ioi Lam wrote:
>>>
>>>
>>> On 7/20/20 7:36 PM, David Holmes wrote:
>>>> Hi Ioi,
>>>>
>>>> On 21/07/2020 10:12 am, Ioi Lam wrote:
>>>>> Hi please review this very simple fix:
>>>>>
>>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 
>>>>> -0700
>>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 
>>>>> -0700
>>>>> @@ -51,8 +51,11 @@
>>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>> ??? _hash_and_refcount = 
>>>>> pack_hash_and_refcount((short)os::random(), refcount);
>>>>> ??? _length = length;
>>>>> -? _body[0] = 0;? // in case length == 0
>>>>> ??? memcpy(_body, name, length);
>>>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are 
>>>>> uninitialized and may
>>>>
>>>> Can we ever have a Symbol of length zero?
>>>>
>>>> If the Symbol name is length 1 then surely _body[0] is initialized 
>>>> to the single character of that name?
>>>>
>>>> The change seems harmless given a zero length symbol is meaningless, 
>>>> but the commentary just confuses things to me.
>>>>
>>>
>>> Hi David,
>>>
>>> We can have a valid Symbol of length 0. All UTF8 constants in 
>>> classfiles, including the literal string "", are represented as Symbols.
>>
>> Ah I see.
>>> How about
>>>
>>> // Random, uninitialized values may appear in _body[0] and _body[1]
>>> // for Symbols of length 0 and 1. These random values never change 
>>> during
>>> // the lifetime of the Symbol, and are read only by 
>>> Symbol::identity_hash(),
>>
>> What if as_quoted_ascii() were called on the zero-length symbol? That 
>> reads _body[0] unconditionally. The same for use of base(). ??
> 
> Hi David,
> 
> as_quoted_ascii() and base() do not read the contents of _body[0]. They 
> take the address of _body[0]:
> 
> char* Symbol::as_quoted_ascii() const {
>  ? const char *ptr = (const char *)&_body[0];

Yep misread that one - sorry.

>  ? int quoted_length = UTF8::quoted_ascii_length(ptr, utf8_length());
>  ? char* result = NEW_RESOURCE_ARRAY(char, quoted_length + 1);
>  ? UTF8::as_quoted_ascii(ptr, utf8_length(), result, quoted_length + 1);
>  ? return result;
> }
> 
>  ? const u1* base() const { return &_body[0]; }

Sure but what about the callers of base()? Any indexing off base() 
should be guarded by a length() check of course, but it's hard to verify 
that.

How about:

// the lifetime of the Symbol, and should only be read by 
Symbol::identity_hash(),

Thanks,
David
-----

> Thanks
> - Ioi
> 
> 
> 
>>
>> Thanks,
>> David
>> -----
>>
>>> // which would tolerate such randomness.
>>>
>>>
>>> Thanks
>>> - Ioi
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> +? // contain random values, which will only be read by 
>>>>> Symbol::identity_hash(),
>>>>> +? // which would tolerate such randomness. These values never 
>>>>> change during the lifetime
>>>>> +? // of the Symbol.
>>>>> ??}
>>>>>
>>>>>
>>>>> Passed hs tiers 1/2. Running tiers 3/4 now.
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>
> 

From david.holmes at oracle.com  Tue Jul 21 04:02:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Jul 2020 14:02:37 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <5e298ff3-6dc1-c4fa-4545-1fc26d7379b5@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <5e298ff3-6dc1-c4fa-4545-1fc26d7379b5@oracle.com>
Message-ID: <ed6b90a1-82da-fb54-af1f-c7db761e2273@oracle.com>

Hi Vladimir,

On 21/07/2020 6:20 am, Vladimir Kozlov wrote:
> Hi David,
> 
> Changes look good.

Thanks for taking a look!

> On 7/20/20 10:07 AM, Daniel D. Daugherty wrote:
>  > src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>  >????? L1021:?? HandleMark hm;
>  > Can this be 'hm(THREAD)'? (Not your problem, but while you're in that 
> file?)
> 
> There are several cases like this in jvmciCompilerToVM.cpp and may be in 
> other places.
> I think it should be done as separate clean up.

I think so too :) otherwise this just keeps growing and growing.

Thanks,
David
-----

> Thanks,
> Vladimir
> 
> On 7/19/20 10:53 PM, David Holmes wrote:
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>>> wrote:
>>>>
>>>> Subject line got truncated by accident ...
>>>>
>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>> areas - hence the cross-post.
>>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>>> in jni.cpp we were using the following form of make_local:
>>>>> JNIHandles::make_local(env, obj);
>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>> but there is also another, faster, variant for when you already 
>>>>> have the "thread":
>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>> }
>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>> from the JNIEnv:
>>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>>> and further defined:
>>>>> ???? Thread* THREAD = thread;
>>>>> so we always already have direct access to the "thread" available 
>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>> Along the way I spotted some related issues with unnecessary use of 
>>>>> Thread::current() when it is already available from TRAPS, and some 
>>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>>> later extract the thread from the JNIEnv.
>>>>> Testing: tiers 1 - 3
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/javaClasses.cpp
>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>
>>> Since env is no longer used on the next line, move this down to where
>>> it is used, at line 444.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/verifier.cpp
>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>
>>> env now seems to only be used at line 320.? Move this closer.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a 
>> "Thread*" (and somewhat surprisingly a number of API's that only work 
>> for JavaThreads actually take a Thread*. :( ). I had choice between 
>> trying to be file-wide consistent with the make_local calls, versus 
>> local-code consistent, and used THREAD as it is available in both 
>> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
>> "thread" for local consistency.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvm.cpp
>>>
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>> use "thread" instead. But I'm not sure it's a consistency worth 
>> pursuing at least as part of these changes (there are likely similar 
>> issues with most of the touched files).
>>
>> Thanks,
>> David
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>

From fweimer at redhat.com  Tue Jul 21 06:12:41 2020
From: fweimer at redhat.com (Florian Weimer)
Date: Tue, 21 Jul 2020 08:12:41 +0200
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com> (Ioi Lam's
 message of "Mon, 20 Jul 2020 17:12:31 -0700")
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
Message-ID: <87blk9tmzq.fsf@oldenburg2.str.redhat.com>

* Ioi Lam:

> Hi please review this very simple fix:
>
> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 -0700
> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 -0700
> @@ -51,8 +51,11 @@
> ?Symbol::Symbol(const u1* name, int length, int refcount) {
> ?? _hash_and_refcount = pack_hash_and_refcount((short)os::random(),
> refcount);
> ?? _length = length;
> -? _body[0] = 0;? // in case length == 0
> ?? memcpy(_body, name, length);
> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
> uninitialized and may
> +? // contain random values, which will only be read by
> Symbol::identity_hash(),
> +? // which would tolerate such randomness. These values never change
> during the lifetime
> +? // of the Symbol.
> ?}

Won't this still trip memory debuggers?  Symbol::identity_hash() implies
that the result is eventually used in a conditional operation (a hash
comparison perhaps).  If it's possible one day to run Hotspot under
valgrind, this would result in false positives.

Thanks,
Florian


From ioi.lam at oracle.com  Tue Jul 21 06:14:00 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 20 Jul 2020 23:14:00 -0700
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <7ee62f02-c354-8a9c-96ee-4d020292c435@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <582bfa41-c9e9-de73-d985-385d0c1ce9ae@oracle.com>
 <4ef694cd-adc3-ed1b-9835-b7ce9cf90b9b@oracle.com>
 <9894fafd-93bb-1c35-9b92-11f5f682eda2@oracle.com>
 <6ee76d0a-4ca3-1ae7-f47f-e403ce90e969@oracle.com>
 <7ee62f02-c354-8a9c-96ee-4d020292c435@oracle.com>
Message-ID: <410bbb95-64ec-0167-fed0-71da6a4c534e@oracle.com>


On 7/20/20 8:59 PM, David Holmes wrote:
> On 21/07/2020 1:26 pm, Ioi Lam wrote:
>> On 7/20/20 7:57 PM, David Holmes wrote:
>>> Hi Ioi,
>>>
>>> On 21/07/2020 12:50 pm, Ioi Lam wrote:
>>>>
>>>>
>>>> On 7/20/20 7:36 PM, David Holmes wrote:
>>>>> Hi Ioi,
>>>>>
>>>>> On 21/07/2020 10:12 am, Ioi Lam wrote:
>>>>>> Hi please review this very simple fix:
>>>>>>
>>>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 
>>>>>> 2020 -0700
>>>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 
>>>>>> 2020 -0700
>>>>>> @@ -51,8 +51,11 @@
>>>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>>> ??? _hash_and_refcount = 
>>>>>> pack_hash_and_refcount((short)os::random(), refcount);
>>>>>> ??? _length = length;
>>>>>> -? _body[0] = 0;? // in case length == 0
>>>>>> ??? memcpy(_body, name, length);
>>>>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are 
>>>>>> uninitialized and may
>>>>>
>>>>> Can we ever have a Symbol of length zero?
>>>>>
>>>>> If the Symbol name is length 1 then surely _body[0] is initialized 
>>>>> to the single character of that name?
>>>>>
>>>>> The change seems harmless given a zero length symbol is 
>>>>> meaningless, but the commentary just confuses things to me.
>>>>>
>>>>
>>>> Hi David,
>>>>
>>>> We can have a valid Symbol of length 0. All UTF8 constants in 
>>>> classfiles, including the literal string "", are represented as 
>>>> Symbols.
>>>
>>> Ah I see.
>>>> How about
>>>>
>>>> // Random, uninitialized values may appear in _body[0] and _body[1]
>>>> // for Symbols of length 0 and 1. These random values never change 
>>>> during
>>>> // the lifetime of the Symbol, and are read only by 
>>>> Symbol::identity_hash(),
>>>
>>> What if as_quoted_ascii() were called on the zero-length symbol? 
>>> That reads _body[0] unconditionally. The same for use of base(). ??
>>
>> Hi David,
>>
>> as_quoted_ascii() and base() do not read the contents of _body[0]. 
>> They take the address of _body[0]:
>>
>> char* Symbol::as_quoted_ascii() const {
>> ?? const char *ptr = (const char *)&_body[0];
>
> Yep misread that one - sorry.
>
>> ?? int quoted_length = UTF8::quoted_ascii_length(ptr, utf8_length());
>> ?? char* result = NEW_RESOURCE_ARRAY(char, quoted_length + 1);
>> ?? UTF8::as_quoted_ascii(ptr, utf8_length(), result, quoted_length + 1);
>> ?? return result;
>> }
>>
>> ?? const u1* base() const { return &_body[0]; }
>
> Sure but what about the callers of base()? Any indexing off base() 
> should be guarded by a length() check of course, but it's hard to 
> verify that.

I think we are pretty safe -- _body[] is not zero-terminated, so if we 
have code that uses base() and goes out of bound, it will read from the 
next object (or debug padding) and will surely have caused some tests to 
fail.
>
> How about:
>
> // the lifetime of the Symbol, and should only be read by 
> Symbol::identity_hash(),
>

OK, I'll change the comment as you suggested.

Thanks
- Ioi

> Thanks,
> David
> -----
>
>> Thanks
>> - Ioi
>>
>>
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> // which would tolerate such randomness.
>>>>
>>>>
>>>> Thanks
>>>> - Ioi
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> +? // contain random values, which will only be read by 
>>>>>> Symbol::identity_hash(),
>>>>>> +? // which would tolerate such randomness. These values never 
>>>>>> change during the lifetime
>>>>>> +? // of the Symbol.
>>>>>> ??}
>>>>>>
>>>>>>
>>>>>> Passed hs tiers 1/2. Running tiers 3/4 now.
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>
>>


From thomas.stuefe at gmail.com  Tue Jul 21 06:16:06 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 21 Jul 2020 08:16:06 +0200
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
 <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
 <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>
Message-ID: <CAA-vtUyycw2u947pfq7tvjFfhO_Amb5=7PGc-RtaHNEfFxZk=Q@mail.gmail.com>

Hi David,

On Tue, Jul 21, 2020 at 12:06 AM David Holmes <david.holmes at oracle.com>
wrote:

> Hi Thomas,
>
> On 21/07/2020 12:17 am, Thomas St?fe wrote:
> > Hi David,
> >
> >
> > On Mon, Jul 20, 2020 at 2:54 PM David Holmes <david.holmes at oracle.com
> > <mailto:david.holmes at oracle.com>> wrote:
> >
> >     Hi Thomas,
> >
> >     On 20/07/2020 8:59 pm, Thomas St?fe wrote:
> >      > Hi,
> >      >
> >      > could I please have reviews for this very trivial patch?
> >      >
> >      > I found that gtest ignores invalid jvm options randomly since it
> >     relies on
> >      > an uninitialized variable.
> >
> >     Is it really random? I would have expected it to be basically always
> >     non-zero and hence always ignore unknown options.
> >
> >
> > I remember seeing cmdline option errors from gtest, and since this
> > coding is very old, this must be either my failing memory or the random
> > content of the underlying uninitialized stack memory.
>
> Yes but there is only one value of that uninitialized memory (zero) that
> would cause unrecognised options to not be ignored.
>
>
So you are saying that it was in the past more probable that options were
ignored than that they were heeded.

I am saying that they were occasionally not ignored. If that were bad we
would have seen sporadic errors in the past. Especially as this is a day
zero bug, there since integration of gtests in 2016.


> >     I'm not at all clear
> >     if running of the gtests might rely on always ignoring
> >     unexpected/unknown flags -
> >
> >
> > It does not. googletest fishes its own arguments from the initial
> > argument vector and passes the rest off to the JVM. So the JVM ignoring
> > or not ignoring arguments will not change anything.
> >
> >     does it have the capability to distinguish
> >     product and non-product test runs?
> >
> >
> > Gtests are hotspot coding and there are sections running with #ifdef
> > ASSERT, though I am not sure why that would matter?
>
> For the same reason we sometimes need to @require that a VM is a debug
> VM with a jtreg test - because the flag passed in is a non-product flag.
> Also when we run test suites we often pass
> -XX:+IgnoreUnrecognisedVMOptions, again so that non-product flags don't
> cause a failure with release bits.
>
>
Okay, I understand.


> >
> >     I think this needs wider review from people familiar with how our
> gtest
> >     tests are run. (I have no idea - I never use it.)
> >
> >
> > I am quite familiar with it since I use it almost daily. I really depend
> > on it being able to interpret JVM options.
> >
> > In fact I am a bit dismayed by this bug since I write tons of tests for
> > JEP387 and was feeling very smug about the tests running through in all
> > my test scenarios, only to find that since days I keep running the same
> > - default - scenario over and over again :( No, that should be really
> fixed.
>
> I don't understand what you mean. This setting should only affect what
> happens with unrecognised "bad" arguments (as per your subject).
>
>
It hides errors. Let's say I want to run metaspace gtests with a policy
different from the standard, I'd specify

gtestLauncher -jdk:<jdk>  -XX:MetaspaceReclaimStrategy=none

but if I mistype the option - which happened to me - the VM ignores it
silently and the tests run in default settings, all being green. Which gave
me a false feeling of security until much later I started found during
debugging the setting had been ignored.

The correct behaviour would have been for gtestLauncher to give me an
"Unrecognized VM option" error right away.

> Gtestlauncher is called as part of the jtreg tests by the GtestWrapper,
> > but that does not pass any options to it.
>
> Pardon my ignorance but how does one specify options for a gtest then?
>
>
You just pass them on the command line. One can intermix them freely with
the googletest options and those of the gtestlauncher itself.
For example:

./hotspot/variant-server/libjvm/gtest/gtestLauncher -Xlog:metaspace*
-jdk:./images/jdk/  --gtest_filter=metaspace.*
-XX:MetaspaceReclaimStrategy=none

-jdk gets consumed by our gtestLauncher itself
-any one of the gtest_... options get eaten by the googletest framework
-Xlog and the -XX options are the remaining VM options and get handed to
the VM


> > But seriously, if there are tests which pass options to it, they
> > probably want those options to do something in the jvm, so ignoring them
> > silently is not good.
>
> Again only "bad" options are ignored - which should only impact use of
> non-product flags with product bits.
>
>
Same reasoning as above.


> David
> -----
>
>
My reasoning is this:

launcher should not ignore unrecognized VM options because whoever
specifies them expects them to arrive in the VM and do something. In case
this patch shakes loose a scenario where someone specified the wrong option
he/she should look at that case. Either it was wrong to specify the option,
or more probably it was a mistype like it happened to me.

As for how could that happen. ATM I see in our sources three invocations of
gtestLauncher.

- from make files when one runs tests via make, make/RunTests.gmk. There,
we can specify VM options via command line similar to what I do above.
Though I am not sure this even works: the associated make variable
GTEST_JAVA_OPTIONS is not set anywhere AFAICS. Anyway, this case is to my
knowledge only executed manually and so the same reasoning applies: whoever
does this probably does not want a mistyped VM option silently ignored.
- from the one jtreg test which runs the gtests. It does not specify any VM
options.
- from a vscode IDE integration script which does not either

Of course I do not know what Oracle does internally and whether they rely
on bad options being ignored in some scripts somewhere.

--

So this is a bit of an impasse. Who do you suggest should look at this? I
am not sure how to proceed.

Cheers, Thomas


> > Thanks, Thomas
> >
> >
> >     Thanks,
> >     David
> >
> >      > issue: https://bugs.openjdk.java.net/browse/JDK-8249748
> >      > webrev:
> >      >
> >
> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
> >      >
> >      > Thanks, Thomas
> >      >
> >
>

From ioi.lam at oracle.com  Tue Jul 21 06:24:58 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 20 Jul 2020 23:24:58 -0700
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
Message-ID: <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>


On 7/20/20 11:12 PM, Florian Weimer wrote:
> * Ioi Lam:
>
>> Hi please review this very simple fix:
>>
>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 -0700
>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 -0700
>> @@ -51,8 +51,11 @@
>>  ?Symbol::Symbol(const u1* name, int length, int refcount) {
>>  ?? _hash_and_refcount = pack_hash_and_refcount((short)os::random(),
>> refcount);
>>  ?? _length = length;
>> -? _body[0] = 0;? // in case length == 0
>>  ?? memcpy(_body, name, length);
>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>> uninitialized and may
>> +? // contain random values, which will only be read by
>> Symbol::identity_hash(),
>> +? // which would tolerate such randomness. These values never change
>> during the lifetime
>> +? // of the Symbol.
>>  ?}
> Won't this still trip memory debuggers?  Symbol::identity_hash() implies
> that the result is eventually used in a conditional operation (a hash
> comparison perhaps).  If it's possible one day to run Hotspot under
> valgrind, this would result in false positives.

Are you saying that valgrind will modify uninitialized memory 
periodically after the constructor has returned, and thus will cause 
Symbol::identity_hash() to return a different value?

Without my patch, _body[1] is uninitialized for Symbols whose length is 
0 or 1. We have not heard of any issues related to valgrind and 
Symbol::identity_hash().

In fact, looking at the code history, the setting of "_body[0] = 0" in 
Symbol::Symbol was introduced only recently (Feb 2020):

http://hg.openjdk.java.net/jdk/jdk/annotate/4a4d185098e2/src/hotspot/share/oops/symbol.cpp#l55

I'll check with Lois who added the code to see the reason for doing it.


Thanks
- Ioi

> Thanks,
> Florian
>


From david.holmes at oracle.com  Tue Jul 21 07:27:46 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Jul 2020 17:27:46 +1000
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <CAA-vtUyycw2u947pfq7tvjFfhO_Amb5=7PGc-RtaHNEfFxZk=Q@mail.gmail.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
 <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
 <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>
 <CAA-vtUyycw2u947pfq7tvjFfhO_Amb5=7PGc-RtaHNEfFxZk=Q@mail.gmail.com>
Message-ID: <d043432f-d1fe-de4f-394f-b1aba538c34d@oracle.com>

Hi Thomas,

I'm running this through our tier 1-3 testing to see if it exposes any 
issue. If not then lets proceed unless someone else chimes in. I'll also 
flag this RFR internally.

To be clear my concern was that under the current uninitialized code it 
would more often act as

args.ignoreUnrecognized = JNI_TRUE;

than

args.ignoreUnrecognized = JNI_FALSE;

and so your change may perturb existing testing regimes.

Sorry to belabour this.

Thanks,
David
-----

On 21/07/2020 4:16 pm, Thomas St?fe wrote:
> Hi David,
> 
> On Tue, Jul 21, 2020 at 12:06 AM David Holmes <david.holmes at oracle.com 
> <mailto:david.holmes at oracle.com>> wrote:
> 
>     Hi Thomas,
> 
>     On 21/07/2020 12:17 am, Thomas St?fe wrote:
>      > Hi David,
>      >
>      >
>      > On Mon, Jul 20, 2020 at 2:54 PM David Holmes
>     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>      > <mailto:david.holmes at oracle.com
>     <mailto:david.holmes at oracle.com>>> wrote:
>      >
>      >? ? ?Hi Thomas,
>      >
>      >? ? ?On 20/07/2020 8:59 pm, Thomas St?fe wrote:
>      >? ? ? > Hi,
>      >? ? ? >
>      >? ? ? > could I please have reviews for this very trivial patch?
>      >? ? ? >
>      >? ? ? > I found that gtest ignores invalid jvm options randomly
>     since it
>      >? ? ?relies on
>      >? ? ? > an uninitialized variable.
>      >
>      >? ? ?Is it really random? I would have expected it to be basically
>     always
>      >? ? ?non-zero and hence always ignore unknown options.
>      >
>      >
>      > I remember seeing cmdline option errors from gtest, and since this
>      > coding is very old, this must be either my failing memory or the
>     random
>      > content of the underlying uninitialized stack memory.
> 
>     Yes but there is only one value of that uninitialized memory (zero)
>     that
>     would cause unrecognised options to not be ignored.
> 
> 
> So you are saying that it was in the past more probable that options 
> were ignored than that they were heeded.
> 
> I am saying that they were occasionally not ignored. If that were bad we 
> would have seen sporadic errors in the past. Especially as this is a day 
> zero bug, there since integration of gtests in 2016.
> 
>      >? ? ?I'm not at all clear
>      >? ? ?if running of the gtests might rely on always ignoring
>      >? ? ?unexpected/unknown flags -
>      >
>      >
>      > It does not. googletest fishes its own arguments from the initial
>      > argument vector and passes the rest off to the JVM. So the JVM
>     ignoring
>      > or not ignoring arguments will not change anything.
>      >
>      >? ? ?does it have the capability to distinguish
>      >? ? ?product and non-product test runs?
>      >
>      >
>      > Gtests are hotspot?coding and there are sections running?with #ifdef
>      > ASSERT, though I am not sure why that would matter?
> 
>     For the same reason we sometimes need to @require that a VM is a debug
>     VM with a jtreg test - because the flag passed in is a non-product
>     flag.
>     Also when we run test suites we often pass
>     -XX:+IgnoreUnrecognisedVMOptions, again so that non-product flags don't
>     cause a failure with release bits.
> 
> 
> Okay, I understand.
> 
>      >
>      >? ? ?I think this needs wider review from people familiar with how
>     our gtest
>      >? ? ?tests are run. (I have no idea - I never use it.)
>      >
>      >
>      > I am quite familiar with it since I use it almost daily. I really
>     depend
>      > on it being able to interpret JVM options.
>      >
>      > In fact I am a bit dismayed by this bug since I write tons of
>     tests for
>      > JEP387 and was feeling very smug about the tests running through
>     in all
>      > my test scenarios, only to find that since days I keep running
>     the same
>      > - default?- scenario over and over again :( No, that should be
>     really fixed.
> 
>     I don't understand what you mean. This setting should only affect what
>     happens with unrecognised "bad" arguments (as per your subject).
> 
> 
> It hides errors. Let's say I want to run metaspace gtests with a policy 
> different from the standard, I'd specify
> 
> gtestLauncher?-jdk:<jdk>? -XX:MetaspaceReclaimStrategy=none
> 
> but if I mistype the option - which happened to me - the VM ignores it 
> silently and the tests run in default settings, all being green. Which 
> gave me a false feeling of security until much later I started found 
> during debugging the setting had been ignored.
> The correct behaviour would have been for gtestLauncher to give me an 
> "Unrecognized VM option" error right away.
> 
>      > Gtestlauncher is called as part of the jtreg tests by the
>     GtestWrapper,
>      > but that does not pass any options to it.
> 
>     Pardon my ignorance but how does one specify options for a gtest then?
> 
> 
> You just pass them on the command line. One can intermix them freely 
> with the googletest options and those of the?gtestlauncher?itself.
> For example:
> 
> ./hotspot/variant-server/libjvm/gtest/gtestLauncher -Xlog:metaspace* 
> -jdk:./images/jdk/ ?--gtest_filter=metaspace.* 
> -XX:MetaspaceReclaimStrategy=none
> 
> -jdk gets consumed by our gtestLauncher?itself
> -any one of the gtest_... options get eaten by the googletest framework
> -Xlog and the -XX options are the remaining VM options and get handed to 
> the VM
> 
>      > But seriously, if there are tests which?pass options to it, they
>      > probably want those options to do something in the?jvm, so
>     ignoring them
>      > silently is not good.
> 
>     Again only "bad" options are ignored - which should only impact use of
>     non-product flags with product bits.
> 
> 
> Same reasoning as above.
> 
>     David
>     -----
> 
> 
> My reasoning is this:
> 
> launcher should not ignore unrecognized VM options because whoever 
> specifies them expects them to arrive in the VM and do something. In 
> case this patch shakes loose a scenario where someone specified the 
> wrong option he/she should?look at that case. Either it was wrong to 
> specify the option, or more probably it was a mistype like it happened 
> to me.
> 
> As for how could that happen. ATM I see in our sources three invocations 
> of gtestLauncher.
> 
> - from make files when one runs tests via make, make/RunTests.gmk. 
> There, we can specify VM options via command line similar to what I do 
> above. Though I am not sure this even works: the associated make 
> variable GTEST_JAVA_OPTIONS is not set anywhere AFAICS. Anyway, this 
> case is to my knowledge only executed manually and so the same reasoning 
> applies: whoever does this probably does not want a mistyped VM option 
> silently ignored.
> - from the one jtreg test which runs the gtests. It does not specify any 
> VM options.
> - from a vscode IDE integration script which does not either
> 
> Of course I do not know what Oracle does internally and whether they 
> rely on bad options being ignored in some scripts somewhere.
> 
> --
> 
> So this is a bit of an impasse. Who do you suggest should look at this? 
> I am not sure how to proceed.
> 
> Cheers, Thomas
> 
>      > Thanks, Thomas
>      >
>      >
>      >? ? ?Thanks,
>      >? ? ?David
>      >
>      >? ? ? > issue: https://bugs.openjdk.java.net/browse/JDK-8249748
>      >? ? ? > webrev:
>      >? ? ? >
>      >
>     http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
>      >? ? ? >
>      >? ? ? > Thanks, Thomas
>      >? ? ? >
>      >
> 

From thomas.stuefe at gmail.com  Tue Jul 21 08:35:09 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 21 Jul 2020 10:35:09 +0200
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <d043432f-d1fe-de4f-394f-b1aba538c34d@oracle.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
 <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
 <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>
 <CAA-vtUyycw2u947pfq7tvjFfhO_Amb5=7PGc-RtaHNEfFxZk=Q@mail.gmail.com>
 <d043432f-d1fe-de4f-394f-b1aba538c34d@oracle.com>
Message-ID: <CAA-vtUwZ=zVtrgidHvSTwQDHqBe3vmU25dVkXdYC9v0dbAbneA@mail.gmail.com>

On Tue, Jul 21, 2020 at 9:32 AM David Holmes <david.holmes at oracle.com>
wrote:

> Hi Thomas,
>
> I'm running this through our tier 1-3 testing to see if it exposes any
> issue. If not then lets proceed unless someone else chimes in. I'll also
> flag this RFR internally.
>

Thank you David!


>
> To be clear my concern was that under the current uninitialized code it
> would more often act as
>
> args.ignoreUnrecognized = JNI_TRUE;
>
> than
>
> args.ignoreUnrecognized = JNI_FALSE;
>
> and so your change may perturb existing testing regimes.
>
>
I understand.


> Sorry to belabour this.
>
>
No problem at all, that is why we do Reviews.

Thanks Thomas

Thanks,
> David
> -----
>
> On 21/07/2020 4:16 pm, Thomas St?fe wrote:
> > Hi David,
> >
> > On Tue, Jul 21, 2020 at 12:06 AM David Holmes <david.holmes at oracle.com
> > <mailto:david.holmes at oracle.com>> wrote:
> >
> >     Hi Thomas,
> >
> >     On 21/07/2020 12:17 am, Thomas St?fe wrote:
> >      > Hi David,
> >      >
> >      >
> >      > On Mon, Jul 20, 2020 at 2:54 PM David Holmes
> >     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> >      > <mailto:david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>>> wrote:
> >      >
> >      >     Hi Thomas,
> >      >
> >      >     On 20/07/2020 8:59 pm, Thomas St?fe wrote:
> >      >      > Hi,
> >      >      >
> >      >      > could I please have reviews for this very trivial patch?
> >      >      >
> >      >      > I found that gtest ignores invalid jvm options randomly
> >     since it
> >      >     relies on
> >      >      > an uninitialized variable.
> >      >
> >      >     Is it really random? I would have expected it to be basically
> >     always
> >      >     non-zero and hence always ignore unknown options.
> >      >
> >      >
> >      > I remember seeing cmdline option errors from gtest, and since this
> >      > coding is very old, this must be either my failing memory or the
> >     random
> >      > content of the underlying uninitialized stack memory.
> >
> >     Yes but there is only one value of that uninitialized memory (zero)
> >     that
> >     would cause unrecognised options to not be ignored.
> >
> >
> > So you are saying that it was in the past more probable that options
> > were ignored than that they were heeded.
> >
> > I am saying that they were occasionally not ignored. If that were bad we
> > would have seen sporadic errors in the past. Especially as this is a day
> > zero bug, there since integration of gtests in 2016.
> >
> >      >     I'm not at all clear
> >      >     if running of the gtests might rely on always ignoring
> >      >     unexpected/unknown flags -
> >      >
> >      >
> >      > It does not. googletest fishes its own arguments from the initial
> >      > argument vector and passes the rest off to the JVM. So the JVM
> >     ignoring
> >      > or not ignoring arguments will not change anything.
> >      >
> >      >     does it have the capability to distinguish
> >      >     product and non-product test runs?
> >      >
> >      >
> >      > Gtests are hotspot coding and there are sections running with
> #ifdef
> >      > ASSERT, though I am not sure why that would matter?
> >
> >     For the same reason we sometimes need to @require that a VM is a
> debug
> >     VM with a jtreg test - because the flag passed in is a non-product
> >     flag.
> >     Also when we run test suites we often pass
> >     -XX:+IgnoreUnrecognisedVMOptions, again so that non-product flags
> don't
> >     cause a failure with release bits.
> >
> >
> > Okay, I understand.
> >
> >      >
> >      >     I think this needs wider review from people familiar with how
> >     our gtest
> >      >     tests are run. (I have no idea - I never use it.)
> >      >
> >      >
> >      > I am quite familiar with it since I use it almost daily. I really
> >     depend
> >      > on it being able to interpret JVM options.
> >      >
> >      > In fact I am a bit dismayed by this bug since I write tons of
> >     tests for
> >      > JEP387 and was feeling very smug about the tests running through
> >     in all
> >      > my test scenarios, only to find that since days I keep running
> >     the same
> >      > - default - scenario over and over again :( No, that should be
> >     really fixed.
> >
> >     I don't understand what you mean. This setting should only affect
> what
> >     happens with unrecognised "bad" arguments (as per your subject).
> >
> >
> > It hides errors. Let's say I want to run metaspace gtests with a policy
> > different from the standard, I'd specify
> >
> > gtestLauncher -jdk:<jdk>  -XX:MetaspaceReclaimStrategy=none
> >
> > but if I mistype the option - which happened to me - the VM ignores it
> > silently and the tests run in default settings, all being green. Which
> > gave me a false feeling of security until much later I started found
> > during debugging the setting had been ignored.
> > The correct behaviour would have been for gtestLauncher to give me an
> > "Unrecognized VM option" error right away.
> >
> >      > Gtestlauncher is called as part of the jtreg tests by the
> >     GtestWrapper,
> >      > but that does not pass any options to it.
> >
> >     Pardon my ignorance but how does one specify options for a gtest
> then?
> >
> >
> > You just pass them on the command line. One can intermix them freely
> > with the googletest options and those of the gtestlauncher itself.
> > For example:
> >
> > ./hotspot/variant-server/libjvm/gtest/gtestLauncher -Xlog:metaspace*
> > -jdk:./images/jdk/  --gtest_filter=metaspace.*
> > -XX:MetaspaceReclaimStrategy=none
> >
> > -jdk gets consumed by our gtestLauncher itself
> > -any one of the gtest_... options get eaten by the googletest framework
> > -Xlog and the -XX options are the remaining VM options and get handed to
> > the VM
> >
> >      > But seriously, if there are tests which pass options to it, they
> >      > probably want those options to do something in the jvm, so
> >     ignoring them
> >      > silently is not good.
> >
> >     Again only "bad" options are ignored - which should only impact use
> of
> >     non-product flags with product bits.
> >
> >
> > Same reasoning as above.
> >
> >     David
> >     -----
> >
> >
> > My reasoning is this:
> >
> > launcher should not ignore unrecognized VM options because whoever
> > specifies them expects them to arrive in the VM and do something. In
> > case this patch shakes loose a scenario where someone specified the
> > wrong option he/she should look at that case. Either it was wrong to
> > specify the option, or more probably it was a mistype like it happened
> > to me.
> >
> > As for how could that happen. ATM I see in our sources three invocations
> > of gtestLauncher.
> >
> > - from make files when one runs tests via make, make/RunTests.gmk.
> > There, we can specify VM options via command line similar to what I do
> > above. Though I am not sure this even works: the associated make
> > variable GTEST_JAVA_OPTIONS is not set anywhere AFAICS. Anyway, this
> > case is to my knowledge only executed manually and so the same reasoning
> > applies: whoever does this probably does not want a mistyped VM option
> > silently ignored.
> > - from the one jtreg test which runs the gtests. It does not specify any
> > VM options.
> > - from a vscode IDE integration script which does not either
> >
> > Of course I do not know what Oracle does internally and whether they
> > rely on bad options being ignored in some scripts somewhere.
> >
> > --
> >
> > So this is a bit of an impasse. Who do you suggest should look at this?
> > I am not sure how to proceed.
> >
> > Cheers, Thomas
> >
> >      > Thanks, Thomas
> >      >
> >      >
> >      >     Thanks,
> >      >     David
> >      >
> >      >      > issue: https://bugs.openjdk.java.net/browse/JDK-8249748
> >      >      > webrev:
> >      >      >
> >      >
> >
> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
> >      >      >
> >      >      > Thanks, Thomas
> >      >      >
> >      >
> >
>

From fweimer at redhat.com  Tue Jul 21 10:43:17 2020
From: fweimer at redhat.com (Florian Weimer)
Date: Tue, 21 Jul 2020 12:43:17 +0200
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com> (Ioi Lam's
 message of "Mon, 20 Jul 2020 23:24:58 -0700")
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
 <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>
Message-ID: <87a6ztrvwa.fsf@oldenburg2.str.redhat.com>

* Ioi Lam:

> On 7/20/20 11:12 PM, Florian Weimer wrote:
>> * Ioi Lam:
>>
>>> Hi please review this very simple fix:
>>>
>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 -0700
>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 -0700
>>> @@ -51,8 +51,11 @@
>>>  ?Symbol::Symbol(const u1* name, int length, int refcount) {
>>>  ?? _hash_and_refcount = pack_hash_and_refcount((short)os::random(),
>>> refcount);
>>>  ?? _length = length;
>>> -? _body[0] = 0;? // in case length == 0
>>>  ?? memcpy(_body, name, length);
>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>>> uninitialized and may
>>> +? // contain random values, which will only be read by
>>> Symbol::identity_hash(),
>>> +? // which would tolerate such randomness. These values never change
>>> during the lifetime
>>> +? // of the Symbol.
>>>  ?}
>> Won't this still trip memory debuggers?  Symbol::identity_hash() implies
>> that the result is eventually used in a conditional operation (a hash
>> comparison perhaps).  If it's possible one day to run Hotspot under
>> valgrind, this would result in false positives.
>
> Are you saying that valgrind will modify uninitialized memory
> periodically after the constructor has returned, and thus will cause
> Symbol::identity_hash() to return a different value?

No, it flags conditional branches that (indirectly) depend on
uninitialized memory.  But the chosen program path is deterministic.
Intentionally uninitialized values add clutter to the diagnostic output.

Things break for real if the compiler can observe that uninitialized
memory is read and treats this as an indicator that this execution path
is not taken, although compiler stories vary in this area, and
especially for byte access.

Thanks,
Florian


From david.holmes at oracle.com  Tue Jul 21 12:09:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Jul 2020 22:09:37 +1000
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <CAA-vtUwZ=zVtrgidHvSTwQDHqBe3vmU25dVkXdYC9v0dbAbneA@mail.gmail.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
 <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
 <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>
 <CAA-vtUyycw2u947pfq7tvjFfhO_Amb5=7PGc-RtaHNEfFxZk=Q@mail.gmail.com>
 <d043432f-d1fe-de4f-394f-b1aba538c34d@oracle.com>
 <CAA-vtUwZ=zVtrgidHvSTwQDHqBe3vmU25dVkXdYC9v0dbAbneA@mail.gmail.com>
Message-ID: <b94578c8-cdb2-0fb5-2fa0-6af7efd5a518@oracle.com>

On 21/07/2020 6:35 pm, Thomas St?fe wrote:
> 
> 
> On Tue, Jul 21, 2020 at 9:32 AM David Holmes <david.holmes at oracle.com 
> <mailto:david.holmes at oracle.com>> wrote:
> 
>     Hi Thomas,
> 
>     I'm running this through our tier 1-3 testing to see if it exposes any
>     issue. If not then lets proceed unless someone else chimes in. I'll
>     also
>     flag this RFR internally.
> 
> 
> Thank you David!

Tests all clear.

Cheers,
David

> 
>     To be clear my concern was that under the current uninitialized code it
>     would more often act as
> 
>     args.ignoreUnrecognized = JNI_TRUE;
> 
>     than
> 
>     args.ignoreUnrecognized = JNI_FALSE;
> 
>     and so your change may perturb existing testing regimes.
> 
> 
> I understand.
> 
>     Sorry to belabour this.
> 
> 
> No problem at all, that is why we do Reviews.
> Thanks Thomas
> 
>     Thanks,
>     David
>     -----
> 
>     On 21/07/2020 4:16 pm, Thomas St?fe wrote:
>      > Hi David,
>      >
>      > On Tue, Jul 21, 2020 at 12:06 AM David Holmes
>     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>      > <mailto:david.holmes at oracle.com
>     <mailto:david.holmes at oracle.com>>> wrote:
>      >
>      >? ? ?Hi Thomas,
>      >
>      >? ? ?On 21/07/2020 12:17 am, Thomas St?fe wrote:
>      >? ? ? > Hi David,
>      >? ? ? >
>      >? ? ? >
>      >? ? ? > On Mon, Jul 20, 2020 at 2:54 PM David Holmes
>      >? ? ?<david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>     <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
>      >? ? ? > <mailto:david.holmes at oracle.com
>     <mailto:david.holmes at oracle.com>
>      >? ? ?<mailto:david.holmes at oracle.com
>     <mailto:david.holmes at oracle.com>>>> wrote:
>      >? ? ? >
>      >? ? ? >? ? ?Hi Thomas,
>      >? ? ? >
>      >? ? ? >? ? ?On 20/07/2020 8:59 pm, Thomas St?fe wrote:
>      >? ? ? >? ? ? > Hi,
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? > could I please have reviews for this very trivial
>     patch?
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? > I found that gtest ignores invalid jvm options randomly
>      >? ? ?since it
>      >? ? ? >? ? ?relies on
>      >? ? ? >? ? ? > an uninitialized variable.
>      >? ? ? >
>      >? ? ? >? ? ?Is it really random? I would have expected it to be
>     basically
>      >? ? ?always
>      >? ? ? >? ? ?non-zero and hence always ignore unknown options.
>      >? ? ? >
>      >? ? ? >
>      >? ? ? > I remember seeing cmdline option errors from gtest, and
>     since this
>      >? ? ? > coding is very old, this must be either my failing memory
>     or the
>      >? ? ?random
>      >? ? ? > content of the underlying uninitialized stack memory.
>      >
>      >? ? ?Yes but there is only one value of that uninitialized memory
>     (zero)
>      >? ? ?that
>      >? ? ?would cause unrecognised options to not be ignored.
>      >
>      >
>      > So you are saying that it was in the past more probable that options
>      > were ignored than that they were heeded.
>      >
>      > I am saying that they were occasionally not ignored. If that were
>     bad we
>      > would have seen sporadic errors in the past. Especially as this
>     is a day
>      > zero bug, there since integration of gtests in 2016.
>      >
>      >? ? ? >? ? ?I'm not at all clear
>      >? ? ? >? ? ?if running of the gtests might rely on always ignoring
>      >? ? ? >? ? ?unexpected/unknown flags -
>      >? ? ? >
>      >? ? ? >
>      >? ? ? > It does not. googletest fishes its own arguments from the
>     initial
>      >? ? ? > argument vector and passes the rest off to the JVM. So the JVM
>      >? ? ?ignoring
>      >? ? ? > or not ignoring arguments will not change anything.
>      >? ? ? >
>      >? ? ? >? ? ?does it have the capability to distinguish
>      >? ? ? >? ? ?product and non-product test runs?
>      >? ? ? >
>      >? ? ? >
>      >? ? ? > Gtests are hotspot?coding and there are sections
>     running?with #ifdef
>      >? ? ? > ASSERT, though I am not sure why that would matter?
>      >
>      >? ? ?For the same reason we sometimes need to @require that a VM
>     is a debug
>      >? ? ?VM with a jtreg test - because the flag passed in is a
>     non-product
>      >? ? ?flag.
>      >? ? ?Also when we run test suites we often pass
>      >? ? ?-XX:+IgnoreUnrecognisedVMOptions, again so that non-product
>     flags don't
>      >? ? ?cause a failure with release bits.
>      >
>      >
>      > Okay, I understand.
>      >
>      >? ? ? >
>      >? ? ? >? ? ?I think this needs wider review from people familiar
>     with how
>      >? ? ?our gtest
>      >? ? ? >? ? ?tests are run. (I have no idea - I never use it.)
>      >? ? ? >
>      >? ? ? >
>      >? ? ? > I am quite familiar with it since I use it almost daily. I
>     really
>      >? ? ?depend
>      >? ? ? > on it being able to interpret JVM options.
>      >? ? ? >
>      >? ? ? > In fact I am a bit dismayed by this bug since I write tons of
>      >? ? ?tests for
>      >? ? ? > JEP387 and was feeling very smug about the tests running
>     through
>      >? ? ?in all
>      >? ? ? > my test scenarios, only to find that since days I keep running
>      >? ? ?the same
>      >? ? ? > - default?- scenario over and over again :( No, that should be
>      >? ? ?really fixed.
>      >
>      >? ? ?I don't understand what you mean. This setting should only
>     affect what
>      >? ? ?happens with unrecognised "bad" arguments (as per your subject).
>      >
>      >
>      > It hides errors. Let's say I want to run metaspace gtests with a
>     policy
>      > different from the standard, I'd specify
>      >
>      > gtestLauncher?-jdk:<jdk>? -XX:MetaspaceReclaimStrategy=none
>      >
>      > but if I mistype the option - which happened to me - the VM
>     ignores it
>      > silently and the tests run in default settings, all being green.
>     Which
>      > gave me a false feeling of security until much later I started found
>      > during debugging the setting had been ignored.
>      > The correct behaviour would have been for gtestLauncher to give
>     me an
>      > "Unrecognized VM option" error right away.
>      >
>      >? ? ? > Gtestlauncher is called as part of the jtreg tests by the
>      >? ? ?GtestWrapper,
>      >? ? ? > but that does not pass any options to it.
>      >
>      >? ? ?Pardon my ignorance but how does one specify options for a
>     gtest then?
>      >
>      >
>      > You just pass them on the command line. One can intermix them freely
>      > with the googletest options and those of the?gtestlauncher?itself.
>      > For example:
>      >
>      > ./hotspot/variant-server/libjvm/gtest/gtestLauncher -Xlog:metaspace*
>      > -jdk:./images/jdk/ ?--gtest_filter=metaspace.*
>      > -XX:MetaspaceReclaimStrategy=none
>      >
>      > -jdk gets consumed by our gtestLauncher?itself
>      > -any one of the gtest_... options get eaten by the googletest
>     framework
>      > -Xlog and the -XX options are the remaining VM options and get
>     handed to
>      > the VM
>      >
>      >? ? ? > But seriously, if there are tests which?pass options to
>     it, they
>      >? ? ? > probably want those options to do something in the?jvm, so
>      >? ? ?ignoring them
>      >? ? ? > silently is not good.
>      >
>      >? ? ?Again only "bad" options are ignored - which should only
>     impact use of
>      >? ? ?non-product flags with product bits.
>      >
>      >
>      > Same reasoning as above.
>      >
>      >? ? ?David
>      >? ? ?-----
>      >
>      >
>      > My reasoning is this:
>      >
>      > launcher should not ignore unrecognized VM options because whoever
>      > specifies them expects them to arrive in the VM and do something. In
>      > case this patch shakes loose a scenario where someone specified the
>      > wrong option he/she should?look at that case. Either it was wrong to
>      > specify the option, or more probably it was a mistype like it
>     happened
>      > to me.
>      >
>      > As for how could that happen. ATM I see in our sources three
>     invocations
>      > of gtestLauncher.
>      >
>      > - from make files when one runs tests via make, make/RunTests.gmk.
>      > There, we can specify VM options via command line similar to what
>     I do
>      > above. Though I am not sure this even works: the associated make
>      > variable GTEST_JAVA_OPTIONS is not set anywhere AFAICS. Anyway, this
>      > case is to my knowledge only executed manually and so the same
>     reasoning
>      > applies: whoever does this probably does not want a mistyped VM
>     option
>      > silently ignored.
>      > - from the one jtreg test which runs the gtests. It does not
>     specify any
>      > VM options.
>      > - from a vscode IDE integration script which does not either
>      >
>      > Of course I do not know what Oracle does internally and whether they
>      > rely on bad options being ignored in some scripts somewhere.
>      >
>      > --
>      >
>      > So this is a bit of an impasse. Who do you suggest should look at
>     this?
>      > I am not sure how to proceed.
>      >
>      > Cheers, Thomas
>      >
>      >? ? ? > Thanks, Thomas
>      >? ? ? >
>      >? ? ? >
>      >? ? ? >? ? ?Thanks,
>      >? ? ? >? ? ?David
>      >? ? ? >
>      >? ? ? >? ? ? > issue: https://bugs.openjdk.java.net/browse/JDK-8249748
>      >? ? ? >? ? ? > webrev:
>      >? ? ? >? ? ? >
>      >? ? ? >
>      >
>     http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
>      >? ? ? >? ? ? >
>      >? ? ? >? ? ? > Thanks, Thomas
>      >? ? ? >? ? ? >
>      >? ? ? >
>      >
> 

From thomas.stuefe at gmail.com  Tue Jul 21 12:13:41 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 21 Jul 2020 14:13:41 +0200
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <b94578c8-cdb2-0fb5-2fa0-6af7efd5a518@oracle.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
 <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
 <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>
 <CAA-vtUyycw2u947pfq7tvjFfhO_Amb5=7PGc-RtaHNEfFxZk=Q@mail.gmail.com>
 <d043432f-d1fe-de4f-394f-b1aba538c34d@oracle.com>
 <CAA-vtUwZ=zVtrgidHvSTwQDHqBe3vmU25dVkXdYC9v0dbAbneA@mail.gmail.com>
 <b94578c8-cdb2-0fb5-2fa0-6af7efd5a518@oracle.com>
Message-ID: <CAA-vtUz0YQEGW60bo8gu4OaF_0O6TrLOzCbkeieD09i9BujkNA@mail.gmail.com>

thanks!

I'll wait till tomorrow, if no one has opinions about this I push.

..Thomas

On Tue, Jul 21, 2020 at 2:09 PM David Holmes <david.holmes at oracle.com>
wrote:

> On 21/07/2020 6:35 pm, Thomas St?fe wrote:
> >
> >
> > On Tue, Jul 21, 2020 at 9:32 AM David Holmes <david.holmes at oracle.com
> > <mailto:david.holmes at oracle.com>> wrote:
> >
> >     Hi Thomas,
> >
> >     I'm running this through our tier 1-3 testing to see if it exposes
> any
> >     issue. If not then lets proceed unless someone else chimes in. I'll
> >     also
> >     flag this RFR internally.
> >
> >
> > Thank you David!
>
> Tests all clear.
>
> Cheers,
> David
>
> >
> >     To be clear my concern was that under the current uninitialized code
> it
> >     would more often act as
> >
> >     args.ignoreUnrecognized = JNI_TRUE;
> >
> >     than
> >
> >     args.ignoreUnrecognized = JNI_FALSE;
> >
> >     and so your change may perturb existing testing regimes.
> >
> >
> > I understand.
> >
> >     Sorry to belabour this.
> >
> >
> > No problem at all, that is why we do Reviews.
> > Thanks Thomas
> >
> >     Thanks,
> >     David
> >     -----
> >
> >     On 21/07/2020 4:16 pm, Thomas St?fe wrote:
> >      > Hi David,
> >      >
> >      > On Tue, Jul 21, 2020 at 12:06 AM David Holmes
> >     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> >      > <mailto:david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>>> wrote:
> >      >
> >      >     Hi Thomas,
> >      >
> >      >     On 21/07/2020 12:17 am, Thomas St?fe wrote:
> >      >      > Hi David,
> >      >      >
> >      >      >
> >      >      > On Mon, Jul 20, 2020 at 2:54 PM David Holmes
> >      >     <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> >     <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
> >      >      > <mailto:david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>
> >      >     <mailto:david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>>>> wrote:
> >      >      >
> >      >      >     Hi Thomas,
> >      >      >
> >      >      >     On 20/07/2020 8:59 pm, Thomas St?fe wrote:
> >      >      >      > Hi,
> >      >      >      >
> >      >      >      > could I please have reviews for this very trivial
> >     patch?
> >      >      >      >
> >      >      >      > I found that gtest ignores invalid jvm options
> randomly
> >      >     since it
> >      >      >     relies on
> >      >      >      > an uninitialized variable.
> >      >      >
> >      >      >     Is it really random? I would have expected it to be
> >     basically
> >      >     always
> >      >      >     non-zero and hence always ignore unknown options.
> >      >      >
> >      >      >
> >      >      > I remember seeing cmdline option errors from gtest, and
> >     since this
> >      >      > coding is very old, this must be either my failing memory
> >     or the
> >      >     random
> >      >      > content of the underlying uninitialized stack memory.
> >      >
> >      >     Yes but there is only one value of that uninitialized memory
> >     (zero)
> >      >     that
> >      >     would cause unrecognised options to not be ignored.
> >      >
> >      >
> >      > So you are saying that it was in the past more probable that
> options
> >      > were ignored than that they were heeded.
> >      >
> >      > I am saying that they were occasionally not ignored. If that were
> >     bad we
> >      > would have seen sporadic errors in the past. Especially as this
> >     is a day
> >      > zero bug, there since integration of gtests in 2016.
> >      >
> >      >      >     I'm not at all clear
> >      >      >     if running of the gtests might rely on always ignoring
> >      >      >     unexpected/unknown flags -
> >      >      >
> >      >      >
> >      >      > It does not. googletest fishes its own arguments from the
> >     initial
> >      >      > argument vector and passes the rest off to the JVM. So the
> JVM
> >      >     ignoring
> >      >      > or not ignoring arguments will not change anything.
> >      >      >
> >      >      >     does it have the capability to distinguish
> >      >      >     product and non-product test runs?
> >      >      >
> >      >      >
> >      >      > Gtests are hotspot coding and there are sections
> >     running with #ifdef
> >      >      > ASSERT, though I am not sure why that would matter?
> >      >
> >      >     For the same reason we sometimes need to @require that a VM
> >     is a debug
> >      >     VM with a jtreg test - because the flag passed in is a
> >     non-product
> >      >     flag.
> >      >     Also when we run test suites we often pass
> >      >     -XX:+IgnoreUnrecognisedVMOptions, again so that non-product
> >     flags don't
> >      >     cause a failure with release bits.
> >      >
> >      >
> >      > Okay, I understand.
> >      >
> >      >      >
> >      >      >     I think this needs wider review from people familiar
> >     with how
> >      >     our gtest
> >      >      >     tests are run. (I have no idea - I never use it.)
> >      >      >
> >      >      >
> >      >      > I am quite familiar with it since I use it almost daily. I
> >     really
> >      >     depend
> >      >      > on it being able to interpret JVM options.
> >      >      >
> >      >      > In fact I am a bit dismayed by this bug since I write tons
> of
> >      >     tests for
> >      >      > JEP387 and was feeling very smug about the tests running
> >     through
> >      >     in all
> >      >      > my test scenarios, only to find that since days I keep
> running
> >      >     the same
> >      >      > - default - scenario over and over again :( No, that
> should be
> >      >     really fixed.
> >      >
> >      >     I don't understand what you mean. This setting should only
> >     affect what
> >      >     happens with unrecognised "bad" arguments (as per your
> subject).
> >      >
> >      >
> >      > It hides errors. Let's say I want to run metaspace gtests with a
> >     policy
> >      > different from the standard, I'd specify
> >      >
> >      > gtestLauncher -jdk:<jdk>  -XX:MetaspaceReclaimStrategy=none
> >      >
> >      > but if I mistype the option - which happened to me - the VM
> >     ignores it
> >      > silently and the tests run in default settings, all being green.
> >     Which
> >      > gave me a false feeling of security until much later I started
> found
> >      > during debugging the setting had been ignored.
> >      > The correct behaviour would have been for gtestLauncher to give
> >     me an
> >      > "Unrecognized VM option" error right away.
> >      >
> >      >      > Gtestlauncher is called as part of the jtreg tests by the
> >      >     GtestWrapper,
> >      >      > but that does not pass any options to it.
> >      >
> >      >     Pardon my ignorance but how does one specify options for a
> >     gtest then?
> >      >
> >      >
> >      > You just pass them on the command line. One can intermix them
> freely
> >      > with the googletest options and those of the gtestlauncher itself.
> >      > For example:
> >      >
> >      > ./hotspot/variant-server/libjvm/gtest/gtestLauncher
> -Xlog:metaspace*
> >      > -jdk:./images/jdk/  --gtest_filter=metaspace.*
> >      > -XX:MetaspaceReclaimStrategy=none
> >      >
> >      > -jdk gets consumed by our gtestLauncher itself
> >      > -any one of the gtest_... options get eaten by the googletest
> >     framework
> >      > -Xlog and the -XX options are the remaining VM options and get
> >     handed to
> >      > the VM
> >      >
> >      >      > But seriously, if there are tests which pass options to
> >     it, they
> >      >      > probably want those options to do something in the jvm, so
> >      >     ignoring them
> >      >      > silently is not good.
> >      >
> >      >     Again only "bad" options are ignored - which should only
> >     impact use of
> >      >     non-product flags with product bits.
> >      >
> >      >
> >      > Same reasoning as above.
> >      >
> >      >     David
> >      >     -----
> >      >
> >      >
> >      > My reasoning is this:
> >      >
> >      > launcher should not ignore unrecognized VM options because whoever
> >      > specifies them expects them to arrive in the VM and do something.
> In
> >      > case this patch shakes loose a scenario where someone specified
> the
> >      > wrong option he/she should look at that case. Either it was wrong
> to
> >      > specify the option, or more probably it was a mistype like it
> >     happened
> >      > to me.
> >      >
> >      > As for how could that happen. ATM I see in our sources three
> >     invocations
> >      > of gtestLauncher.
> >      >
> >      > - from make files when one runs tests via make, make/RunTests.gmk.
> >      > There, we can specify VM options via command line similar to what
> >     I do
> >      > above. Though I am not sure this even works: the associated make
> >      > variable GTEST_JAVA_OPTIONS is not set anywhere AFAICS. Anyway,
> this
> >      > case is to my knowledge only executed manually and so the same
> >     reasoning
> >      > applies: whoever does this probably does not want a mistyped VM
> >     option
> >      > silently ignored.
> >      > - from the one jtreg test which runs the gtests. It does not
> >     specify any
> >      > VM options.
> >      > - from a vscode IDE integration script which does not either
> >      >
> >      > Of course I do not know what Oracle does internally and whether
> they
> >      > rely on bad options being ignored in some scripts somewhere.
> >      >
> >      > --
> >      >
> >      > So this is a bit of an impasse. Who do you suggest should look at
> >     this?
> >      > I am not sure how to proceed.
> >      >
> >      > Cheers, Thomas
> >      >
> >      >      > Thanks, Thomas
> >      >      >
> >      >      >
> >      >      >     Thanks,
> >      >      >     David
> >      >      >
> >      >      >      > issue:
> https://bugs.openjdk.java.net/browse/JDK-8249748
> >      >      >      > webrev:
> >      >      >      >
> >      >      >
> >      >
> >
> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
> >      >      >      >
> >      >      >      > Thanks, Thomas
> >      >      >      >
> >      >      >
> >      >
> >
>

From igor.ignatyev at oracle.com  Tue Jul 21 14:39:28 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 21 Jul 2020 07:39:28 -0700
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <CAA-vtUz0YQEGW60bo8gu4OaF_0O6TrLOzCbkeieD09i9BujkNA@mail.gmail.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
 <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
 <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>
 <CAA-vtUyycw2u947pfq7tvjFfhO_Amb5=7PGc-RtaHNEfFxZk=Q@mail.gmail.com>
 <d043432f-d1fe-de4f-394f-b1aba538c34d@oracle.com>
 <CAA-vtUwZ=zVtrgidHvSTwQDHqBe3vmU25dVkXdYC9v0dbAbneA@mail.gmail.com>
 <b94578c8-cdb2-0fb5-2fa0-6af7efd5a518@oracle.com>
 <CAA-vtUz0YQEGW60bo8gu4OaF_0O6TrLOzCbkeieD09i9BujkNA@mail.gmail.com>
Message-ID: <D17D763F-415E-4129-B212-B7709695AD26@oracle.com>

Hi Thomas,

the fix looks good to me. and your reasoning is completely correct, this is just an oversight in JEP 281 implementation, nor gtest integration, nor Oracle internal infra, nor any other code which depends on that bug. thanks for fixing!

Cheers,
-- Igor


> On Jul 21, 2020, at 5:13 AM, Thomas St?fe <thomas.stuefe at gmail.com> wrote:
> 
> thanks!
> 
> I'll wait till tomorrow, if no one has opinions about this I push.
> 
> ..Thomas
> 
> On Tue, Jul 21, 2020 at 2:09 PM David Holmes <david.holmes at oracle.com>
> wrote:
> 
>> On 21/07/2020 6:35 pm, Thomas St?fe wrote:
>>> 
>>> 
>>> On Tue, Jul 21, 2020 at 9:32 AM David Holmes <david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>> wrote:
>>> 
>>>    Hi Thomas,
>>> 
>>>    I'm running this through our tier 1-3 testing to see if it exposes
>> any
>>>    issue. If not then lets proceed unless someone else chimes in. I'll
>>>    also
>>>    flag this RFR internally.
>>> 
>>> 
>>> Thank you David!
>> 
>> Tests all clear.
>> 
>> Cheers,
>> David
>> 
>>> 
>>>    To be clear my concern was that under the current uninitialized code
>> it
>>>    would more often act as
>>> 
>>>    args.ignoreUnrecognized = JNI_TRUE;
>>> 
>>>    than
>>> 
>>>    args.ignoreUnrecognized = JNI_FALSE;
>>> 
>>>    and so your change may perturb existing testing regimes.
>>> 
>>> 
>>> I understand.
>>> 
>>>    Sorry to belabour this.
>>> 
>>> 
>>> No problem at all, that is why we do Reviews.
>>> Thanks Thomas
>>> 
>>>    Thanks,
>>>    David
>>>    -----
>>> 
>>>    On 21/07/2020 4:16 pm, Thomas St?fe wrote:
>>>> Hi David,
>>>> 
>>>> On Tue, Jul 21, 2020 at 12:06 AM David Holmes
>>>    <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>>>> <mailto:david.holmes at oracle.com
>>>    <mailto:david.holmes at oracle.com>>> wrote:
>>>> 
>>>>    Hi Thomas,
>>>> 
>>>>    On 21/07/2020 12:17 am, Thomas St?fe wrote:
>>>>> Hi David,
>>>>> 
>>>>> 
>>>>> On Mon, Jul 20, 2020 at 2:54 PM David Holmes
>>>>    <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>>>    <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
>>>>> <mailto:david.holmes at oracle.com
>>>    <mailto:david.holmes at oracle.com>
>>>>    <mailto:david.holmes at oracle.com
>>>    <mailto:david.holmes at oracle.com>>>> wrote:
>>>>> 
>>>>>    Hi Thomas,
>>>>> 
>>>>>    On 20/07/2020 8:59 pm, Thomas St?fe wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> could I please have reviews for this very trivial
>>>    patch?
>>>>>> 
>>>>>> I found that gtest ignores invalid jvm options
>> randomly
>>>>    since it
>>>>>    relies on
>>>>>> an uninitialized variable.
>>>>> 
>>>>>    Is it really random? I would have expected it to be
>>>    basically
>>>>    always
>>>>>    non-zero and hence always ignore unknown options.
>>>>> 
>>>>> 
>>>>> I remember seeing cmdline option errors from gtest, and
>>>    since this
>>>>> coding is very old, this must be either my failing memory
>>>    or the
>>>>    random
>>>>> content of the underlying uninitialized stack memory.
>>>> 
>>>>    Yes but there is only one value of that uninitialized memory
>>>    (zero)
>>>>    that
>>>>    would cause unrecognised options to not be ignored.
>>>> 
>>>> 
>>>> So you are saying that it was in the past more probable that
>> options
>>>> were ignored than that they were heeded.
>>>> 
>>>> I am saying that they were occasionally not ignored. If that were
>>>    bad we
>>>> would have seen sporadic errors in the past. Especially as this
>>>    is a day
>>>> zero bug, there since integration of gtests in 2016.
>>>> 
>>>>>    I'm not at all clear
>>>>>    if running of the gtests might rely on always ignoring
>>>>>    unexpected/unknown flags -
>>>>> 
>>>>> 
>>>>> It does not. googletest fishes its own arguments from the
>>>    initial
>>>>> argument vector and passes the rest off to the JVM. So the
>> JVM
>>>>    ignoring
>>>>> or not ignoring arguments will not change anything.
>>>>> 
>>>>>    does it have the capability to distinguish
>>>>>    product and non-product test runs?
>>>>> 
>>>>> 
>>>>> Gtests are hotspot coding and there are sections
>>>    running with #ifdef
>>>>> ASSERT, though I am not sure why that would matter?
>>>> 
>>>>    For the same reason we sometimes need to @require that a VM
>>>    is a debug
>>>>    VM with a jtreg test - because the flag passed in is a
>>>    non-product
>>>>    flag.
>>>>    Also when we run test suites we often pass
>>>>    -XX:+IgnoreUnrecognisedVMOptions, again so that non-product
>>>    flags don't
>>>>    cause a failure with release bits.
>>>> 
>>>> 
>>>> Okay, I understand.
>>>> 
>>>>> 
>>>>>    I think this needs wider review from people familiar
>>>    with how
>>>>    our gtest
>>>>>    tests are run. (I have no idea - I never use it.)
>>>>> 
>>>>> 
>>>>> I am quite familiar with it since I use it almost daily. I
>>>    really
>>>>    depend
>>>>> on it being able to interpret JVM options.
>>>>> 
>>>>> In fact I am a bit dismayed by this bug since I write tons
>> of
>>>>    tests for
>>>>> JEP387 and was feeling very smug about the tests running
>>>    through
>>>>    in all
>>>>> my test scenarios, only to find that since days I keep
>> running
>>>>    the same
>>>>> - default - scenario over and over again :( No, that
>> should be
>>>>    really fixed.
>>>> 
>>>>    I don't understand what you mean. This setting should only
>>>    affect what
>>>>    happens with unrecognised "bad" arguments (as per your
>> subject).
>>>> 
>>>> 
>>>> It hides errors. Let's say I want to run metaspace gtests with a
>>>    policy
>>>> different from the standard, I'd specify
>>>> 
>>>> gtestLauncher -jdk:<jdk>  -XX:MetaspaceReclaimStrategy=none
>>>> 
>>>> but if I mistype the option - which happened to me - the VM
>>>    ignores it
>>>> silently and the tests run in default settings, all being green.
>>>    Which
>>>> gave me a false feeling of security until much later I started
>> found
>>>> during debugging the setting had been ignored.
>>>> The correct behaviour would have been for gtestLauncher to give
>>>    me an
>>>> "Unrecognized VM option" error right away.
>>>> 
>>>>> Gtestlauncher is called as part of the jtreg tests by the
>>>>    GtestWrapper,
>>>>> but that does not pass any options to it.
>>>> 
>>>>    Pardon my ignorance but how does one specify options for a
>>>    gtest then?
>>>> 
>>>> 
>>>> You just pass them on the command line. One can intermix them
>> freely
>>>> with the googletest options and those of the gtestlauncher itself.
>>>> For example:
>>>> 
>>>> ./hotspot/variant-server/libjvm/gtest/gtestLauncher
>> -Xlog:metaspace*
>>>> -jdk:./images/jdk/  --gtest_filter=metaspace.*
>>>> -XX:MetaspaceReclaimStrategy=none
>>>> 
>>>> -jdk gets consumed by our gtestLauncher itself
>>>> -any one of the gtest_... options get eaten by the googletest
>>>    framework
>>>> -Xlog and the -XX options are the remaining VM options and get
>>>    handed to
>>>> the VM
>>>> 
>>>>> But seriously, if there are tests which pass options to
>>>    it, they
>>>>> probably want those options to do something in the jvm, so
>>>>    ignoring them
>>>>> silently is not good.
>>>> 
>>>>    Again only "bad" options are ignored - which should only
>>>    impact use of
>>>>    non-product flags with product bits.
>>>> 
>>>> 
>>>> Same reasoning as above.
>>>> 
>>>>    David
>>>>    -----
>>>> 
>>>> 
>>>> My reasoning is this:
>>>> 
>>>> launcher should not ignore unrecognized VM options because whoever
>>>> specifies them expects them to arrive in the VM and do something.
>> In
>>>> case this patch shakes loose a scenario where someone specified
>> the
>>>> wrong option he/she should look at that case. Either it was wrong
>> to
>>>> specify the option, or more probably it was a mistype like it
>>>    happened
>>>> to me.
>>>> 
>>>> As for how could that happen. ATM I see in our sources three
>>>    invocations
>>>> of gtestLauncher.
>>>> 
>>>> - from make files when one runs tests via make, make/RunTests.gmk.
>>>> There, we can specify VM options via command line similar to what
>>>    I do
>>>> above. Though I am not sure this even works: the associated make
>>>> variable GTEST_JAVA_OPTIONS is not set anywhere AFAICS. Anyway,
>> this
>>>> case is to my knowledge only executed manually and so the same
>>>    reasoning
>>>> applies: whoever does this probably does not want a mistyped VM
>>>    option
>>>> silently ignored.
>>>> - from the one jtreg test which runs the gtests. It does not
>>>    specify any
>>>> VM options.
>>>> - from a vscode IDE integration script which does not either
>>>> 
>>>> Of course I do not know what Oracle does internally and whether
>> they
>>>> rely on bad options being ignored in some scripts somewhere.
>>>> 
>>>> --
>>>> 
>>>> So this is a bit of an impasse. Who do you suggest should look at
>>>    this?
>>>> I am not sure how to proceed.
>>>> 
>>>> Cheers, Thomas
>>>> 
>>>>> Thanks, Thomas
>>>>> 
>>>>> 
>>>>>    Thanks,
>>>>>    David
>>>>> 
>>>>>> issue:
>> https://bugs.openjdk.java.net/browse/JDK-8249748
>>>>>> webrev:
>>>>>> 
>>>>> 
>>>> 
>>> 
>> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
>>>>>> 
>>>>>> Thanks, Thomas
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 


From thomas.stuefe at gmail.com  Tue Jul 21 14:44:40 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 21 Jul 2020 16:44:40 +0200
Subject: RFR(xxs): 8249748: gtest silently ignores bad jvm arguments
In-Reply-To: <D17D763F-415E-4129-B212-B7709695AD26@oracle.com>
References: <CAA-vtUw7oE3CxAS_QeuQ4TJw3LWR_+8AyZVVECQ5rXDLeopNTQ@mail.gmail.com>
 <afc88d17-cd41-87cf-2eff-c44c7e72ddbf@oracle.com>
 <CAA-vtUxRO_3+8p4k0RiiKhB3TXrDo86gww1iM100iJVM3E1TmQ@mail.gmail.com>
 <b859287d-3a8f-20dd-7bb1-8b1e07036a8b@oracle.com>
 <CAA-vtUyycw2u947pfq7tvjFfhO_Amb5=7PGc-RtaHNEfFxZk=Q@mail.gmail.com>
 <d043432f-d1fe-de4f-394f-b1aba538c34d@oracle.com>
 <CAA-vtUwZ=zVtrgidHvSTwQDHqBe3vmU25dVkXdYC9v0dbAbneA@mail.gmail.com>
 <b94578c8-cdb2-0fb5-2fa0-6af7efd5a518@oracle.com>
 <CAA-vtUz0YQEGW60bo8gu4OaF_0O6TrLOzCbkeieD09i9BujkNA@mail.gmail.com>
 <D17D763F-415E-4129-B212-B7709695AD26@oracle.com>
Message-ID: <CAA-vtUwcR00QhLjpuA-buk31GKiNsNx6_wRfg2tVZtpaN3ZRDA@mail.gmail.com>

Thanks Igor! I'll push it then.

On Tue, Jul 21, 2020 at 4:41 PM Igor Ignatyev <igor.ignatyev at oracle.com>
wrote:

> Hi Thomas,
>
> the fix looks good to me. and your reasoning is completely correct, this
> is just an oversight in JEP 281 implementation, nor gtest integration, nor
> Oracle internal infra, nor any other code which depends on that bug. thanks
> for fixing!
>
> Cheers,
> -- Igor
>
>
> > On Jul 21, 2020, at 5:13 AM, Thomas St?fe <thomas.stuefe at gmail.com>
> wrote:
> >
> > thanks!
> >
> > I'll wait till tomorrow, if no one has opinions about this I push.
> >
> > ..Thomas
> >
> > On Tue, Jul 21, 2020 at 2:09 PM David Holmes <david.holmes at oracle.com>
> > wrote:
> >
> >> On 21/07/2020 6:35 pm, Thomas St?fe wrote:
> >>>
> >>>
> >>> On Tue, Jul 21, 2020 at 9:32 AM David Holmes <david.holmes at oracle.com
> >>> <mailto:david.holmes at oracle.com>> wrote:
> >>>
> >>>    Hi Thomas,
> >>>
> >>>    I'm running this through our tier 1-3 testing to see if it exposes
> >> any
> >>>    issue. If not then lets proceed unless someone else chimes in. I'll
> >>>    also
> >>>    flag this RFR internally.
> >>>
> >>>
> >>> Thank you David!
> >>
> >> Tests all clear.
> >>
> >> Cheers,
> >> David
> >>
> >>>
> >>>    To be clear my concern was that under the current uninitialized code
> >> it
> >>>    would more often act as
> >>>
> >>>    args.ignoreUnrecognized = JNI_TRUE;
> >>>
> >>>    than
> >>>
> >>>    args.ignoreUnrecognized = JNI_FALSE;
> >>>
> >>>    and so your change may perturb existing testing regimes.
> >>>
> >>>
> >>> I understand.
> >>>
> >>>    Sorry to belabour this.
> >>>
> >>>
> >>> No problem at all, that is why we do Reviews.
> >>> Thanks Thomas
> >>>
> >>>    Thanks,
> >>>    David
> >>>    -----
> >>>
> >>>    On 21/07/2020 4:16 pm, Thomas St?fe wrote:
> >>>> Hi David,
> >>>>
> >>>> On Tue, Jul 21, 2020 at 12:06 AM David Holmes
> >>>    <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> >>>> <mailto:david.holmes at oracle.com
> >>>    <mailto:david.holmes at oracle.com>>> wrote:
> >>>>
> >>>>    Hi Thomas,
> >>>>
> >>>>    On 21/07/2020 12:17 am, Thomas St?fe wrote:
> >>>>> Hi David,
> >>>>>
> >>>>>
> >>>>> On Mon, Jul 20, 2020 at 2:54 PM David Holmes
> >>>>    <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
> >>>    <mailto:david.holmes at oracle.com <mailto:david.holmes at oracle.com>>
> >>>>> <mailto:david.holmes at oracle.com
> >>>    <mailto:david.holmes at oracle.com>
> >>>>    <mailto:david.holmes at oracle.com
> >>>    <mailto:david.holmes at oracle.com>>>> wrote:
> >>>>>
> >>>>>    Hi Thomas,
> >>>>>
> >>>>>    On 20/07/2020 8:59 pm, Thomas St?fe wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> could I please have reviews for this very trivial
> >>>    patch?
> >>>>>>
> >>>>>> I found that gtest ignores invalid jvm options
> >> randomly
> >>>>    since it
> >>>>>    relies on
> >>>>>> an uninitialized variable.
> >>>>>
> >>>>>    Is it really random? I would have expected it to be
> >>>    basically
> >>>>    always
> >>>>>    non-zero and hence always ignore unknown options.
> >>>>>
> >>>>>
> >>>>> I remember seeing cmdline option errors from gtest, and
> >>>    since this
> >>>>> coding is very old, this must be either my failing memory
> >>>    or the
> >>>>    random
> >>>>> content of the underlying uninitialized stack memory.
> >>>>
> >>>>    Yes but there is only one value of that uninitialized memory
> >>>    (zero)
> >>>>    that
> >>>>    would cause unrecognised options to not be ignored.
> >>>>
> >>>>
> >>>> So you are saying that it was in the past more probable that
> >> options
> >>>> were ignored than that they were heeded.
> >>>>
> >>>> I am saying that they were occasionally not ignored. If that were
> >>>    bad we
> >>>> would have seen sporadic errors in the past. Especially as this
> >>>    is a day
> >>>> zero bug, there since integration of gtests in 2016.
> >>>>
> >>>>>    I'm not at all clear
> >>>>>    if running of the gtests might rely on always ignoring
> >>>>>    unexpected/unknown flags -
> >>>>>
> >>>>>
> >>>>> It does not. googletest fishes its own arguments from the
> >>>    initial
> >>>>> argument vector and passes the rest off to the JVM. So the
> >> JVM
> >>>>    ignoring
> >>>>> or not ignoring arguments will not change anything.
> >>>>>
> >>>>>    does it have the capability to distinguish
> >>>>>    product and non-product test runs?
> >>>>>
> >>>>>
> >>>>> Gtests are hotspot coding and there are sections
> >>>    running with #ifdef
> >>>>> ASSERT, though I am not sure why that would matter?
> >>>>
> >>>>    For the same reason we sometimes need to @require that a VM
> >>>    is a debug
> >>>>    VM with a jtreg test - because the flag passed in is a
> >>>    non-product
> >>>>    flag.
> >>>>    Also when we run test suites we often pass
> >>>>    -XX:+IgnoreUnrecognisedVMOptions, again so that non-product
> >>>    flags don't
> >>>>    cause a failure with release bits.
> >>>>
> >>>>
> >>>> Okay, I understand.
> >>>>
> >>>>>
> >>>>>    I think this needs wider review from people familiar
> >>>    with how
> >>>>    our gtest
> >>>>>    tests are run. (I have no idea - I never use it.)
> >>>>>
> >>>>>
> >>>>> I am quite familiar with it since I use it almost daily. I
> >>>    really
> >>>>    depend
> >>>>> on it being able to interpret JVM options.
> >>>>>
> >>>>> In fact I am a bit dismayed by this bug since I write tons
> >> of
> >>>>    tests for
> >>>>> JEP387 and was feeling very smug about the tests running
> >>>    through
> >>>>    in all
> >>>>> my test scenarios, only to find that since days I keep
> >> running
> >>>>    the same
> >>>>> - default - scenario over and over again :( No, that
> >> should be
> >>>>    really fixed.
> >>>>
> >>>>    I don't understand what you mean. This setting should only
> >>>    affect what
> >>>>    happens with unrecognised "bad" arguments (as per your
> >> subject).
> >>>>
> >>>>
> >>>> It hides errors. Let's say I want to run metaspace gtests with a
> >>>    policy
> >>>> different from the standard, I'd specify
> >>>>
> >>>> gtestLauncher -jdk:<jdk>  -XX:MetaspaceReclaimStrategy=none
> >>>>
> >>>> but if I mistype the option - which happened to me - the VM
> >>>    ignores it
> >>>> silently and the tests run in default settings, all being green.
> >>>    Which
> >>>> gave me a false feeling of security until much later I started
> >> found
> >>>> during debugging the setting had been ignored.
> >>>> The correct behaviour would have been for gtestLauncher to give
> >>>    me an
> >>>> "Unrecognized VM option" error right away.
> >>>>
> >>>>> Gtestlauncher is called as part of the jtreg tests by the
> >>>>    GtestWrapper,
> >>>>> but that does not pass any options to it.
> >>>>
> >>>>    Pardon my ignorance but how does one specify options for a
> >>>    gtest then?
> >>>>
> >>>>
> >>>> You just pass them on the command line. One can intermix them
> >> freely
> >>>> with the googletest options and those of the gtestlauncher itself.
> >>>> For example:
> >>>>
> >>>> ./hotspot/variant-server/libjvm/gtest/gtestLauncher
> >> -Xlog:metaspace*
> >>>> -jdk:./images/jdk/  --gtest_filter=metaspace.*
> >>>> -XX:MetaspaceReclaimStrategy=none
> >>>>
> >>>> -jdk gets consumed by our gtestLauncher itself
> >>>> -any one of the gtest_... options get eaten by the googletest
> >>>    framework
> >>>> -Xlog and the -XX options are the remaining VM options and get
> >>>    handed to
> >>>> the VM
> >>>>
> >>>>> But seriously, if there are tests which pass options to
> >>>    it, they
> >>>>> probably want those options to do something in the jvm, so
> >>>>    ignoring them
> >>>>> silently is not good.
> >>>>
> >>>>    Again only "bad" options are ignored - which should only
> >>>    impact use of
> >>>>    non-product flags with product bits.
> >>>>
> >>>>
> >>>> Same reasoning as above.
> >>>>
> >>>>    David
> >>>>    -----
> >>>>
> >>>>
> >>>> My reasoning is this:
> >>>>
> >>>> launcher should not ignore unrecognized VM options because whoever
> >>>> specifies them expects them to arrive in the VM and do something.
> >> In
> >>>> case this patch shakes loose a scenario where someone specified
> >> the
> >>>> wrong option he/she should look at that case. Either it was wrong
> >> to
> >>>> specify the option, or more probably it was a mistype like it
> >>>    happened
> >>>> to me.
> >>>>
> >>>> As for how could that happen. ATM I see in our sources three
> >>>    invocations
> >>>> of gtestLauncher.
> >>>>
> >>>> - from make files when one runs tests via make, make/RunTests.gmk.
> >>>> There, we can specify VM options via command line similar to what
> >>>    I do
> >>>> above. Though I am not sure this even works: the associated make
> >>>> variable GTEST_JAVA_OPTIONS is not set anywhere AFAICS. Anyway,
> >> this
> >>>> case is to my knowledge only executed manually and so the same
> >>>    reasoning
> >>>> applies: whoever does this probably does not want a mistyped VM
> >>>    option
> >>>> silently ignored.
> >>>> - from the one jtreg test which runs the gtests. It does not
> >>>    specify any
> >>>> VM options.
> >>>> - from a vscode IDE integration script which does not either
> >>>>
> >>>> Of course I do not know what Oracle does internally and whether
> >> they
> >>>> rely on bad options being ignored in some scripts somewhere.
> >>>>
> >>>> --
> >>>>
> >>>> So this is a bit of an impasse. Who do you suggest should look at
> >>>    this?
> >>>> I am not sure how to proceed.
> >>>>
> >>>> Cheers, Thomas
> >>>>
> >>>>> Thanks, Thomas
> >>>>>
> >>>>>
> >>>>>    Thanks,
> >>>>>    David
> >>>>>
> >>>>>> issue:
> >> https://bugs.openjdk.java.net/browse/JDK-8249748
> >>>>>> webrev:
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> http://cr.openjdk.java.net/~stuefe/webrevs/8249748-gtest-silently-ignores-bad-jvm-arguments/webrev.00/webrev/
> >>>>>>
> >>>>>> Thanks, Thomas
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

From lois.foltan at oracle.com  Tue Jul 21 15:06:31 2020
From: lois.foltan at oracle.com (Lois Foltan)
Date: Tue, 21 Jul 2020 11:06:31 -0400
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
 <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>
Message-ID: <369109fa-4aba-d8f1-3ce4-afb25c7e137a@oracle.com>

On 7/21/2020 2:24 AM, Ioi Lam wrote:
>
>
> On 7/20/20 11:12 PM, Florian Weimer wrote:
>> * Ioi Lam:
>>
>>> Hi please review this very simple fix:
>>>
>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 
>>> -0700
>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 
>>> -0700
>>> @@ -51,8 +51,11 @@
>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>> ??? _hash_and_refcount = pack_hash_and_refcount((short)os::random(),
>>> refcount);
>>> ??? _length = length;
>>> -? _body[0] = 0;? // in case length == 0
>>> ??? memcpy(_body, name, length);
>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>>> uninitialized and may
>>> +? // contain random values, which will only be read by
>>> Symbol::identity_hash(),
>>> +? // which would tolerate such randomness. These values never change
>>> during the lifetime
>>> +? // of the Symbol.
>>> ??}
>> Won't this still trip memory debuggers?? Symbol::identity_hash() implies
>> that the result is eventually used in a conditional operation (a hash
>> comparison perhaps).? If it's possible one day to run Hotspot under
>> valgrind, this would result in false positives.
>
> Are you saying that valgrind will modify uninitialized memory 
> periodically after the constructor has returned, and thus will cause 
> Symbol::identity_hash() to return a different value?
>
> Without my patch, _body[1] is uninitialized for Symbols whose length 
> is 0 or 1. We have not heard of any issues related to valgrind and 
> Symbol::identity_hash().
>
> In fact, looking at the code history, the setting of "_body[0] = 0" in 
> Symbol::Symbol was introduced only recently (Feb 2020):
>
> http://hg.openjdk.java.net/jdk/jdk/annotate/4a4d185098e2/src/hotspot/share/oops/symbol.cpp#l55 
>
>
> I'll check with Lois who added the code to see the reason for doing it.

Hi Ioi,

Reviewing this JBS issue, I have concerns over leaving both _body[0] and 
now even _body[1] uninitialized.? The signature processing frequently 
checks the first character of a Symbol via Symbol::char_at(0) to 
determine what type it is dealing with.? Is there a danger that the 
uninitialized memory actually has a valid type indicator in it like an 
'[' character for example?? The signature processing could potentially 
make wrong assumptions about the type it is trying to process.

Thanks,
Lois

>
>
> Thanks
> - Ioi
>
>> Thanks,
>> Florian
>>
>


From coleen.phillimore at oracle.com  Tue Jul 21 17:57:36 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 21 Jul 2020 13:57:36 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <2b52127c-8637-ed24-2a63-0b1372d4bff0@oracle.com>


One note below:

On 7/20/20 1:53 AM, David Holmes wrote:
> Hi Kim,
>
> Thanks for looking at this.
>
> Updated webrev at:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM 
>>>> areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>> in jni.cpp we were using the following form of make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already 
>>>> have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>> from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available 
>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of 
>>>> Thread::current() when it is already available from TRAPS, and some 
>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>> later extract the thread from the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
>
> "thread" and "THREAD" are interchangeable for anything expecting a 
> "Thread*" (and somewhat surprisingly a number of API's that only work 
> for JavaThreads actually take a Thread*. :( ). I had choice between 
> trying to be file-wide consistent with the make_local calls, versus 
> local-code consistent, and used THREAD as it is available in both 
> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
> "thread" for local consistency.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
>
> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
> use "thread" instead. But I'm not sure it's a consistency worth 
> pursuing at least as part of these changes (there are likely similar 
> issues with most of the touched files).

The thing I like about THREAD if it's available is that it's assumed to 
be *always* the current thread, so I have to wonder no further. Also, 
"thread" is generally the current thread too, but if you have a choice, 
my preference would be to use THREAD.

I wouldn't want to see this changed.

Thanks,
Coleen
>
> Thanks,
> David
>
>> ------------------------------------------------------------------------------ 
>>
>>


From coleen.phillimore at oracle.com  Tue Jul 21 18:01:36 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 21 Jul 2020 14:01:36 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <82ac807a-1492-9ac0-570a-d08b1dc93e09@oracle.com>


This looks like a nice cleanup.

http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/src/hotspot/share/runtime/jniHandles.cpp.udiff.html

I'm wondering why you took out the NULL return for make_local() without 
a thread argument?? Here you may call Thread::current() unnecessarily.

  jobject JNIHandles::make_local(oop obj) {
- if (obj == NULL) {
- return NULL; // ignore null handles
- } else {
- Thread* thread = Thread::current();
- assert(oopDesc::is_oop(obj), "not an oop");
- assert(!current_thread_in_native(), "must not be in native");
- return thread->active_handles()->allocate_handle(obj);
- }
+ return make_local(Thread::current(), obj);
  }
  

Beyond the scope of this fix, but it'd be cool to not have a version 
that doesn't take thread, since there may be many more callers that 
already have Thread::current().

Coleen


On 7/20/20 1:53 AM, David Holmes wrote:
> Hi Kim,
>
> Thanks for looking at this.
>
> Updated webrev at:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM 
>>>> areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>> in jni.cpp we were using the following form of make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already 
>>>> have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>> from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available 
>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of 
>>>> Thread::current() when it is already available from TRAPS, and some 
>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>> later extract the thread from the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
>
> "thread" and "THREAD" are interchangeable for anything expecting a 
> "Thread*" (and somewhat surprisingly a number of API's that only work 
> for JavaThreads actually take a Thread*. :( ). I had choice between 
> trying to be file-wide consistent with the make_local calls, versus 
> local-code consistent, and used THREAD as it is available in both 
> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
> "thread" for local consistency.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
>
> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
> use "thread" instead. But I'm not sure it's a consistency worth 
> pursuing at least as part of these changes (there are likely similar 
> issues with most of the touched files).
>
> Thanks,
> David
>
>> ------------------------------------------------------------------------------ 
>>
>>


From serguei.spitsyn at oracle.com  Tue Jul 21 19:25:31 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 21 Jul 2020 12:25:31 -0700
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <1256c311-76cf-2d59-2e12-c79516728d34@oracle.com>

Hi David,

The fix looks good to me.

Thanks,
Serguei


On 7/19/20 22:53, David Holmes wrote:
> Hi Kim,
>
> Thanks for looking at this.
>
> Updated webrev at:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM 
>>>> areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>> in jni.cpp we were using the following form of make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already 
>>>> have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>> from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available 
>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of 
>>>> Thread::current() when it is already available from TRAPS, and some 
>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>> later extract the thread from the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
>
> "thread" and "THREAD" are interchangeable for anything expecting a 
> "Thread*" (and somewhat surprisingly a number of API's that only work 
> for JavaThreads actually take a Thread*. :( ). I had choice between 
> trying to be file-wide consistent with the make_local calls, versus 
> local-code consistent, and used THREAD as it is available in both 
> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
> "thread" for local consistency.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
>
> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
> use "thread" instead. But I'm not sure it's a consistency worth 
> pursuing at least as part of these changes (there are likely similar 
> issues with most of the touched files).
>
> Thanks,
> David
>
>> ------------------------------------------------------------------------------ 
>>
>>


From david.holmes at oracle.com  Wed Jul 22 02:34:02 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 22 Jul 2020 12:34:02 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <1256c311-76cf-2d59-2e12-c79516728d34@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <1256c311-76cf-2d59-2e12-c79516728d34@oracle.com>
Message-ID: <63ff96e0-bcba-5041-0844-fb55b4fbfc1f@oracle.com>

Thanks Serguei!

David

On 22/07/2020 5:25 am, serguei.spitsyn at oracle.com wrote:
> Hi David,
> 
> The fix looks good to me.
> 
> Thanks,
> Serguei
> 
> 
> 
> On 7/19/20 22:53, David Holmes wrote:
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>>> wrote:
>>>>
>>>> Subject line got truncated by accident ...
>>>>
>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>> areas - hence the cross-post.
>>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>>> in jni.cpp we were using the following form of make_local:
>>>>> JNIHandles::make_local(env, obj);
>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>> but there is also another, faster, variant for when you already 
>>>>> have the "thread":
>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>> }
>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>> from the JNIEnv:
>>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>>> and further defined:
>>>>> ???? Thread* THREAD = thread;
>>>>> so we always already have direct access to the "thread" available 
>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>> Along the way I spotted some related issues with unnecessary use of 
>>>>> Thread::current() when it is already available from TRAPS, and some 
>>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>>> later extract the thread from the JNIEnv.
>>>>> Testing: tiers 1 - 3
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/javaClasses.cpp
>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>
>>> Since env is no longer used on the next line, move this down to where
>>> it is used, at line 444.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/verifier.cpp
>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>
>>> env now seems to only be used at line 320.? Move this closer.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a 
>> "Thread*" (and somewhat surprisingly a number of API's that only work 
>> for JavaThreads actually take a Thread*. :( ). I had choice between 
>> trying to be file-wide consistent with the make_local calls, versus 
>> local-code consistent, and used THREAD as it is available in both 
>> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
>> "thread" for local consistency.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvm.cpp
>>>
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>> use "thread" instead. But I'm not sure it's a consistency worth 
>> pursuing at least as part of these changes (there are likely similar 
>> issues with most of the touched files).
>>
>> Thanks,
>> David
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
> 

From david.holmes at oracle.com  Wed Jul 22 02:46:26 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 22 Jul 2020 12:46:26 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <328fb322-5b14-968b-7b13-4b449a8d98fd@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <328fb322-5b14-968b-7b13-4b449a8d98fd@oracle.com>
Message-ID: <4d763c6f-96e1-5c9b-8739-a441ee3b4b31@oracle.com>

Hi Dan,

On 21/07/2020 3:07 am, Daniel D. Daugherty wrote:
> On 7/20/20 1:53 AM, David Holmes wrote:
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
> 
> I like this cleanup very much!

Thanks for looking at it.

> 
> src/hotspot/share/classfile/javaClasses.cpp
>  ??? No comments.
> 
> src/hotspot/share/classfile/verifier.cpp
>  ??? L298: ? JavaThread* thread = (JavaThread*)THREAD;
>  ??? L307: ? ResourceMark rm(THREAD);
>  ??????? Since we've gone to the trouble of creating the 'thread' variable,
>  ??????? I would prefer it to be used instead of THREAD where possible.

Okay I made this change as we already use "thread" throughout that method.

> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>  ??? L1021: ? HandleMark hm;
>  ??????? Can this be 'hm(THREAD)'? (Not your problem, but while you're
>  ??????? in that file?)

It probably could but there are around 8 such uses and I don't want to 
expand this change any further than necessary for the current issue. I 
filed a general RFE for things that should take advantage of having a 
current thread reference already (that will encompass Coleen's 
make_local(obj) change as well).

https://bugs.openjdk.java.net/browse/JDK-8249837

> src/hotspot/share/prims/jni.cpp
>  ??? No comments.
> 
> src/hotspot/share/prims/jvm.cpp
>  ??? L140: ? ResourceMark rm;
>  ??????? Can this be 'rm(THREAD)'? (Not your problem, but while you're
>  ??????? in that file?)
> 
>  ??? L611: ? Handle stackStream_h(THREAD, 
> JNIHandles::resolve_non_null(stackStream));
>  ??? L617: ? objArrayHandle frames_array_h(THREAD, fa);
>  ??? L626: ? return JNIHandles::make_local(THREAD, result);
>  ??????? Since we've gone to the trouble of creating the 'jt' variable,
>  ??????? I would prefer it to be used instead of THREAD where possible.
> 
>  ??? L767: ? vframeStream vfst(thread);
>  ??? L788???????? return (jclass) JNIHandles::make_local(THREAD, 
> m->method_holder()->java_mirror());
>  ??????? Can we use 'thread' on L788? (preferred)
>  ??????? Can we use 'THREAD' on L767? (less preferred)
> 
>  ??? L949: ? ResourceMark rm(THREAD);
>  ??? L951: ? Handle class_loader (THREAD, JNIHandles::resolve(loader));
>  ??? L955: ?????????????????????????? THREAD);
>  ??? L957: ? Handle protection_domain (THREAD, JNIHandles::resolve(pd));
>  ??? L968: ? return (jclass) JNIHandles::make_local(THREAD, 
> k->java_mirror());
>  ??????? Since we've gone to the trouble of creating the 'jt' variable,
>  ??????? I would prefer it to be used instead of THREAD where possible.

As per our slack chat, and the fact you are okay with things as-is, I 
will forego a more general "consistency" pass as it is unclear what is 
best here. As Coleen notes THREAD is generally understood to always be 
the current thread, while thread/jthread/jt could be any old thread in 
general. Also THREAD usage can highlight a Thread* API, while "thread" 
has to be used for JavaThread* API - but obviously that needs to be 
carefully and consistently applied to be useful. :)

>  ??? L986: ? JavaThread* jt = (JavaThread*) THREAD;
>  ??????? This 'jt' is unused and can be deleted (Not your problem, but 
> while you're
>  ??????? in that file?)

Fixed (and another case elsewhere).

>  ??? L1154: ? while (*p != '\0') {
>  ??? L1155: ????? if (*p == '.') {
>  ??? L1156: ????????? *p = '/';
>  ??? L1157: ????? }
>  ??? L1158: ????? p++;
>  ??????? Nit - the indents are wrong on L1155-58. (Not your problem, but 
> while you're
>  ??????? in that file?)

Fixed

>  ??? L1389: ? ResourceMark rm(THREAD);
>  ??? L1446: ??? return JNIHandles::make_local(THREAD, result);
>  ??? L1460: ? return JNIHandles::make_local(THREAD, result);
>  ??????? Can we use 'thread' on L1389? (preferred) And then the line you
>  ??????? touched could also be 'thread' and we'll be consistent in this
>  ??????? function...

Left as-is.

>  ??? L3287: ? oop jthread = thread->threadObj();
>  ??? L3288: ? assert (thread != NULL, "no current thread!");
>  ??????? I think the assert is wrong. It should be:
> 
>  ??????????? assert(jthread != NULL, "no current thread!");
> 
>  ??????? If 'thread == NULL', then we would have crashed at L3287.
>  ??????? Also notice that I deleted the extra ' ' before '('. (Not
>  ??????? your problem, but while you're in that file?)

Fixed. I was initially concerned about bootstrapping but it is fine - we 
ensure we set threadObj() before executing any Java code.

>  ??? L3289: ? return JNIHandles::make_local(THREAD, jthread);
>  ??????? Can you use 'thread' instead of 'THREAD' here for consistency?
> 
>  ??? L3681: ??? method_handle = Handle(THREAD, 
> JNIHandles::resolve(method));
>  ??? L3682: ??? Handle receiver(THREAD, JNIHandles::resolve(obj));
>  ??? L3683: ??? objArrayHandle args(THREAD, 
> objArrayOop(JNIHandles::resolve(args0)));
>  ??? L3685: ??? jobject res = JNIHandles::make_local(THREAD, result);
>  ??????? Can you use 'thread' instead of 'THREAD' here for consistency?
> 
>  ??? L3705: ? objArrayHandle args(THREAD, 
> objArrayOop(JNIHandles::resolve(args0)));
>  ??? L3707?? jobject res = JNIHandles::make_local(THREAD, result);
>  ??????? Can you use 'thread' instead of 'THREAD' here for consistency?

Left as-is.

> src/hotspot/share/prims/methodHandles.cpp
>  ??? No comments.
> 
> src/hotspot/share/prims/methodHandles.hpp
>  ??? No comments.
> 
> src/hotspot/share/prims/unsafe.cpp
>  ??? No comments.
> 
> src/hotspot/share/prims/whitebox.cpp
>  ??? No comments.
> 
> src/hotspot/share/runtime/jniHandles.cpp
>  ??? No comments.
> 
> src/hotspot/share/runtime/jniHandles.hpp
>  ??? No comments.
> 
> src/hotspot/share/services/management.cpp
>  ??? No comments.
> 
> 
> None of my comments above are "must do". If you choose to make the
> changes, a new webrev isn't required, but would be useful for a
> sanity check.

In addition to the tweak above I found a bunch of make_locasl(obj) 
usages in jvm.cpp and jni.cpp thanks to Coleen, which I have also fixed. 
Updated webrev:

http://cr.openjdk.java.net/~dholmes/8249650/webrev.v3/

If this passes tier 1-3 re-testing then I plan to push.

Thanks,
David
-----

> Thumbs up.
> 
> Dan
> 
> 
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>>> wrote:
>>>>
>>>> Subject line got truncated by accident ...
>>>>
>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>> areas - hence the cross-post.
>>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>>> in jni.cpp we were using the following form of make_local:
>>>>> JNIHandles::make_local(env, obj);
>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>> but there is also another, faster, variant for when you already 
>>>>> have the "thread":
>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>> }
>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>> from the JNIEnv:
>>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>>> and further defined:
>>>>> ???? Thread* THREAD = thread;
>>>>> so we always already have direct access to the "thread" available 
>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>> Along the way I spotted some related issues with unnecessary use of 
>>>>> Thread::current() when it is already available from TRAPS, and some 
>>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>>> later extract the thread from the JNIEnv.
>>>>> Testing: tiers 1 - 3
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/javaClasses.cpp
>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>
>>> Since env is no longer used on the next line, move this down to where
>>> it is used, at line 444.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/verifier.cpp
>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>
>>> env now seems to only be used at line 320.? Move this closer.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a 
>> "Thread*" (and somewhat surprisingly a number of API's that only work 
>> for JavaThreads actually take a Thread*. :( ). I had choice between 
>> trying to be file-wide consistent with the make_local calls, versus 
>> local-code consistent, and used THREAD as it is available in both 
>> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
>> "thread" for local consistency.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvm.cpp
>>>
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>> use "thread" instead. But I'm not sure it's a consistency worth 
>> pursuing at least as part of these changes (there are likely similar 
>> issues with most of the touched files).
>>
>> Thanks,
>> David
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
> 

From david.holmes at oracle.com  Wed Jul 22 02:46:56 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 22 Jul 2020 12:46:56 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <82ac807a-1492-9ac0-570a-d08b1dc93e09@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <82ac807a-1492-9ac0-570a-d08b1dc93e09@oracle.com>
Message-ID: <4ca86ddb-8a73-783c-0b3f-e8003f7160a3@oracle.com>

Hi Coleen,

On 22/07/2020 4:01 am, coleen.phillimore at oracle.com wrote:
> 
> This looks like a nice cleanup.

Thanks for looking at this.

> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/src/hotspot/share/runtime/jniHandles.cpp.udiff.html
> 
> I'm wondering why you took out the NULL return for make_local() without 
> a thread argument?? Here you may call Thread::current() unnecessarily.
> 
>   jobject JNIHandles::make_local(oop obj) {
> - if (obj == NULL) {
> - return NULL; // ignore null handles
> - } else {
> - Thread* thread = Thread::current();
> - assert(oopDesc::is_oop(obj), "not an oop");
> - assert(!current_thread_in_native(), "must not be in native");
> - return thread->active_handles()->allocate_handle(obj);
> - }
> + return make_local(Thread::current(), obj);
>   }

I was simply using a standard call forwarding pattern to avoid code 
duplication. I suspect passing NULL is very rare so the unnecessary 
Thread::current() call is not an issue. Otherwise, if not NULL, the NULL 
check would happen twice (unless I keep the duplicated implementations).

> Beyond the scope of this fix, but it'd be cool to not have a version 
> that doesn't take thread, since there may be many more callers that 
> already have Thread::current().

Indeed! And in fact I had missed a number of these in jvm.cpp and 
jni.cpp so I have fixed those. I've filed a RFE for other cases:

https://bugs.openjdk.java.net/browse/JDK-8249837

Updated webrev:

http://cr.openjdk.java.net/~dholmes/8249650/webrev.v3/

If this passes tier 1-3 re-testing then I plan to push.

Thanks,
David
-----

> Coleen
> 
> 
> On 7/20/20 1:53 AM, David Holmes wrote:
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>>> wrote:
>>>>
>>>> Subject line got truncated by accident ...
>>>>
>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>> areas - hence the cross-post.
>>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>>> in jni.cpp we were using the following form of make_local:
>>>>> JNIHandles::make_local(env, obj);
>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>> but there is also another, faster, variant for when you already 
>>>>> have the "thread":
>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>> }
>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>> from the JNIEnv:
>>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>>> and further defined:
>>>>> ???? Thread* THREAD = thread;
>>>>> so we always already have direct access to the "thread" available 
>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>> Along the way I spotted some related issues with unnecessary use of 
>>>>> Thread::current() when it is already available from TRAPS, and some 
>>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>>> later extract the thread from the JNIEnv.
>>>>> Testing: tiers 1 - 3
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/javaClasses.cpp
>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>
>>> Since env is no longer used on the next line, move this down to where
>>> it is used, at line 444.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/verifier.cpp
>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>
>>> env now seems to only be used at line 320.? Move this closer.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a 
>> "Thread*" (and somewhat surprisingly a number of API's that only work 
>> for JavaThreads actually take a Thread*. :( ). I had choice between 
>> trying to be file-wide consistent with the make_local calls, versus 
>> local-code consistent, and used THREAD as it is available in both 
>> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
>> "thread" for local consistency.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvm.cpp
>>>
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>> use "thread" instead. But I'm not sure it's a consistency worth 
>> pursuing at least as part of these changes (there are likely similar 
>> issues with most of the touched files).
>>
>> Thanks,
>> David
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
> 

From david.holmes at oracle.com  Wed Jul 22 04:17:34 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 22 Jul 2020 14:17:34 +1000
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <369109fa-4aba-d8f1-3ce4-afb25c7e137a@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
 <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>
 <369109fa-4aba-d8f1-3ce4-afb25c7e137a@oracle.com>
Message-ID: <dc7c3f13-c3b6-28de-77f4-ff10ff3e670a@oracle.com>

Hi Lois,

On 22/07/2020 1:06 am, Lois Foltan wrote:
> On 7/21/2020 2:24 AM, Ioi Lam wrote:
>>
>>
>> On 7/20/20 11:12 PM, Florian Weimer wrote:
>>> * Ioi Lam:
>>>
>>>> Hi please review this very simple fix:
>>>>
>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 2020 
>>>> -0700
>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 2020 
>>>> -0700
>>>> @@ -51,8 +51,11 @@
>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>> ??? _hash_and_refcount = pack_hash_and_refcount((short)os::random(),
>>>> refcount);
>>>> ??? _length = length;
>>>> -? _body[0] = 0;? // in case length == 0
>>>> ??? memcpy(_body, name, length);
>>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>>>> uninitialized and may
>>>> +? // contain random values, which will only be read by
>>>> Symbol::identity_hash(),
>>>> +? // which would tolerate such randomness. These values never change
>>>> during the lifetime
>>>> +? // of the Symbol.
>>>> ??}
>>> Won't this still trip memory debuggers?? Symbol::identity_hash() implies
>>> that the result is eventually used in a conditional operation (a hash
>>> comparison perhaps).? If it's possible one day to run Hotspot under
>>> valgrind, this would result in false positives.
>>
>> Are you saying that valgrind will modify uninitialized memory 
>> periodically after the constructor has returned, and thus will cause 
>> Symbol::identity_hash() to return a different value?
>>
>> Without my patch, _body[1] is uninitialized for Symbols whose length 
>> is 0 or 1. We have not heard of any issues related to valgrind and 
>> Symbol::identity_hash().
>>
>> In fact, looking at the code history, the setting of "_body[0] = 0" in 
>> Symbol::Symbol was introduced only recently (Feb 2020):
>>
>> http://hg.openjdk.java.net/jdk/jdk/annotate/4a4d185098e2/src/hotspot/share/oops/symbol.cpp#l55 
>>
>>
>> I'll check with Lois who added the code to see the reason for doing it.
> 
> Hi Ioi,
> 
> Reviewing this JBS issue, I have concerns over leaving both _body[0] and 
> now even _body[1] uninitialized.? The signature processing frequently 
> checks the first character of a Symbol via Symbol::char_at(0) to 
> determine what type it is dealing with.? Is there a danger that the 
> uninitialized memory actually has a valid type indicator in it like an 
> '[' character for example?? The signature processing could potentially 
> make wrong assumptions about the type it is trying to process.

Aren't all the signature related symbols already guaranteed to not have 
zero-length, or else the length is being pre-tested for zero?

Thanks,
David
-----

> Thanks,
> Lois
> 
>>
>>
>> Thanks
>> - Ioi
>>
>>> Thanks,
>>> Florian
>>>
>>
> 

From xxinliu at amazon.com  Wed Jul 22 07:12:40 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Wed, 22 Jul 2020 07:12:40 +0000
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>,
 <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
Message-ID: <1595401959932.33284@amazon.com>

hi, Tobias, 

Thank you to review my patch. 
I make changes according to your feedbacks. here is the updated revision: 
https://cr.openjdk.java.net/~xliu/8247732/01/webrev/

1. I move the validation logic for compiler directives to compilerOracle::scan_flag_and_value.  
If something wrong  happens in parser, the patch will "gracefully" quit JVM using jvm_exit(1). is that okay? 
here is the example:

$./build/linux-x86_64-server-release/jdk/bin/java -XX:CompileCommand=option,java.util.HashMap::putVal,ccstrlist,DisableIntrinsic,_hello -version
CompileCommand: An error occurred during parsing
Line: option,java/util/HashMap  putVal ccstrlist DisableIntrinsic _hello
Error: Unrecognized intrinsic detected in DisableIntrinsic: _hello

Usage: '-XX:CompileCommand=command,"package/Class.method()"'
Use:   '-XX:CompileCommand=help' for more information.

2. I removed Method::external_name_short().

3. fixed indentation issue. 

Test: 
hotspot:tier1 and gtest:all

thanks,
--lx


________________________________________
From: Tobias Hartmann <tobias.hartmann at oracle.com>
Sent: Monday, July 20, 2020 1:23 AM
To: Liu, Xin; Nils Eliasson; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev
Subject: RE: [EXTERNAL] RFR(S): 8247732: validate user-input intrinsic_ids in ControlIntrinsic

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi,

On 08.07.20 10:26, Liu, Xin wrote:
> ControlIntrinsic/DisableIntrinsic in compiler directives are more complex. The matched directive is only parsed when hotspot attempts to compile the corresponding method.
>
> I validate at that time and JVM will crash if it doesnot meet guarantee() statement.

I don't think a guarantee should be used here, i.e. the VM shouldn't crash but we should exit
gracefully with an error message. Isn't it possible to piggy-back on the error mechanism in
DirectivesParser?

> I added Method::external_name_short() which only returns the shorter method name in the form of  "classname::method".
>
> Probably hotspot has had similar code, but I failed to discover. please let me know and I will remove it.

I would just use name_and_sig_as_C_string().

jvmFlagConstraintList.cpp:180/181
- Wrong indentation

jvmFlagConstraintsCompiler.cpp:388/400
- Maybe change the error message to "Unrecognized intrinsic detected in DisableIntrinsic [...]"

Best regards,
Tobias

From richard.reingruber at sap.com  Wed Jul 22 08:20:15 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Wed, 22 Jul 2020 08:20:15 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB3331445A57DBEC5F24C155649B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Goetz,

> I'll answer to the obvious things in this mail now.
> I'll go through the code thoroughly again and write 
> a review of my findings thereafter.

Sure. If trimmed my citations to relevant parts.

> > The delta includes many changes in comments, renaming of names, etc. So
> > I'd like to summarize
> > functional changes:
> > 
> > * Collected all the code for the testing feature DeoptimizeObjectsALot in
> > compileBroker.cpp and reworked it.
> Thanks, this makes it much more compact.

> >   With DeoptimizeObjectsALot enabled internal threads are started that
> > deoptimize frames and
> >   objects. The number of threads started are given with
> > DeoptimizeObjectsALotThreadCountAll and
> >   DeoptimizeObjectsALotThreadCountSingle. The former targets all existing
> > threads whereas the
> >   latter operates on a single thread selected round robin.
> > 
> >   I removed the mode where deoptimizations were performed at every nth
> > exit from the runtime. I never used it.

> Do I get it right? You have a n:1 and a n:all test scenario.
>  n:1: n threads deoptimize 1 Jana thread    where n = DOALThreadCountSingle
>  n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?

Not quite.

-XX:+DeoptimizeObjectsALot // required
-XX:DeoptimizeObjectsALotThreadCountAll=m
-XX:DeoptimizeObjectsALotThreadCountSingle=n

Will start m+n threads. Each operating on all existing JavaThreads using EscapeBarriers. The
difference between the 2 thread types is that one distinct EscapeBarrier targets either just a
single thread or all exisitng threads at onece. If just one single thread is targeted per
EscapeBarrier, then it is not always the same thread, but threads are selected round robin. So there
will be n threads selecting independently single threads round robin per EscapeBarrier and m threads
that target all threads in every EscapeBarrier.


> > * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> > execute it always independently
> >   of is_thread_fully_suspended().
> Is this also a performance optimization?

Maybe a minor one.

> > * JavaThread::wait_for_object_deoptimization():
> >   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> > safepoint check! This
> >     caused issues with not walkable stacks with DeoptimizeObjectsALot.
> OK. As I understand, there was one safepoint check in the old version, 
> now there is one in each iteration.  I assume this is intended, right?

Yes it is. The important thing here is (A) a safepoint check is needed /after/ leaving a safe state
(_thread_in_native, _thread_blocked). (B) Shared variables that are modified at safepoints or with
handshakes need to be reread /after/ the safepoint check.

BTW: I only noticed now that since JDK-8240918 JavaThreads themselves must disarm their polling
page. Originally (before handshakes) this was done by the VM thread. With handshakes it was done by
the thread executing the handshake op. This was change for OrderAccess::cross_modify_fence() where
the poll is left armed if the thread is in native and sice JDK-8240918 it is always left armed. So
when a thread leaves a safe state (native, blocked) and there was a handshake/vm op, it will always
call SafepointMechanism::block_if_requested_slow(), even if the handshake/vm operation have been
processed already and everybody else is happyly executing bytecodes :)

Still (A) and (B) hold.

> >   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> > microbenchmark [1]
> Ok.  Nice improvement, nice catch!

Yes. It certainly took some time to find out.

> > 
> > I refer to some more changes answering your questions and comments inline
> > below.
> > 
> > Thanks,
> > Richard.
> > 
> > [1] Microbenchmark:
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/
> > 


> > > I understand you annotate at safepoints where the escape analysis
> > > finds out that an object is "better" than global escape.
> > > This are the cases where the analysis identifies optimization
> > > opportunities. These annotations are then used to deoptimize
> > > frames and the objects referenced by them.
> > > Doesn't this overestimate the optimized
> > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > out.
> > 
> > Yes, the implementation is conservative, but it is comparatively simple and
> > the additional debug
> > info is just 2 flags per safepoint. 
> Thanks. It also helped that you explained to me offline that 
> there are more optimizations than only lock elimination and scalar
> replacement done based on the ea information.
> The ea refines the IR graph with allows follow up optimizations 
> which can not easily be tracked back to the escaping objects or 
> the call sites where they do not escape. 
> Thus, if there are non-global escaping objects, you have to 
> deoptimize the frame.
> Did I repeat that correctly?

Mostly, but there are also cases, where deoptimization is required if and only if ea-local objects
are passed as arguments. This is the case, when values are not read directely from a frame, but from
a callee frame.

> With this understanding, a row of my proposed renamings/comments
> are obsolete.

Ok.


> > On the other hand, those JVMTI operations
> > that really trigger
> > deoptimizations are expected to be comparatively infrequent such that
> > switching to the interpreter
> > for a few microseconds will hardly have an effect.
> That sounds reasonable.

> > I've done microbenchmarking to check this.
> > 
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> > nchmark/
> > 
> > I found that in the worst case performance can be impacted by 10%. If the
> > agent is extremely active
> > and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every
> > millisecond or more often,
> > then the performance impact can be 30%. But I would think that this is not
> > realistic. These calls
> > are issued in interactive sessions to analyze deadlocks.
> Ok. 
 
> > We could get more precise deoptimizations by adding a third flag per
> > safepoint for ea-local objects
> > among the owned monitors. This would help improve the worst case in the
> > benchmark. But I'm not
> > convinced, if it is worth it.
> > 
> > Refer to the README.txt of the microbenchmark for a more detailled
> > discussion.
 
> > > pcDesc.hpp
> > >
> > > I would like to see some documentation of the methods. 
> > Done. I didn't take your text, though, because I only noticed it after writing
> > my own. Let me know if you are not ok with it.
> That's fine. My texts were only proposals, you as author know better
> what goes on anyways.

Ok.

> > > scopeDesc.cpp
> > >
> > >   Besides refactoring copy escape info from pcDesc to scopeDesc
> > >   and add accessors. Trivial.
> > >
> > >   In scopeDesc.hpp you talk about NoEscape and ArgEscape.
> > >   This are opto terms, but scopeDesc is a shared datastructure
> > >   that does not depend on a specific compiler.
> > >   Please explain what is going on without using these terms.
> > 
> > Actually these are not too opto specific terms. They are used in the paper
> > referenced in
> > escape.hpp. Also you can easily google them. I'd rather keep the comments
> > as they are.
> Hmm, I'm not really happy with this, as also the papers
> are for the compiler community, and probably not familiar to 
> others that work with HotSpot.
> But stay with your terms if you think it makes it clearer.
> Anyways, with now understanding why you use conservative
> Information (see above), the descriptions I had in mind are not precise.

Ok.

> > > callnode.hpp
> > >
> > > You add functionality to annotate callnodes with escape information
> > > This is carried through code generation to final output where it is
> > > added to the compiled methods meta information.
> > >
> > > At Safepoints in general jvmti can access
> > >   - Objects that were scalar replaced. They must be reallocated.
> > >     (Flag EliminateAllocations)
> > >   - Objects that should be locked but are not because they never
> > >     escape the thread. They need to be relocked.
> > >
> > > At calls, Objects where locks have been removed escape to callees.
> > > We must persist this information so that if jvmti accesses the
> > > object in a callee, we can determine by looking at the caller that
> > > it needs to be relocked.
> > 
> > Note that the ea-optimization must not be at the current location, it can also
> > follow when control
> > returns to the caller. Lock elimination isn't the only relevant optimization.
> Yes, I understood now, see above. Thanks for explaining.

Ok.

> > Accesses to instance
> > members or array elements can be optimized as well.
> You mean the compiler can/will ignore volatile or memory ordering
> requirements for non-escaping objects? Sounds reasonable to do.

Yes, for instance. Also without volatile modifiers it will eliminate accesses. Here is an example:
Method A has a NoEscape allocation O that is not scalar replaced. A calls Method B, which is not
inlined. When you use your debugger to break in B, then modify a field of O, then this modification
would have no effect without deoptimization, because the jit assumes that B cannot modify O without
a reference to it.

> > You are right, it is not correct how flags are checked. Especially if only
> > running with the JVMCI compiler.
> >
> > I changed Deoptimization::deoptimize_objects_internal() to make
> > reallocation and relocking dependent
> > on similar checks as in Deoptimization::fetch_unroll_info_helper().
> > Furthermore EscapeBarriers are
> > conditionally activated depending on the following (see EscapeBarrier ctors):
> > 
> > JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
> > COMPILER2_PRESENT(|| DoEscapeAnalysis)
> > 
> > So the enhancement can be practically completely disabled by disabling
> > DoEscapeAnalysis, which is
> > what C2 currently does if JVMTI capabilities that allow access to local
> > references are taken.
> Thanks for fixing. 

Thanks for finding :)

> > I went for the latter.
> > 
> > > In fetch_unroll_info_helper, I don't understand why you need
> > >  && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
> > > for eliminated locks, but not for skalar replaced objects?
> > 
> > In short reallocation is idempotent, relocking is not.
> > 
> > Without the enhancement Deoptimization::realloc_objects() can already be
> > called more than once for a frame:
> > 
> > First call in materializeVirtualObjects() (also iterateFrames()).
> > 
> > Second (indirect) call in fetch_unroll_info_helper().
> > 
> > The objects from the first call are saved as jvmti deferred updates when
> > realloc_objects()
> > returns. Note that there is no relationship to jvmti. The thing in common is
> > that updates cannot be
> > directely installed into a compiled frame, it is necessary to deoptimize the
> > frame and defer the
> > updates until the compiled frame gets replaced. Every time the vframes
> > corresponding to the owner
> > frame are iterated, they get the deferred updates. So in
> > fetch_unroll_info_helper() the
> > GrowableArray<compiledVFrame*>* chunk reference them too. All
> > references to the objects created by
> > the second (indirect) call to realloc_objects() are never used, because
> > compiledVFrame accessors to
> > locals, expressions, and monitors override them with the deferred updates.
> > The objects become
> > unreachable and get gc'ed.
> OK, so repeatedly computed vFrames always have the first version of 
> reallocated objects by construction, so it needs not be handled here.
> But also due to construction, objects might be allocated just to be
> discarded.

Yes.
 
> > materializeVirtualObjects() does not bother with relocking.
> > deoptimize_objects_internal(), which is
> > introduced by the enhancement, does relock objects, after all the lock
> > elimination becomes illegal 
> > with the change in escape state. Relocking twice does not work, so the
> > enhancement avoids it by
> > checking EscapeBarrier::objs_are_deoptimized(thread, deoptee.id()).
> > 
> > Note that materializeVirtualObjects() can be called more than once and will
> > always return the very
> > same objects, even though it calls realloc_objects() again.
> Ok.


> > > I would guess it is because the eliminated locks can be applied to
> > > argEscape, but scalar replacement only to noescape objects?
> > > I.e. it might have been done before?
> > >
> > > But why isn't this the case for eliminate_allocations?
> > > deoptimize_objects_internal does both unconditionally,
> > > so both can happen to inner frames, right?
> > 
> > Sorry, I don't quite understand. Hope the explanation above helps.
> Yes.  I was guessing wrong :)

Ok, good :)

> > 
> > > Code will get much more simple if BiasedLocking is removed.
> > >
> > > EscapeBarrier:: ...
> > >
> > > (This class maybe would qualify for a file of its own.)
> > >
> > > deoptimize_objects()
> > > I would mention escape analysis only as side remark.  Also, as I understand,
> > > there is only one frame at given depth?
> > > // Deoptimize frames with optimized objects. This can be omitted locks and
> > > // objects not allocated but replaced by scalars. In C2, these optimizations
> > > // are based on escape analysis.
> > > // Up to depth, deoptimize frames with any optimized objects.
> > > // From depth to entry_frame, deoptimize only frames that
> > > // pass optimized objects to their callees.
> > > (First part similar for the comment above
> > EscapeBarrier::deoptimize_objects_internal().)
> > 
> > I've reworked the comment. Let me know if you still think it needs to be
> > improved.
> Good now, thanks (maybe break the long line ...)

Ok. Will do in next webrev.7

> > > Syncronization: looks good. I think others had a look at this before.
> > >
> > > EscapeBarrier::deoptimize_objects_internal()
> > >   The method name is misleading, it is not used by
> > >   deoptimize_objects().
> > >   Also, method with the same name is in Deopitmization.
> > >   Proposal: deoptimize_objects_thread() ?
> > 
> > Sorry, but I don't see, why it would be misleading.
> > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > understand that name.
> 1. I have no idea why it's called "_internal". Because it is private?
>    By the name, I would expect that EscapeBarrier::deoptimize_objects()
>    calls it for some internal tasks. But it does not.

Well, I'd say it is pretty internal, what's happening in that method. So IMHO the suffix _internal
is a match.

> 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> and calls deoptimize_objects(_one)_thread(thread) for each of these.
> That's how I would have named it. 
> But no bike shedding, if you don't see what I mean it's not obvious.

Ok. We could have a quick call, too, if you like.

> > > Renaming deferred_locals to deferred_updates is good, as well as
> > > adding a datastructure for it.
> > > (Adding this data structure might be a breakout, too.)
> > >
> > > good.
> > >
> > > thread.cpp
> > >
> > > good.
> > >
> > > vframe.cpp
> > >
> > > Is this a bug in existing code?
> > > Makes sense.
> > 
> > Depends on your definition of bug. There are no references to
> > vframe::is_entry_frame() in the
> > existing code. I would think it is a bug.
> So it is :)

I'm just afraid it could get fixed by removing the class entryVFrame.

> > 
> > >
> > > vframe_hp.hpp
> > > (What stands _hp for? helper? The file should be named
> > compiledVFrame ...)
> > >
> > > not_global_escape_in_scope() ...
> > > Again, you mention escape analysis here. Comments above hold, too.
> > 
> > I think it is the right name, because it is meaningful and simple.
> Ok, accepted ... given my understandings from above.

Ok.

> > 
> > > You introduce JvmtiDeferredUpdates. Good.
> > >
> > > vframe_hp.cpp
> > >
> > > Changes for JvmtiDeferredUpdates, escape state accessors,
> > >
> > > line 422:
> > > Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?
> > >
> > >
> > > macros.hpp
> > >   Good.
> > >
> > >
> > > Test coding
> > > ============
> > >
> > > compileBroker.h|cpp
> > >
> > > You introduce a third class of threads handled here and
> > > add a new flag to distinguish it. Before, the two kinds
> > > of threads were distinguished implicitly by passing in
> > > a compiler for compiler threads.
> > > The new thread kind is only used for testing in debug.
> > >
> > > make_thread:
> > > You could assert (comp != NULL...) to assure previous
> > > conditions.
> > 
> > If replaced the if-statements with a switch-statement, made sure all enum-
> > elements are covered, and
> > added the assertion you suggested.
> > 
> > > line 989 indentation broken
> > 
> > You are referring to this block I assume:
> > (from
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/src/hots
> > pot/share/compiler/compileBroker.cpp.frames.html)
> > 
> >  976   if (MethodFlushing) {
> >  977     // Initialize the sweeper thread
> >  978     Handle thread_oop = create_thread_oop("Sweeper thread", CHECK);
> >  979     jobject thread_handle = JNIHandles::make_local(THREAD,
> > thread_oop());
> >  980     make_thread(sweeper_t, thread_handle, NULL, NULL, THREAD);
> >  981   }
> >  982
> >  983 #if defined(ASSERT) && COMPILER2_OR_JVMCI
> >  984   if (DeoptimizeObjectsALot == 2) {
> >  985     // Initialize and start the object deoptimizer threads
> >  986     for (int thread_count = 0; thread_count <
> > DeoptimizeObjectsALotThreadCount; thread_count++) {
> >  987       Handle thread_oop = create_thread_oop("Deoptimize objects a lot
> > thread", CHECK);
> >  988       jobject thread_handle = JNIHandles::make_local(THREAD,
> > thread_oop());
> >  989       make_thread(deoptimizer_t, thread_handle, NULL, NULL, THREAD);
> >  990     }
> >  991   }
> >  992 #endif // defined(ASSERT) && COMPILER2_OR_JVMCI
> > 
> > I cannot really see broken indentation here. Am I looking at the wrong
> > location?
> I don't have the source version I reviewed last time any more, so 
> I can't check. But maybe an artefact from patching ... if there were
> tabs jcheck would have told you, so that's not it. No problem.

Ok.

Thanks again!

Cheers, Richard.

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Donnerstag, 16. Juli 2020 18:30
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard, 

I'll answer to the obvious things in this mail now.
I'll go through the code thoroughly again and write 
a review of my findings thereafter.

> So here is the new webrev.6
> 
> Webrev.6:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/
> Delta:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.inc/
Thanks for the incremental webrev, it's helpful!
 
> I spent most of the time running a microbenchmark [1] I wrote to answer
> questions from your
> review. At first I had trouble with variance in the results until I found out it
> was due to the NUMA
> architecture of the server I used. After that I noticed that there was a
> performance regression of
> about 5% even at low agent activity. I finally found out that it was due to the
> implementation of
> JavaThread::wait_for_object_deoptimization() which is called by the target
> of the JVMTI operation to
> self suspend for object deoptimization. I fixed this by adding limited spinning
> before calling
> wait() on the monitor.
> 
> The delta includes many changes in comments, renaming of names, etc. So
> I'd like to summarize
> functional changes:
> 
> * Collected all the code for the testing feature DeoptimizeObjectsALot in
> compileBroker.cpp and reworked it.
Thanks, this makes it much more compact.

>   With DeoptimizeObjectsALot enabled internal threads are started that
> deoptimize frames and
>   objects. The number of threads started are given with
> DeoptimizeObjectsALotThreadCountAll and
>   DeoptimizeObjectsALotThreadCountSingle. The former targets all existing
> threads whereas the
>   latter operates on a single thread selected round robin.
> 
>   I removed the mode where deoptimizations were performed at every nth
> exit from the runtime. I never used it.

Do I get it right? You have a n:1 and a n:all test scenario.
 n:1: n threads deoptimize 1 Jana thread    where n = DOALThreadCountSingle
 n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?

> * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> execute it always independently
>   of is_thread_fully_suspended().
Is this also a performance optimization?

> * Bugfix in EscapeBarrier::thread_added(): must not clear deopt flag. Found
> this testing with DeoptimizeObjectsALot.
Ok.

> * Added EscapeBarrier::thread_removed().
Ok.

> * EscapeBarrier constructors: barriers can now be entirely disabled by
> disabling DoEscapeAnalysis.
>   This effectively disables the enhancement.
Good!

> * JavaThread::wait_for_object_deoptimization():
>   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> safepoint check! This
>     caused issues with not walkable stacks with DeoptimizeObjectsALot.
OK. As I understand, there was one safepoint check in the old version, 
now there is one in each iteration.  I assume this is intended, right?

>   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> microbenchmark [1]
Ok.  Nice improvement, nice catch!

> 
> I refer to some more changes answering your questions and comments inline
> below.
> 
> Thanks,
> Richard.
> 
> [1] Microbenchmark:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/
> 


> > I understand you annotate at safepoints where the escape analysis
> > finds out that an object is "better" than global escape.
> > This are the cases where the analysis identifies optimization
> > opportunities. These annotations are then used to deoptimize
> > frames and the objects referenced by them.
> > Doesn't this overestimate the optimized
> > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > out.
> 
> Yes, the implementation is conservative, but it is comparatively simple and
> the additional debug
> info is just 2 flags per safepoint. 
Thanks. It also helped that you explained to me offline that 
there are more optimizations than only lock elimination and scalar
replacement done based on the ea information.
The ea refines the IR graph with allows follow up optimizations 
which can not easily be tracked back to the escaping objects or 
the call sites where they do not escape. 
Thus, if there are non-global escaping objects, you have to 
deoptimize the frame.
Did I repeat that correctly?
With this understanding, a row of my proposed renamings/comments
are obsolete.


> On the other hand, those JVMTI operations
> that really trigger
> deoptimizations are expected to be comparatively infrequent such that
> switching to the interpreter
> for a few microseconds will hardly have an effect.
That sounds reasonable.

> I've done microbenchmarking to check this.
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> 
> I found that in the worst case performance can be impacted by 10%. If the
> agent is extremely active
> and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every
> millisecond or more often,
> then the performance impact can be 30%. But I would think that this is not
> realistic. These calls
> are issued in interactive sessions to analyze deadlocks.
Ok. 
 
> We could get more precise deoptimizations by adding a third flag per
> safepoint for ea-local objects
> among the owned monitors. This would help improve the worst case in the
> benchmark. But I'm not
> convinced, if it is worth it.
> 
> Refer to the README.txt of the microbenchmark for a more detailled
> discussion.
 
> > pcDesc.hpp
> >
> > I would like to see some documentation of the methods. 
> Done. I didn't take your text, though, because I only noticed it after writing
> my own. Let me know if you are not ok with it.
That's fine. My texts were only proposals, you as author know better
what goes on anyways.

> > scopeDesc.cpp
> >
> >   Besides refactoring copy escape info from pcDesc to scopeDesc
> >   and add accessors. Trivial.
> >
> >   In scopeDesc.hpp you talk about NoEscape and ArgEscape.
> >   This are opto terms, but scopeDesc is a shared datastructure
> >   that does not depend on a specific compiler.
> >   Please explain what is going on without using these terms.
> 
> Actually these are not too opto specific terms. They are used in the paper
> referenced in
> escape.hpp. Also you can easily google them. I'd rather keep the comments
> as they are.
Hmm, I'm not really happy with this, as also the papers
are for the compiler community, and probably not familiar to 
others that work with HotSpot.
But stay with your terms if you think it makes it clearer.
Anyways, with now understanding why you use conservative
Information (see above), the descriptions I had in mind are not precise.

> > callnode.hpp
> >
> > You add functionality to annotate callnodes with escape information
> > This is carried through code generation to final output where it is
> > added to the compiled methods meta information.
> >
> > At Safepoints in general jvmti can access
> >   - Objects that were scalar replaced. They must be reallocated.
> >     (Flag EliminateAllocations)
> >   - Objects that should be locked but are not because they never
> >     escape the thread. They need to be relocked.
> >
> > At calls, Objects where locks have been removed escape to callees.
> > We must persist this information so that if jvmti accesses the
> > object in a callee, we can determine by looking at the caller that
> > it needs to be relocked.
> 
> Note that the ea-optimization must not be at the current location, it can also
> follow when control
> returns to the caller. Lock elimination isn't the only relevant optimization.
Yes, I understood now, see above. Thanks for explaining.
> Accesses to instance
> members or array elements can be optimized as well.
You mean the compiler can/will ignore volatile or memory ordering
requirements for non-escaping objects? Sounds reasonable to do.

> > // Returns true if at least one of the arguments to the call is an oop
> > // that does not escape globally.
> > bool ConnectionGraph::has_arg_escape(CallJavaNode* call) {
> 
> IMHO the method names are descriptive and don't need the comments. But I
> give in :) (only replaced
> "oop" with "object")
Thanks. Yes, object is better than oop.

> You are right, it is not correct how flags are checked. Especially if only
> running with the JVMCI compiler.
>
> I changed Deoptimization::deoptimize_objects_internal() to make
> reallocation and relocking dependent
> on similar checks as in Deoptimization::fetch_unroll_info_helper().
> Furthermore EscapeBarriers are
> conditionally activated depending on the following (see EscapeBarrier ctors):
> 
> JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
> COMPILER2_PRESENT(|| DoEscapeAnalysis)
> 
> So the enhancement can be practically completely disabled by disabling
> DoEscapeAnalysis, which is
> what C2 currently does if JVMTI capabilities that allow access to local
> references are taken.
Thanks for fixing. 

> I went for the latter.
> 
> > In fetch_unroll_info_helper, I don't understand why you need
> >  && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
> > for eliminated locks, but not for skalar replaced objects?
> 
> In short reallocation is idempotent, relocking is not.
> 
> Without the enhancement Deoptimization::realloc_objects() can already be
> called more than once for a frame:
> 
> First call in materializeVirtualObjects() (also iterateFrames()).
> 
> Second (indirect) call in fetch_unroll_info_helper().
> 
> The objects from the first call are saved as jvmti deferred updates when
> realloc_objects()
> returns. Note that there is no relationship to jvmti. The thing in common is
> that updates cannot be
> directely installed into a compiled frame, it is necessary to deoptimize the
> frame and defer the
> updates until the compiled frame gets replaced. Every time the vframes
> corresponding to the owner
> frame are iterated, they get the deferred updates. So in
> fetch_unroll_info_helper() the
> GrowableArray<compiledVFrame*>* chunk reference them too. All
> references to the objects created by
> the second (indirect) call to realloc_objects() are never used, because
> compiledVFrame accessors to
> locals, expressions, and monitors override them with the deferred updates.
> The objects become
> unreachable and get gc'ed.
OK, so repeatedly computed vFrames always have the first version of 
reallocated objects by construction, so it needs not be handled here.
But also due to construction, objects might be allocated just to be
discarded.
 
> materializeVirtualObjects() does not bother with relocking.
> deoptimize_objects_internal(), which is
> introduced by the enhancement, does relock objects, after all the lock
> elimination becomes illegal 
> with the change in escape state. Relocking twice does not work, so the
> enhancement avoids it by
> checking EscapeBarrier::objs_are_deoptimized(thread, deoptee.id()).
> 
> Note that materializeVirtualObjects() can be called more than once and will
> always return the very
> same objects, even though it calls realloc_objects() again.
Ok.


> > I would guess it is because the eliminated locks can be applied to
> > argEscape, but scalar replacement only to noescape objects?
> > I.e. it might have been done before?
> >
> > But why isn't this the case for eliminate_allocations?
> > deoptimize_objects_internal does both unconditionally,
> > so both can happen to inner frames, right?
> 
> Sorry, I don't quite understand. Hope the explanation above helps.
Yes.  I was guessing wrong :)

> >   I like if boolean operators are at the beginning of broken lines,
> >   but I think hotspot convention is to have them at the end.
> Ok, fixed.
Thanks.

> 
> > Code will get much more simple if BiasedLocking is removed.
> >
> > EscapeBarrier:: ...
> >
> > (This class maybe would qualify for a file of its own.)
> >
> > deoptimize_objects()
> > I would mention escape analysis only as side remark.  Also, as I understand,
> > there is only one frame at given depth?
> > // Deoptimize frames with optimized objects. This can be omitted locks and
> > // objects not allocated but replaced by scalars. In C2, these optimizations
> > // are based on escape analysis.
> > // Up to depth, deoptimize frames with any optimized objects.
> > // From depth to entry_frame, deoptimize only frames that
> > // pass optimized objects to their callees.
> > (First part similar for the comment above
> EscapeBarrier::deoptimize_objects_internal().)
> 
> I've reworked the comment. Let me know if you still think it needs to be
> improved.
Good now, thanks (maybe break the long line ...)


> > What is the check (cur_depth <= depth) good for? Can you
> > ever walk past entry_frame?
> 
> Yes (assuming you mean the outer while-statement), there are java frames
> beyond the entry frame if a
> native method calls java methods again. So we visit all frames up to the given
> depth and from there
> we continue to the entry frame. It is not necessary to continue beyond that
> entry frame, because
> escape analysis assumes that arguments to native functions escape globally.
> 
> Example: Let the java stack look like this:
> 
> +---------+
> | Frame A |
> +---------+
> | Frame N |
> +---------+
> | Frame B |
> +---------+ <- top of stack
> 
> Where java method A calls native method N and N calls java method B.
> 
> Very simplified the native stack will look like this
> 
> +-------------------------+
> | Frame of JIT Compiled A |
> +-------------------------+
> | Frame N                 |
> +-------------------------+
> | Entry Frame             |
> +-------------------------+
> | Frame B                 |
> +-------------------------+ <- top of stack
> 
> The entry frame is an activation of the call stub, which is a small assembler
> routine that
> translates from the native calling convention to the java calling convention.
> 
> There cannot be any ArgEscape that is passed to B (see above), therefore we
> can stop the stackwalk
> at the entry frame if depth is 1. If depth is 3 we have to continue to Frame A,
> as it is directely
> accessed. 
Ok, thanks, nice explanation!!

> > Isn't vf->is_compiled_frame() prerequisite that "Move to next physical
> frame"
> > is needed? You could move it into the other check.
> > If so, similar for deoptimize_objects_all_threads().
> 
> Only compiledVFrame require moving to the /top/ frame. Fixed.
Thanks, this looks better.

> > Syncronization: looks good. I think others had a look at this before.
> >
> > EscapeBarrier::deoptimize_objects_internal()
> >   The method name is misleading, it is not used by
> >   deoptimize_objects().
> >   Also, method with the same name is in Deopitmization.
> >   Proposal: deoptimize_objects_thread() ?
> 
> Sorry, but I don't see, why it would be misleading.
> What would be the meaning of 'deoptimize_objects_thread'? I don't
> understand that name.
1. I have no idea why it's called "_internal". Because it is private?
   By the name, I would expect that EscapeBarrier::deoptimize_objects()
   calls it for some internal tasks. But it does not.
2. My proposal: deoptimize_objects_all_threads() iterates all threads 
and calls deoptimize_objects(_one)_thread(thread) for each of these.
That's how I would have named it. 
But no bike shedding, if you don't see what I mean it's not obvious.


> > C1 stubs: this really shows you tested all configurations, great!
> >
> >
> > mutexLocker: ok.
> > objectMonitor.cpp: ok
> > stackValue.hpp   Is this missing clearing a bug?
> 
> In short: that change is not needed anymore. I'll remove it again.
Good. Thanks for the details.

> > Renaming deferred_locals to deferred_updates is good, as well as
> > adding a datastructure for it.
> > (Adding this data structure might be a breakout, too.)
> >
> > good.
> >
> > thread.cpp
> >
> > good.
> >
> > vframe.cpp
> >
> > Is this a bug in existing code?
> > Makes sense.
> 
> Depends on your definition of bug. There are no references to
> vframe::is_entry_frame() in the
> existing code. I would think it is a bug.
So it is :)

> 
> >
> > vframe_hp.hpp
> > (What stands _hp for? helper? The file should be named
> compiledVFrame ...)
> >
> > not_global_escape_in_scope() ...
> > Again, you mention escape analysis here. Comments above hold, too.
> 
> I think it is the right name, because it is meaningful and simple.
Ok, accepted ... given my understandings from above.

> 
> > You introduce JvmtiDeferredUpdates. Good.
> >
> > vframe_hp.cpp
> >
> > Changes for JvmtiDeferredUpdates, escape state accessors,
> >
> > line 422:
> > Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?
> >
> >
> > macros.hpp
> >   Good.
> >
> >
> > Test coding
> > ============
> >
> > compileBroker.h|cpp
> >
> > You introduce a third class of threads handled here and
> > add a new flag to distinguish it. Before, the two kinds
> > of threads were distinguished implicitly by passing in
> > a compiler for compiler threads.
> > The new thread kind is only used for testing in debug.
> >
> > make_thread:
> > You could assert (comp != NULL...) to assure previous
> > conditions.
> 
> If replaced the if-statements with a switch-statement, made sure all enum-
> elements are covered, and
> added the assertion you suggested.
> 
> > line 989 indentation broken
> 
> You are referring to this block I assume:
> (from
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/src/hots
> pot/share/compiler/compileBroker.cpp.frames.html)
> 
>  976   if (MethodFlushing) {
>  977     // Initialize the sweeper thread
>  978     Handle thread_oop = create_thread_oop("Sweeper thread", CHECK);
>  979     jobject thread_handle = JNIHandles::make_local(THREAD,
> thread_oop());
>  980     make_thread(sweeper_t, thread_handle, NULL, NULL, THREAD);
>  981   }
>  982
>  983 #if defined(ASSERT) && COMPILER2_OR_JVMCI
>  984   if (DeoptimizeObjectsALot == 2) {
>  985     // Initialize and start the object deoptimizer threads
>  986     for (int thread_count = 0; thread_count <
> DeoptimizeObjectsALotThreadCount; thread_count++) {
>  987       Handle thread_oop = create_thread_oop("Deoptimize objects a lot
> thread", CHECK);
>  988       jobject thread_handle = JNIHandles::make_local(THREAD,
> thread_oop());
>  989       make_thread(deoptimizer_t, thread_handle, NULL, NULL, THREAD);
>  990     }
>  991   }
>  992 #endif // defined(ASSERT) && COMPILER2_OR_JVMCI
> 
> I cannot really see broken indentation here. Am I looking at the wrong
> location?
I don't have the source version I reviewed last time any more, so 
I can't check. But maybe an artefact from patching ... if there were
tabs jcheck would have told you, so that's not it. No problem.

Best regards,
  Goetz.

From coleen.phillimore at oracle.com  Wed Jul 22 12:25:13 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 22 Jul 2020 08:25:13 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <4ca86ddb-8a73-783c-0b3f-e8003f7160a3@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <82ac807a-1492-9ac0-570a-d08b1dc93e09@oracle.com>
 <4ca86ddb-8a73-783c-0b3f-e8003f7160a3@oracle.com>
Message-ID: <e6a0a004-7805-7985-d844-5a2e74cf0814@oracle.com>

Ok, looks good to me.
Colen

On 7/21/20 10:46 PM, David Holmes wrote:
> Hi Coleen,
>
> On 22/07/2020 4:01 am, coleen.phillimore at oracle.com wrote:
>>
>> This looks like a nice cleanup.
>
> Thanks for looking at this.
>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/src/hotspot/share/runtime/jniHandles.cpp.udiff.html 
>>
>>
>> I'm wondering why you took out the NULL return for make_local() 
>> without a thread argument?? Here you may call Thread::current() 
>> unnecessarily.
>>
>> ? jobject JNIHandles::make_local(oop obj) {
>> - if (obj == NULL) {
>> - return NULL; // ignore null handles
>> - } else {
>> - Thread* thread = Thread::current();
>> - assert(oopDesc::is_oop(obj), "not an oop");
>> - assert(!current_thread_in_native(), "must not be in native");
>> - return thread->active_handles()->allocate_handle(obj);
>> - }
>> + return make_local(Thread::current(), obj);
>> ? }
>
> I was simply using a standard call forwarding pattern to avoid code 
> duplication. I suspect passing NULL is very rare so the unnecessary 
> Thread::current() call is not an issue. Otherwise, if not NULL, the 
> NULL check would happen twice (unless I keep the duplicated 
> implementations).
>
>> Beyond the scope of this fix, but it'd be cool to not have a version 
>> that doesn't take thread, since there may be many more callers that 
>> already have Thread::current().
>
> Indeed! And in fact I had missed a number of these in jvm.cpp and 
> jni.cpp so I have fixed those. I've filed a RFE for other cases:
>
> https://bugs.openjdk.java.net/browse/JDK-8249837
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v3/
>
> If this passes tier 1-3 re-testing then I plan to push.
>
> Thanks,
> David
> -----
>
>> Coleen
>>
>>
>> On 7/20/20 1:53 AM, David Holmes wrote:
>>> Hi Kim,
>>>
>>> Thanks for looking at this.
>>>
>>> Updated webrev at:
>>>
>>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>>>
>>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>>> On Jul 20, 2020, at 12:16 AM, David Holmes 
>>>>> <david.holmes at oracle.com> wrote:
>>>>>
>>>>> Subject line got truncated by accident ...
>>>>>
>>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>>> areas - hence the cross-post.
>>>>>> Whilst working on a different JNI fix I noticed that in most 
>>>>>> cases in jni.cpp we were using the following form of make_local:
>>>>>> JNIHandles::make_local(env, obj);
>>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>>> but there is also another, faster, variant for when you already 
>>>>>> have the "thread":
>>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>>> }
>>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>>> from the JNIEnv:
>>>>>> ???? JavaThread* 
>>>>>> thread=JavaThread::thread_from_jni_environment(env);
>>>>>> and further defined:
>>>>>> ???? Thread* THREAD = thread;
>>>>>> so we always already have direct access to the "thread" available 
>>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>>> Along the way I spotted some related issues with unnecessary use 
>>>>>> of Thread::current() when it is already available from TRAPS, and 
>>>>>> some other cases where we extracted the JNIEnv from a thread only 
>>>>>> to later extract the thread from the JNIEnv.
>>>>>> Testing: tiers 1 - 3
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/classfile/javaClasses.cpp
>>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>>
>>>> Since env is no longer used on the next line, move this down to where
>>>> it is used, at line 444.
>>>
>>> Fixed.
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/classfile/verifier.cpp
>>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>>
>>>> env now seems to only be used at line 320.? Move this closer.
>>>
>>> Fixed.
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/prims/jni.cpp
>>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>>
>>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>>> previously it just used "thread". Maybe this change shouldn't be made?
>>>> Or can the other uses be changed to THREAD for consistency?
>>>
>>> "thread" and "THREAD" are interchangeable for anything expecting a 
>>> "Thread*" (and somewhat surprisingly a number of API's that only 
>>> work for JavaThreads actually take a Thread*. :( ). I had choice 
>>> between trying to be file-wide consistent with the make_local calls, 
>>> versus local-code consistent, and used THREAD as it is available in 
>>> both JNI_ENTRY and via TRAPS. But I can certainly make a local 
>>> change to "thread" for local consistency.
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/prims/jvm.cpp
>>>>
>>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>>> instead of "THREAD", even though other places nearby are using
>>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>>> easily avoidable.
>>>
>>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>>> use "thread" instead. But I'm not sure it's a consistency worth 
>>> pursuing at least as part of these changes (there are likely similar 
>>> issues with most of the touched files).
>>>
>>> Thanks,
>>> David
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>>
>>


From goetz.lindenmaier at sap.com  Wed Jul 22 16:21:38 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 22 Jul 2020 16:21:38 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB3331445A57DBEC5F24C155649B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331445A57DBEC5F24C155649B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB29648700486756F4DA6ED521EC790@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard,

Thanks for the quick reply.

> > >   With DeoptimizeObjectsALot enabled internal threads are started that
> > > deoptimize frames and
> > >   objects. The number of threads started are given with
> > > DeoptimizeObjectsALotThreadCountAll and
> > >   DeoptimizeObjectsALotThreadCountSingle. The former targets all
> existing
> > > threads whereas the
> > >   latter operates on a single thread selected round robin.
> > >
> > >   I removed the mode where deoptimizations were performed at every nth
> > > exit from the runtime. I never used it.
> 
> > Do I get it right? You have a n:1 and a n:all test scenario.
> >  n:1: n threads deoptimize 1 Jana thread    where n => DOALThreadCountSingle
> >  n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?
> 
> Not quite.
> 
> -XX:+DeoptimizeObjectsALot // required
> -XX:DeoptimizeObjectsALotThreadCountAll=m
> -XX:DeoptimizeObjectsALotThreadCountSingle=n
> 
> Will start m+n threads. Each operating on all existing JavaThreads using
> EscapeBarriers. The
> difference between the 2 thread types is that one distinct EscapeBarrier
> targets either just a
> single thread or all exisitng threads at onece. If just one single thread is
> targeted per
> EscapeBarrier, then it is not always the same thread, but threads are selected
> round robin. So there
> will be n threads selecting independently single threads round robin per
> EscapeBarrier and m threads
> that target all threads in every EscapeBarrier.
Ok, yes, that is how I understood it. 
 
> > > * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> > > execute it always independently
> > >   of is_thread_fully_suspended().
> > Is this also a performance optimization?
> 
> Maybe a minor one.
OK

> > > * JavaThread::wait_for_object_deoptimization():
> > >   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> > > safepoint check! This
> > >     caused issues with not walkable stacks with DeoptimizeObjectsALot.
> > OK. As I understand, there was one safepoint check in the old version,
> > now there is one in each iteration.  I assume this is intended, right?
> 
> Yes it is. The important thing here is (A) a safepoint check is needed /after/
> leaving a safe state
> (_thread_in_native, _thread_blocked). (B) Shared variables that are modified
> at safepoints or with handshakes need to be reread /after/ the safepoint check.
> 
> BTW: I only noticed now that since JDK-8240918 JavaThreads themselves
> must disarm their polling
> page. Originally (before handshakes) this was done by the VM thread. With
> handshakes it was done by
> the thread executing the handshake op. This was changed for
> OrderAccess::cross_modify_fence() where
> the poll is left armed if the thread is in native and sice JDK-8240918 it is
> always left armed. So
> when a thread leaves a safe state (native, blocked) and there was a
> handshake/vm op, it will always
> call SafepointMechanism::block_if_requested_slow(), even if the
> handshake/vm operation have been
> processed already and everybody else is happyly executing bytecodes :)
Ok.

> Still (A) and (B) hold.

> > >   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> > > microbenchmark [1]
> > Ok.  Nice improvement, nice catch!
> 
> Yes. It certainly took some time to find out.
> 
> > >
> > > I refer to some more changes answering your questions and comments
> inline
> > > below.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > [1] Microbenchmark:
> > >
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> > >
> 
> 
> > > > I understand you annotate at safepoints where the escape analysis
> > > > finds out that an object is "better" than global escape.
> > > > This are the cases where the analysis identifies optimization
> > > > opportunities. These annotations are then used to deoptimize
> > > > frames and the objects referenced by them.
> > > > Doesn't this overestimate the optimized
> > > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > > out.
> > >
> > > Yes, the implementation is conservative, but it is comparatively simple
> and
> > > the additional debug
> > > info is just 2 flags per safepoint.
> > Thanks. It also helped that you explained to me offline that
> > there are more optimizations than only lock elimination and scalar
> > replacement done based on the ea information.
> > The ea refines the IR graph with allows follow up optimizations
> > which can not easily be tracked back to the escaping objects or
> > the call sites where they do not escape.
> > Thus, if there are non-global escaping objects, you have to
> > deoptimize the frame.
> > Did I repeat that correctly?
> 
> Mostly, but there are also cases where deoptimization is required if and only
> if ea-local objects
> are passed as arguments. This is the case when values are not read directly
> from a frame, but from a callee frame.
Hmm, don't get this completely, but ok.
  
> > > Accesses to instance
> > > members or array elements can be optimized as well.
> > You mean the compiler can/will ignore volatile or memory ordering
> > requirements for non-escaping objects? Sounds reasonable to do.
> 
> Yes, for instance. Also without volatile modifiers it will eliminate accesses.
> Here is an example:
> Method A has a NoEscape allocation O that is not scalar replaced. A calls
> Method B, which is not
> inlined. When you use your debugger to break in B, then modify a field of O,
> then this modification
> would have no effect without deoptimization, because the jit assumes that B
> cannot modify O without
> a reference to it.
Yes, A can keep O in a register, while the JVMTI thread would write to 
the location in the stack where the local is held (if it was written back).

> > > > Syncronization: looks good. I think others had a look at this before.
> > > >
> > > > EscapeBarrier::deoptimize_objects_internal()
> > > >   The method name is misleading, it is not used by
> > > >   deoptimize_objects().
> > > >   Also, method with the same name is in Deopitmization.
> > > >   Proposal: deoptimize_objects_thread() ?
> > >
> > > Sorry, but I don't see, why it would be misleading.
> > > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > > understand that name.
> > 1. I have no idea why it's called "_internal". Because it is private?
> >    By the name, I would expect that EscapeBarrier::deoptimize_objects()
> >    calls it for some internal tasks. But it does not.
> 
> Well, I'd say it is pretty internal, what's happening in that method. So IMHO
> the suffix _internal
> is a match.
> 
> > 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> > and calls deoptimize_objects(_one)_thread(thread) for each of these.
> > That's how I would have named it.
> > But no bike shedding, if you don't see what I mean it's not obvious.
> Ok. We could have a quick call, too, if you like.

Ok, I think I have understood the remaining points.  I'm fine with this 
so far.

Thanks,
  Goetz.


From lois.foltan at oracle.com  Wed Jul 22 17:12:53 2020
From: lois.foltan at oracle.com (Lois Foltan)
Date: Wed, 22 Jul 2020 13:12:53 -0400
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <dc7c3f13-c3b6-28de-77f4-ff10ff3e670a@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
 <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>
 <369109fa-4aba-d8f1-3ce4-afb25c7e137a@oracle.com>
 <dc7c3f13-c3b6-28de-77f4-ff10ff3e670a@oracle.com>
Message-ID: <f531bd56-4ce0-6581-b28e-3f8316764536@oracle.com>

On 7/22/2020 12:17 AM, David Holmes wrote:
> Hi Lois,
>
> On 22/07/2020 1:06 am, Lois Foltan wrote:
>> On 7/21/2020 2:24 AM, Ioi Lam wrote:
>>>
>>>
>>> On 7/20/20 11:12 PM, Florian Weimer wrote:
>>>> * Ioi Lam:
>>>>
>>>>> Hi please review this very simple fix:
>>>>>
>>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 
>>>>> 2020 -0700
>>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 
>>>>> 2020 -0700
>>>>> @@ -51,8 +51,11 @@
>>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>> ??? _hash_and_refcount = pack_hash_and_refcount((short)os::random(),
>>>>> refcount);
>>>>> ??? _length = length;
>>>>> -? _body[0] = 0;? // in case length == 0
>>>>> ??? memcpy(_body, name, length);
>>>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>>>>> uninitialized and may
>>>>> +? // contain random values, which will only be read by
>>>>> Symbol::identity_hash(),
>>>>> +? // which would tolerate such randomness. These values never change
>>>>> during the lifetime
>>>>> +? // of the Symbol.
>>>>> ??}
>>>> Won't this still trip memory debuggers? Symbol::identity_hash() 
>>>> implies
>>>> that the result is eventually used in a conditional operation (a hash
>>>> comparison perhaps).? If it's possible one day to run Hotspot under
>>>> valgrind, this would result in false positives.
>>>
>>> Are you saying that valgrind will modify uninitialized memory 
>>> periodically after the constructor has returned, and thus will cause 
>>> Symbol::identity_hash() to return a different value?
>>>
>>> Without my patch, _body[1] is uninitialized for Symbols whose length 
>>> is 0 or 1. We have not heard of any issues related to valgrind and 
>>> Symbol::identity_hash().
>>>
>>> In fact, looking at the code history, the setting of "_body[0] = 0" 
>>> in Symbol::Symbol was introduced only recently (Feb 2020):
>>>
>>> http://hg.openjdk.java.net/jdk/jdk/annotate/4a4d185098e2/src/hotspot/share/oops/symbol.cpp#l55 
>>>
>>>
>>> I'll check with Lois who added the code to see the reason for doing it.
>>
>> Hi Ioi,
>>
>> Reviewing this JBS issue, I have concerns over leaving both _body[0] 
>> and now even _body[1] uninitialized.? The signature processing 
>> frequently checks the first character of a Symbol via 
>> Symbol::char_at(0) to determine what type it is dealing with. Is 
>> there a danger that the uninitialized memory actually has a valid 
>> type indicator in it like an '[' character for example? The signature 
>> processing could potentially make wrong assumptions about the type it 
>> is trying to process.
>
> Aren't all the signature related symbols already guaranteed to not 
> have zero-length, or else the length is being pre-tested for zero?

Hi David,

I believe you are correct in that signature related symbols are already 
guaranteed to not have zero-length.? I tried several different jasm 
files to introduce an empty string either for a class name or for a type 
of a field reference and did not get past class file parsing without a 
ClassFormatError.

Error: LinkageError occurred while loading main class HelloEmptyString
 ????????java.lang.ClassFormatError: Class name is empty or contains 
illegal character in descriptor in class file HelloEmptyString


However, after reading the original issue in JDK-8249087, the objection 
wasn't that there was a needless initialization of _body[0] but instead 
it was around the inconsistency of initializing _body[0] and not 
_body[1].? I think it should be the responsibility of the Symbol class 
API to ensure initialization so that as a consumer I don't have to worry 
about _body[0] or _body[1]'s validity.? So I prefer the change to 
actually add the initialization of _body[1].? Something like, "_body[0] 
= _body[1] = 0".

Thanks,
Lois

>
> Thanks,
> David
> -----
>
>> Thanks,
>> Lois
>>
>>>
>>>
>>> Thanks
>>> - Ioi
>>>
>>>> Thanks,
>>>> Florian
>>>>
>>>
>>


From ioi.lam at oracle.com  Wed Jul 22 17:42:16 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 22 Jul 2020 10:42:16 -0700
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <f531bd56-4ce0-6581-b28e-3f8316764536@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
 <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>
 <369109fa-4aba-d8f1-3ce4-afb25c7e137a@oracle.com>
 <dc7c3f13-c3b6-28de-77f4-ff10ff3e670a@oracle.com>
 <f531bd56-4ce0-6581-b28e-3f8316764536@oracle.com>
Message-ID: <10e54211-4124-b055-ac8b-d31f0fbaca30@oracle.com>


On 7/22/20 10:12 AM, Lois Foltan wrote:
> On 7/22/2020 12:17 AM, David Holmes wrote:
>> Hi Lois,
>>
>> On 22/07/2020 1:06 am, Lois Foltan wrote:
>>> On 7/21/2020 2:24 AM, Ioi Lam wrote:
>>>>
>>>>
>>>> On 7/20/20 11:12 PM, Florian Weimer wrote:
>>>>> * Ioi Lam:
>>>>>
>>>>>> Hi please review this very simple fix:
>>>>>>
>>>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 
>>>>>> 2020 -0700
>>>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 
>>>>>> 2020 -0700
>>>>>> @@ -51,8 +51,11 @@
>>>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>>> ??? _hash_and_refcount = pack_hash_and_refcount((short)os::random(),
>>>>>> refcount);
>>>>>> ??? _length = length;
>>>>>> -? _body[0] = 0;? // in case length == 0
>>>>>> ??? memcpy(_body, name, length);
>>>>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>>>>>> uninitialized and may
>>>>>> +? // contain random values, which will only be read by
>>>>>> Symbol::identity_hash(),
>>>>>> +? // which would tolerate such randomness. These values never 
>>>>>> change
>>>>>> during the lifetime
>>>>>> +? // of the Symbol.
>>>>>> ??}
>>>>> Won't this still trip memory debuggers? Symbol::identity_hash() 
>>>>> implies
>>>>> that the result is eventually used in a conditional operation (a hash
>>>>> comparison perhaps).? If it's possible one day to run Hotspot under
>>>>> valgrind, this would result in false positives.
>>>>
>>>> Are you saying that valgrind will modify uninitialized memory 
>>>> periodically after the constructor has returned, and thus will 
>>>> cause Symbol::identity_hash() to return a different value?
>>>>
>>>> Without my patch, _body[1] is uninitialized for Symbols whose 
>>>> length is 0 or 1. We have not heard of any issues related to 
>>>> valgrind and Symbol::identity_hash().
>>>>
>>>> In fact, looking at the code history, the setting of "_body[0] = 0" 
>>>> in Symbol::Symbol was introduced only recently (Feb 2020):
>>>>
>>>> http://hg.openjdk.java.net/jdk/jdk/annotate/4a4d185098e2/src/hotspot/share/oops/symbol.cpp#l55 
>>>>
>>>>
>>>> I'll check with Lois who added the code to see the reason for doing 
>>>> it.
>>>
>>> Hi Ioi,
>>>
>>> Reviewing this JBS issue, I have concerns over leaving both _body[0] 
>>> and now even _body[1] uninitialized.? The signature processing 
>>> frequently checks the first character of a Symbol via 
>>> Symbol::char_at(0) to determine what type it is dealing with.? Is 
>>> there a danger that the uninitialized memory actually has a valid 
>>> type indicator in it like an '[' character for example?? The 
>>> signature processing could potentially make wrong assumptions about 
>>> the type it is trying to process.
>>
>> Aren't all the signature related symbols already guaranteed to not 
>> have zero-length, or else the length is being pre-tested for zero?
>
> Hi David,
>
> I believe you are correct in that signature related symbols are 
> already guaranteed to not have zero-length.? I tried several different 
> jasm files to introduce an empty string either for a class name or for 
> a type of a field reference and did not get past class file parsing 
> without a ClassFormatError.
>
> Error: LinkageError occurred while loading main class HelloEmptyString
> ????????java.lang.ClassFormatError: Class name is empty or contains 
> illegal character in descriptor in class file HelloEmptyString
>
>
> However, after reading the original issue in JDK-8249087, the 
> objection wasn't that there was a needless initialization of _body[0] 
> but instead it was around the inconsistency of initializing _body[0] 
> and not _body[1].? I think it should be the responsibility of the 
> Symbol class API to ensure initialization so that as a consumer I 
> don't have to worry about _body[0] or _body[1]'s validity.? So I 
> prefer the change to actually add the initialization of _body[1].? 
> Something like, "_body[0] = _body[1] = 0".
>
Hi Lois,

The fact that _body[0..1] is in the Symbol header is just by coincidence 
-- we need only 6 bytes of meta-info about the Symbol, so we have 2 
bytes left over. We use these two left-over bytes for 
Symbol::identity_hash(). However, no one else should unconditionally 
read these 2 bytes --? if we change the Symbol header in the future, 
these two bytes may not be allocated anymore

It looks like leaving _body[0..1] uninitialized is confusing, and will 
possibly lead to problems with valgrind as pointed out by Florian. How 
about this:

Symbol::Symbol(const u1* name, int length, int refcount) {
 ? _hash_and_refcount =? pack_hash_and_refcount((short)os::random(), 
refcount);
 ? _length = length;
 ? // _body[0..1] are allocated in the header just by coincidence in the 
current
 ? // implementation of Symbol. They are read by identity_hash(), so 
make sure they
 ? // are initialized.
 ? // No other code should assume that _body[0..1] are always allocated. 
E.g., do
 ? // not unconditionally read base()[0] as that will be invalid for an 
empty Symbol.
 ? _body[0] = _body[1] = 0;
 ? memcpy(_body, name, length);
}

I'll also change the bug title to "Always initialize _body[0..1] in 
Symbol constructor"

----

As I discussed with Lois off-line:

There's Signature code that unconditionally reads _body[0], which would 
assert (but class loading checks for invalid signatures that prevents 
this from happening)

BasicType Signature::basic_type(const Symbol* signature) {
 ? return basic_type(signature->char_at(0));
}

char Symbol::char_at(int index) const {
 ? assert(index >=0 && index < length(), "symbol index overflow");
 ? return (char)base()[index];
}

Signature::basic_type() should be fixed to either check for length, 
and/or assert that signature is a valid signature.

Thanks
- Ioi


From lois.foltan at oracle.com  Wed Jul 22 18:54:52 2020
From: lois.foltan at oracle.com (Lois Foltan)
Date: Wed, 22 Jul 2020 14:54:52 -0400
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <10e54211-4124-b055-ac8b-d31f0fbaca30@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
 <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>
 <369109fa-4aba-d8f1-3ce4-afb25c7e137a@oracle.com>
 <dc7c3f13-c3b6-28de-77f4-ff10ff3e670a@oracle.com>
 <f531bd56-4ce0-6581-b28e-3f8316764536@oracle.com>
 <10e54211-4124-b055-ac8b-d31f0fbaca30@oracle.com>
Message-ID: <53d28cb5-072a-d5bc-72c8-614ac7e98242@oracle.com>

On 7/22/2020 1:42 PM, Ioi Lam wrote:
>
>
> On 7/22/20 10:12 AM, Lois Foltan wrote:
>> On 7/22/2020 12:17 AM, David Holmes wrote:
>>> Hi Lois,
>>>
>>> On 22/07/2020 1:06 am, Lois Foltan wrote:
>>>> On 7/21/2020 2:24 AM, Ioi Lam wrote:
>>>>>
>>>>>
>>>>> On 7/20/20 11:12 PM, Florian Weimer wrote:
>>>>>> * Ioi Lam:
>>>>>>
>>>>>>> Hi please review this very simple fix:
>>>>>>>
>>>>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 
>>>>>>> 2020 -0700
>>>>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 
>>>>>>> 2020 -0700
>>>>>>> @@ -51,8 +51,11 @@
>>>>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>>>> ??? _hash_and_refcount = 
>>>>>>> pack_hash_and_refcount((short)os::random(),
>>>>>>> refcount);
>>>>>>> ??? _length = length;
>>>>>>> -? _body[0] = 0;? // in case length == 0
>>>>>>> ??? memcpy(_body, name, length);
>>>>>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>>>>>>> uninitialized and may
>>>>>>> +? // contain random values, which will only be read by
>>>>>>> Symbol::identity_hash(),
>>>>>>> +? // which would tolerate such randomness. These values never 
>>>>>>> change
>>>>>>> during the lifetime
>>>>>>> +? // of the Symbol.
>>>>>>> ??}
>>>>>> Won't this still trip memory debuggers? Symbol::identity_hash() 
>>>>>> implies
>>>>>> that the result is eventually used in a conditional operation (a 
>>>>>> hash
>>>>>> comparison perhaps).? If it's possible one day to run Hotspot under
>>>>>> valgrind, this would result in false positives.
>>>>>
>>>>> Are you saying that valgrind will modify uninitialized memory 
>>>>> periodically after the constructor has returned, and thus will 
>>>>> cause Symbol::identity_hash() to return a different value?
>>>>>
>>>>> Without my patch, _body[1] is uninitialized for Symbols whose 
>>>>> length is 0 or 1. We have not heard of any issues related to 
>>>>> valgrind and Symbol::identity_hash().
>>>>>
>>>>> In fact, looking at the code history, the setting of "_body[0] = 
>>>>> 0" in Symbol::Symbol was introduced only recently (Feb 2020):
>>>>>
>>>>> http://hg.openjdk.java.net/jdk/jdk/annotate/4a4d185098e2/src/hotspot/share/oops/symbol.cpp#l55 
>>>>>
>>>>>
>>>>> I'll check with Lois who added the code to see the reason for 
>>>>> doing it.
>>>>
>>>> Hi Ioi,
>>>>
>>>> Reviewing this JBS issue, I have concerns over leaving both 
>>>> _body[0] and now even _body[1] uninitialized.? The signature 
>>>> processing frequently checks the first character of a Symbol via 
>>>> Symbol::char_at(0) to determine what type it is dealing with.? Is 
>>>> there a danger that the uninitialized memory actually has a valid 
>>>> type indicator in it like an '[' character for example?? The 
>>>> signature processing could potentially make wrong assumptions about 
>>>> the type it is trying to process.
>>>
>>> Aren't all the signature related symbols already guaranteed to not 
>>> have zero-length, or else the length is being pre-tested for zero?
>>
>> Hi David,
>>
>> I believe you are correct in that signature related symbols are 
>> already guaranteed to not have zero-length.? I tried several 
>> different jasm files to introduce an empty string either for a class 
>> name or for a type of a field reference and did not get past class 
>> file parsing without a ClassFormatError.
>>
>> Error: LinkageError occurred while loading main class HelloEmptyString
>> ????????java.lang.ClassFormatError: Class name is empty or contains 
>> illegal character in descriptor in class file HelloEmptyString
>>
>>
>> However, after reading the original issue in JDK-8249087, the 
>> objection wasn't that there was a needless initialization of _body[0] 
>> but instead it was around the inconsistency of initializing _body[0] 
>> and not _body[1].? I think it should be the responsibility of the 
>> Symbol class API to ensure initialization so that as a consumer I 
>> don't have to worry about _body[0] or _body[1]'s validity.? So I 
>> prefer the change to actually add the initialization of _body[1].? 
>> Something like, "_body[0] = _body[1] = 0".
>>
> Hi Lois,
>
> The fact that _body[0..1] is in the Symbol header is just by 
> coincidence -- we need only 6 bytes of meta-info about the Symbol, so 
> we have 2 bytes left over. We use these two left-over bytes for 
> Symbol::identity_hash(). However, no one else should unconditionally 
> read these 2 bytes --? if we change the Symbol header in the future, 
> these two bytes may not be allocated anymore
>
> It looks like leaving _body[0..1] uninitialized is confusing, and will 
> possibly lead to problems with valgrind as pointed out by Florian. How 
> about this:
>
> Symbol::Symbol(const u1* name, int length, int refcount) {
> ? _hash_and_refcount = pack_hash_and_refcount((short)os::random(), 
> refcount);
> ? _length = length;
> ? // _body[0..1] are allocated in the header just by coincidence in 
> the current
> ? // implementation of Symbol. They are read by identity_hash(), so 
> make sure they
> ? // are initialized.
> ? // No other code should assume that _body[0..1] are always 
> allocated. E.g., do
> ? // not unconditionally read base()[0] as that will be invalid for an 
> empty Symbol.
> ? _body[0] = _body[1] = 0;
> ? memcpy(_body, name, length);
> }
>
> I'll also change the bug title to "Always initialize _body[0..1] in 
> Symbol constructor"

This looks good to me.? Thanks Ioi for changing it.

>
> ----
>
> As I discussed with Lois off-line:
>
> There's Signature code that unconditionally reads _body[0], which 
> would assert (but class loading checks for invalid signatures that 
> prevents this from happening)
>
> BasicType Signature::basic_type(const Symbol* signature) {
> ? return basic_type(signature->char_at(0));
> }
>
> char Symbol::char_at(int index) const {
> ? assert(index >=0 && index < length(), "symbol index overflow");
> ? return (char)base()[index];
> }
>
> Signature::basic_type() should be fixed to either check for length, 
> and/or assert that signature is a valid signature.

I just created an RFE for this, 
https://bugs.openjdk.java.net/browse/JDK-8249931

Thanks,
Lois

>
> Thanks
> - Ioi
>
>
>
>


From ioi.lam at oracle.com  Wed Jul 22 19:36:17 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 22 Jul 2020 12:36:17 -0700
Subject: RFR(L) 8244778 Archive full module graph in CDS
Message-ID: <c3b14359-e91e-5bab-bc92-384bb50a6949@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8244778
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v01/

Please review this patch that stores the full module graph in the CDS
archive heap. This reduces the initialization time of the basic JVM by
about 22%:

$ perf stat -r 100 bin/java -version
before: 98,219,329 instructions 0.03971 secs elapsed (+- 0.44%)
after:? 55,835,815 instructions 0.03109 secs elapsed (+- 0.65%)

[1] Start with ModuleBootstrap.java. The current implementation is
 ??? quite restrictive: the archived module graph is used only when no
 ??? module options are specified.

 ??? See ModuleBootstrap.mayUseArchivedBootLayer().

 ??? We can probably support options such as main module and module path
 ??? in a future RFE.

[2] In the current JDK implementation, there is no single object
 ??? that represents "the module graph". Most of the information
 ??? is stored in the archive bootLayer object, but a few additional
 ??? restoration operations need to be performed:

 ??? + See ModuleBootstrap.getArchivedBootLayer()
 ??? + Some static fields need to be archived/restored in
 ????? Module.java, BuiltinClassLoader.java, ClassLoaders.java
 ????? and BootLoader.java

[3] I ran into a complication with two loader instances of
 ??? PlatformClassLoader and AppClassLoader. They are stored in
 ??? multiple tables inside the module graph (e.g.,
 ??? BuiltinClassLoader$LoadedModule) so I cannot easily recreate
 ??? them at runtime.

 ??? However, these two loaders contain information specific to the
 ??? dump time VM lifecycle (such as the classes that were loaded
 ??? during CDS dumping) that need to be scrubbed. I couldn't find an
 ??? elegant way of doing this, so I added a private "resetArchivedStates"
 ??? method to a few classes. They are called inside
 ??? HeapShared::reset_archived_object_states().

[4] Related native data structures (PackageEntry and ModuleEntry)
 ??? are also archived. Start with classLoaderData.cpp

Passes mach5 tiers 1-4. I will test with additional tiers.

Thanks
- Ioi

From richard.reingruber at sap.com  Wed Jul 22 20:18:23 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Wed, 22 Jul 2020 20:18:23 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM4PR0202MB2964FAF58FBD21D6705A4418EC7C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM4PR0202MB2964FAF58FBD21D6705A4418EC7C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB333139A9A877B64198E73D0F9B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Goetz,

> > I'll answer to the obvious things in this mail now.
> > I'll go through the code thoroughly again and write
> > a review of my findings thereafter.
> As promised a detailed walk-throug, but without any major findings:

> c1_IR.hpp: ok
> ci_Env.h|cpp: ok
> compiledMethod.cpp, nmethod.cpp: ok
> debugInfoRec.h|cpp: ok
> scopeDesc.h|cpp ok

> compileBroker.h|cpp: 
> Maybe a bit of documentation how and why you start 
> the threads? I had expected there are two test
> scenarios run after each other, but now I understand 'Single'
> and 'All' run simultaneously.  Well, this really is a stress test!
> Also good the two variants of depotimization are
> stressed against each other.
> Besides that really nice it's all in one place.

Done.

> rootResolver.cpp: ok
> jvmciCodeInstaller.cpp: ok

> c2compiler.cpp: The essence of this change! Just one line :)
> Great!

:)

> callnode.hpp ok
> escape.h|cpp ok
> macro.cpp 
> I was not that happy with the names saying not_global_escape
> and similar. I now agreed you have to use the terms of the escape
> analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with 
> the 'not' in the term, I always try to expand the name to some
> sentence with a negated verb, but it makes no sense.
> For example, "has_not_global_escape_in_scope" expands to 
> "Hasn't a global escape in its scope." in my thinking, which makes 
> no sense. You probably mean
> "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} 
> in its scope."

> C2 is using the word "non" in this context, e.g., here 
> alloc->is_non_escaping.

There is also ConnectionGraph::not_global_escape()

> non obviously negates the adjective 'global',
> non-global or nonglobal even is a English term I find in the 
> net. 
> So what about "has_non_global_escape_in_scope?"

And what about has_ea_local_in_scope?

> matcher.cpp ok

> output.cpp:1071
> Please break the long line.

Done.

> jvmtiCodeBlobEvents.cpp ok

> jvmtiEnv.cpp
> MaxJavaStackTraceDepth is only documented to affect
> the exceptions stack trace depth, not to limit jvmti 
> operations. Therefore I wondered why it is used here. 
> Non of your business, but the flag should
> document this in globals.hpp, too.  
> Does jvmti specify that the same limits are used ...?
> ok on your side.

I don't know and didn't find anything in a quick search.

> jvmtiEnvBase.cpp  ok
> jvmtiImpl.h|cpp  ok
> jvmtiTagMap.cpp ok
> whitebox.cpp ok

> deoptimization.cpp

> line 177: Please break line
> line 246, 281: Please break line
> 1578, 1583, 1589, 1632, 1649, 1651 Break line

> 1651: You use 'non'-terms, too: non-escaping :)

I know :) At least here it is wrong I'd say. "...has to be a not escaping obj..." sounds better
(hopefully not only to my german ears).

> 2805, 2929, 2946ff, break lines

> deoptimization.hpp

> 158, 174, 176 ... I would break lines too, but here you are in
> good company :)

Done.

> globals.hpp ok
> mutexLocker.h|cpp ok
> objectMonitor.cpp ok

> thread.cpp 

> 2631 typo: sapfepont --> safepoint

Done.

> thread.hpp ok
> thread.inline.hpp ok
> vframe.cpp ok
> vframe_hp.cpp   458ff break lines
> vframe_hp.hpp ok
> macros.hpp ok
> TEST.ROOT ok
> WhiteBox.java ok

> IterateHeapWithEscapeAnalysisEnabled.java

> line 415:
> msg("wait until target thread has set testMethod_result");
> while (testMethod_result == 0) {
>     Thread.sleep(50);
> }
> Might the test run into timeouts at this place?
> The field is volatile, i.e. it will be reloaded
> in each iteration. But will dontinline_testMethod
> write it back to main memory in time?

You mean, the test could hang in that loop for a couple of minutes? I don't
think so. There are cache coherence protocols in place which will invalidate
stale data very timely.

> libIterateHeapWithEscapeAnalysisEnabled.c ok

> EATests.java

> This is a very elaborate test.
> I found a row of test cases illustrating issues
> we talked about before. Really helpful!

> 1311: TypeO materialize -> materialized

Found and fix typo at line 1369.
(Probably the cursor was on 1311 and your eyes on 1369 ;))

> 1640: setting local variable i triggers always deoptimization
>   --> setting local variable i always triggers deoptimization

Fixed.

> 2176: dontinline_calee --> dontinline_callee
> 2510: poping --> popping  ... but I'm not sure here.

Done.

> https://www.urbandictionary.com/define.php?term=poping
> poping
> Drinking large amounts of Dextromethorphan Hydrobromide (DXM)based cough syrup, and then embarking on an adventure while wandering around neighborhoods or parks all night. This is usually done while listening to Punk rock music from a portable jambox. 
> ;)
> Don?t do it! ??

OMG! How come you know?! ;)

> EATestsJVMTI.java

> I think you can just copy this test description into the other
> test. You can have two @test comments, they will be treated
> as separate tests.  The @requires will be evaluated accordingly.
> For an example see 
> test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerExceptionTest.java
> which has two different compile setups for the test class (-g).

Done.

> so, that's it for reading code ...


> Some general remarks, maybe a bit picky ...:
> I think you could use less commas ',' in comments.
> As I understand, you need a comma if the relative
> sentence is at the beginning, but not if it is at 
> the end:
>   If Corona is over, I go to the office.
> but
>   I go to the office if Corona is over.

That seem's to be correct except "If Corona is over" isn't a relative sentence
but a conditional sentence, isn't it?

The general rule seems to be: the subordinate clause is separated with a comma
from a following main clause. No comma separation is needed if the subordinate
clause follows the main clause.

Thanks, that's a lesson I learned!

> I think the same holds for 'because', 'while' etc.
> E.g., jvmtiEnvBase.cpp:1313, jvmtiImpl.cpp:646ff, 
> vframe_hp.hpp 104ff

Ok. I've removed quite a lot of the occurrances.

> Also, I like full sentences in comments.  
> Especially for me as foreign speaker, this makes
> things much more clear. I.e., I try to make it
> a real sentence with articles, capitalized and a
> dot at the end if there is a subject and a verb
> in first place.
> E.g., jvmtiEnvBase.cpp:1327

Are you referring to the following?
(from http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/src/hotspot/share/prims/jvmtiEnvBase.cpp.frames.html)

1326 
1327   // If the frame is a compiled one, need to deoptimize it.
1328   if (vf->is_compiled_frame()) {

This line 1327 is preexisting.

> In many places, your comments read really 
> well but some are quite abbreviated I think.

Yeah, but not only because I'm lazy... It is the style that I prefer and I think
it matches the surrounding code quite well.

> E.g. thread.cpp:2601 is an example where a simple
> 'a' helps a lot.
> "Single deoptimization is typically very short."
> I would add 'A': "A single deoptimization is typically very short (fast?)."
> An other meaning of the comment I first considered is this:
> "Single deoptimization is typically very short, all_threads deoptimization takes longer"
> having in mind the functions
> EscapeBarries::deoptimize_objects_all_threads()  
> and 
> EscapeBarries::deoptimize_objects() doing a single thread.
> German with it's compound nouns is helpful here :)

> Einzeldeoptimierung <--> eine einzelne Deoptimierung

I've added the 'A' and I'll try to use complete sentences in the future. The
telegram style has advantages, too, though ;)

Thanks!

Cheers, Richard.

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Freitag, 17. Juli 2020 14:31
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

> I'll answer to the obvious things in this mail now.
> I'll go through the code thoroughly again and write
> a review of my findings thereafter.
As promised a detailed walk-throug, but without any major findings:

c1_IR.hpp: ok
ci_Env.h|cpp: ok
compiledMethod.cpp, nmethod.cpp: ok
debugInfoRec.h|cpp: ok
scopeDesc.h|cpp ok

compileBroker.h|cpp: 
Maybe a bit of documentation how and why you start 
the threads? I had expected there are two test
scenarios run after each other, but now I understand 'Single'
and 'All' run simultaneously.  Well, this really is a stress test!
Also good the two variants of depotimization are
stressed against each other.
Besides that really nice it's all in one place.

rootResolver.cpp: ok
jvmciCodeInstaller.cpp: ok

c2compiler.cpp: The essence of this change! Just one line :)
Great!

callnode.hpp ok
escape.h|cpp ok
macro.cpp 
I was not that happy with the names saying not_global_escape
and similar. I now agreed you have to use the terms of the escape
analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with 
the 'not' in the term, I always try to expand the name to some
sentence with a negated verb, but it makes no sense.
For example, "has_not_global_escape_in_scope" expands to 
"Hasn't a global escape in its scope." in my thinking, which makes 
no sense. You probably mean
"Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} 
in its scope."

C2 is using the word "non" in this context, e.g., here 
alloc->is_non_escaping.

non obviously negates the adjective 'global',
non-global or nonglobal even is a English term I find in the 
net. 
So what about "has_non_global_escape_in_scope?"

matcher.cpp ok

output.cpp:1071
Please break the long line.

jvmtiCodeBlobEvents.cpp ok

jvmtiEnv.cpp
MaxJavaStackTraceDepth is only documented to affect
the exceptions stack trace depth, not to limit jvmti 
operations. Therefore I wondered why it is used here. 
Non of your business, but the flag should
document this in globals.hpp, too.  
Does jvmti specify that the same limits are used ...?
ok on your side.

jvmtiEnvBase.cpp  ok
jvmtiImpl.h|cpp  ok
jvmtiTagMap.cpp ok
whitebox.cpp ok

deoptimization.cpp

line 177: Please break line
line 246, 281: Please break line
1578, 1583, 1589, 1632, 1649, 1651 Break line

1651: You use 'non'-terms, too: non-escaping :)

2805, 2929, 2946ff, break lines

deoptimization.hpp

158, 174, 176 ... I would break lines too, but here you are in
good company :)

globals.hpp ok
mutexLocker.h|cpp ok
objectMonitor.cpp ok

thread.cpp 

2631 typo: sapfepont --> safepoint

thread.hpp ok
thread.inline.hpp ok
vframe.cpp ok
vframe_hp.cpp   458ff break lines
vframe_hp.hpp ok
macros.hpp ok
TEST.ROOT ok
WhiteBox.java ok

IterateHeapWithEscapeAnalysisEnabled.java

line 415:
msg("wait until target thread has set testMethod_result");
while (testMethod_result == 0) {
    Thread.sleep(50);
}
Might the test run into timeouts at this place?
The field is volatile, i.e. it will be reloaded
in each iteration. But will dontinline_testMethod
write it back to main memory in time?

libIterateHeapWithEscapeAnalysisEnabled.c ok

EATests.java

This is a very elaborate test.
I found a row of test cases illustrating issues
we talked about before. Really helpful!

1311: TypeO materialize -> materialized

1640: setting local variable i triggers always deoptimization
  --> setting local variable i always triggers deoptimization

2176: dontinline_calee --> dontinline_callee
2510: poping --> popping  ... but I'm not sure here.

https://www.urbandictionary.com/define.php?term=poping
poping
Drinking large amounts of Dextromethorphan Hydrobromide (DXM)based cough syrup, and then embarking on an adventure while wandering around neighborhoods or parks all night. This is usually done while listening to Punk rock music from a portable jambox. 
;)
Don?t do it! ??

EATestsJVMTI.java

I think you can just copy this test description into the other
test. You can have two @test comments, they will be treated
as separate tests.  The @requires will be evaluated accordingly.
For an example see 
test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerExceptionTest.java
which has two different compile setups for the test class (-g).

so, that's it for reading code ...


Some general remarks, maybe a bit picky ...:
I think you could use less commas ',' in comments.
As I understand, you need a comma if the relative
sentence is at the beginning, but not if it is at 
the end:
  If Corona is over, I go to the office.
but
  I go to the office if Corona is over.
I think the same holds for 'because', 'while' etc.
E.g., jvmtiEnvBase.cpp:1313, jvmtiImpl.cpp:646ff, 
vframe_hp.hpp 104ff

Also, I like full sentences in comments.  
Especially for me as foreign speaker, this makes
things much more clear. I.e., I try to make it
a real sentence with articles, capitalized and a
dot at the end if there is a subject and a verb
in first place.
E.g., jvmtiEnvBase.cpp:1327
In many places, your comments read really 
well but some are quite abbreviated I think.

E.g. thread.cpp:2601 is an example where a simple
'a' helps a lot.
"Single deoptimization is typically very short."
I would add 'A': "A single deoptimization is typically very short (fast?)."
An other meaning of the comment I first considered is this:
"Single deoptimization is typically very short, all_threads deoptimization takes longer"
having in mind the functions
EscapeBarries::deoptimize_objects_all_threads()  
and 
EscapeBarries::deoptimize_objects() doing a single thread.
German with it's compound nouns is helpful here :)

Einzeldeoptimierung <--> eine einzelne Deoptimierung

Best regards,
  Goetz.


From yumin.qi at oracle.com  Wed Jul 22 20:47:07 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 22 Jul 2020 13:47:07 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
Message-ID: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>

Hi, Please review this tiny change on comment:

bug: https://bugs.openjdk.java.net/browse/JDK-8249624

webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/


Note 8081416 already marked as fixed (thanks Ioi), please read the 
comment on https://bugs.openjdk.java.net/browse/JDK-8081416

With CDS can be done with UseCompressedOops disabled, the test already 
has correct result.


Thanks

Yumin


From richard.reingruber at sap.com  Wed Jul 22 20:53:19 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Wed, 22 Jul 2020 20:53:19 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM4PR0202MB29648700486756F4DA6ED521EC790@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331445A57DBEC5F24C155649B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29648700486756F4DA6ED521EC790@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB3331CCE0DCF038DE3E6838BF9B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Goetz,

> Thanks for the quick reply.

Yes, this time it didn't take that long...

[... snip ...]

> > > > > I understand you annotate at safepoints where the escape analysis
> > > > > finds out that an object is "better" than global escape.
> > > > > This are the cases where the analysis identifies optimization
> > > > > opportunities. These annotations are then used to deoptimize
> > > > > frames and the objects referenced by them.
> > > > > Doesn't this overestimate the optimized
> > > > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > > > out.
> > > >
> > > > Yes, the implementation is conservative, but it is comparatively simple
> > and
> > > > the additional debug
> > > > info is just 2 flags per safepoint.
> > > Thanks. It also helped that you explained to me offline that
> > > there are more optimizations than only lock elimination and scalar
> > > replacement done based on the ea information.
> > > The ea refines the IR graph with allows follow up optimizations
> > > which can not easily be tracked back to the escaping objects or
> > > the call sites where they do not escape.
> > > Thus, if there are non-global escaping objects, you have to
> > > deoptimize the frame.
> > > Did I repeat that correctly?
> > 
> > Mostly, but there are also cases where deoptimization is required if and only
> > if ea-local objects
> > are passed as arguments. This is the case when values are not read directly
> > from a frame, but from a callee frame.
> Hmm, don't get this completely, but ok.

Let C be a callee frame of B which is a callee of A. If you use JVMTI to read an
object reference from a local variable of C then the implementation of
JDK-8227745 deoptimizes A if it passes any ea-local as argument, because the
reference could be ea-local in A and there might be optimizations that are
invalid after the escape state change.
  
> > > > Accesses to instance
> > > > members or array elements can be optimized as well.
> > > You mean the compiler can/will ignore volatile or memory ordering
> > > requirements for non-escaping objects? Sounds reasonable to do.
> > 
> > Yes, for instance. Also without volatile modifiers it will eliminate accesses.
> > Here is an example:
> > Method A has a NoEscape allocation O that is not scalar replaced. A calls
> > Method B, which is not
> > inlined. When you use your debugger to break in B, then modify a field of O,
> > then this modification
> > would have no effect without deoptimization, because the jit assumes that B
> > cannot modify O without
> > a reference to it.
> Yes, A can keep O in a register, while the JVMTI thread would write to 
> the location in the stack where the local is held (if it was written back).

Not quite. It is the value of the field of O that is in a register not the
reference to O itself. The agent changes the field's value in the /java heap/
(remember: O is _not_ scalar replaced), but the fields value is not reloaded
after return from B.

> > > > > Syncronization: looks good. I think others had a look at this before.
> > > > >
> > > > > EscapeBarrier::deoptimize_objects_internal()
> > > > >   The method name is misleading, it is not used by
> > > > >   deoptimize_objects().
> > > > >   Also, method with the same name is in Deopitmization.
> > > > >   Proposal: deoptimize_objects_thread() ?
> > > >
> > > > Sorry, but I don't see, why it would be misleading.
> > > > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > > > understand that name.
> > > 1. I have no idea why it's called "_internal". Because it is private?
> > >    By the name, I would expect that EscapeBarrier::deoptimize_objects()
> > >    calls it for some internal tasks. But it does not.
> > 
> > Well, I'd say it is pretty internal, what's happening in that method. So IMHO
> > the suffix _internal
> > is a match.
> > 
> > > 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> > > and calls deoptimize_objects(_one)_thread(thread) for each of these.
> > > That's how I would have named it.
> > > But no bike shedding, if you don't see what I mean it's not obvious.
> > Ok. We could have a quick call, too, if you like.

> Ok, I think I have understood the remaining points.  I'm fine with this 
> so far.

Thanks again and best regards,
Richard.

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Mittwoch, 22. Juli 2020 18:22
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

Thanks for the quick reply.

> > >   With DeoptimizeObjectsALot enabled internal threads are started that
> > > deoptimize frames and
> > >   objects. The number of threads started are given with
> > > DeoptimizeObjectsALotThreadCountAll and
> > >   DeoptimizeObjectsALotThreadCountSingle. The former targets all
> existing
> > > threads whereas the
> > >   latter operates on a single thread selected round robin.
> > >
> > >   I removed the mode where deoptimizations were performed at every nth
> > > exit from the runtime. I never used it.
> 
> > Do I get it right? You have a n:1 and a n:all test scenario.
> >  n:1: n threads deoptimize 1 Jana thread    where n => DOALThreadCountSingle
> >  n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?
> 
> Not quite.
> 
> -XX:+DeoptimizeObjectsALot // required
> -XX:DeoptimizeObjectsALotThreadCountAll=m
> -XX:DeoptimizeObjectsALotThreadCountSingle=n
> 
> Will start m+n threads. Each operating on all existing JavaThreads using
> EscapeBarriers. The
> difference between the 2 thread types is that one distinct EscapeBarrier
> targets either just a
> single thread or all exisitng threads at onece. If just one single thread is
> targeted per
> EscapeBarrier, then it is not always the same thread, but threads are selected
> round robin. So there
> will be n threads selecting independently single threads round robin per
> EscapeBarrier and m threads
> that target all threads in every EscapeBarrier.
Ok, yes, that is how I understood it. 
 
> > > * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> > > execute it always independently
> > >   of is_thread_fully_suspended().
> > Is this also a performance optimization?
> 
> Maybe a minor one.
OK

> > > * JavaThread::wait_for_object_deoptimization():
> > >   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> > > safepoint check! This
> > >     caused issues with not walkable stacks with DeoptimizeObjectsALot.
> > OK. As I understand, there was one safepoint check in the old version,
> > now there is one in each iteration.  I assume this is intended, right?
> 
> Yes it is. The important thing here is (A) a safepoint check is needed /after/
> leaving a safe state
> (_thread_in_native, _thread_blocked). (B) Shared variables that are modified
> at safepoints or with handshakes need to be reread /after/ the safepoint check.
> 
> BTW: I only noticed now that since JDK-8240918 JavaThreads themselves
> must disarm their polling
> page. Originally (before handshakes) this was done by the VM thread. With
> handshakes it was done by
> the thread executing the handshake op. This was changed for
> OrderAccess::cross_modify_fence() where
> the poll is left armed if the thread is in native and sice JDK-8240918 it is
> always left armed. So
> when a thread leaves a safe state (native, blocked) and there was a
> handshake/vm op, it will always
> call SafepointMechanism::block_if_requested_slow(), even if the
> handshake/vm operation have been
> processed already and everybody else is happyly executing bytecodes :)
Ok.

> Still (A) and (B) hold.

> > >   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> > > microbenchmark [1]
> > Ok.  Nice improvement, nice catch!
> 
> Yes. It certainly took some time to find out.
> 
> > >
> > > I refer to some more changes answering your questions and comments
> inline
> > > below.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > [1] Microbenchmark:
> > >
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> > >
> 
> 
> > > > I understand you annotate at safepoints where the escape analysis
> > > > finds out that an object is "better" than global escape.
> > > > This are the cases where the analysis identifies optimization
> > > > opportunities. These annotations are then used to deoptimize
> > > > frames and the objects referenced by them.
> > > > Doesn't this overestimate the optimized
> > > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > > out.
> > >
> > > Yes, the implementation is conservative, but it is comparatively simple
> and
> > > the additional debug
> > > info is just 2 flags per safepoint.
> > Thanks. It also helped that you explained to me offline that
> > there are more optimizations than only lock elimination and scalar
> > replacement done based on the ea information.
> > The ea refines the IR graph with allows follow up optimizations
> > which can not easily be tracked back to the escaping objects or
> > the call sites where they do not escape.
> > Thus, if there are non-global escaping objects, you have to
> > deoptimize the frame.
> > Did I repeat that correctly?
> 
> Mostly, but there are also cases where deoptimization is required if and only
> if ea-local objects
> are passed as arguments. This is the case when values are not read directly
> from a frame, but from a callee frame.
Hmm, don't get this completely, but ok.
  
> > > Accesses to instance
> > > members or array elements can be optimized as well.
> > You mean the compiler can/will ignore volatile or memory ordering
> > requirements for non-escaping objects? Sounds reasonable to do.
> 
> Yes, for instance. Also without volatile modifiers it will eliminate accesses.
> Here is an example:
> Method A has a NoEscape allocation O that is not scalar replaced. A calls
> Method B, which is not
> inlined. When you use your debugger to break in B, then modify a field of O,
> then this modification
> would have no effect without deoptimization, because the jit assumes that B
> cannot modify O without
> a reference to it.
Yes, A can keep O in a register, while the JVMTI thread would write to 
the location in the stack where the local is held (if it was written back).

> > > > Syncronization: looks good. I think others had a look at this before.
> > > >
> > > > EscapeBarrier::deoptimize_objects_internal()
> > > >   The method name is misleading, it is not used by
> > > >   deoptimize_objects().
> > > >   Also, method with the same name is in Deopitmization.
> > > >   Proposal: deoptimize_objects_thread() ?
> > >
> > > Sorry, but I don't see, why it would be misleading.
> > > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > > understand that name.
> > 1. I have no idea why it's called "_internal". Because it is private?
> >    By the name, I would expect that EscapeBarrier::deoptimize_objects()
> >    calls it for some internal tasks. But it does not.
> 
> Well, I'd say it is pretty internal, what's happening in that method. So IMHO
> the suffix _internal
> is a match.
> 
> > 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> > and calls deoptimize_objects(_one)_thread(thread) for each of these.
> > That's how I would have named it.
> > But no bike shedding, if you don't see what I mean it's not obvious.
> Ok. We could have a quick call, too, if you like.

Ok, I think I have understood the remaining points.  I'm fine with this 
so far.

Thanks,
  Goetz.


From ioi.lam at oracle.com  Wed Jul 22 21:06:49 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Wed, 22 Jul 2020 14:06:49 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
Message-ID: <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>

Hi Yumin,

Just small nits on the comments:

// UseCompressedOops default is turned on when heap is under 32G but will be

-> UseCompressedOops is turned on by default ....

// turned off when heap is greater than 32G. This leads inconsistency

-> This leads to inconsistency ...

// of UseCompressedOops at dump time and runtime.


Thanks
- Ioi

On 7/22/20 1:47 PM, Yumin Qi wrote:
> Hi, Please review this tiny change on comment:
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>
> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>
>
> Note 8081416 already marked as fixed (thanks Ioi), please read the 
> comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>
> With CDS can be done with UseCompressedOops disabled, the test already 
> has correct result.
>
>
> Thanks
>
> Yumin
>
>


From calvin.cheung at oracle.com  Wed Jul 22 21:32:53 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Wed, 22 Jul 2020 14:32:53 -0700
Subject: RFR(XS): 8249630: unused is_static_archive parameter in
 SystemDictionaryShared::write_dictionary
Message-ID: <7bef8c0f-edd4-4852-f6c8-59b61417e604@oracle.com>

JBS: https://bugs.openjdk.java.net/browse/JDK-8249630

webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8249630/webrev.00/

Please review this small cleanup.

Passed tier1 and tier2 tests.

thanks,

Calvin


From harold.seigel at oracle.com  Wed Jul 22 22:05:10 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Wed, 22 Jul 2020 18:05:10 -0400
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
Message-ID: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>

Hi,

Please review this small fix to avoid running test 
AbstractMethodErrorTest.java with Graal and remove it from the ProblemList.

Open Webrev: 
http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html

JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582

The change was tested by using mach5 testing and checking that the test 
was not run in tier*-graal tasks but was run in non-graal tasks.

Thanks, Harold


From yumin.qi at oracle.com  Wed Jul 22 22:55:29 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 22 Jul 2020 15:55:29 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
Message-ID: <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>

Hi Ioi,

 ? I have updated the words as your suggestion, also more precisely for 
the max heap size for compressed oop is around 31G, which is calculated 
by max_heap_for_compressed_oops().

 ? updated on same webrev.

$J6/bin/java -Xshare:on 
-XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
java version "16-internal" 2021-03-16
Java(TM) SE Runtime Environment (slowdebug build 
16-internal+0-adhoc.minqi.open)
Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
16-internal+0-adhoc.minqi.open, mixed mode, sharing)

$J6/bin/java -Xshare:on 
-XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
An error has occurred while processing the shared archive file.
Unable to use shared archive.
The saved state of UseCompressedOops and UseCompressedClassPointers is 
different from runtime, CDS will be disabled.
Error occurred during initialization of VM
Unable to use shared archive.


Thanks

Yumin

On 7/22/20 2:06 PM, Ioi Lam wrote:
> Hi Yumin,
>
> Just small nits on the comments:
>
> // UseCompressedOops default is turned on when heap is under 32G but 
> will be
>
> -> UseCompressedOops is turned on by default ....
>
> // turned off when heap is greater than 32G. This leads inconsistency
>
> -> This leads to inconsistency ...
>
> // of UseCompressedOops at dump time and runtime.
>
>
> Thanks
> - Ioi
>
> On 7/22/20 1:47 PM, Yumin Qi wrote:
>> Hi, Please review this tiny change on comment:
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>
>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>
>>
>> Note 8081416 already marked as fixed (thanks Ioi), please read the 
>> comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>
>> With CDS can be done with UseCompressedOops disabled, the test 
>> already has correct result.
>>
>>
>> Thanks
>>
>> Yumin
>>
>>
>

From yumin.qi at oracle.com  Wed Jul 22 22:58:27 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Wed, 22 Jul 2020 15:58:27 -0700
Subject: RFR(XS): 8249630: unused is_static_archive parameter in
 SystemDictionaryShared::write_dictionary
In-Reply-To: <7bef8c0f-edd4-4852-f6c8-59b61417e604@oracle.com>
References: <7bef8c0f-edd4-4852-f6c8-59b61417e604@oracle.com>
Message-ID: <bab6559e-a8cf-5987-19f5-0a46e3b65186@oracle.com>

Looks good!


Thanks

Yumin

On 7/22/20 2:32 PM, Calvin Cheung wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249630
>
> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8249630/webrev.00/
>
> Please review this small cleanup.
>
> Passed tier1 and tier2 tests.
>
> thanks,
>
> Calvin
>

From david.holmes at oracle.com  Wed Jul 22 23:00:43 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 09:00:43 +1000
Subject: RFR (trivial) 8249940: Remove unnecessary includes of jni_util.h in
 native tests
Message-ID: <8eb62661-dc95-bba1-ac2f-092d637f57c5@oracle.com>

Bug: https://bugs.openjdk.java.net/browse/JDK-8249940
webrev: http://cr.openjdk.java.net/~dholmes/8249940/webrev/

A number of native tests in hotspot and jdk include the jni_util.h 
header file which is part of the sources for libjava and not part of the 
testing framework, nor an exported interface for the JDK. This seems to 
have occurred through copy-and-paste when creating the tests as the 
include is not needed.

test/hotspot/jtreg/runtime/jni/FindClass/libbootLoaderTest.c
test/hotspot/jtreg/runtime/jni/registerNativesWarning/libregisterNativesWarning.c
test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c
test/jdk/java/lang/ClassLoader/nativeLibrary/libnativeLibraryTest.c
test/jdk/java/lang/ProcessBuilder/checkHandles/libCheckHandles.c
test/jdk/jdk/internal/loader/NativeLibraries/libnativeLibrariesTest.c

There is one test that includes jni_util.h and uses the utility function 
declared there:
./jdk/java/lang/String/nativeEncoding/libstringPlatformChars.c
so that is left as-is.

Thanks,
David
-----

From igor.ignatyev at oracle.com  Wed Jul 22 23:09:13 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 22 Jul 2020 16:09:13 -0700
Subject: RFR (trivial) 8249940: Remove unnecessary includes of jni_util.h
 in native tests
In-Reply-To: <8eb62661-dc95-bba1-ac2f-092d637f57c5@oracle.com>
References: <8eb62661-dc95-bba1-ac2f-092d637f57c5@oracle.com>
Message-ID: <498178C1-A8AD-4A4A-B94F-F61DC02C2B0F@oracle.com>

Hi David,

looks good to me.

-- Igor

> On Jul 22, 2020, at 4:00 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8249940
> webrev: http://cr.openjdk.java.net/~dholmes/8249940/webrev/
> 
> A number of native tests in hotspot and jdk include the jni_util.h header file which is part of the sources for libjava and not part of the testing framework, nor an exported interface for the JDK. This seems to have occurred through copy-and-paste when creating the tests as the include is not needed.
> 
> test/hotspot/jtreg/runtime/jni/FindClass/libbootLoaderTest.c
> test/hotspot/jtreg/runtime/jni/registerNativesWarning/libregisterNativesWarning.c
> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c
> test/jdk/java/lang/ClassLoader/nativeLibrary/libnativeLibraryTest.c
> test/jdk/java/lang/ProcessBuilder/checkHandles/libCheckHandles.c
> test/jdk/jdk/internal/loader/NativeLibraries/libnativeLibrariesTest.c
> 
> There is one test that includes jni_util.h and uses the utility function declared there:
> ./jdk/java/lang/String/nativeEncoding/libstringPlatformChars.c
> so that is left as-is.
> 
> Thanks,
> David
> -----


From david.holmes at oracle.com  Wed Jul 22 23:29:45 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 09:29:45 +1000
Subject: RFR (trivial) 8249940: Remove unnecessary includes of jni_util.h
 in native tests
In-Reply-To: <498178C1-A8AD-4A4A-B94F-F61DC02C2B0F@oracle.com>
References: <8eb62661-dc95-bba1-ac2f-092d637f57c5@oracle.com>
 <498178C1-A8AD-4A4A-B94F-F61DC02C2B0F@oracle.com>
Message-ID: <2358514a-c8a5-3b10-b0c3-6fba4d1848dc@oracle.com>

Thanks Igor!

David

On 23/07/2020 9:09 am, Igor Ignatyev wrote:
> Hi David,
> 
> looks good to me.
> 
> -- Igor
> 
>> On Jul 22, 2020, at 4:00 PM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249940
>> webrev: http://cr.openjdk.java.net/~dholmes/8249940/webrev/
>>
>> A number of native tests in hotspot and jdk include the jni_util.h header file which is part of the sources for libjava and not part of the testing framework, nor an exported interface for the JDK. This seems to have occurred through copy-and-paste when creating the tests as the include is not needed.
>>
>> test/hotspot/jtreg/runtime/jni/FindClass/libbootLoaderTest.c
>> test/hotspot/jtreg/runtime/jni/registerNativesWarning/libregisterNativesWarning.c
>> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c
>> test/jdk/java/lang/ClassLoader/nativeLibrary/libnativeLibraryTest.c
>> test/jdk/java/lang/ProcessBuilder/checkHandles/libCheckHandles.c
>> test/jdk/jdk/internal/loader/NativeLibraries/libnativeLibrariesTest.c
>>
>> There is one test that includes jni_util.h and uses the utility function declared there:
>> ./jdk/java/lang/String/nativeEncoding/libstringPlatformChars.c
>> so that is left as-is.
>>
>> Thanks,
>> David
>> -----
> 

From david.holmes at oracle.com  Thu Jul 23 03:00:31 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 13:00:31 +1000
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
In-Reply-To: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
References: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
Message-ID: <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>

Hi Harold,

On 23/07/2020 8:05 am, Harold Seigel wrote:
> Hi,
> 
> Please review this small fix to avoid running test 
> AbstractMethodErrorTest.java with Graal and remove it from the ProblemList.
> 
> Open Webrev: 
> http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html

You seem to have lost the requirement that tiered compilation be enabled. ??

Thanks,
David
-----

> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582
> 
> The change was tested by using mach5 testing and checking that the test 
> was not run in tier*-graal tasks but was run in non-graal tasks.
> 
> Thanks, Harold
> 

From david.holmes at oracle.com  Thu Jul 23 04:07:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 14:07:37 +1000
Subject: RFR(XXS) 8249087 Symbol constructor unnecessarily initializes
 _body[0]
In-Reply-To: <10e54211-4124-b055-ac8b-d31f0fbaca30@oracle.com>
References: <ef2ebc63-bf85-6577-1619-d1025e3f4e18@oracle.com>
 <87blk9tmzq.fsf@oldenburg2.str.redhat.com>
 <9b4b10c1-4fc4-2272-5609-e3456f0bffed@oracle.com>
 <369109fa-4aba-d8f1-3ce4-afb25c7e137a@oracle.com>
 <dc7c3f13-c3b6-28de-77f4-ff10ff3e670a@oracle.com>
 <f531bd56-4ce0-6581-b28e-3f8316764536@oracle.com>
 <10e54211-4124-b055-ac8b-d31f0fbaca30@oracle.com>
Message-ID: <2dac3bc3-b614-0bce-60ea-b09fb1fb7350@oracle.com>

On 23/07/2020 3:42 am, Ioi Lam wrote:
> On 7/22/20 10:12 AM, Lois Foltan wrote:
>> On 7/22/2020 12:17 AM, David Holmes wrote:
>>> Hi Lois,
>>>
>>> On 22/07/2020 1:06 am, Lois Foltan wrote:
>>>> On 7/21/2020 2:24 AM, Ioi Lam wrote:
>>>>>
>>>>>
>>>>> On 7/20/20 11:12 PM, Florian Weimer wrote:
>>>>>> * Ioi Lam:
>>>>>>
>>>>>>> Hi please review this very simple fix:
>>>>>>>
>>>>>>> diff -r 19f26d72a8d0 src/hotspot/share/oops/symbol.cpp
>>>>>>> --- a/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 14:24:19 
>>>>>>> 2020 -0700
>>>>>>> +++ b/src/hotspot/share/oops/symbol.cpp??? Mon Jul 20 17:11:57 
>>>>>>> 2020 -0700
>>>>>>> @@ -51,8 +51,11 @@
>>>>>>> ??Symbol::Symbol(const u1* name, int length, int refcount) {
>>>>>>> ??? _hash_and_refcount = pack_hash_and_refcount((short)os::random(),
>>>>>>> refcount);
>>>>>>> ??? _length = length;
>>>>>>> -? _body[0] = 0;? // in case length == 0
>>>>>>> ??? memcpy(_body, name, length);
>>>>>>> +? // For symbols of length 0 and 1: _body[0] (and _body[1]) are
>>>>>>> uninitialized and may
>>>>>>> +? // contain random values, which will only be read by
>>>>>>> Symbol::identity_hash(),
>>>>>>> +? // which would tolerate such randomness. These values never 
>>>>>>> change
>>>>>>> during the lifetime
>>>>>>> +? // of the Symbol.
>>>>>>> ??}
>>>>>> Won't this still trip memory debuggers? Symbol::identity_hash() 
>>>>>> implies
>>>>>> that the result is eventually used in a conditional operation (a hash
>>>>>> comparison perhaps).? If it's possible one day to run Hotspot under
>>>>>> valgrind, this would result in false positives.
>>>>>
>>>>> Are you saying that valgrind will modify uninitialized memory 
>>>>> periodically after the constructor has returned, and thus will 
>>>>> cause Symbol::identity_hash() to return a different value?
>>>>>
>>>>> Without my patch, _body[1] is uninitialized for Symbols whose 
>>>>> length is 0 or 1. We have not heard of any issues related to 
>>>>> valgrind and Symbol::identity_hash().
>>>>>
>>>>> In fact, looking at the code history, the setting of "_body[0] = 0" 
>>>>> in Symbol::Symbol was introduced only recently (Feb 2020):
>>>>>
>>>>> http://hg.openjdk.java.net/jdk/jdk/annotate/4a4d185098e2/src/hotspot/share/oops/symbol.cpp#l55 
>>>>>
>>>>>
>>>>> I'll check with Lois who added the code to see the reason for doing 
>>>>> it.
>>>>
>>>> Hi Ioi,
>>>>
>>>> Reviewing this JBS issue, I have concerns over leaving both _body[0] 
>>>> and now even _body[1] uninitialized.? The signature processing 
>>>> frequently checks the first character of a Symbol via 
>>>> Symbol::char_at(0) to determine what type it is dealing with.? Is 
>>>> there a danger that the uninitialized memory actually has a valid 
>>>> type indicator in it like an '[' character for example?? The 
>>>> signature processing could potentially make wrong assumptions about 
>>>> the type it is trying to process.
>>>
>>> Aren't all the signature related symbols already guaranteed to not 
>>> have zero-length, or else the length is being pre-tested for zero?
>>
>> Hi David,
>>
>> I believe you are correct in that signature related symbols are 
>> already guaranteed to not have zero-length.? I tried several different 
>> jasm files to introduce an empty string either for a class name or for 
>> a type of a field reference and did not get past class file parsing 
>> without a ClassFormatError.
>>
>> Error: LinkageError occurred while loading main class HelloEmptyString
>> ????????java.lang.ClassFormatError: Class name is empty or contains 
>> illegal character in descriptor in class file HelloEmptyString
>>
>>
>> However, after reading the original issue in JDK-8249087, the 
>> objection wasn't that there was a needless initialization of _body[0] 
>> but instead it was around the inconsistency of initializing _body[0] 
>> and not _body[1].? I think it should be the responsibility of the 
>> Symbol class API to ensure initialization so that as a consumer I 
>> don't have to worry about _body[0] or _body[1]'s validity.? So I 
>> prefer the change to actually add the initialization of _body[1].  
>> Something like, "_body[0] = _body[1] = 0".
>>
> Hi Lois,
> 
> The fact that _body[0..1] is in the Symbol header is just by coincidence 
> -- we need only 6 bytes of meta-info about the Symbol, so we have 2 
> bytes left over. We use these two left-over bytes for 
> Symbol::identity_hash(). However, no one else should unconditionally 
> read these 2 bytes --? if we change the Symbol header in the future, 
> these two bytes may not be allocated anymore
> 
> It looks like leaving _body[0..1] uninitialized is confusing, and will 
> possibly lead to problems with valgrind as pointed out by Florian. How 
> about this:
> 
> Symbol::Symbol(const u1* name, int length, int refcount) {
>  ? _hash_and_refcount =? pack_hash_and_refcount((short)os::random(), 
> refcount);
>  ? _length = length;
>  ? // _body[0..1] are allocated in the header just by coincidence in the 
> current
>  ? // implementation of Symbol. They are read by identity_hash(), so 
> make sure they
>  ? // are initialized.
>  ? // No other code should assume that _body[0..1] are always allocated. 
> E.g., do
>  ? // not unconditionally read base()[0] as that will be invalid for an 
> empty Symbol.
>  ? _body[0] = _body[1] = 0;
>  ? memcpy(_body, name, length);
> }
> 
> I'll also change the bug title to "Always initialize _body[0..1] in 
> Symbol constructor"

That works for me.

> ----
> 
> As I discussed with Lois off-line:
> 
> There's Signature code that unconditionally reads _body[0], which would 
> assert (but class loading checks for invalid signatures that prevents 
> this from happening)

Right - and that's what I was referring to before. The signature symbol 
validity has already been established and length must be > 0. I would 
not want to see unnecessary length checks inserted there. At most an 
assertion.

Thanks,
David
-----

> BasicType Signature::basic_type(const Symbol* signature) {
>  ? return basic_type(signature->char_at(0));
> }
> 
> char Symbol::char_at(int index) const {
>  ? assert(index >=0 && index < length(), "symbol index overflow");
>  ? return (char)base()[index];
> }
> 
> Signature::basic_type() should be fixed to either check for length, 
> and/or assert that signature is a valid signature.
> 
> Thanks
> - Ioi
> 
> 
> 
> 

From david.holmes at oracle.com  Thu Jul 23 04:03:27 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 14:03:27 +1000
Subject: RFR(XS): 8249630: unused is_static_archive parameter in
 SystemDictionaryShared::write_dictionary
In-Reply-To: <7bef8c0f-edd4-4852-f6c8-59b61417e604@oracle.com>
References: <7bef8c0f-edd4-4852-f6c8-59b61417e604@oracle.com>
Message-ID: <14bcb264-61fd-1c46-7366-29bfa5ec19f3@oracle.com>

Looks good and trivial.

Thanks,
David

On 23/07/2020 7:32 am, Calvin Cheung wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249630
> 
> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8249630/webrev.00/
> 
> Please review this small cleanup.
> 
> Passed tier1 and tier2 tests.
> 
> thanks,
> 
> Calvin
> 

From mandy.chung at oracle.com  Thu Jul 23 04:22:17 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Wed, 22 Jul 2020 21:22:17 -0700
Subject: RFR (trivial) 8249940: Remove unnecessary includes of jni_util.h
 in native tests
In-Reply-To: <8eb62661-dc95-bba1-ac2f-092d637f57c5@oracle.com>
References: <8eb62661-dc95-bba1-ac2f-092d637f57c5@oracle.com>
Message-ID: <6ed5dca4-2258-a3bc-4e71-8541c43f885f@oracle.com>

Hi David,

Looks good.

Mandy

On 7/22/20 4:00 PM, David Holmes wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8249940
> webrev: http://cr.openjdk.java.net/~dholmes/8249940/webrev/
>
> A number of native tests in hotspot and jdk include the jni_util.h 
> header file which is part of the sources for libjava and not part of 
> the testing framework, nor an exported interface for the JDK. This 
> seems to have occurred through copy-and-paste when creating the tests 
> as the include is not needed.
>
> test/hotspot/jtreg/runtime/jni/FindClass/libbootLoaderTest.c
> test/hotspot/jtreg/runtime/jni/registerNativesWarning/libregisterNativesWarning.c 
>
> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c
> test/jdk/java/lang/ClassLoader/nativeLibrary/libnativeLibraryTest.c
> test/jdk/java/lang/ProcessBuilder/checkHandles/libCheckHandles.c
> test/jdk/jdk/internal/loader/NativeLibraries/libnativeLibrariesTest.c
>
> There is one test that includes jni_util.h and uses the utility 
> function declared there:
> ./jdk/java/lang/String/nativeEncoding/libstringPlatformChars.c
> so that is left as-is.
>
> Thanks,
> David
> -----


From david.holmes at oracle.com  Thu Jul 23 04:30:50 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 14:30:50 +1000
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
Message-ID: <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>

Hi Yumin,

Given we have the earlier test:

  112         // ======= archive with compressed oops, run w/o

it would seem better if we had:

  112         // Explicitly archive with compressed oops, run without.

and:

  127         // Implicitly archive with compressed oops, run without.
  128         // Max heap size for compressed oops is around 31G.
  129         // UseCompressedOops is turned on by default when the heap
  130         // size is under 31G, but will be turned off when the heap
  131         // size is greater than that.

And should we also have the opposite test:

// Explicitly archive without compressed oops and run with.
// Implicitly archive without compressed oops and run with.

Thanks,
David
-----

On 23/07/2020 8:55 am, Yumin Qi wrote:
> Hi Ioi,
> 
>  ? I have updated the words as your suggestion, also more precisely for 
> the max heap size for compressed oop is around 31G, which is calculated 
> by max_heap_for_compressed_oops().
> 
>  ? updated on same webrev.
> 
> $J6/bin/java -Xshare:on 
> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
> java version "16-internal" 2021-03-16
> Java(TM) SE Runtime Environment (slowdebug build 
> 16-internal+0-adhoc.minqi.open)
> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
> 
> $J6/bin/java -Xshare:on 
> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
> An error has occurred while processing the shared archive file.
> Unable to use shared archive.
> The saved state of UseCompressedOops and UseCompressedClassPointers is 
> different from runtime, CDS will be disabled.
> Error occurred during initialization of VM
> Unable to use shared archive.
> 
> 
> Thanks
> 
> Yumin
> 
> On 7/22/20 2:06 PM, Ioi Lam wrote:
>> Hi Yumin,
>>
>> Just small nits on the comments:
>>
>> // UseCompressedOops default is turned on when heap is under 32G but 
>> will be
>>
>> -> UseCompressedOops is turned on by default ....
>>
>> // turned off when heap is greater than 32G. This leads inconsistency
>>
>> -> This leads to inconsistency ...
>>
>> // of UseCompressedOops at dump time and runtime.
>>
>>
>> Thanks
>> - Ioi
>>
>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>> Hi, Please review this tiny change on comment:
>>>
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>
>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>
>>>
>>> Note 8081416 already marked as fixed (thanks Ioi), please read the 
>>> comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>
>>> With CDS can be done with UseCompressedOops disabled, the test 
>>> already has correct result.
>>>
>>>
>>> Thanks
>>>
>>> Yumin
>>>
>>>
>>

From david.holmes at oracle.com  Thu Jul 23 04:42:55 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 14:42:55 +1000
Subject: RFR (trivial) 8249940: Remove unnecessary includes of jni_util.h
 in native tests
In-Reply-To: <6ed5dca4-2258-a3bc-4e71-8541c43f885f@oracle.com>
References: <8eb62661-dc95-bba1-ac2f-092d637f57c5@oracle.com>
 <6ed5dca4-2258-a3bc-4e71-8541c43f885f@oracle.com>
Message-ID: <0048a4d0-9a94-3e94-734e-fc77945504da@oracle.com>

Thanks Mandy!

David

On 23/07/2020 2:22 pm, Mandy Chung wrote:
> Hi David,
> 
> Looks good.
> 
> Mandy
> 
> On 7/22/20 4:00 PM, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249940
>> webrev: http://cr.openjdk.java.net/~dholmes/8249940/webrev/
>>
>> A number of native tests in hotspot and jdk include the jni_util.h 
>> header file which is part of the sources for libjava and not part of 
>> the testing framework, nor an exported interface for the JDK. This 
>> seems to have occurred through copy-and-paste when creating the tests 
>> as the include is not needed.
>>
>> test/hotspot/jtreg/runtime/jni/FindClass/libbootLoaderTest.c
>> test/hotspot/jtreg/runtime/jni/registerNativesWarning/libregisterNativesWarning.c 
>>
>> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c
>> test/jdk/java/lang/ClassLoader/nativeLibrary/libnativeLibraryTest.c
>> test/jdk/java/lang/ProcessBuilder/checkHandles/libCheckHandles.c
>> test/jdk/jdk/internal/loader/NativeLibraries/libnativeLibrariesTest.c
>>
>> There is one test that includes jni_util.h and uses the utility 
>> function declared there:
>> ./jdk/java/lang/String/nativeEncoding/libstringPlatformChars.c
>> so that is left as-is.
>>
>> Thanks,
>> David
>> -----
> 

From david.holmes at oracle.com  Thu Jul 23 08:52:31 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 18:52:31 +1000
Subject: RFR (S): 8194309: JNI handle allocation failure not reported correctly
Message-ID: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>

webrev: http://cr.openjdk.java.net/~dholmes/8194309/webrev/
bug: https://bugs.openjdk.java.net/browse/JDK-8194309

The JNI specification states that NewLocalref and NewGloalRef return 
NULL on out-of-memory, whilst NewWeakGlobalRef throws OutofMemoryError. 
But the hotspot implementation will abort on out-of-memory in all cases.

The oop storage code for global and weak-global handles already supports 
taking an AllocFailStrategy, so we simply pass the RETURN_NULL strategy 
through - and in the weak case throw a newly defined OutOfMemoryError 
for "C heap space" (as a corollary for "Java heap space").

For NewLocalRef we pass the strategy through to 
JNIHandleBlock::allocate_block, where we explicitly use "new 
(std::nothrow)" to get a NULL on out-of-memory, so that we can pass it back.

There are three internal calls to NewGlobalRef in jni.cpp that needed a 
NULL check added to support the new behaviour.

In reality we know that if any of these things actually return NULL 
because C-heap is exhausted then chances are we are going to abort soon 
in any case. But to be spec compliant we make the changes.

Note that I deliberately do not change any of the internal 
JNIHandle::make_local calls (contained in the majority of JNI methods) 
to get NULL on out-of-memory. This is because none of those APIs are 
specified in a way that even considers what should happen if an internal 
request to create a local-ref fails - so we can neither return NULL nor 
throw an exception in general. All this fix addresses are the three 
specific JNI entry points themselves.

Also note there is no attempt with this changeset to add NULL checks to 
all the JNI code in the other JDK libraries that uses these API's. 
Interestingly quite a number already include the NULL checks.

Testing:

There is no practical way to test this for real so I had to use 
fault-injection. A version of the webrev with the fault-injection hooks 
and a test case, is presented here (for the record):

http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/

The test is constructed so that only the JNI calls in the test should 
possibly encounter the NULL returns.

Otherwise only sanity testing of tiers 1-3.

Thanks,
David

From tobias.hartmann at oracle.com  Thu Jul 23 09:13:41 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 23 Jul 2020 11:13:41 +0200
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <1595401959932.33284@amazon.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
 <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
 <1595401959932.33284@amazon.com>
Message-ID: <a03d92d6-ad07-b347-7452-776459b8d174@oracle.com>


On 22.07.20 09:12, Liu, Xin wrote:
> 1. I move the validation logic for compiler directives to compilerOracle::scan_flag_and_value.  
> If something wrong  happens in parser, the patch will "gracefully" quit JVM using jvm_exit(1). is that okay? 

With "piggy-back on the error mechanism" I meant that you should use the existing bailout mechanism
in the parser. In this case, couldn't you simply put the error message in 'errorbuf' and let the
caller take care of handling it?

Best regards,
Tobias


From kim.barrett at oracle.com  Thu Jul 23 12:27:06 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 23 Jul 2020 08:27:06 -0400
Subject: RFR (S): 8194309: JNI handle allocation failure not reported
 correctly
In-Reply-To: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>
References: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>
Message-ID: <D01BB27F-644B-4227-8A83-057330DF2B51@oracle.com>

> On Jul 23, 2020, at 4:52 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> webrev: http://cr.openjdk.java.net/~dholmes/8194309/webrev/
> bug: https://bugs.openjdk.java.net/browse/JDK-8194309
> 
> The JNI specification states that NewLocalref and NewGloalRef return NULL on out-of-memory, whilst NewWeakGlobalRef throws OutofMemoryError. But the hotspot implementation will abort on out-of-memory in all cases.
> 
> The oop storage code for global and weak-global handles already supports taking an AllocFailStrategy, so we simply pass the RETURN_NULL strategy through - and in the weak case throw a newly defined OutOfMemoryError for "C heap space" (as a corollary for "Java heap space").
> 
> For NewLocalRef we pass the strategy through to JNIHandleBlock::allocate_block, where we explicitly use "new (std::nothrow)" to get a NULL on out-of-memory, so that we can pass it back.
> 
> There are three internal calls to NewGlobalRef in jni.cpp that needed a NULL check added to support the new behaviour.
> 
> In reality we know that if any of these things actually return NULL because C-heap is exhausted then chances are we are going to abort soon in any case. But to be spec compliant we make the changes.
> 
> Note that I deliberately do not change any of the internal JNIHandle::make_local calls (contained in the majority of JNI methods) to get NULL on out-of-memory. This is because none of those APIs are specified in a way that even considers what should happen if an internal request to create a local-ref fails - so we can neither return NULL nor throw an exception in general. All this fix addresses are the three specific JNI entry points themselves.
> 
> Also note there is no attempt with this changeset to add NULL checks to all the JNI code in the other JDK libraries that uses these API's. Interestingly quite a number already include the NULL checks.
> 
> Testing:
> 
> There is no practical way to test this for real so I had to use fault-injection. A version of the webrev with the fault-injection hooks and a test case, is presented here (for the record):
> 
> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/
> 
> The test is constructed so that only the JNI calls in the test should possibly encounter the NULL returns.
> 
> Otherwise only sanity testing of tiers 1-3.
> 
> Thanks,
> David

For consistency, should this:
  57 jobject JNIHandles::make_local(oop obj) {
have an optional AllocFailType argument?

Other than that, this looks good.


From coleen.phillimore at oracle.com  Thu Jul 23 12:46:04 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 23 Jul 2020 08:46:04 -0400
Subject: RFR (S): 8194309: JNI handle allocation failure not reported
 correctly
In-Reply-To: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>
References: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>
Message-ID: <c365c92a-fdb7-d41f-bade-94c680778e53@oracle.com>


http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/src/hotspot/share/runtime/jniHandles.cpp.udiff.html

These functions shouldn't use UseNewCode which is false by default. 
UseNewCode is for testing experimental features locally, not for checked 
in code.? You should use logging for this.

Coleen

On 7/23/20 4:52 AM, David Holmes wrote:
> webrev: http://cr.openjdk.java.net/~dholmes/8194309/webrev/
> bug: https://bugs.openjdk.java.net/browse/JDK-8194309
>
> The JNI specification states that NewLocalref and NewGloalRef return 
> NULL on out-of-memory, whilst NewWeakGlobalRef throws 
> OutofMemoryError. But the hotspot implementation will abort on 
> out-of-memory in all cases.
>
> The oop storage code for global and weak-global handles already 
> supports taking an AllocFailStrategy, so we simply pass the 
> RETURN_NULL strategy through - and in the weak case throw a newly 
> defined OutOfMemoryError for "C heap space" (as a corollary for "Java 
> heap space").
>
> For NewLocalRef we pass the strategy through to 
> JNIHandleBlock::allocate_block, where we explicitly use "new 
> (std::nothrow)" to get a NULL on out-of-memory, so that we can pass it 
> back.
>
> There are three internal calls to NewGlobalRef in jni.cpp that needed 
> a NULL check added to support the new behaviour.
>
> In reality we know that if any of these things actually return NULL 
> because C-heap is exhausted then chances are we are going to abort 
> soon in any case. But to be spec compliant we make the changes.
>
> Note that I deliberately do not change any of the internal 
> JNIHandle::make_local calls (contained in the majority of JNI methods) 
> to get NULL on out-of-memory. This is because none of those APIs are 
> specified in a way that even considers what should happen if an 
> internal request to create a local-ref fails - so we can neither 
> return NULL nor throw an exception in general. All this fix addresses 
> are the three specific JNI entry points themselves.
>
> Also note there is no attempt with this changeset to add NULL checks 
> to all the JNI code in the other JDK libraries that uses these API's. 
> Interestingly quite a number already include the NULL checks.
>
> Testing:
>
> There is no practical way to test this for real so I had to use 
> fault-injection. A version of the webrev with the fault-injection 
> hooks and a test case, is presented here (for the record):
>
> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/
>
> The test is constructed so that only the JNI calls in the test should 
> possibly encounter the NULL returns.
>
> Otherwise only sanity testing of tiers 1-3.
>
> Thanks,
> David


From harold.seigel at oracle.com  Thu Jul 23 13:22:17 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Thu, 23 Jul 2020 09:22:17 -0400
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
In-Reply-To: <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>
References: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
 <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>
Message-ID: <8b8651c7-8034-aaae-edf3-a8c7f0b5039f@oracle.com>

Hi David,

Thanks for looking at this.

The existing @requires for test AbstractMethodErrorTest.java contained 
this clause:

    (!vm.graal.enabled | vm.opt.TieredCompilation == true)

This clause evaluated to TRUE if either Graal was disabled or 
vm.opt.TieredCompilation was true.? Since now Graal is always disabled, 
this clause would always be TRUE, regardless of the value of 
vm.opt.TieredCompilation.? There is not requirement that tiered 
compilation be enabled for this test.

Thanks, Harold

On 7/22/2020 11:00 PM, David Holmes wrote:
> Hi Harold,
>
> On 23/07/2020 8:05 am, Harold Seigel wrote:
>> Hi,
>>
>> Please review this small fix to avoid running test 
>> AbstractMethodErrorTest.java with Graal and remove it from the 
>> ProblemList.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html
>
> You seem to have lost the requirement that tiered compilation be 
> enabled. ??
>
> Thanks,
> David
> -----
>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582
>>
>> The change was tested by using mach5 testing and checking that the 
>> test was not run in tier*-graal tasks but was run in non-graal tasks.
>>
>> Thanks, Harold
>>

From david.holmes at oracle.com  Thu Jul 23 13:38:32 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 23:38:32 +1000
Subject: RFR (S): 8194309: JNI handle allocation failure not reported
 correctly
In-Reply-To: <D01BB27F-644B-4227-8A83-057330DF2B51@oracle.com>
References: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>
 <D01BB27F-644B-4227-8A83-057330DF2B51@oracle.com>
Message-ID: <6a887ff2-708a-a5f3-aafa-526aced16726@oracle.com>

Hi Kim,

Thanks for taking a look at this.

On 23/07/2020 10:27 pm, Kim Barrett wrote:
>> On Jul 23, 2020, at 4:52 AM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> webrev: http://cr.openjdk.java.net/~dholmes/8194309/webrev/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8194309
>>
>> The JNI specification states that NewLocalref and NewGloalRef return NULL on out-of-memory, whilst NewWeakGlobalRef throws OutofMemoryError. But the hotspot implementation will abort on out-of-memory in all cases.
>>
>> The oop storage code for global and weak-global handles already supports taking an AllocFailStrategy, so we simply pass the RETURN_NULL strategy through - and in the weak case throw a newly defined OutOfMemoryError for "C heap space" (as a corollary for "Java heap space").
>>
>> For NewLocalRef we pass the strategy through to JNIHandleBlock::allocate_block, where we explicitly use "new (std::nothrow)" to get a NULL on out-of-memory, so that we can pass it back.
>>
>> There are three internal calls to NewGlobalRef in jni.cpp that needed a NULL check added to support the new behaviour.
>>
>> In reality we know that if any of these things actually return NULL because C-heap is exhausted then chances are we are going to abort soon in any case. But to be spec compliant we make the changes.
>>
>> Note that I deliberately do not change any of the internal JNIHandle::make_local calls (contained in the majority of JNI methods) to get NULL on out-of-memory. This is because none of those APIs are specified in a way that even considers what should happen if an internal request to create a local-ref fails - so we can neither return NULL nor throw an exception in general. All this fix addresses are the three specific JNI entry points themselves.
>>
>> Also note there is no attempt with this changeset to add NULL checks to all the JNI code in the other JDK libraries that uses these API's. Interestingly quite a number already include the NULL checks.
>>
>> Testing:
>>
>> There is no practical way to test this for real so I had to use fault-injection. A version of the webrev with the fault-injection hooks and a test case, is presented here (for the record):
>>
>> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/
>>
>> The test is constructed so that only the JNI calls in the test should possibly encounter the NULL returns.
>>
>> Otherwise only sanity testing of tiers 1-3.
>>
>> Thanks,
>> David
> 
> For consistency, should this:
>    57 jobject JNIHandles::make_local(oop obj) {
> have an optional AllocFailType argument?
> 
> Other than that, this looks good.

Given that is only used internally, and we don't ever want/need the 
return-NULL behaviour internally, and we are looking at possibly 
eradicating that overload in any case ... I saw no reason to expand that 
API.

Thanks,
David

> 

From david.holmes at oracle.com  Thu Jul 23 13:39:44 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 23:39:44 +1000
Subject: RFR (S): 8194309: JNI handle allocation failure not reported
 correctly
In-Reply-To: <c365c92a-fdb7-d41f-bade-94c680778e53@oracle.com>
References: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>
 <c365c92a-fdb7-d41f-bade-94c680778e53@oracle.com>
Message-ID: <a0074c61-98ba-d9f6-b402-ec98ecd2969c@oracle.com>

On 23/07/2020 10:46 pm, coleen.phillimore at oracle.com wrote:
> 
> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/src/hotspot/share/runtime/jniHandles.cpp.udiff.html 
> 
> 
> These functions shouldn't use UseNewCode which is false by default. 
> UseNewCode is for testing experimental features locally, not for checked 
> in code.? You should use logging for this.

That code is not being checked in. It was purely for my testing purposes.

David
-----

> Coleen
> 
> On 7/23/20 4:52 AM, David Holmes wrote:
>> webrev: http://cr.openjdk.java.net/~dholmes/8194309/webrev/
>> bug: https://bugs.openjdk.java.net/browse/JDK-8194309
>>
>> The JNI specification states that NewLocalref and NewGloalRef return 
>> NULL on out-of-memory, whilst NewWeakGlobalRef throws 
>> OutofMemoryError. But the hotspot implementation will abort on 
>> out-of-memory in all cases.
>>
>> The oop storage code for global and weak-global handles already 
>> supports taking an AllocFailStrategy, so we simply pass the 
>> RETURN_NULL strategy through - and in the weak case throw a newly 
>> defined OutOfMemoryError for "C heap space" (as a corollary for "Java 
>> heap space").
>>
>> For NewLocalRef we pass the strategy through to 
>> JNIHandleBlock::allocate_block, where we explicitly use "new 
>> (std::nothrow)" to get a NULL on out-of-memory, so that we can pass it 
>> back.
>>
>> There are three internal calls to NewGlobalRef in jni.cpp that needed 
>> a NULL check added to support the new behaviour.
>>
>> In reality we know that if any of these things actually return NULL 
>> because C-heap is exhausted then chances are we are going to abort 
>> soon in any case. But to be spec compliant we make the changes.
>>
>> Note that I deliberately do not change any of the internal 
>> JNIHandle::make_local calls (contained in the majority of JNI methods) 
>> to get NULL on out-of-memory. This is because none of those APIs are 
>> specified in a way that even considers what should happen if an 
>> internal request to create a local-ref fails - so we can neither 
>> return NULL nor throw an exception in general. All this fix addresses 
>> are the three specific JNI entry points themselves.
>>
>> Also note there is no attempt with this changeset to add NULL checks 
>> to all the JNI code in the other JDK libraries that uses these API's. 
>> Interestingly quite a number already include the NULL checks.
>>
>> Testing:
>>
>> There is no practical way to test this for real so I had to use 
>> fault-injection. A version of the webrev with the fault-injection 
>> hooks and a test case, is presented here (for the record):
>>
>> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/
>>
>> The test is constructed so that only the JNI calls in the test should 
>> possibly encounter the NULL returns.
>>
>> Otherwise only sanity testing of tiers 1-3.
>>
>> Thanks,
>> David
> 

From david.holmes at oracle.com  Thu Jul 23 13:51:51 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 23:51:51 +1000
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
In-Reply-To: <8b8651c7-8034-aaae-edf3-a8c7f0b5039f@oracle.com>
References: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
 <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>
 <8b8651c7-8034-aaae-edf3-a8c7f0b5039f@oracle.com>
Message-ID: <3be37a17-b80f-8515-a831-d6268aef7708@oracle.com>

Hi Harold,

On 23/07/2020 11:22 pm, Harold Seigel wrote:
> Hi David,
> 
> Thanks for looking at this.
> 
> The existing @requires for test AbstractMethodErrorTest.java contained 
> this clause:
> 
>     (!vm.graal.enabled | vm.opt.TieredCompilation == true)
> 
> This clause evaluated to TRUE if either Graal was disabled or 
> vm.opt.TieredCompilation was true.

Okay so this claimed the test was okay with Graal as long as tiered was 
enabled but ...

> Since now Graal is always disabled, this clause would always be TRUE,

... we decided no Graal under any conditions ... okay ...

> regardless of the value of vm.opt.TieredCompilation.  There is not
> requirement that tiered compilation be enabled for this test.

... but if tiered is not enabled then what is the significance of 
"vm.opt.TieredStopAtLevel==4" ?

Sorry but this is one of the most complex and obscure @requires 
conditions that I've seen. And I don't see how it achieves the goal of 
running under the interpreter and compiler (per the synopsis)?

Thanks,
David

> Thanks, Harold
> 
> On 7/22/2020 11:00 PM, David Holmes wrote:
>> Hi Harold,
>>
>> On 23/07/2020 8:05 am, Harold Seigel wrote:
>>> Hi,
>>>
>>> Please review this small fix to avoid running test 
>>> AbstractMethodErrorTest.java with Graal and remove it from the 
>>> ProblemList.
>>>
>>> Open Webrev: 
>>> http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html
>>
>> You seem to have lost the requirement that tiered compilation be 
>> enabled. ??
>>
>> Thanks,
>> David
>> -----
>>
>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582
>>>
>>> The change was tested by using mach5 testing and checking that the 
>>> test was not run in tier*-graal tasks but was run in non-graal tasks.
>>>
>>> Thanks, Harold
>>>

From david.holmes at oracle.com  Thu Jul 23 13:58:09 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 23 Jul 2020 23:58:09 +1000
Subject: RFR (S): 8194309: JNI handle allocation failure not reported
 correctly
In-Reply-To: <44d84c2d-6126-7cf4-766f-315f955b6269@oracle.com>
References: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>
 <c365c92a-fdb7-d41f-bade-94c680778e53@oracle.com>
 <a0074c61-98ba-d9f6-b402-ec98ecd2969c@oracle.com>
 <44d84c2d-6126-7cf4-766f-315f955b6269@oracle.com>
Message-ID: <39bdebc3-ec0e-1ccc-97eb-3e3a767b3d80@oracle.com>

On 23/07/2020 11:54 pm, coleen.phillimore at oracle.com wrote:
> On 7/23/20 9:39 AM, David Holmes wrote:
>> On 23/07/2020 10:46 pm, coleen.phillimore at oracle.com wrote:
>>>
>>> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/src/hotspot/share/runtime/jniHandles.cpp.udiff.html 
>>>
>>>
>>> These functions shouldn't use UseNewCode which is false by default. 
>>> UseNewCode is for testing experimental features locally, not for 
>>> checked in code.? You should use logging for this.
>>
>> That code is not being checked in. It was purely for my testing purposes.
> 
> I really need to start reading to the bottom of the mail, I clicked 
> first.? The real version looks good to me.

Thanks for the review Coleen!

David

> Coleen
> 
>>
>> David
>> -----
>>
>>> Coleen
>>>
>>> On 7/23/20 4:52 AM, David Holmes wrote:
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8194309/webrev/
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8194309
>>>>
>>>> The JNI specification states that NewLocalref and NewGloalRef return 
>>>> NULL on out-of-memory, whilst NewWeakGlobalRef throws 
>>>> OutofMemoryError. But the hotspot implementation will abort on 
>>>> out-of-memory in all cases.
>>>>
>>>> The oop storage code for global and weak-global handles already 
>>>> supports taking an AllocFailStrategy, so we simply pass the 
>>>> RETURN_NULL strategy through - and in the weak case throw a newly 
>>>> defined OutOfMemoryError for "C heap space" (as a corollary for 
>>>> "Java heap space").
>>>>
>>>> For NewLocalRef we pass the strategy through to 
>>>> JNIHandleBlock::allocate_block, where we explicitly use "new 
>>>> (std::nothrow)" to get a NULL on out-of-memory, so that we can pass 
>>>> it back.
>>>>
>>>> There are three internal calls to NewGlobalRef in jni.cpp that 
>>>> needed a NULL check added to support the new behaviour.
>>>>
>>>> In reality we know that if any of these things actually return NULL 
>>>> because C-heap is exhausted then chances are we are going to abort 
>>>> soon in any case. But to be spec compliant we make the changes.
>>>>
>>>> Note that I deliberately do not change any of the internal 
>>>> JNIHandle::make_local calls (contained in the majority of JNI 
>>>> methods) to get NULL on out-of-memory. This is because none of those 
>>>> APIs are specified in a way that even considers what should happen 
>>>> if an internal request to create a local-ref fails - so we can 
>>>> neither return NULL nor throw an exception in general. All this fix 
>>>> addresses are the three specific JNI entry points themselves.
>>>>
>>>> Also note there is no attempt with this changeset to add NULL checks 
>>>> to all the JNI code in the other JDK libraries that uses these 
>>>> API's. Interestingly quite a number already include the NULL checks.
>>>>
>>>> Testing:
>>>>
>>>> There is no practical way to test this for real so I had to use 
>>>> fault-injection. A version of the webrev with the fault-injection 
>>>> hooks and a test case, is presented here (for the record):
>>>>
>>>> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/
>>>>
>>>> The test is constructed so that only the JNI calls in the test 
>>>> should possibly encounter the NULL returns.
>>>>
>>>> Otherwise only sanity testing of tiers 1-3.
>>>>
>>>> Thanks,
>>>> David
>>>
> 

From coleen.phillimore at oracle.com  Thu Jul 23 13:54:19 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 23 Jul 2020 09:54:19 -0400
Subject: RFR (S): 8194309: JNI handle allocation failure not reported
 correctly
In-Reply-To: <a0074c61-98ba-d9f6-b402-ec98ecd2969c@oracle.com>
References: <ec1cfc48-fdb1-2268-f4a3-029845dc65d4@oracle.com>
 <c365c92a-fdb7-d41f-bade-94c680778e53@oracle.com>
 <a0074c61-98ba-d9f6-b402-ec98ecd2969c@oracle.com>
Message-ID: <44d84c2d-6126-7cf4-766f-315f955b6269@oracle.com>


On 7/23/20 9:39 AM, David Holmes wrote:
> On 23/07/2020 10:46 pm, coleen.phillimore at oracle.com wrote:
>>
>> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/src/hotspot/share/runtime/jniHandles.cpp.udiff.html 
>>
>>
>> These functions shouldn't use UseNewCode which is false by default. 
>> UseNewCode is for testing experimental features locally, not for 
>> checked in code.? You should use logging for this.
>
> That code is not being checked in. It was purely for my testing purposes.

I really need to start reading to the bottom of the mail, I clicked 
first.? The real version looks good to me.
Coleen

>
> David
> -----
>
>> Coleen
>>
>> On 7/23/20 4:52 AM, David Holmes wrote:
>>> webrev: http://cr.openjdk.java.net/~dholmes/8194309/webrev/
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8194309
>>>
>>> The JNI specification states that NewLocalref and NewGloalRef return 
>>> NULL on out-of-memory, whilst NewWeakGlobalRef throws 
>>> OutofMemoryError. But the hotspot implementation will abort on 
>>> out-of-memory in all cases.
>>>
>>> The oop storage code for global and weak-global handles already 
>>> supports taking an AllocFailStrategy, so we simply pass the 
>>> RETURN_NULL strategy through - and in the weak case throw a newly 
>>> defined OutOfMemoryError for "C heap space" (as a corollary for 
>>> "Java heap space").
>>>
>>> For NewLocalRef we pass the strategy through to 
>>> JNIHandleBlock::allocate_block, where we explicitly use "new 
>>> (std::nothrow)" to get a NULL on out-of-memory, so that we can pass 
>>> it back.
>>>
>>> There are three internal calls to NewGlobalRef in jni.cpp that 
>>> needed a NULL check added to support the new behaviour.
>>>
>>> In reality we know that if any of these things actually return NULL 
>>> because C-heap is exhausted then chances are we are going to abort 
>>> soon in any case. But to be spec compliant we make the changes.
>>>
>>> Note that I deliberately do not change any of the internal 
>>> JNIHandle::make_local calls (contained in the majority of JNI 
>>> methods) to get NULL on out-of-memory. This is because none of those 
>>> APIs are specified in a way that even considers what should happen 
>>> if an internal request to create a local-ref fails - so we can 
>>> neither return NULL nor throw an exception in general. All this fix 
>>> addresses are the three specific JNI entry points themselves.
>>>
>>> Also note there is no attempt with this changeset to add NULL checks 
>>> to all the JNI code in the other JDK libraries that uses these 
>>> API's. Interestingly quite a number already include the NULL checks.
>>>
>>> Testing:
>>>
>>> There is no practical way to test this for real so I had to use 
>>> fault-injection. A version of the webrev with the fault-injection 
>>> hooks and a test case, is presented here (for the record):
>>>
>>> http://cr.openjdk.java.net/~dholmes/8194309/webrev.with-test-hooks/
>>>
>>> The test is constructed so that only the JNI calls in the test 
>>> should possibly encounter the NULL returns.
>>>
>>> Otherwise only sanity testing of tiers 1-3.
>>>
>>> Thanks,
>>> David
>>


From goetz.lindenmaier at sap.com  Thu Jul 23 14:19:57 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 23 Jul 2020 14:19:57 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB333139A9A877B64198E73D0F9B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM4PR0202MB2964FAF58FBD21D6705A4418EC7C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM0PR0202MB333139A9A877B64198E73D0F9B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB296490252335D6D6D638277AEC760@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard, 

Thanks for your two further explanations in the other thread. 
That made the points clear to me.

> > I was not that happy with the names saying not_global_escape
> > and similar. I now agreed you have to use the terms of the escape
> > analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with
> > the 'not' in the term, I always try to expand the name to some
> > sentence with a negated verb, but it makes no sense.
> > For example, "has_not_global_escape_in_scope" expands to
> > "Hasn't a global escape in its scope." in my thinking, which makes
> > no sense. You probably mean
> > "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape}
> > in its scope."
> 
> > C2 is using the word "non" in this context, e.g., here
> > alloc->is_non_escaping.
> 
> There is also ConnectionGraph::not_global_escape()
That talks about a single node that represents a single 
Object. An object has a single state wrt. ea.
You use the term for safepoint which tracks a set of objects.
Here, has_not_global_excape can mean
  1. None of the several objects does escape globaly.
  2. There is at least one object that escapes globaly.

> > non obviously negates the adjective 'global',
> > non-global or nonglobal even is a English term I find in the
> > net.
> > So what about "has_non_global_escape_in_scope?"
> 
> And what about has_ea_local_in_scope?
That's good. Please document somewhere that 
Ea_local == ArgEscape | NoEscape.
That's what it is, right?

> > Does jvmti specify that the same limits are used ...?
> > ok on your side.
> 
> I don't know and didn't find anything in a quick search.
Ok, not your business.

> 
> > jvmtiEnvBase.cpp  ok
> > jvmtiImpl.h|cpp  ok
> > jvmtiTagMap.cpp ok
> > whitebox.cpp ok
> 
> > deoptimization.cpp
> 
> > line 177: Please break line
> > line 246, 281: Please break line
> > 1578, 1583, 1589, 1632, 1649, 1651 Break line
> 
> > 1651: You use 'non'-terms, too: non-escaping :)
> 
> I know :) At least here it is wrong I'd say. "...has to be a not escaping obj..."
> sounds better
> (hopefully not only to my german ears).
I thought the term non-escpaing makes it quite clear.
I just wanted to point out that using non above would
be similar to the wording here.

> > IterateHeapWithEscapeAnalysisEnabled.java
> 
> > line 415:
> > msg("wait until target thread has set testMethod_result");
> > while (testMethod_result == 0) {
> >     Thread.sleep(50);
> > }
> > Might the test run into timeouts at this place?
> > The field is volatile, i.e. it will be reloaded
> > in each iteration. But will dontinline_testMethod
> > write it back to main memory in time?
> 
> You mean, the test could hang in that loop for a couple of minutes? I don't
> think so. There are cache coherence protocols in place which will invalidate
> stale data very timely.
Ok, anyways, it would only be a hanging test.
> 
> Ok. I've removed quite a lot of the occurrances.
> 
> > Also, I like full sentences in comments.
> > Especially for me as foreign speaker, this makes
> > things much more clear. I.e., I try to make it
> > a real sentence with articles, capitalized and a
> > dot at the end if there is a subject and a verb
> > in first place.
> > E.g., jvmtiEnvBase.cpp:1327
> 
> Are you referring to the following?
> (from
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/src/hots
> pot/share/prims/jvmtiEnvBase.cpp.frames.html)
> 
> 1326
> 1327   // If the frame is a compiled one, need to deoptimize it.
> 1328   if (vf->is_compiled_frame()) {
> 
> This line 1327 is preexisting.
Sorry, wrong line number again. 
I think I meant
1333 // eagerly reallocate scalar replaced objects.

But I must admit, the subject is missing. It's one of these 
imperative sentences where the subject is left out, which 
are used throughout documentation.
Bad example, but still a correct sentence, so qualifies 
for punctuation?

Best regards,
  Goetz.


From xxinliu at amazon.com  Thu Jul 23 16:02:42 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Thu, 23 Jul 2020 16:02:42 +0000
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <a03d92d6-ad07-b347-7452-776459b8d174@oracle.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
 <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
 <1595401959932.33284@amazon.com>,
 <a03d92d6-ad07-b347-7452-776459b8d174@oracle.com>
Message-ID: <1595520162373.22868@amazon.com>

hi, Tobias, 

That is my intention too, but CompilerOracle doesn't exit JVM when it encounters parsing errors. 
It just exacts information from CompileCommand as many as possible. That makes sense because compiler "directives" are supposed to be optional for program execution. 

I do put the error message in parser's errorbuf.  I set a flag "exit_on_error" to quit JVM after it dumps parser errors. yes, I treat undefined intrinsics as fatal errors.  
This behavior is from Nils comment: "I want to see an error on startup if the user has specified unknown intrinsic names."  It is also consistent with JVM option -XX:ControlIntrinsic=. 

thanks, 
--lx

________________________________________
From: Tobias Hartmann <tobias.hartmann at oracle.com>
Sent: Thursday, July 23, 2020 2:13 AM
To: Liu, Xin; Nils Eliasson; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev
Subject: RE: [EXTERNAL] RFR(S): 8247732: validate user-input intrinsic_ids in ControlIntrinsic

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


On 22.07.20 09:12, Liu, Xin wrote:
> 1. I move the validation logic for compiler directives to compilerOracle::scan_flag_and_value.
> If something wrong  happens in parser, the patch will "gracefully" quit JVM using jvm_exit(1). is that okay?

With "piggy-back on the error mechanism" I meant that you should use the existing bailout mechanism
in the parser. In this case, couldn't you simply put the error message in 'errorbuf' and let the
caller take care of handling it?

Best regards,
Tobias


From calvin.cheung at oracle.com  Thu Jul 23 16:40:46 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Thu, 23 Jul 2020 09:40:46 -0700
Subject: RFR(XS): 8249630: unused is_static_archive parameter in
 SystemDictionaryShared::write_dictionary
In-Reply-To: <bab6559e-a8cf-5987-19f5-0a46e3b65186@oracle.com>
References: <7bef8c0f-edd4-4852-f6c8-59b61417e604@oracle.com>
 <bab6559e-a8cf-5987-19f5-0a46e3b65186@oracle.com>
Message-ID: <517e10ae-c406-44e4-11c9-b3173c6f6b0a@oracle.com>

Thanks Yumin!

Calvin

On 7/22/20 3:58 PM, Yumin Qi wrote:
> Looks good!
>
>
> Thanks
>
> Yumin
>
> On 7/22/20 2:32 PM, Calvin Cheung wrote:
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249630
>>
>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8249630/webrev.00/
>>
>> Please review this small cleanup.
>>
>> Passed tier1 and tier2 tests.
>>
>> thanks,
>>
>> Calvin
>>

From calvin.cheung at oracle.com  Thu Jul 23 16:41:08 2020
From: calvin.cheung at oracle.com (Calvin Cheung)
Date: Thu, 23 Jul 2020 09:41:08 -0700
Subject: RFR(XS): 8249630: unused is_static_archive parameter in
 SystemDictionaryShared::write_dictionary
In-Reply-To: <14bcb264-61fd-1c46-7366-29bfa5ec19f3@oracle.com>
References: <7bef8c0f-edd4-4852-f6c8-59b61417e604@oracle.com>
 <14bcb264-61fd-1c46-7366-29bfa5ec19f3@oracle.com>
Message-ID: <cd7a17be-706c-0896-4f43-3d08f1267812@oracle.com>

Thanks David!

Calvin

On 7/22/20 9:03 PM, David Holmes wrote:
> Looks good and trivial.
>
> Thanks,
> David
>
> On 23/07/2020 7:32 am, Calvin Cheung wrote:
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249630
>>
>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8249630/webrev.00/
>>
>> Please review this small cleanup.
>>
>> Passed tier1 and tier2 tests.
>>
>> thanks,
>>
>> Calvin
>>

From coleen.phillimore at oracle.com  Thu Jul 23 17:05:15 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 23 Jul 2020 13:05:15 -0400
Subject: RFR (S) 8249938: Move mirror oops from Universe into OopStorage
Message-ID: <368cf365-cfd2-5269-509c-b64b19509150@oracle.com>

Summary: Save and restore mirror oops to temporary array for CDS, and 
move them to OopStorage once restored.

This is a subtask of moving oops out of Universe.? I ran performance 
tested of this and there is no performance change.? Some slight decrease 
in number of instructions (improvement!) in Perfstartup-Noop that were 
flagged as significant - 0.10%

Tested with mach5 tier1-3.

open webrev at http://cr.openjdk.java.net/~coleenp/2020/8249938.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8249938

Thanks,
Coleen

From yumin.qi at oracle.com  Thu Jul 23 17:10:15 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Thu, 23 Jul 2020 10:10:15 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
 <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
Message-ID: <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>

HI, David

 ? Thanks for the review. Updated on new link with your suggestion:

http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-01/


Thanks

Yumin


On 7/22/20 9:30 PM, David Holmes wrote:
> Hi Yumin,
>
> Given we have the earlier test:
>
> ?112???????? // ======= archive with compressed oops, run w/o
>
> it would seem better if we had:
>
> ?112???????? // Explicitly archive with compressed oops, run without.
>
> and:
>
> ?127???????? // Implicitly archive with compressed oops, run without.
> ?128???????? // Max heap size for compressed oops is around 31G.
> ?129???????? // UseCompressedOops is turned on by default when the heap
> ?130???????? // size is under 31G, but will be turned off when the heap
> ?131???????? // size is greater than that.
>
> And should we also have the opposite test:
>
> // Explicitly archive without compressed oops and run with.
> // Implicitly archive without compressed oops and run with.
>
> Thanks,
> David
> -----
>
> On 23/07/2020 8:55 am, Yumin Qi wrote:
>> Hi Ioi,
>>
>> ?? I have updated the words as your suggestion, also more precisely 
>> for the max heap size for compressed oop is around 31G, which is 
>> calculated by max_heap_for_compressed_oops().
>>
>> ?? updated on same webrev.
>>
>> $J6/bin/java -Xshare:on 
>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
>> java version "16-internal" 2021-03-16
>> Java(TM) SE Runtime Environment (slowdebug build 
>> 16-internal+0-adhoc.minqi.open)
>> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
>> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
>>
>> $J6/bin/java -Xshare:on 
>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
>> An error has occurred while processing the shared archive file.
>> Unable to use shared archive.
>> The saved state of UseCompressedOops and UseCompressedClassPointers 
>> is different from runtime, CDS will be disabled.
>> Error occurred during initialization of VM
>> Unable to use shared archive.
>>
>>
>> Thanks
>>
>> Yumin
>>
>> On 7/22/20 2:06 PM, Ioi Lam wrote:
>>> Hi Yumin,
>>>
>>> Just small nits on the comments:
>>>
>>> // UseCompressedOops default is turned on when heap is under 32G but 
>>> will be
>>>
>>> -> UseCompressedOops is turned on by default ....
>>>
>>> // turned off when heap is greater than 32G. This leads inconsistency
>>>
>>> -> This leads to inconsistency ...
>>>
>>> // of UseCompressedOops at dump time and runtime.
>>>
>>>
>>> Thanks
>>> - Ioi
>>>
>>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>>> Hi, Please review this tiny change on comment:
>>>>
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>>
>>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>>
>>>>
>>>> Note 8081416 already marked as fixed (thanks Ioi), please read the 
>>>> comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>>
>>>> With CDS can be done with UseCompressedOops disabled, the test 
>>>> already has correct result.
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Yumin
>>>>
>>>>
>>>

From yumin.qi at oracle.com  Thu Jul 23 17:13:33 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Thu, 23 Jul 2020 10:13:33 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
 <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
 <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
Message-ID: <14ec180b-ccf4-bf58-62a5-af4879e120c6@oracle.com>

Tests: passed local jtreg. Sumbitted for mach5 tier1/2 and in testing.


On 7/23/20 10:10 AM, Yumin Qi wrote:
> HI, David
>
> ? Thanks for the review. Updated on new link with your suggestion:
>
> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-01/
>
>
> Thanks
>
> Yumin
>
>
> On 7/22/20 9:30 PM, David Holmes wrote:
>> Hi Yumin,
>>
>> Given we have the earlier test:
>>
>> ?112???????? // ======= archive with compressed oops, run w/o
>>
>> it would seem better if we had:
>>
>> ?112???????? // Explicitly archive with compressed oops, run without.
>>
>> and:
>>
>> ?127???????? // Implicitly archive with compressed oops, run without.
>> ?128???????? // Max heap size for compressed oops is around 31G.
>> ?129???????? // UseCompressedOops is turned on by default when the heap
>> ?130???????? // size is under 31G, but will be turned off when the heap
>> ?131???????? // size is greater than that.
>>
>> And should we also have the opposite test:
>>
>> // Explicitly archive without compressed oops and run with.
>> // Implicitly archive without compressed oops and run with.
>>
>> Thanks,
>> David
>> -----
>>
>> On 23/07/2020 8:55 am, Yumin Qi wrote:
>>> Hi Ioi,
>>>
>>> ?? I have updated the words as your suggestion, also more precisely 
>>> for the max heap size for compressed oop is around 31G, which is 
>>> calculated by max_heap_for_compressed_oops().
>>>
>>> ?? updated on same webrev.
>>>
>>> $J6/bin/java -Xshare:on 
>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
>>> java version "16-internal" 2021-03-16
>>> Java(TM) SE Runtime Environment (slowdebug build 
>>> 16-internal+0-adhoc.minqi.open)
>>> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
>>> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
>>>
>>> $J6/bin/java -Xshare:on 
>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
>>> An error has occurred while processing the shared archive file.
>>> Unable to use shared archive.
>>> The saved state of UseCompressedOops and UseCompressedClassPointers 
>>> is different from runtime, CDS will be disabled.
>>> Error occurred during initialization of VM
>>> Unable to use shared archive.
>>>
>>>
>>> Thanks
>>>
>>> Yumin
>>>
>>> On 7/22/20 2:06 PM, Ioi Lam wrote:
>>>> Hi Yumin,
>>>>
>>>> Just small nits on the comments:
>>>>
>>>> // UseCompressedOops default is turned on when heap is under 32G 
>>>> but will be
>>>>
>>>> -> UseCompressedOops is turned on by default ....
>>>>
>>>> // turned off when heap is greater than 32G. This leads inconsistency
>>>>
>>>> -> This leads to inconsistency ...
>>>>
>>>> // of UseCompressedOops at dump time and runtime.
>>>>
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>>>> Hi, Please review this tiny change on comment:
>>>>>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>>>
>>>>>
>>>>> Note 8081416 already marked as fixed (thanks Ioi), please read the 
>>>>> comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>>>
>>>>> With CDS can be done with UseCompressedOops disabled, the test 
>>>>> already has correct result.
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Yumin
>>>>>
>>>>>
>>>>

From harold.seigel at oracle.com  Thu Jul 23 20:09:37 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Thu, 23 Jul 2020 16:09:37 -0400
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
In-Reply-To: <3be37a17-b80f-8515-a831-d6268aef7708@oracle.com>
References: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
 <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>
 <8b8651c7-8034-aaae-edf3-a8c7f0b5039f@oracle.com>
 <3be37a17-b80f-8515-a831-d6268aef7708@oracle.com>
Message-ID: <2ea6bd32-8c90-0c74-7bfd-c04a00741463@oracle.com>

Hi David,

The test achieves its goal of running under the interpreter and compiler 
by running with "vm.compMode=Xmixed".

The test fails unless both the interpreter and JIT compiled code 
generate AbstractMethodError exceptions.? I ran the test in tiers 1-5 
without it failing.? So, at least for those test runs, the test achieved 
its goal of running under both the compiler and interpreter.

Perhaps the purpose of "vm.opt.TieredStopAtLevel==4" is to specify the 
tiered behavior if TieredCompiliation is specified?

Thanks, Harold

On 7/23/2020 9:51 AM, David Holmes wrote:
> Hi Harold,
>
> On 23/07/2020 11:22 pm, Harold Seigel wrote:
>> Hi David,
>>
>> Thanks for looking at this.
>>
>> The existing @requires for test AbstractMethodErrorTest.java 
>> contained this clause:
>>
>> ??? (!vm.graal.enabled | vm.opt.TieredCompilation == true)
>>
>> This clause evaluated to TRUE if either Graal was disabled or 
>> vm.opt.TieredCompilation was true.
>
> Okay so this claimed the test was okay with Graal as long as tiered 
> was enabled but ...
>
>> Since now Graal is always disabled, this clause would always be TRUE,
>
> ... we decided no Graal under any conditions ... okay ...
>
>> regardless of the value of vm.opt.TieredCompilation.? There is not
>> requirement that tiered compilation be enabled for this test.
>
> ... but if tiered is not enabled then what is the significance of 
> "vm.opt.TieredStopAtLevel==4" ?
>
> Sorry but this is one of the most complex and obscure @requires 
> conditions that I've seen. And I don't see how it achieves the goal of 
> running under the interpreter and compiler (per the synopsis)?
>
> Thanks,
> David
>
>> Thanks, Harold
>>
>> On 7/22/2020 11:00 PM, David Holmes wrote:
>>> Hi Harold,
>>>
>>> On 23/07/2020 8:05 am, Harold Seigel wrote:
>>>> Hi,
>>>>
>>>> Please review this small fix to avoid running test 
>>>> AbstractMethodErrorTest.java with Graal and remove it from the 
>>>> ProblemList.
>>>>
>>>> Open Webrev: 
>>>> http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html
>>>
>>> You seem to have lost the requirement that tiered compilation be 
>>> enabled. ??
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582
>>>>
>>>> The change was tested by using mach5 testing and checking that the 
>>>> test was not run in tier*-graal tasks but was run in non-graal tasks.
>>>>
>>>> Thanks, Harold
>>>>

From ioi.lam at oracle.com  Thu Jul 23 20:18:45 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 23 Jul 2020 13:18:45 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
 <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
 <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
Message-ID: <55616e1d-8cf4-bbca-ce88-19e0c1a20f29@oracle.com>

Looks good to me.

Thanks
- Ioi

On 7/23/20 10:10 AM, Yumin Qi wrote:
> HI, David
>
> ? Thanks for the review. Updated on new link with your suggestion:
>
> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-01/
>
>
> Thanks
>
> Yumin
>
>
> On 7/22/20 9:30 PM, David Holmes wrote:
>> Hi Yumin,
>>
>> Given we have the earlier test:
>>
>> ?112???????? // ======= archive with compressed oops, run w/o
>>
>> it would seem better if we had:
>>
>> ?112???????? // Explicitly archive with compressed oops, run without.
>>
>> and:
>>
>> ?127???????? // Implicitly archive with compressed oops, run without.
>> ?128???????? // Max heap size for compressed oops is around 31G.
>> ?129???????? // UseCompressedOops is turned on by default when the heap
>> ?130???????? // size is under 31G, but will be turned off when the heap
>> ?131???????? // size is greater than that.
>>
>> And should we also have the opposite test:
>>
>> // Explicitly archive without compressed oops and run with.
>> // Implicitly archive without compressed oops and run with.
>>
>> Thanks,
>> David
>> -----
>>
>> On 23/07/2020 8:55 am, Yumin Qi wrote:
>>> Hi Ioi,
>>>
>>> ?? I have updated the words as your suggestion, also more precisely 
>>> for the max heap size for compressed oop is around 31G, which is 
>>> calculated by max_heap_for_compressed_oops().
>>>
>>> ?? updated on same webrev.
>>>
>>> $J6/bin/java -Xshare:on 
>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
>>> java version "16-internal" 2021-03-16
>>> Java(TM) SE Runtime Environment (slowdebug build 
>>> 16-internal+0-adhoc.minqi.open)
>>> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
>>> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
>>>
>>> $J6/bin/java -Xshare:on 
>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
>>> An error has occurred while processing the shared archive file.
>>> Unable to use shared archive.
>>> The saved state of UseCompressedOops and UseCompressedClassPointers 
>>> is different from runtime, CDS will be disabled.
>>> Error occurred during initialization of VM
>>> Unable to use shared archive.
>>>
>>>
>>> Thanks
>>>
>>> Yumin
>>>
>>> On 7/22/20 2:06 PM, Ioi Lam wrote:
>>>> Hi Yumin,
>>>>
>>>> Just small nits on the comments:
>>>>
>>>> // UseCompressedOops default is turned on when heap is under 32G 
>>>> but will be
>>>>
>>>> -> UseCompressedOops is turned on by default ....
>>>>
>>>> // turned off when heap is greater than 32G. This leads inconsistency
>>>>
>>>> -> This leads to inconsistency ...
>>>>
>>>> // of UseCompressedOops at dump time and runtime.
>>>>
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>>>> Hi, Please review this tiny change on comment:
>>>>>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>>>
>>>>>
>>>>> Note 8081416 already marked as fixed (thanks Ioi), please read the 
>>>>> comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>>>
>>>>> With CDS can be done with UseCompressedOops disabled, the test 
>>>>> already has correct result.
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Yumin
>>>>>
>>>>>
>>>>


From daniel.daugherty at oracle.com  Thu Jul 23 20:28:18 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 23 Jul 2020 16:28:18 -0400
Subject: RFR(T): 8250236: ProblemList
 java/lang/invoke/lambda/LambdaFileEncodingSerialization.java on linux-x64
Message-ID: <28f472e4-b3c5-783d-b0a1-b5c0af72e6df@oracle.com>

Greetings,

I'm making another pass at reducing the noise in the JDK16 CI.

There are currently 34 sightings of this test failure since I filed
the bug on 2020.07.08. I think it is time to ProblemList this test. For
whatever reason, this failures only seems to be happening on Linux-X64
machines.

Here's the context diff for this trivial review:

$ hg diff
diff -r d62da6fc4074 test/jdk/ProblemList.txt
--- a/test/jdk/ProblemList.txt??? Thu Jul 23 20:25:41 2020 +0100
+++ b/test/jdk/ProblemList.txt??? Thu Jul 23 16:20:37 2020 -0400
@@ -571,6 +571,7 @@
 ?java/lang/ProcessHandle/InfoTest.java 8211847 aix-ppc64
 ?java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java 8151492 
generic-all
 ?java/lang/invoke/LFCaching/LFGarbageCollectedTest.java 8078602 generic-all
+java/lang/invoke/lambda/LambdaFileEncodingSerialization.java 8249079 
linux-x64

 ?############################################################################


Thanks, in advance, for any comments, questions or suggestions.

Dan

From daniel.daugherty at oracle.com  Thu Jul 23 20:30:38 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 23 Jul 2020 16:30:38 -0400
Subject: RFR(T): 8250236: ProblemList
 java/lang/invoke/lambda/LambdaFileEncodingSerialization.java on linux-x64
Message-ID: <5a510876-73de-4043-47eb-65520fb973eb@oracle.com>

Accidentally pushed send before I was done. Please ignore the previous 
email...

Greetings,

I'm making another pass at reducing the noise in the JDK16 CI.

There are currently 34 sightings of this test failure since I filed
the bug on 2020.07.08. I think it is time to ProblemList this test. For
whatever reason, this failures only seems to be happening on Linux-X64
machines.

Here's the bug:

 ??? JDK-8249079 LambdaFileEncodingSerialization.java failed "exitCode = 
1 expected [true] but found [false]"
 ??? https://bugs.openjdk.java.net/browse/JDK-8249079

Here's the ProblemListing bug:

 ??? JDK-8250236 ProblemList 
java/lang/invoke/lambda/LambdaFileEncodingSerialization.java on linux-x64
 ??? https://bugs.openjdk.java.net/browse/JDK-8250236

Here's the context diff for this trivial review:

$ hg diff
diff -r d62da6fc4074 test/jdk/ProblemList.txt
--- a/test/jdk/ProblemList.txt??? Thu Jul 23 20:25:41 2020 +0100
+++ b/test/jdk/ProblemList.txt??? Thu Jul 23 16:20:37 2020 -0400
@@ -571,6 +571,7 @@
 ?java/lang/ProcessHandle/InfoTest.java 8211847 aix-ppc64
 ?java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java 8151492 
generic-all
 ?java/lang/invoke/LFCaching/LFGarbageCollectedTest.java 8078602 generic-all
+java/lang/invoke/lambda/LambdaFileEncodingSerialization.java 8249079 
linux-x64

 ?############################################################################


Thanks, in advance, for any comments, questions or suggestions.

Dan


From Roger.Riggs at oracle.com  Thu Jul 23 20:34:40 2020
From: Roger.Riggs at oracle.com (Roger Riggs)
Date: Thu, 23 Jul 2020 16:34:40 -0400
Subject: RFR(T): 8250236: ProblemList
 java/lang/invoke/lambda/LambdaFileEncodingSerialization.java on linux-x64
In-Reply-To: <5a510876-73de-4043-47eb-65520fb973eb@oracle.com>
References: <5a510876-73de-4043-47eb-65520fb973eb@oracle.com>
Message-ID: <7b772558-3f78-e075-18dd-d3d6a40cb9dd@oracle.com>

Looks good, thanks

On 7/23/20 4:30 PM, Daniel D. Daugherty wrote:
> Accidentally pushed send before I was done. Please ignore the previous 
> email...
>
> Greetings,
>
> I'm making another pass at reducing the noise in the JDK16 CI.
>
> There are currently 34 sightings of this test failure since I filed
> the bug on 2020.07.08. I think it is time to ProblemList this test. For
> whatever reason, this failures only seems to be happening on Linux-X64
> machines.
>
> Here's the bug:
>
> ??? JDK-8249079 LambdaFileEncodingSerialization.java failed "exitCode 
> = 1 expected [true] but found [false]"
> ??? https://bugs.openjdk.java.net/browse/JDK-8249079
>
> Here's the ProblemListing bug:
>
> ??? JDK-8250236 ProblemList 
> java/lang/invoke/lambda/LambdaFileEncodingSerialization.java on linux-x64
> ??? https://bugs.openjdk.java.net/browse/JDK-8250236
>
> Here's the context diff for this trivial review:
>
> $ hg diff
> diff -r d62da6fc4074 test/jdk/ProblemList.txt
> --- a/test/jdk/ProblemList.txt??? Thu Jul 23 20:25:41 2020 +0100
> +++ b/test/jdk/ProblemList.txt??? Thu Jul 23 16:20:37 2020 -0400
> @@ -571,6 +571,7 @@
> ?java/lang/ProcessHandle/InfoTest.java 8211847 aix-ppc64
> ?java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java 8151492 
> generic-all
> ?java/lang/invoke/LFCaching/LFGarbageCollectedTest.java 8078602 
> generic-all
> +java/lang/invoke/lambda/LambdaFileEncodingSerialization.java 8249079 
> linux-x64
>
> ?############################################################################ 
>
>
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>


From daniel.daugherty at oracle.com  Thu Jul 23 20:35:34 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 23 Jul 2020 16:35:34 -0400
Subject: RFR(T): 8250236: ProblemList
 java/lang/invoke/lambda/LambdaFileEncodingSerialization.java on linux-x64
In-Reply-To: <7b772558-3f78-e075-18dd-d3d6a40cb9dd@oracle.com>
References: <5a510876-73de-4043-47eb-65520fb973eb@oracle.com>
 <7b772558-3f78-e075-18dd-d3d6a40cb9dd@oracle.com>
Message-ID: <7ec832a4-4484-631e-da1c-6e7704a48991@oracle.com>

Thanks for the fast review.

Dan


On 7/23/20 4:34 PM, Roger Riggs wrote:
> Looks good, thanks
>
> On 7/23/20 4:30 PM, Daniel D. Daugherty wrote:
>> Accidentally pushed send before I was done. Please ignore the 
>> previous email...
>>
>> Greetings,
>>
>> I'm making another pass at reducing the noise in the JDK16 CI.
>>
>> There are currently 34 sightings of this test failure since I filed
>> the bug on 2020.07.08. I think it is time to ProblemList this test. For
>> whatever reason, this failures only seems to be happening on Linux-X64
>> machines.
>>
>> Here's the bug:
>>
>> ??? JDK-8249079 LambdaFileEncodingSerialization.java failed "exitCode 
>> = 1 expected [true] but found [false]"
>> ??? https://bugs.openjdk.java.net/browse/JDK-8249079
>>
>> Here's the ProblemListing bug:
>>
>> ??? JDK-8250236 ProblemList 
>> java/lang/invoke/lambda/LambdaFileEncodingSerialization.java on 
>> linux-x64
>> ??? https://bugs.openjdk.java.net/browse/JDK-8250236
>>
>> Here's the context diff for this trivial review:
>>
>> $ hg diff
>> diff -r d62da6fc4074 test/jdk/ProblemList.txt
>> --- a/test/jdk/ProblemList.txt??? Thu Jul 23 20:25:41 2020 +0100
>> +++ b/test/jdk/ProblemList.txt??? Thu Jul 23 16:20:37 2020 -0400
>> @@ -571,6 +571,7 @@
>> ?java/lang/ProcessHandle/InfoTest.java 8211847 aix-ppc64
>> ?java/lang/invoke/LFCaching/LFMultiThreadCachingTest.java 8151492 
>> generic-all
>> ?java/lang/invoke/LFCaching/LFGarbageCollectedTest.java 8078602 
>> generic-all
>> +java/lang/invoke/lambda/LambdaFileEncodingSerialization.java 8249079 
>> linux-x64
>>
>> ?############################################################################ 
>>
>>
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>


From ioi.lam at oracle.com  Thu Jul 23 20:41:39 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 23 Jul 2020 13:41:39 -0700
Subject: RFR (S) 8249938: Move mirror oops from Universe into OopStorage
In-Reply-To: <368cf365-cfd2-5269-509c-b64b19509150@oracle.com>
References: <368cf365-cfd2-5269-509c-b64b19509150@oracle.com>
Message-ID: <58e90095-199d-2ed1-4139-53536ad2fb08@oracle.com>


On 7/23/20 10:05 AM, coleen.phillimore at oracle.com wrote:
> Summary: Save and restore mirror oops to temporary array for CDS, and 
> move them to OopStorage once restored.
>
> This is a subtask of moving oops out of Universe.? I ran performance 
> tested of this and there is no performance change. Some slight 
> decrease in number of instructions (improvement!) in Perfstartup-Noop 
> that were flagged as significant - 0.10%
>
> Tested with mach5 tier1-3.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/2020/8249938.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8249938
>
> Thanks,
> Coleen

Hi Coleen,

The changes looks good. I think Universe::serialize() can be simplified 
as (not tested):

void Universe::serialize(SerializeClosure* f) {

#if INCLUDE_CDS_JAVA_HEAP
 ? {
 ??? oop mirror_oop;
 ??? for (int i = T_BOOLEAN; i < T_VOID+1; i++) {
 ????? if (f->is_reading()) {
 ??????? f->do_oop(&mirror_oop); // read from archive
 ??????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
 ????? } else {
 ??????? mirror_oop = _mirrors[i].resolve();
 ??????? f->do_oop(&mirror_oop); // write to archive
 ????? }
 ????? if (mirror_oop != NULL) { // may be null if archived heap is disabled
java_lang_Class::update_archived_primitive_mirror_native_pointers(mirror_oop);
 ????? }
 ??? }
 ? }
#endif

Thanks
- Ioi

From coleen.phillimore at oracle.com  Thu Jul 23 21:49:50 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 23 Jul 2020 17:49:50 -0400
Subject: RFR (S) 8249938: Move mirror oops from Universe into OopStorage
In-Reply-To: <58e90095-199d-2ed1-4139-53536ad2fb08@oracle.com>
References: <368cf365-cfd2-5269-509c-b64b19509150@oracle.com>
 <58e90095-199d-2ed1-4139-53536ad2fb08@oracle.com>
Message-ID: <2c10f73c-81be-8c00-5004-5fd4bdf9b985@oracle.com>


Thank you for reviewing, Ioi.

On 7/23/20 4:41 PM, Ioi Lam wrote:
>
>
> On 7/23/20 10:05 AM, coleen.phillimore at oracle.com wrote:
>> Summary: Save and restore mirror oops to temporary array for CDS, and 
>> move them to OopStorage once restored.
>>
>> This is a subtask of moving oops out of Universe.? I ran performance 
>> tested of this and there is no performance change. Some slight 
>> decrease in number of instructions (improvement!) in Perfstartup-Noop 
>> that were flagged as significant - 0.10%
>>
>> Tested with mach5 tier1-3.
>>
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8249938.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8249938
>>
>> Thanks,
>> Coleen
>
> Hi Coleen,
>
> The changes looks good. I think Universe::serialize() can be 
> simplified as (not tested):
>
> void Universe::serialize(SerializeClosure* f) {
>
> #if INCLUDE_CDS_JAVA_HEAP
> ? {
> ??? oop mirror_oop;
> ??? for (int i = T_BOOLEAN; i < T_VOID+1; i++) {
> ????? if (f->is_reading()) {?? // f->reading() ...
> ??????? f->do_oop(&mirror_oop); // read from archive
> ??????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
> ????? } else {
> ??????? mirror_oop = _mirrors[i].resolve();
> ??????? f->do_oop(&mirror_oop); // write to archive
> ????? }
> ????? if (mirror_oop != NULL) { // may be null if archived heap is 
> disabled
> java_lang_Class::update_archived_primitive_mirror_native_pointers(mirror_oop); 
>
> ????? }
> ??? }
> ? }
> #endif

Yes that works and is better.? I think GC hates when the location of 
mirror_oop is the same, but serializing doesn't care.? I've rerun tier1 
tests on it and it all passes.
Thanks!
Coleen
>
> Thanks
> - Ioi


From yumin.qi at oracle.com  Thu Jul 23 23:23:28 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Thu, 23 Jul 2020 16:23:28 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <55616e1d-8cf4-bbca-ce88-19e0c1a20f29@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
 <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
 <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
 <55616e1d-8cf4-bbca-ce88-19e0c1a20f29@oracle.com>
Message-ID: <ffc69ff3-e2b2-122b-45f5-5e113967c4ed@oracle.com>

Hi, Ioi

 ?Thanks for re-review!


Yumin

On 7/23/20 1:18 PM, Ioi Lam wrote:
> Looks good to me.
>
> Thanks
> - Ioi
>
> On 7/23/20 10:10 AM, Yumin Qi wrote:
>> HI, David
>>
>> ? Thanks for the review. Updated on new link with your suggestion:
>>
>> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-01/
>>
>>
>> Thanks
>>
>> Yumin
>>
>>
>> On 7/22/20 9:30 PM, David Holmes wrote:
>>> Hi Yumin,
>>>
>>> Given we have the earlier test:
>>>
>>> ?112???????? // ======= archive with compressed oops, run w/o
>>>
>>> it would seem better if we had:
>>>
>>> ?112???????? // Explicitly archive with compressed oops, run without.
>>>
>>> and:
>>>
>>> ?127???????? // Implicitly archive with compressed oops, run without.
>>> ?128???????? // Max heap size for compressed oops is around 31G.
>>> ?129???????? // UseCompressedOops is turned on by default when the heap
>>> ?130???????? // size is under 31G, but will be turned off when the heap
>>> ?131???????? // size is greater than that.
>>>
>>> And should we also have the opposite test:
>>>
>>> // Explicitly archive without compressed oops and run with.
>>> // Implicitly archive without compressed oops and run with.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>> On 23/07/2020 8:55 am, Yumin Qi wrote:
>>>> Hi Ioi,
>>>>
>>>> ?? I have updated the words as your suggestion, also more precisely 
>>>> for the max heap size for compressed oop is around 31G, which is 
>>>> calculated by max_heap_for_compressed_oops().
>>>>
>>>> ?? updated on same webrev.
>>>>
>>>> $J6/bin/java -Xshare:on 
>>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
>>>> java version "16-internal" 2021-03-16
>>>> Java(TM) SE Runtime Environment (slowdebug build 
>>>> 16-internal+0-adhoc.minqi.open)
>>>> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
>>>> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
>>>>
>>>> $J6/bin/java -Xshare:on 
>>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
>>>> An error has occurred while processing the shared archive file.
>>>> Unable to use shared archive.
>>>> The saved state of UseCompressedOops and UseCompressedClassPointers 
>>>> is different from runtime, CDS will be disabled.
>>>> Error occurred during initialization of VM
>>>> Unable to use shared archive.
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Yumin
>>>>
>>>> On 7/22/20 2:06 PM, Ioi Lam wrote:
>>>>> Hi Yumin,
>>>>>
>>>>> Just small nits on the comments:
>>>>>
>>>>> // UseCompressedOops default is turned on when heap is under 32G 
>>>>> but will be
>>>>>
>>>>> -> UseCompressedOops is turned on by default ....
>>>>>
>>>>> // turned off when heap is greater than 32G. This leads inconsistency
>>>>>
>>>>> -> This leads to inconsistency ...
>>>>>
>>>>> // of UseCompressedOops at dump time and runtime.
>>>>>
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>>>>> Hi, Please review this tiny change on comment:
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>>>>
>>>>>>
>>>>>> Note 8081416 already marked as fixed (thanks Ioi), please read 
>>>>>> the comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>>>>
>>>>>> With CDS can be done with UseCompressedOops disabled, the test 
>>>>>> already has correct result.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>
>

From david.holmes at oracle.com  Thu Jul 23 23:25:02 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 24 Jul 2020 09:25:02 +1000
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
In-Reply-To: <2ea6bd32-8c90-0c74-7bfd-c04a00741463@oracle.com>
References: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
 <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>
 <8b8651c7-8034-aaae-edf3-a8c7f0b5039f@oracle.com>
 <3be37a17-b80f-8515-a831-d6268aef7708@oracle.com>
 <2ea6bd32-8c90-0c74-7bfd-c04a00741463@oracle.com>
Message-ID: <aeea97ae-8894-3a32-9a2d-a3a730db7f5d@oracle.com>

On 24/07/2020 6:09 am, Harold Seigel wrote:
> Hi David,
> 
> The test achieves its goal of running under the interpreter and compiler 
> by running with "vm.compMode=Xmixed".

Okay .. and that precludes Xcomp as well.

> The test fails unless both the interpreter and JIT compiled code 
> generate AbstractMethodError exceptions.? I ran the test in tiers 1-5 
> without it failing.? So, at least for those test runs, the test achieved 
> its goal of running under both the compiler and interpreter.
> 
> Perhaps the purpose of "vm.opt.TieredStopAtLevel==4" is to specify the 
> tiered behavior if TieredCompiliation is specified?

Not clear - especially with the explicit WB compilation. This is one 
complex test! :)

Sorry to belabour what should be a trivial fix. As long as it works 
without Graal and now excludes Graal, it is good to go.

Thanks,
David
-----

> Thanks, Harold
> 
> On 7/23/2020 9:51 AM, David Holmes wrote:
>> Hi Harold,
>>
>> On 23/07/2020 11:22 pm, Harold Seigel wrote:
>>> Hi David,
>>>
>>> Thanks for looking at this.
>>>
>>> The existing @requires for test AbstractMethodErrorTest.java 
>>> contained this clause:
>>>
>>> ??? (!vm.graal.enabled | vm.opt.TieredCompilation == true)
>>>
>>> This clause evaluated to TRUE if either Graal was disabled or 
>>> vm.opt.TieredCompilation was true.
>>
>> Okay so this claimed the test was okay with Graal as long as tiered 
>> was enabled but ...
>>
>>> Since now Graal is always disabled, this clause would always be TRUE,
>>
>> ... we decided no Graal under any conditions ... okay ...
>>
>>> regardless of the value of vm.opt.TieredCompilation.? There is not
>>> requirement that tiered compilation be enabled for this test.
>>
>> ... but if tiered is not enabled then what is the significance of 
>> "vm.opt.TieredStopAtLevel==4" ?
>>
>> Sorry but this is one of the most complex and obscure @requires 
>> conditions that I've seen. And I don't see how it achieves the goal of 
>> running under the interpreter and compiler (per the synopsis)?
>>
>> Thanks,
>> David
>>
>>> Thanks, Harold
>>>
>>> On 7/22/2020 11:00 PM, David Holmes wrote:
>>>> Hi Harold,
>>>>
>>>> On 23/07/2020 8:05 am, Harold Seigel wrote:
>>>>> Hi,
>>>>>
>>>>> Please review this small fix to avoid running test 
>>>>> AbstractMethodErrorTest.java with Graal and remove it from the 
>>>>> ProblemList.
>>>>>
>>>>> Open Webrev: 
>>>>> http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html
>>>>
>>>> You seem to have lost the requirement that tiered compilation be 
>>>> enabled. ??
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582
>>>>>
>>>>> The change was tested by using mach5 testing and checking that the 
>>>>> test was not run in tier*-graal tasks but was run in non-graal tasks.
>>>>>
>>>>> Thanks, Harold
>>>>>

From david.holmes at oracle.com  Thu Jul 23 23:31:07 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 24 Jul 2020 09:31:07 +1000
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
 <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
 <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
Message-ID: <267b24e5-5104-c770-3679-602e745ac46b@oracle.com>

Hi Yumin,

On 24/07/2020 3:10 am, Yumin Qi wrote:
> HI, David
> 
>  ? Thanks for the review. Updated on new link with your suggestion:
> 
> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-01/

  142         // Explicitly archive without compressed oops and run with.
  143         testDump(13, "-XX:+UseG1GC", "-XX:-UseCompressedOops", 
null, false);
  144         testExec(13, "-XX:+UseG1GC", "-XX:-UseCompressedOops", 
null, false);

That should be using +UseCompressedOops for testExec and expect failure.

  145         // Implicitly archive without compressed oops and run with.
  146         testDump(14, "-XX:+UseG1GC", "-Xmx32G", null, false);
  147         testExec(14, "-XX:+UseG1GC", "-Xmx32G", null, false);

That should be using -Xmx1G for testExec and expect failure.

David
-----

> 
> Thanks
> 
> Yumin
> 
> 
> On 7/22/20 9:30 PM, David Holmes wrote:
>> Hi Yumin,
>>
>> Given we have the earlier test:
>>
>> ?112???????? // ======= archive with compressed oops, run w/o
>>
>> it would seem better if we had:
>>
>> ?112???????? // Explicitly archive with compressed oops, run without.
>>
>> and:
>>
>> ?127???????? // Implicitly archive with compressed oops, run without.
>> ?128???????? // Max heap size for compressed oops is around 31G.
>> ?129???????? // UseCompressedOops is turned on by default when the heap
>> ?130???????? // size is under 31G, but will be turned off when the heap
>> ?131???????? // size is greater than that.
>>
>> And should we also have the opposite test:
>>
>> // Explicitly archive without compressed oops and run with.
>> // Implicitly archive without compressed oops and run with.
>>
>> Thanks,
>> David
>> -----
>>
>> On 23/07/2020 8:55 am, Yumin Qi wrote:
>>> Hi Ioi,
>>>
>>> ?? I have updated the words as your suggestion, also more precisely 
>>> for the max heap size for compressed oop is around 31G, which is 
>>> calculated by max_heap_for_compressed_oops().
>>>
>>> ?? updated on same webrev.
>>>
>>> $J6/bin/java -Xshare:on 
>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
>>> java version "16-internal" 2021-03-16
>>> Java(TM) SE Runtime Environment (slowdebug build 
>>> 16-internal+0-adhoc.minqi.open)
>>> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
>>> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
>>>
>>> $J6/bin/java -Xshare:on 
>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
>>> An error has occurred while processing the shared archive file.
>>> Unable to use shared archive.
>>> The saved state of UseCompressedOops and UseCompressedClassPointers 
>>> is different from runtime, CDS will be disabled.
>>> Error occurred during initialization of VM
>>> Unable to use shared archive.
>>>
>>>
>>> Thanks
>>>
>>> Yumin
>>>
>>> On 7/22/20 2:06 PM, Ioi Lam wrote:
>>>> Hi Yumin,
>>>>
>>>> Just small nits on the comments:
>>>>
>>>> // UseCompressedOops default is turned on when heap is under 32G but 
>>>> will be
>>>>
>>>> -> UseCompressedOops is turned on by default ....
>>>>
>>>> // turned off when heap is greater than 32G. This leads inconsistency
>>>>
>>>> -> This leads to inconsistency ...
>>>>
>>>> // of UseCompressedOops at dump time and runtime.
>>>>
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>>>> Hi, Please review this tiny change on comment:
>>>>>
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>>>
>>>>>
>>>>> Note 8081416 already marked as fixed (thanks Ioi), please read the 
>>>>> comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>>>
>>>>> With CDS can be done with UseCompressedOops disabled, the test 
>>>>> already has correct result.
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Yumin
>>>>>
>>>>>
>>>>

From coleen.phillimore at oracle.com  Fri Jul 24 01:43:23 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 23 Jul 2020 21:43:23 -0400
Subject: RFR (S) 8249938: Move mirror oops from Universe into OopStorage
In-Reply-To: <2c10f73c-81be-8c00-5004-5fd4bdf9b985@oracle.com>
References: <368cf365-cfd2-5269-509c-b64b19509150@oracle.com>
 <58e90095-199d-2ed1-4139-53536ad2fb08@oracle.com>
 <2c10f73c-81be-8c00-5004-5fd4bdf9b985@oracle.com>
Message-ID: <ef886ba2-724c-f357-1b2e-b667f12063c8@oracle.com>


For the record and for reviewer #2, this is the incremental webrev, with 
a bug fixed in ReadClosure::do_oop.

incr webrev at 
http://cr.openjdk.java.net/~coleenp/2020/8249938.02.incr/webrev
full webrev at http://cr.openjdk.java.net/~coleenp/2020/8249938.02/webrev

thanks,
Coleen


On 7/23/20 5:49 PM, coleen.phillimore at oracle.com wrote:
>
> Thank you for reviewing, Ioi.
>
> On 7/23/20 4:41 PM, Ioi Lam wrote:
>>
>>
>> On 7/23/20 10:05 AM, coleen.phillimore at oracle.com wrote:
>>> Summary: Save and restore mirror oops to temporary array for CDS, 
>>> and move them to OopStorage once restored.
>>>
>>> This is a subtask of moving oops out of Universe.? I ran performance 
>>> tested of this and there is no performance change. Some slight 
>>> decrease in number of instructions (improvement!) in 
>>> Perfstartup-Noop that were flagged as significant - 0.10%
>>>
>>> Tested with mach5 tier1-3.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2020/8249938.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8249938
>>>
>>> Thanks,
>>> Coleen
>>
>> Hi Coleen,
>>
>> The changes looks good. I think Universe::serialize() can be 
>> simplified as (not tested):
>>
>> void Universe::serialize(SerializeClosure* f) {
>>
>> #if INCLUDE_CDS_JAVA_HEAP
>> ? {
>> ??? oop mirror_oop;
>> ??? for (int i = T_BOOLEAN; i < T_VOID+1; i++) {
>> ????? if (f->is_reading()) {?? // f->reading() ...
>> ??????? f->do_oop(&mirror_oop); // read from archive
>> ??????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
>> ????? } else {
>> ??????? mirror_oop = _mirrors[i].resolve();
>> ??????? f->do_oop(&mirror_oop); // write to archive
>> ????? }
>> ????? if (mirror_oop != NULL) { // may be null if archived heap is 
>> disabled
>> java_lang_Class::update_archived_primitive_mirror_native_pointers(mirror_oop); 
>>
>> ????? }
>> ??? }
>> ? }
>> #endif
>
> Yes that works and is better.? I think GC hates when the location of 
> mirror_oop is the same, but serializing doesn't care.? I've rerun 
> tier1 tests on it and it all passes.


> Thanks!
> Coleen
>>
>> Thanks
>> - Ioi
>


From Nikola.Grcevski at microsoft.com  Fri Jul 24 01:47:41 2020
From: Nikola.Grcevski at microsoft.com (Nikola Grcevski)
Date: Fri, 24 Jul 2020 01:47:41 +0000
Subject: RFR(s): Support graceful application termination on Windows
 shutdown/logoff 
Message-ID: <DM6PR21MB12890934E3B440B16CB962B1F5770@DM6PR21MB1289.namprd21.prod.outlook.com>

Hello hotspot-runtime-dev,

After some recent investigation into stale files remaining after Java process terminates
on Windows shutdown, we noticed that there's missing support for detecting Windows
shutdown/logoff events for interactive Java applications. Given that Java loads both 
GDI32.dll and USER32.dll, even for console applications, this means that almost all Java 
processes launched on Windows don't run any shutdown hooks at the moment on
user logoff or system shutdown/restart.

Since Windows 7, all Windows applications that load (or transitively call) GDI32.dll or USER32.dll
will not receive the CTRL_LOGOFF_EVENT and CTRL_SHUTDOWN_EVENT events, but
instead they will be sent WM_ENDSESSION.

This is documented in MSDN under the following article:

https://docs.microsoft.com/en-us/windows/console/setconsolectrlhandler

It appears that this issue was logged in JSB at some point, but it was
made duplicate of another issue:

https://bugs.openjdk.java.net/browse/JDK-8079631 

The behaviour changed going from Windows Vista to Windows 7.

I've made a proposal patch to address this issue under the following webrev:

http://cr.openjdk.java.net/~adityam/nikola/wm_endsession_handling/

At the moment only AWT applications would terminate gracefully on 
shutdown/logoff, because they have support for listening on WM_ENDSESSION.
There's a bug in the AWT code, it doesn't check for wparam upon receiving the
event, but it will work in most cases. If this patch is accepted I can submit a
follow-up patch for AWT to resolve the possible issues.

Finally, there are third set of events for service processes, for example  
java applications which are started with a Windows Service wrappers. These
services work with SERVICE_ACCEPT_SHUTDOWN and SERVICE_CONTROL_SHUTDOWN.
Once the most common case is resolved, I'd like to submit perhaps a follow-up patch
to support graceful termination of Java as Windows service programs.

We are working to amend the MSDN documentation for SetConsoleCtrlHandler
to specify that this behaviour change is also present on server OSs. The documentation only 
mentions the workstation OS flavours at the moment.

Thanks in advance for reviewing this.

Nikola Grcevski
Microsoft

From david.holmes at oracle.com  Fri Jul 24 03:01:56 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 24 Jul 2020 13:01:56 +1000
Subject: RFR(s): Support graceful application termination on Windows
 shutdown/logoff
In-Reply-To: <DM6PR21MB12890934E3B440B16CB962B1F5770@DM6PR21MB1289.namprd21.prod.outlook.com>
References: <DM6PR21MB12890934E3B440B16CB962B1F5770@DM6PR21MB1289.namprd21.prod.outlook.com>
Message-ID: <b1d37679-f376-b9d8-0745-23c4f78ba304@oracle.com>

Hi Nikola,

I'm redirecting this to the core-libs team initially because this is an 
issue that has been raised and discussed considerably in the past 
(possibly with some misunderstanding relating to the WM_ENDSESSION 
event). The core-libs team need to confirm the intended semantics here 
and we (runtime) can then implement whatever is determined to be needed. 
Interaction with the client team for AWT interoperability may also be 
needed.

Thanks,
David

On 24/07/2020 11:47 am, Nikola Grcevski wrote:
> Hello hotspot-runtime-dev,
> 
> After some recent investigation into stale files remaining after Java process terminates
> on Windows shutdown, we noticed that there's missing support for detecting Windows
> shutdown/logoff events for interactive Java applications. Given that Java loads both
> GDI32.dll and USER32.dll, even for console applications, this means that almost all Java
> processes launched on Windows don't run any shutdown hooks at the moment on
> user logoff or system shutdown/restart.
> 
> Since Windows 7, all Windows applications that load (or transitively call) GDI32.dll or USER32.dll
> will not receive the CTRL_LOGOFF_EVENT and CTRL_SHUTDOWN_EVENT events, but
> instead they will be sent WM_ENDSESSION.
> 
> This is documented in MSDN under the following article:
> 
> https://docs.microsoft.com/en-us/windows/console/setconsolectrlhandler
> 
> It appears that this issue was logged in JSB at some point, but it was
> made duplicate of another issue:
> 
> https://bugs.openjdk.java.net/browse/JDK-8079631
> 
> The behaviour changed going from Windows Vista to Windows 7.
> 
> I've made a proposal patch to address this issue under the following webrev:
> 
> http://cr.openjdk.java.net/~adityam/nikola/wm_endsession_handling/
> 
> At the moment only AWT applications would terminate gracefully on
> shutdown/logoff, because they have support for listening on WM_ENDSESSION.
> There's a bug in the AWT code, it doesn't check for wparam upon receiving the
> event, but it will work in most cases. If this patch is accepted I can submit a
> follow-up patch for AWT to resolve the possible issues.
> 
> Finally, there are third set of events for service processes, for example
> java applications which are started with a Windows Service wrappers. These
> services work with SERVICE_ACCEPT_SHUTDOWN and SERVICE_CONTROL_SHUTDOWN.
> Once the most common case is resolved, I'd like to submit perhaps a follow-up patch
> to support graceful termination of Java as Windows service programs.
> 
> We are working to amend the MSDN documentation for SetConsoleCtrlHandler
> to specify that this behaviour change is also present on server OSs. The documentation only
> mentions the workstation OS flavours at the moment.
> 
> Thanks in advance for reviewing this.
> 
> Nikola Grcevski
> Microsoft
> 

From david.holmes at oracle.com  Fri Jul 24 03:49:58 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 24 Jul 2020 13:49:58 +1000
Subject: RFR (S) 8247296: Optimize JVM_GetDeclaringClass
Message-ID: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>

Bug: https://bugs.openjdk.java.net/browse/JDK-8247296
webrev: http://cr.openjdk.java.net/~dholmes/8247296/webrev/

Please review this simple optimization contributed by Christoph Dreis in 
its initial form and then expanded by me to cover other cases in jvm.cpp.

http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-June/040025.html

There is a common pattern of code of the form:

if (java_lang_Class::is_primitive(JNIHandles::resolve_non_null(ofClass)) ||
     ! 
java_lang_Class::as_Klass(JNIHandles::resolve_non_null(ofClass))->is_instance_klass()) 
{

which resolves cls twice. There are also duplicate calls to as_Klass 
that can be removed in a couple of cases.

Testing: tiers 1 - 3

Thanks,
David

From yumin.qi at oracle.com  Fri Jul 24 05:01:58 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Thu, 23 Jul 2020 22:01:58 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <267b24e5-5104-c770-3679-602e745ac46b@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
 <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
 <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
 <267b24e5-5104-c770-3679-602e745ac46b@oracle.com>
Message-ID: <ed88834d-2686-6558-cffd-0371d126a0e8@oracle.com>

Hi, David

 ? Sorry I misunderstood the sentence as "without xxxx and run without 
(xxxx)".

 ? Updated at new link: 
http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-02/

 ? Also re-group similar option tests. Passed local jtreg.

Thanks

Yumin


On 7/23/20 4:31 PM, David Holmes wrote:
> Hi Yumin,
>
> On 24/07/2020 3:10 am, Yumin Qi wrote:
>> HI, David
>>
>> ?? Thanks for the review. Updated on new link with your suggestion:
>>
>> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-01/
>
> ?142???????? // Explicitly archive without compressed oops and run with.
> ?143???????? testDump(13, "-XX:+UseG1GC", "-XX:-UseCompressedOops", 
> null, false);
> ?144???????? testExec(13, "-XX:+UseG1GC", "-XX:-UseCompressedOops", 
> null, false);
>
> That should be using +UseCompressedOops for testExec and expect failure.
>
> ?145???????? // Implicitly archive without compressed oops and run with.
> ?146???????? testDump(14, "-XX:+UseG1GC", "-Xmx32G", null, false);
> ?147???????? testExec(14, "-XX:+UseG1GC", "-Xmx32G", null, false);
>
> That should be using -Xmx1G for testExec and expect failure.
>
> David
> -----
>
>>
>> Thanks
>>
>> Yumin
>>
>>
>> On 7/22/20 9:30 PM, David Holmes wrote:
>>> Hi Yumin,
>>>
>>> Given we have the earlier test:
>>>
>>> ?112???????? // ======= archive with compressed oops, run w/o
>>>
>>> it would seem better if we had:
>>>
>>> ?112???????? // Explicitly archive with compressed oops, run without.
>>>
>>> and:
>>>
>>> ?127???????? // Implicitly archive with compressed oops, run without.
>>> ?128???????? // Max heap size for compressed oops is around 31G.
>>> ?129???????? // UseCompressedOops is turned on by default when the heap
>>> ?130???????? // size is under 31G, but will be turned off when the heap
>>> ?131???????? // size is greater than that.
>>>
>>> And should we also have the opposite test:
>>>
>>> // Explicitly archive without compressed oops and run with.
>>> // Implicitly archive without compressed oops and run with.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>> On 23/07/2020 8:55 am, Yumin Qi wrote:
>>>> Hi Ioi,
>>>>
>>>> ?? I have updated the words as your suggestion, also more precisely 
>>>> for the max heap size for compressed oop is around 31G, which is 
>>>> calculated by max_heap_for_compressed_oops().
>>>>
>>>> ?? updated on same webrev.
>>>>
>>>> $J6/bin/java -Xshare:on 
>>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
>>>> java version "16-internal" 2021-03-16
>>>> Java(TM) SE Runtime Environment (slowdebug build 
>>>> 16-internal+0-adhoc.minqi.open)
>>>> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
>>>> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
>>>>
>>>> $J6/bin/java -Xshare:on 
>>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
>>>> An error has occurred while processing the shared archive file.
>>>> Unable to use shared archive.
>>>> The saved state of UseCompressedOops and UseCompressedClassPointers 
>>>> is different from runtime, CDS will be disabled.
>>>> Error occurred during initialization of VM
>>>> Unable to use shared archive.
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Yumin
>>>>
>>>> On 7/22/20 2:06 PM, Ioi Lam wrote:
>>>>> Hi Yumin,
>>>>>
>>>>> Just small nits on the comments:
>>>>>
>>>>> // UseCompressedOops default is turned on when heap is under 32G 
>>>>> but will be
>>>>>
>>>>> -> UseCompressedOops is turned on by default ....
>>>>>
>>>>> // turned off when heap is greater than 32G. This leads inconsistency
>>>>>
>>>>> -> This leads to inconsistency ...
>>>>>
>>>>> // of UseCompressedOops at dump time and runtime.
>>>>>
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>>>>> Hi, Please review this tiny change on comment:
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>>>>
>>>>>>
>>>>>> Note 8081416 already marked as fixed (thanks Ioi), please read 
>>>>>> the comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>>>>
>>>>>> With CDS can be done with UseCompressedOops disabled, the test 
>>>>>> already has correct result.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Yumin
>>>>>>
>>>>>>
>>>>>

From david.holmes at oracle.com  Fri Jul 24 05:13:56 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 24 Jul 2020 15:13:56 +1000
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <ed88834d-2686-6558-cffd-0371d126a0e8@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
 <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
 <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
 <267b24e5-5104-c770-3679-602e745ac46b@oracle.com>
 <ed88834d-2686-6558-cffd-0371d126a0e8@oracle.com>
Message-ID: <5ac8ea05-f81e-b1a9-1af2-f052bdfd538e@oracle.com>

Looks good!

Thanks,
David

On 24/07/2020 3:01 pm, Yumin Qi wrote:
> Hi, David
> 
>  ? Sorry I misunderstood the sentence as "without xxxx and run without 
> (xxxx)".
> 
>  ? Updated at new link: 
> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-02/
> 
>  ? Also re-group similar option tests. Passed local jtreg.
> 
> Thanks
> 
> Yumin
> 
> 
> On 7/23/20 4:31 PM, David Holmes wrote:
>> Hi Yumin,
>>
>> On 24/07/2020 3:10 am, Yumin Qi wrote:
>>> HI, David
>>>
>>> ?? Thanks for the review. Updated on new link with your suggestion:
>>>
>>> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-01/
>>
>> ?142???????? // Explicitly archive without compressed oops and run with.
>> ?143???????? testDump(13, "-XX:+UseG1GC", "-XX:-UseCompressedOops", 
>> null, false);
>> ?144???????? testExec(13, "-XX:+UseG1GC", "-XX:-UseCompressedOops", 
>> null, false);
>>
>> That should be using +UseCompressedOops for testExec and expect failure.
>>
>> ?145???????? // Implicitly archive without compressed oops and run with.
>> ?146???????? testDump(14, "-XX:+UseG1GC", "-Xmx32G", null, false);
>> ?147???????? testExec(14, "-XX:+UseG1GC", "-Xmx32G", null, false);
>>
>> That should be using -Xmx1G for testExec and expect failure.
>>
>> David
>> -----
>>
>>>
>>> Thanks
>>>
>>> Yumin
>>>
>>>
>>> On 7/22/20 9:30 PM, David Holmes wrote:
>>>> Hi Yumin,
>>>>
>>>> Given we have the earlier test:
>>>>
>>>> ?112???????? // ======= archive with compressed oops, run w/o
>>>>
>>>> it would seem better if we had:
>>>>
>>>> ?112???????? // Explicitly archive with compressed oops, run without.
>>>>
>>>> and:
>>>>
>>>> ?127???????? // Implicitly archive with compressed oops, run without.
>>>> ?128???????? // Max heap size for compressed oops is around 31G.
>>>> ?129???????? // UseCompressedOops is turned on by default when the heap
>>>> ?130???????? // size is under 31G, but will be turned off when the heap
>>>> ?131???????? // size is greater than that.
>>>>
>>>> And should we also have the opposite test:
>>>>
>>>> // Explicitly archive without compressed oops and run with.
>>>> // Implicitly archive without compressed oops and run with.
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>> On 23/07/2020 8:55 am, Yumin Qi wrote:
>>>>> Hi Ioi,
>>>>>
>>>>> ?? I have updated the words as your suggestion, also more precisely 
>>>>> for the max heap size for compressed oop is around 31G, which is 
>>>>> calculated by max_heap_for_compressed_oops().
>>>>>
>>>>> ?? updated on same webrev.
>>>>>
>>>>> $J6/bin/java -Xshare:on 
>>>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx31G -version
>>>>> java version "16-internal" 2021-03-16
>>>>> Java(TM) SE Runtime Environment (slowdebug build 
>>>>> 16-internal+0-adhoc.minqi.open)
>>>>> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
>>>>> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
>>>>>
>>>>> $J6/bin/java -Xshare:on 
>>>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa? -Xmx32G -version
>>>>> An error has occurred while processing the shared archive file.
>>>>> Unable to use shared archive.
>>>>> The saved state of UseCompressedOops and UseCompressedClassPointers 
>>>>> is different from runtime, CDS will be disabled.
>>>>> Error occurred during initialization of VM
>>>>> Unable to use shared archive.
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Yumin
>>>>>
>>>>> On 7/22/20 2:06 PM, Ioi Lam wrote:
>>>>>> Hi Yumin,
>>>>>>
>>>>>> Just small nits on the comments:
>>>>>>
>>>>>> // UseCompressedOops default is turned on when heap is under 32G 
>>>>>> but will be
>>>>>>
>>>>>> -> UseCompressedOops is turned on by default ....
>>>>>>
>>>>>> // turned off when heap is greater than 32G. This leads inconsistency
>>>>>>
>>>>>> -> This leads to inconsistency ...
>>>>>>
>>>>>> // of UseCompressedOops at dump time and runtime.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>>>
>>>>>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>>>>>> Hi, Please review this tiny change on comment:
>>>>>>>
>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>>>>>
>>>>>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>>>>>
>>>>>>>
>>>>>>> Note 8081416 already marked as fixed (thanks Ioi), please read 
>>>>>>> the comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>>>>>
>>>>>>> With CDS can be done with UseCompressedOops disabled, the test 
>>>>>>> already has correct result.
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Yumin
>>>>>>>
>>>>>>>
>>>>>>

From yumin.qi at oracle.com  Fri Jul 24 05:14:55 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Thu, 23 Jul 2020 22:14:55 -0700
Subject: RFR(XXS) 8249624: update appcds/sharedStrings/IncompatibleOptions
 test in view of 8081416 closed as WNF
In-Reply-To: <5ac8ea05-f81e-b1a9-1af2-f052bdfd538e@oracle.com>
References: <89e9c141-9ade-537a-7e67-b8283b7b2879@oracle.com>
 <b462ace6-2333-8e08-3cb0-40da72be6ec0@oracle.com>
 <3f32be0d-358f-7b5a-28d1-50b9192ed832@oracle.com>
 <aecb5d60-a47f-f383-4e20-230423986ee3@oracle.com>
 <505684f4-42a7-98cc-1a4d-93235de63252@oracle.com>
 <267b24e5-5104-c770-3679-602e745ac46b@oracle.com>
 <ed88834d-2686-6558-cffd-0371d126a0e8@oracle.com>
 <5ac8ea05-f81e-b1a9-1af2-f052bdfd538e@oracle.com>
Message-ID: <43fda4d8-52c6-236f-46f1-65fd3e612c11@oracle.com>

Thanks David!

Yumin

On 7/23/20 10:13 PM, David Holmes wrote:
> Looks good!
>
> Thanks,
> David
>
> On 24/07/2020 3:01 pm, Yumin Qi wrote:
>> Hi, David
>>
>> ?? Sorry I misunderstood the sentence as "without xxxx and run 
>> without (xxxx)".
>>
>> ?? Updated at new link: 
>> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-02/
>>
>> ?? Also re-group similar option tests. Passed local jtreg.
>>
>> Thanks
>>
>> Yumin
>>
>>
>> On 7/23/20 4:31 PM, David Holmes wrote:
>>> Hi Yumin,
>>>
>>> On 24/07/2020 3:10 am, Yumin Qi wrote:
>>>> HI, David
>>>>
>>>> ?? Thanks for the review. Updated on new link with your suggestion:
>>>>
>>>> http://cr.openjdk.java.net/~minqi/2020/8249624/webrev-01/
>>>
>>> ?142???????? // Explicitly archive without compressed oops and run 
>>> with.
>>> ?143???????? testDump(13, "-XX:+UseG1GC", "-XX:-UseCompressedOops", 
>>> null, false);
>>> ?144???????? testExec(13, "-XX:+UseG1GC", "-XX:-UseCompressedOops", 
>>> null, false);
>>>
>>> That should be using +UseCompressedOops for testExec and expect 
>>> failure.
>>>
>>> ?145???????? // Implicitly archive without compressed oops and run 
>>> with.
>>> ?146???????? testDump(14, "-XX:+UseG1GC", "-Xmx32G", null, false);
>>> ?147???????? testExec(14, "-XX:+UseG1GC", "-Xmx32G", null, false);
>>>
>>> That should be using -Xmx1G for testExec and expect failure.
>>>
>>> David
>>> -----
>>>
>>>>
>>>> Thanks
>>>>
>>>> Yumin
>>>>
>>>>
>>>> On 7/22/20 9:30 PM, David Holmes wrote:
>>>>> Hi Yumin,
>>>>>
>>>>> Given we have the earlier test:
>>>>>
>>>>> ?112???????? // ======= archive with compressed oops, run w/o
>>>>>
>>>>> it would seem better if we had:
>>>>>
>>>>> ?112???????? // Explicitly archive with compressed oops, run without.
>>>>>
>>>>> and:
>>>>>
>>>>> ?127???????? // Implicitly archive with compressed oops, run without.
>>>>> ?128???????? // Max heap size for compressed oops is around 31G.
>>>>> ?129???????? // UseCompressedOops is turned on by default when the 
>>>>> heap
>>>>> ?130???????? // size is under 31G, but will be turned off when the 
>>>>> heap
>>>>> ?131???????? // size is greater than that.
>>>>>
>>>>> And should we also have the opposite test:
>>>>>
>>>>> // Explicitly archive without compressed oops and run with.
>>>>> // Implicitly archive without compressed oops and run with.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>> On 23/07/2020 8:55 am, Yumin Qi wrote:
>>>>>> Hi Ioi,
>>>>>>
>>>>>> ?? I have updated the words as your suggestion, also more 
>>>>>> precisely for the max heap size for compressed oop is around 31G, 
>>>>>> which is calculated by max_heap_for_compressed_oops().
>>>>>>
>>>>>> ?? updated on same webrev.
>>>>>>
>>>>>> $J6/bin/java -Xshare:on 
>>>>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa -Xmx31G -version
>>>>>> java version "16-internal" 2021-03-16
>>>>>> Java(TM) SE Runtime Environment (slowdebug build 
>>>>>> 16-internal+0-adhoc.minqi.open)
>>>>>> Java HotSpot(TM) 64-Bit Server VM (slowdebug build 
>>>>>> 16-internal+0-adhoc.minqi.open, mixed mode, sharing)
>>>>>>
>>>>>> $J6/bin/java -Xshare:on 
>>>>>> -XX:SharedArchiveFile=$J6/lib/server/classes.jsa -Xmx32G -version
>>>>>> An error has occurred while processing the shared archive file.
>>>>>> Unable to use shared archive.
>>>>>> The saved state of UseCompressedOops and 
>>>>>> UseCompressedClassPointers is different from runtime, CDS will be 
>>>>>> disabled.
>>>>>> Error occurred during initialization of VM
>>>>>> Unable to use shared archive.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Yumin
>>>>>>
>>>>>> On 7/22/20 2:06 PM, Ioi Lam wrote:
>>>>>>> Hi Yumin,
>>>>>>>
>>>>>>> Just small nits on the comments:
>>>>>>>
>>>>>>> // UseCompressedOops default is turned on when heap is under 32G 
>>>>>>> but will be
>>>>>>>
>>>>>>> -> UseCompressedOops is turned on by default ....
>>>>>>>
>>>>>>> // turned off when heap is greater than 32G. This leads 
>>>>>>> inconsistency
>>>>>>>
>>>>>>> -> This leads to inconsistency ...
>>>>>>>
>>>>>>> // of UseCompressedOops at dump time and runtime.
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>>>
>>>>>>> On 7/22/20 1:47 PM, Yumin Qi wrote:
>>>>>>>> Hi, Please review this tiny change on comment:
>>>>>>>>
>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8249624
>>>>>>>>
>>>>>>>> webrev: http://cr.openjdk.java.net/~minqi/2020/8249624/webrev/
>>>>>>>>
>>>>>>>>
>>>>>>>> Note 8081416 already marked as fixed (thanks Ioi), please read 
>>>>>>>> the comment on https://bugs.openjdk.java.net/browse/JDK-8081416
>>>>>>>>
>>>>>>>> With CDS can be done with UseCompressedOops disabled, the test 
>>>>>>>> already has correct result.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Yumin
>>>>>>>>
>>>>>>>>
>>>>>>>

From david.holmes at oracle.com  Fri Jul 24 05:30:03 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 24 Jul 2020 15:30:03 +1000
Subject: RFR (S) 8249938: Move mirror oops from Universe into OopStorage
In-Reply-To: <ef886ba2-724c-f357-1b2e-b667f12063c8@oracle.com>
References: <368cf365-cfd2-5269-509c-b64b19509150@oracle.com>
 <58e90095-199d-2ed1-4139-53536ad2fb08@oracle.com>
 <2c10f73c-81be-8c00-5004-5fd4bdf9b985@oracle.com>
 <ef886ba2-724c-f357-1b2e-b667f12063c8@oracle.com>
Message-ID: <ddc2e5bc-c2c5-3c43-445a-06a5b21061db@oracle.com>

Hi Coleen,

This all seems fine to me.

Thanks,
David

On 24/07/2020 11:43 am, coleen.phillimore at oracle.com wrote:
> 
> For the record and for reviewer #2, this is the incremental webrev, with 
> a bug fixed in ReadClosure::do_oop.
> 
> incr webrev at 
> http://cr.openjdk.java.net/~coleenp/2020/8249938.02.incr/webrev
> full webrev at http://cr.openjdk.java.net/~coleenp/2020/8249938.02/webrev
> 
> thanks,
> Coleen
> 
> 
> On 7/23/20 5:49 PM, coleen.phillimore at oracle.com wrote:
>>
>> Thank you for reviewing, Ioi.
>>
>> On 7/23/20 4:41 PM, Ioi Lam wrote:
>>>
>>>
>>> On 7/23/20 10:05 AM, coleen.phillimore at oracle.com wrote:
>>>> Summary: Save and restore mirror oops to temporary array for CDS, 
>>>> and move them to OopStorage once restored.
>>>>
>>>> This is a subtask of moving oops out of Universe.? I ran performance 
>>>> tested of this and there is no performance change. Some slight 
>>>> decrease in number of instructions (improvement!) in 
>>>> Perfstartup-Noop that were flagged as significant - 0.10%
>>>>
>>>> Tested with mach5 tier1-3.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2020/8249938.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8249938
>>>>
>>>> Thanks,
>>>> Coleen
>>>
>>> Hi Coleen,
>>>
>>> The changes looks good. I think Universe::serialize() can be 
>>> simplified as (not tested):
>>>
>>> void Universe::serialize(SerializeClosure* f) {
>>>
>>> #if INCLUDE_CDS_JAVA_HEAP
>>> ? {
>>> ??? oop mirror_oop;
>>> ??? for (int i = T_BOOLEAN; i < T_VOID+1; i++) {
>>> ????? if (f->is_reading()) {?? // f->reading() ...
>>> ??????? f->do_oop(&mirror_oop); // read from archive
>>> ??????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
>>> ????? } else {
>>> ??????? mirror_oop = _mirrors[i].resolve();
>>> ??????? f->do_oop(&mirror_oop); // write to archive
>>> ????? }
>>> ????? if (mirror_oop != NULL) { // may be null if archived heap is 
>>> disabled
>>> java_lang_Class::update_archived_primitive_mirror_native_pointers(mirror_oop); 
>>>
>>> ????? }
>>> ??? }
>>> ? }
>>> #endif
>>
>> Yes that works and is better.? I think GC hates when the location of 
>> mirror_oop is the same, but serializing doesn't care.? I've rerun 
>> tier1 tests on it and it all passes.
> 
> 
>> Thanks!
>> Coleen
>>>
>>> Thanks
>>> - Ioi
>>
> 

From yumin.qi at oracle.com  Fri Jul 24 05:54:31 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Thu, 23 Jul 2020 22:54:31 -0700
Subject: RFR (S) 8247296: Optimize JVM_GetDeclaringClass
In-Reply-To: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>
References: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>
Message-ID: <7974f5e3-9cfe-cf33-cf4f-a0efed33ac70@oracle.com>

Hi, David

 ? Looks good to me. I have done a quick search and found (may not cover 
all):

1) 
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compileBroker.cpp#L834

 ?? Where thread_handle resolved multiple times.

2) 
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/jni.cpp#L1174

 ??? Where clazz resolved twice.

 ? ? Do you want to include those two files in your list?

Thanks

Yumin

On 7/23/20 8:49 PM, David Holmes wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8247296
> webrev: http://cr.openjdk.java.net/~dholmes/8247296/webrev/
>
> Please review this simple optimization contributed by Christoph Dreis 
> in its initial form and then expanded by me to cover other cases in 
> jvm.cpp.
>
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-June/040025.html 
>
>
> There is a common pattern of code of the form:
>
> if 
> (java_lang_Class::is_primitive(JNIHandles::resolve_non_null(ofClass)) ||
> ??? ! 
> java_lang_Class::as_Klass(JNIHandles::resolve_non_null(ofClass))->is_instance_klass()) 
> {
>
> which resolves cls twice. There are also duplicate calls to as_Klass 
> that can be removed in a couple of cases.
>
> Testing: tiers 1 - 3
>
> Thanks,
> David

From david.holmes at oracle.com  Fri Jul 24 06:44:00 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 24 Jul 2020 16:44:00 +1000
Subject: RFR (S) 8247296: Optimize JVM_GetDeclaringClass
In-Reply-To: <7974f5e3-9cfe-cf33-cf4f-a0efed33ac70@oracle.com>
References: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>
 <7974f5e3-9cfe-cf33-cf4f-a0efed33ac70@oracle.com>
Message-ID: <2712b135-e8c4-1b5b-b59f-c7b0be201516@oracle.com>

Hi Yumin,

On 24/07/2020 3:54 pm, Yumin Qi wrote:
> Hi, David
> 
>  ? Looks good to me. 

Thanks for taking a look at this.

> I have done a quick search and found (may not cover all):

I must stress I am sponsoring Christoph's change and only extended it 
within the current file. :) I'm sure there are many, many more 
opportunities for similar optimisations. Though you have to be careful 
to ensure you don't expose an unhandled oop.

> 1) 
> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compileBroker.cpp#L834 
> 
>  ?? Where thread_handle resolved multiple times.

True that is unnecessary, but this is a compiler code and I'd need to 
extend the review so ... I'll pass on this one.

> 2) 
> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/jni.cpp#L1174 
> 
>  ??? Where clazz resolved twice.

Fixed this as it is the same pattern in a core runtime file. Webrev 
updated in place.

Thanks,
David


>  ? ? Do you want to include those two files in your list?
> 
> Thanks
> 
> Yumin
> 
> On 7/23/20 8:49 PM, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8247296
>> webrev: http://cr.openjdk.java.net/~dholmes/8247296/webrev/
>>
>> Please review this simple optimization contributed by Christoph Dreis 
>> in its initial form and then expanded by me to cover other cases in 
>> jvm.cpp.
>>
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-June/040025.html 
>>
>>
>> There is a common pattern of code of the form:
>>
>> if 
>> (java_lang_Class::is_primitive(JNIHandles::resolve_non_null(ofClass)) ||
>> ??? ! 
>> java_lang_Class::as_Klass(JNIHandles::resolve_non_null(ofClass))->is_instance_klass()) 
>> {
>>
>> which resolves cls twice. There are also duplicate calls to as_Klass 
>> that can be removed in a couple of cases.
>>
>> Testing: tiers 1 - 3
>>
>> Thanks,
>> David

From shade at redhat.com  Fri Jul 24 07:22:51 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 24 Jul 2020 09:22:51 +0200
Subject: RFR (S) 8247296: Optimize JVM_GetDeclaringClass
In-Reply-To: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>
References: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>
Message-ID: <50b7c60f-97da-b205-a270-72e43cf9879f@redhat.com>

On 7/24/20 5:49 AM, David Holmes wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8247296
> webrev: http://cr.openjdk.java.net/~dholmes/8247296/webrev/

This looks fine to me.

So we need to be careful that naked oop is still valid at each use, which at least implies there are
no allocations (and associated safepoints) happen anywhere in between? The patch looks safe in that
regard.

-- 
Thanks,
-Aleksey


From david.holmes at oracle.com  Fri Jul 24 08:23:07 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 24 Jul 2020 18:23:07 +1000
Subject: RFR (S) 8247296: Optimize JVM_GetDeclaringClass
In-Reply-To: <50b7c60f-97da-b205-a270-72e43cf9879f@redhat.com>
References: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>
 <50b7c60f-97da-b205-a270-72e43cf9879f@redhat.com>
Message-ID: <7332a240-d73b-3269-f460-0bbf5e0a79a4@oracle.com>

Hi Aleksey,

Thanks for taking a look.

On 24/07/2020 5:22 pm, Aleksey Shipilev wrote:
> On 7/24/20 5:49 AM, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8247296
>> webrev: http://cr.openjdk.java.net/~dholmes/8247296/webrev/
> 
> This looks fine to me.
> 
> So we need to be careful that naked oop is still valid at each use, which at least implies there are
> no allocations (and associated safepoints) happen anywhere in between? The patch looks safe in that
> regard.

Right - no safepoints (or handshakes?) allowed. But these cases are all 
okay. Places that can cross safepoints tend to wrap the oop straight 
into a Handle to maintain safe access.

Cheers,
David
-----


From tobias.hartmann at oracle.com  Fri Jul 24 09:52:42 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 24 Jul 2020 11:52:42 +0200
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <1595520162373.22868@amazon.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
 <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
 <1595401959932.33284@amazon.com>
 <a03d92d6-ad07-b347-7452-776459b8d174@oracle.com>
 <1595520162373.22868@amazon.com>
Message-ID: <916b3a4a-5617-941d-6161-840f3ea900bd@oracle.com>

Hi Liu,

On 23.07.20 18:02, Liu, Xin wrote:
> That is my intention too, but CompilerOracle doesn't exit JVM when it encounters parsing errors. 
> It just exacts information from CompileCommand as many as possible. That makes sense because compiler "directives" are supposed to be optional for program execution. 
> 
> I do put the error message in parser's errorbuf.  I set a flag "exit_on_error" to quit JVM after it dumps parser errors. yes, I treat undefined intrinsics as fatal errors.  
> This behavior is from Nils comment: "I want to see an error on startup if the user has specified unknown intrinsic names."  It is also consistent with JVM option -XX:ControlIntrinsic=. 

Okay, thanks for the explanation! I would prefer consistency in error handling of compiler
directives, i.e., handle all parser failures the same way. But I leave it to Nils to decide.

Best regards,
Tobias

From coleen.phillimore at oracle.com  Fri Jul 24 12:08:35 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 08:08:35 -0400
Subject: RFR (S) 8249938: Move mirror oops from Universe into OopStorage
In-Reply-To: <ddc2e5bc-c2c5-3c43-445a-06a5b21061db@oracle.com>
References: <368cf365-cfd2-5269-509c-b64b19509150@oracle.com>
 <58e90095-199d-2ed1-4139-53536ad2fb08@oracle.com>
 <2c10f73c-81be-8c00-5004-5fd4bdf9b985@oracle.com>
 <ef886ba2-724c-f357-1b2e-b667f12063c8@oracle.com>
 <ddc2e5bc-c2c5-3c43-445a-06a5b21061db@oracle.com>
Message-ID: <30bbee46-aa6d-d9a6-039f-8c99532c2e84@oracle.com>


Thanks David!
Coleen

On 7/24/20 1:30 AM, David Holmes wrote:
> Hi Coleen,
>
> This all seems fine to me.
>
> Thanks,
> David
>
> On 24/07/2020 11:43 am, coleen.phillimore at oracle.com wrote:
>>
>> For the record and for reviewer #2, this is the incremental webrev, 
>> with a bug fixed in ReadClosure::do_oop.
>>
>> incr webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8249938.02.incr/webrev
>> full webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8249938.02/webrev
>>
>> thanks,
>> Coleen
>>
>>
>> On 7/23/20 5:49 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Thank you for reviewing, Ioi.
>>>
>>> On 7/23/20 4:41 PM, Ioi Lam wrote:
>>>>
>>>>
>>>> On 7/23/20 10:05 AM, coleen.phillimore at oracle.com wrote:
>>>>> Summary: Save and restore mirror oops to temporary array for CDS, 
>>>>> and move them to OopStorage once restored.
>>>>>
>>>>> This is a subtask of moving oops out of Universe.? I ran 
>>>>> performance tested of this and there is no performance change. 
>>>>> Some slight decrease in number of instructions (improvement!) in 
>>>>> Perfstartup-Noop that were flagged as significant - 0.10%
>>>>>
>>>>> Tested with mach5 tier1-3.
>>>>>
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/2020/8249938.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8249938
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>
>>>> Hi Coleen,
>>>>
>>>> The changes looks good. I think Universe::serialize() can be 
>>>> simplified as (not tested):
>>>>
>>>> void Universe::serialize(SerializeClosure* f) {
>>>>
>>>> #if INCLUDE_CDS_JAVA_HEAP
>>>> ? {
>>>> ??? oop mirror_oop;
>>>> ??? for (int i = T_BOOLEAN; i < T_VOID+1; i++) {
>>>> ????? if (f->is_reading()) {?? // f->reading() ...
>>>> ??????? f->do_oop(&mirror_oop); // read from archive
>>>> ??????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
>>>> ????? } else {
>>>> ??????? mirror_oop = _mirrors[i].resolve();
>>>> ??????? f->do_oop(&mirror_oop); // write to archive
>>>> ????? }
>>>> ????? if (mirror_oop != NULL) { // may be null if archived heap is 
>>>> disabled
>>>> java_lang_Class::update_archived_primitive_mirror_native_pointers(mirror_oop); 
>>>>
>>>> ????? }
>>>> ??? }
>>>> ? }
>>>> #endif
>>>
>>> Yes that works and is better.? I think GC hates when the location of 
>>> mirror_oop is the same, but serializing doesn't care.? I've rerun 
>>> tier1 tests on it and it all passes.
>>
>>
>>> Thanks!
>>> Coleen
>>>>
>>>> Thanks
>>>> - Ioi
>>>
>>


From harold.seigel at oracle.com  Fri Jul 24 12:20:32 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 24 Jul 2020 08:20:32 -0400
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
In-Reply-To: <aeea97ae-8894-3a32-9a2d-a3a730db7f5d@oracle.com>
References: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
 <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>
 <8b8651c7-8034-aaae-edf3-a8c7f0b5039f@oracle.com>
 <3be37a17-b80f-8515-a831-d6268aef7708@oracle.com>
 <2ea6bd32-8c90-0c74-7bfd-c04a00741463@oracle.com>
 <aeea97ae-8894-3a32-9a2d-a3a730db7f5d@oracle.com>
Message-ID: <fe2457f0-4b06-07c6-df39-928a35890c87@oracle.com>

Thanks David for taking a close look at this!

Harold

On 7/23/2020 7:25 PM, David Holmes wrote:
> On 24/07/2020 6:09 am, Harold Seigel wrote:
>> Hi David,
>>
>> The test achieves its goal of running under the interpreter and 
>> compiler by running with "vm.compMode=Xmixed".
>
> Okay .. and that precludes Xcomp as well.
>
>> The test fails unless both the interpreter and JIT compiled code 
>> generate AbstractMethodError exceptions. I ran the test in tiers 1-5 
>> without it failing.? So, at least for those test runs, the test 
>> achieved its goal of running under both the compiler and interpreter.
>>
>> Perhaps the purpose of "vm.opt.TieredStopAtLevel==4" is to specify 
>> the tiered behavior if TieredCompiliation is specified?
>
> Not clear - especially with the explicit WB compilation. This is one 
> complex test! :)
>
> Sorry to belabour what should be a trivial fix. As long as it works 
> without Graal and now excludes Graal, it is good to go.
>
> Thanks,
> David
> -----
>
>> Thanks, Harold
>>
>> On 7/23/2020 9:51 AM, David Holmes wrote:
>>> Hi Harold,
>>>
>>> On 23/07/2020 11:22 pm, Harold Seigel wrote:
>>>> Hi David,
>>>>
>>>> Thanks for looking at this.
>>>>
>>>> The existing @requires for test AbstractMethodErrorTest.java 
>>>> contained this clause:
>>>>
>>>> ??? (!vm.graal.enabled | vm.opt.TieredCompilation == true)
>>>>
>>>> This clause evaluated to TRUE if either Graal was disabled or 
>>>> vm.opt.TieredCompilation was true.
>>>
>>> Okay so this claimed the test was okay with Graal as long as tiered 
>>> was enabled but ...
>>>
>>>> Since now Graal is always disabled, this clause would always be TRUE,
>>>
>>> ... we decided no Graal under any conditions ... okay ...
>>>
>>>> regardless of the value of vm.opt.TieredCompilation.? There is not
>>>> requirement that tiered compilation be enabled for this test.
>>>
>>> ... but if tiered is not enabled then what is the significance of 
>>> "vm.opt.TieredStopAtLevel==4" ?
>>>
>>> Sorry but this is one of the most complex and obscure @requires 
>>> conditions that I've seen. And I don't see how it achieves the goal 
>>> of running under the interpreter and compiler (per the synopsis)?
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks, Harold
>>>>
>>>> On 7/22/2020 11:00 PM, David Holmes wrote:
>>>>> Hi Harold,
>>>>>
>>>>> On 23/07/2020 8:05 am, Harold Seigel wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Please review this small fix to avoid running test 
>>>>>> AbstractMethodErrorTest.java with Graal and remove it from the 
>>>>>> ProblemList.
>>>>>>
>>>>>> Open Webrev: 
>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html
>>>>>
>>>>> You seem to have lost the requirement that tiered compilation be 
>>>>> enabled. ??
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582
>>>>>>
>>>>>> The change was tested by using mach5 testing and checking that 
>>>>>> the test was not run in tier*-graal tasks but was run in 
>>>>>> non-graal tasks.
>>>>>>
>>>>>> Thanks, Harold
>>>>>>

From coleen.phillimore at oracle.com  Fri Jul 24 12:30:36 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 08:30:36 -0400
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
In-Reply-To: <fe2457f0-4b06-07c6-df39-928a35890c87@oracle.com>
References: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
 <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>
 <8b8651c7-8034-aaae-edf3-a8c7f0b5039f@oracle.com>
 <3be37a17-b80f-8515-a831-d6268aef7708@oracle.com>
 <2ea6bd32-8c90-0c74-7bfd-c04a00741463@oracle.com>
 <aeea97ae-8894-3a32-9a2d-a3a730db7f5d@oracle.com>
 <fe2457f0-4b06-07c6-df39-928a35890c87@oracle.com>
Message-ID: <84be282a-56ee-29b9-9206-38ba67b24773@oracle.com>


Looks good (although somewhat insane) to me.
Coleen

On 7/24/20 8:20 AM, Harold Seigel wrote:
> Thanks David for taking a close look at this!
>
> Harold
>
> On 7/23/2020 7:25 PM, David Holmes wrote:
>> On 24/07/2020 6:09 am, Harold Seigel wrote:
>>> Hi David,
>>>
>>> The test achieves its goal of running under the interpreter and 
>>> compiler by running with "vm.compMode=Xmixed".
>>
>> Okay .. and that precludes Xcomp as well.
>>
>>> The test fails unless both the interpreter and JIT compiled code 
>>> generate AbstractMethodError exceptions. I ran the test in tiers 1-5 
>>> without it failing. So, at least for those test runs, the test 
>>> achieved its goal of running under both the compiler and interpreter.
>>>
>>> Perhaps the purpose of "vm.opt.TieredStopAtLevel==4" is to specify 
>>> the tiered behavior if TieredCompiliation is specified?
>>
>> Not clear - especially with the explicit WB compilation. This is one 
>> complex test! :)
>>
>> Sorry to belabour what should be a trivial fix. As long as it works 
>> without Graal and now excludes Graal, it is good to go.
>>
>> Thanks,
>> David
>> -----
>>
>>> Thanks, Harold
>>>
>>> On 7/23/2020 9:51 AM, David Holmes wrote:
>>>> Hi Harold,
>>>>
>>>> On 23/07/2020 11:22 pm, Harold Seigel wrote:
>>>>> Hi David,
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>> The existing @requires for test AbstractMethodErrorTest.java 
>>>>> contained this clause:
>>>>>
>>>>> ??? (!vm.graal.enabled | vm.opt.TieredCompilation == true)
>>>>>
>>>>> This clause evaluated to TRUE if either Graal was disabled or 
>>>>> vm.opt.TieredCompilation was true.
>>>>
>>>> Okay so this claimed the test was okay with Graal as long as tiered 
>>>> was enabled but ...
>>>>
>>>>> Since now Graal is always disabled, this clause would always be TRUE,
>>>>
>>>> ... we decided no Graal under any conditions ... okay ...
>>>>
>>>>> regardless of the value of vm.opt.TieredCompilation.? There is not
>>>>> requirement that tiered compilation be enabled for this test.
>>>>
>>>> ... but if tiered is not enabled then what is the significance of 
>>>> "vm.opt.TieredStopAtLevel==4" ?
>>>>
>>>> Sorry but this is one of the most complex and obscure @requires 
>>>> conditions that I've seen. And I don't see how it achieves the goal 
>>>> of running under the interpreter and compiler (per the synopsis)?
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks, Harold
>>>>>
>>>>> On 7/22/2020 11:00 PM, David Holmes wrote:
>>>>>> Hi Harold,
>>>>>>
>>>>>> On 23/07/2020 8:05 am, Harold Seigel wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please review this small fix to avoid running test 
>>>>>>> AbstractMethodErrorTest.java with Graal and remove it from the 
>>>>>>> ProblemList.
>>>>>>>
>>>>>>> Open Webrev: 
>>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html
>>>>>>
>>>>>> You seem to have lost the requirement that tiered compilation be 
>>>>>> enabled. ??
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582
>>>>>>>
>>>>>>> The change was tested by using mach5 testing and checking that 
>>>>>>> the test was not run in tier*-graal tasks but was run in 
>>>>>>> non-graal tasks.
>>>>>>>
>>>>>>> Thanks, Harold
>>>>>>>


From harold.seigel at oracle.com  Fri Jul 24 12:32:13 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 24 Jul 2020 08:32:13 -0400
Subject: RFR(S) 8222582: [TESTBUG] AbstractMethodErrorTest.java fails with
 "did not test both cases (interpreted and compiled)."
In-Reply-To: <84be282a-56ee-29b9-9206-38ba67b24773@oracle.com>
References: <4e274634-ef93-ad89-23ac-9c8fe280c27a@oracle.com>
 <8408ac68-1e09-8cfa-9640-96e8a7adf930@oracle.com>
 <8b8651c7-8034-aaae-edf3-a8c7f0b5039f@oracle.com>
 <3be37a17-b80f-8515-a831-d6268aef7708@oracle.com>
 <2ea6bd32-8c90-0c74-7bfd-c04a00741463@oracle.com>
 <aeea97ae-8894-3a32-9a2d-a3a730db7f5d@oracle.com>
 <fe2457f0-4b06-07c6-df39-928a35890c87@oracle.com>
 <84be282a-56ee-29b9-9206-38ba67b24773@oracle.com>
Message-ID: <5ff5fe75-2950-cbc6-cf6e-74effb0d06d3@oracle.com>

Thanks Coleen!

Harold

On 7/24/2020 8:30 AM, coleen.phillimore at oracle.com wrote:
>
> Looks good (although somewhat insane) to me.
> Coleen
>
> On 7/24/20 8:20 AM, Harold Seigel wrote:
>> Thanks David for taking a close look at this!
>>
>> Harold
>>
>> On 7/23/2020 7:25 PM, David Holmes wrote:
>>> On 24/07/2020 6:09 am, Harold Seigel wrote:
>>>> Hi David,
>>>>
>>>> The test achieves its goal of running under the interpreter and 
>>>> compiler by running with "vm.compMode=Xmixed".
>>>
>>> Okay .. and that precludes Xcomp as well.
>>>
>>>> The test fails unless both the interpreter and JIT compiled code 
>>>> generate AbstractMethodError exceptions. I ran the test in tiers 
>>>> 1-5 without it failing. So, at least for those test runs, the test 
>>>> achieved its goal of running under both the compiler and interpreter.
>>>>
>>>> Perhaps the purpose of "vm.opt.TieredStopAtLevel==4" is to specify 
>>>> the tiered behavior if TieredCompiliation is specified?
>>>
>>> Not clear - especially with the explicit WB compilation. This is one 
>>> complex test! :)
>>>
>>> Sorry to belabour what should be a trivial fix. As long as it works 
>>> without Graal and now excludes Graal, it is good to go.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> Thanks, Harold
>>>>
>>>> On 7/23/2020 9:51 AM, David Holmes wrote:
>>>>> Hi Harold,
>>>>>
>>>>> On 23/07/2020 11:22 pm, Harold Seigel wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> Thanks for looking at this.
>>>>>>
>>>>>> The existing @requires for test AbstractMethodErrorTest.java 
>>>>>> contained this clause:
>>>>>>
>>>>>> ??? (!vm.graal.enabled | vm.opt.TieredCompilation == true)
>>>>>>
>>>>>> This clause evaluated to TRUE if either Graal was disabled or 
>>>>>> vm.opt.TieredCompilation was true.
>>>>>
>>>>> Okay so this claimed the test was okay with Graal as long as 
>>>>> tiered was enabled but ...
>>>>>
>>>>>> Since now Graal is always disabled, this clause would always be 
>>>>>> TRUE,
>>>>>
>>>>> ... we decided no Graal under any conditions ... okay ...
>>>>>
>>>>>> regardless of the value of vm.opt.TieredCompilation.? There is not
>>>>>> requirement that tiered compilation be enabled for this test.
>>>>>
>>>>> ... but if tiered is not enabled then what is the significance of 
>>>>> "vm.opt.TieredStopAtLevel==4" ?
>>>>>
>>>>> Sorry but this is one of the most complex and obscure @requires 
>>>>> conditions that I've seen. And I don't see how it achieves the 
>>>>> goal of running under the interpreter and compiler (per the 
>>>>> synopsis)?
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Thanks, Harold
>>>>>>
>>>>>> On 7/22/2020 11:00 PM, David Holmes wrote:
>>>>>>> Hi Harold,
>>>>>>>
>>>>>>> On 23/07/2020 8:05 am, Harold Seigel wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Please review this small fix to avoid running test 
>>>>>>>> AbstractMethodErrorTest.java with Graal and remove it from the 
>>>>>>>> ProblemList.
>>>>>>>>
>>>>>>>> Open Webrev: 
>>>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8222582/webrev/index.html
>>>>>>>
>>>>>>> You seem to have lost the requirement that tiered compilation be 
>>>>>>> enabled. ??
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8222582
>>>>>>>>
>>>>>>>> The change was tested by using mach5 testing and checking that 
>>>>>>>> the test was not run in tier*-graal tasks but was run in 
>>>>>>>> non-graal tasks.
>>>>>>>>
>>>>>>>> Thanks, Harold
>>>>>>>>
>

From coleen.phillimore at oracle.com  Fri Jul 24 14:37:47 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 10:37:47 -0400
Subject: RFR (urgent) 8250516: [BACKOUT] Move mirror oops from Universe into
 OopStorage
Message-ID: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>

open webrev at http://cr.openjdk.java.net/~coleenp/2020/8250516.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8250516

The backout was clean.
Thanks,
Coleen

From harold.seigel at oracle.com  Fri Jul 24 14:48:29 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 24 Jul 2020 10:48:29 -0400
Subject: RFR (urgent) 8250516: [BACKOUT] Move mirror oops from Universe
 into OopStorage
In-Reply-To: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
References: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
Message-ID: <2738c827-96dc-eebd-d4de-d28e150372ca@oracle.com>

Hi Coleen,

The backout looks good.

Thanks, Harold

On 7/24/2020 10:37 AM, coleen.phillimore at oracle.com wrote:
> open webrev at http://cr.openjdk.java.net/~coleenp/2020/8250516.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8250516
>
> The backout was clean.
> Thanks,
> Coleen

From david.holmes at oracle.com  Fri Jul 24 14:49:03 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 25 Jul 2020 00:49:03 +1000
Subject: RFR (urgent) 8250516: [BACKOUT] Move mirror oops from Universe
 into OopStorage
In-Reply-To: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
References: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
Message-ID: <46f6a5dc-2035-ab00-e0fb-487ab5fcf15f@oracle.com>

Looks good.

Thanks,
David

On 25/07/2020 12:37 am, coleen.phillimore at oracle.com wrote:
> open webrev at http://cr.openjdk.java.net/~coleenp/2020/8250516.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8250516
> 
> The backout was clean.
> Thanks,
> Coleen

From thomas.schatzl at oracle.com  Fri Jul 24 14:49:12 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 24 Jul 2020 16:49:12 +0200
Subject: RFR (urgent) 8250516: [BACKOUT] Move mirror oops from Universe
 into OopStorage
In-Reply-To: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
References: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
Message-ID: <cabfbb5a-8f92-ad24-7139-21ba4cf39474@oracle.com>

Hi,

On 24.07.20 16:37, coleen.phillimore at oracle.com wrote:
> open webrev at http://cr.openjdk.java.net/~coleenp/2020/8250516.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8250516
> 
> The backout was clean.
> Thanks,
> Coleen

   looks like a valid backout of 8249938: Move mirror oops from Universe 
into OopStorage. Ship it (before a submit run)

Thanks,
   Thomas

From daniel.daugherty at oracle.com  Fri Jul 24 14:51:16 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 24 Jul 2020 10:51:16 -0400
Subject: RFR (urgent) 8250516: [BACKOUT] Move mirror oops from Universe
 into OopStorage
In-Reply-To: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
References: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
Message-ID: <91098654-c007-3406-d836-c311a6ffa204@oracle.com>

Backout looks good. Mechanically verified the backout against
the originals.

Dan

On 7/24/20 10:37 AM, coleen.phillimore at oracle.com wrote:
> open webrev at http://cr.openjdk.java.net/~coleenp/2020/8250516.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8250516
>
> The backout was clean.
> Thanks,
> Coleen


From andrei.pangin at gmail.com  Fri Jul 24 14:53:03 2020
From: andrei.pangin at gmail.com (Andrei Pangin)
Date: Fri, 24 Jul 2020 17:53:03 +0300
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
Message-ID: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>

Hi,

Please review a small fix to a not-so-small performance issue that we've
seen when migrating a production application from JDK 8 to JDK 14.

On certain workloads, where Nashorn produces thousands MethodHandles,
ResolvedMethodTable operations become extremely slow due to degenerate
hashcode. This patch basically fixes hashcode by including the method
holder's name in the computation. More details in the bug report.

CR: https://bugs.openjdk.java.net/browse/JDK-8249719
Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/

Tested: tier1-2, hotspot*runtime

I'll be glad if someone could sponsor the patch.

Thank you,
Andrei Pangin

From coleen.phillimore at oracle.com  Fri Jul 24 14:55:03 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 10:55:03 -0400
Subject: RFR (urgent) 8250516: [BACKOUT] Move mirror oops from Universe
 into OopStorage
In-Reply-To: <91098654-c007-3406-d836-c311a6ffa204@oracle.com>
References: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
 <91098654-c007-3406-d836-c311a6ffa204@oracle.com>
Message-ID: <a7d3202d-35f7-d3d1-3aa8-b7b15b01cc1b@oracle.com>


Thanks for the reviews and sorry for the breakage.? Thanks to Harold for 
the alert.? I only put David and Thomas as reviewers because they 
reviewed it on slack before this email came through.
thanks,
Coleen

On 7/24/20 10:51 AM, Daniel D. Daugherty wrote:
> Backout looks good. Mechanically verified the backout against
> the originals.
>
> Dan
>
> On 7/24/20 10:37 AM, coleen.phillimore at oracle.com wrote:
>> open webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8250516.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8250516
>>
>> The backout was clean.
>> Thanks,
>> Coleen
>


From daniel.daugherty at oracle.com  Fri Jul 24 14:57:23 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 24 Jul 2020 10:57:23 -0400
Subject: RFR (urgent) 8250516: [BACKOUT] Move mirror oops from Universe
 into OopStorage
In-Reply-To: <a7d3202d-35f7-d3d1-3aa8-b7b15b01cc1b@oracle.com>
References: <ef1bca51-6437-d05d-94f2-62e31edf0691@oracle.com>
 <91098654-c007-3406-d836-c311a6ffa204@oracle.com>
 <a7d3202d-35f7-d3d1-3aa8-b7b15b01cc1b@oracle.com>
Message-ID: <6c758b04-3866-4d79-324c-056bde7d2bae@oracle.com>

On 7/24/20 10:55 AM, coleen.phillimore at oracle.com wrote:
>
> Thanks for the reviews and sorry for the breakage.? Thanks to Harold 
> for the alert.? I only put David and Thomas as reviewers because they 
> reviewed it on slack before this email came through.

No worries. You said "urgent" and three reviewers responded.
That's outstanding!!

Dan


> thanks,
> Coleen
>
> On 7/24/20 10:51 AM, Daniel D. Daugherty wrote:
>> Backout looks good. Mechanically verified the backout against
>> the originals.
>>
>> Dan
>>
>> On 7/24/20 10:37 AM, coleen.phillimore at oracle.com wrote:
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2020/8250516.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8250516
>>>
>>> The backout was clean.
>>> Thanks,
>>> Coleen
>>
>


From yumin.qi at oracle.com  Fri Jul 24 15:37:38 2020
From: yumin.qi at oracle.com (Yumin Qi)
Date: Fri, 24 Jul 2020 08:37:38 -0700
Subject: RFR (S) 8247296: Optimize JVM_GetDeclaringClass
In-Reply-To: <2712b135-e8c4-1b5b-b59f-c7b0be201516@oracle.com>
References: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>
 <7974f5e3-9cfe-cf33-cf4f-a0efed33ac70@oracle.com>
 <2712b135-e8c4-1b5b-b59f-c7b0be201516@oracle.com>
Message-ID: <0be6ee77-be76-1960-b5d2-d2f66b23f76d@oracle.com>

Hi, David

On 7/23/20 11:44 PM, David Holmes wrote:
>
> I must stress I am sponsoring Christoph's change and only extended it 
> within the current file. :) I'm sure there are many, many more 
> opportunities for similar optimisations. Though you have to be careful 
> to ensure you don't expose an unhandled oop.
>
>> 1) 
>> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compileBroker.cpp#L834 
>>
>> ??? Where thread_handle resolved multiple times.
>
> True that is unnecessary, but this is a compiler code and I'd need to 
> extend the review so ... I'll pass on this one.
>
That is OK.
>> 2) 
>> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/jni.cpp#L1174 
>>
>> ???? Where clazz resolved twice.
>
> Fixed this as it is the same pattern in a core runtime file. Webrev 
> updated in place.
>
Looks good!


Thanks

Yumin

> Thanks,
> David
>
>
>
>> ?? ? Do you want to include those two files in your list?
>>
>> Thanks
>>
>> Yumin
>>
>> On 7/23/20 8:49 PM, David Holmes wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8247296
>>> webrev: http://cr.openjdk.java.net/~dholmes/8247296/webrev/
>>>
>>> Please review this simple optimization contributed by Christoph 
>>> Dreis in its initial form and then expanded by me to cover other 
>>> cases in jvm.cpp.
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-June/040025.html 
>>>
>>>
>>> There is a common pattern of code of the form:
>>>
>>> if 
>>> (java_lang_Class::is_primitive(JNIHandles::resolve_non_null(ofClass)) 
>>> ||
>>> ??? ! 
>>> java_lang_Class::as_Klass(JNIHandles::resolve_non_null(ofClass))->is_instance_klass()) 
>>> {
>>>
>>> which resolves cls twice. There are also duplicate calls to as_Klass 
>>> that can be removed in a couple of cases.
>>>
>>> Testing: tiers 1 - 3
>>>
>>> Thanks,
>>> David

From luhenry at microsoft.com  Fri Jul 24 15:39:31 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 24 Jul 2020 15:39:31 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <MWHPR21MB05114BF8C3AB71CF2125FB25B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
 <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
 <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <d5a5e563-e0c5-ec0a-8640-ea940c05f738@oracle.com>
 <MWHPR21MB05114BF8C3AB71CF2125FB25B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <MWHPR21MB0511168A11BCC1501E85A3BCB0770@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi,

A quick follow-up on that change. Is there anything else you'd like to see changed to get in merged?

Following an offline discussion with David, I'm working with relevant Microsoft teams to figure out where relevant documentation is. I'll keep you posted on that.

Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248657/webrev.00/8248657.patch

Thank you,

--
Ludovic

-----Original Message-----
From: Ludovic Henry <luhenry at microsoft.com> 
Sent: Wednesday, July 15, 2020 10:00 AM
To: David Holmes <david.holmes at oracle.com>; Andrew Haley <aph at redhat.com>; Thomas St?fe <thomas.stuefe at gmail.com>
Cc: Kim Barrett <kim.barrett at oracle.com>; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64 <openjdk-aarch64 at microsoft.com>
Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model

Hi David,

>> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.
>
> That is good to know. But this is something that Microsoft should be
> documenting explicitly - even if just a blanket statement that all
> syscalls (which are what exactly?) provide an implicit memory barrier
> (of what type exactly?).

I don't think it's because SetEvent is a syscall that we can assume it has a barrier (even though syscall do guarantee a barrier), it's more that SetEvent is an equivalent to sem_post. And if you cannot assume that sem_post or SetEvent guarantee a memory barrier (full or at least store_release), then you could not trust any standard locking mechanism (what's the point of synchronizing if the CPU can load and store outside of the critical section).

>> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.
>
> jcstress tests will execute the native runtime code of course, but they
> won't be "stressing" it as such.

Makes sense, thanks for the clarification.

--
Ludovic

I agree with you on the value of a more explicit documentation, and I'll go look for that. If it doesn't exist, I'll put the request to have it documented somewhere on docs.microsoft.com. In the meantime, it is safe to assume that SetEvent contains a memory barrier that has at least a store_release semantic. Similarly, WaitForSingleObect and WaitForMultipleObjects have at least a load_acquire memory barrier, and are also syscalls (actually guaranteeing a full memory barrier).

________________________________________
From: David Holmes <david.holmes at oracle.com>
Sent: Monday, July 13, 2020 19:25
To: Ludovic Henry; Andrew Haley; Thomas St?fe
Cc: Kim Barrett; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model

Hi Ludovic,

On 14/07/2020 11:28 am, Ludovic Henry wrote:
> Hello,
>
>> But if we are dealing with non-TSO races then it would be good to get
>> some guidance from Microsoft as to the memory ordering properties of
>> various API's to ensure that we are maintaining correct ordering. For
>> example, in the destructor we have:
>>
>> 81     lock_owner = 0;
>> 82     // No lost wakeups, lock_event stays signaled until reset.
>> 83     DWORD ret = SetEvent(lock_event);
>>
>> but unless we are guaranteed that the store to lock_owner cannot be
>> reordered by the compiler or the hardware, to appear to be after the
>> SetEvent, then the logic is broken. Generally, because Windows only
>> supported TSO systems, we have assumed that the compiler will not
>> reorder code across these kind of API calls. But now we also need
>> hardware guarantees.
>
> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.

That is good to know. But this is something that Microsoft should be
documenting explicitly - even if just a blanket statement that all
syscalls (which are what exactly?) provide an implicit memory barrier
(of what type exactly?).

> As for the general question around platforms with weaker memory models, AArch64 is not the first such platform that MSVC and Windows have been ported to. It is safe to assume that MSVC has a similar approach to GCC and Clang on memory reordering optimizations. [1] also gives some pointers on some MSVC specific knobs for working around the weaker memory model.

The /volatile:ms is the kind of build control I was wondering about.
Thanks for the pointer.

> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.

jcstress tests will execute the native runtime code of course, but they
won't be "stressing" it as such.

Cheers,
David
-----

> I hope this helps to answer your questions.
>
> [1] https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fbuild%2Fcommon-visual-cpp-arm-migration-issues%3Fview%3Dvs-2019%23volatile-keyword-default-behavior&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C64d4fa75aa9949a4716508d828e083f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304292066219168&amp;sdata=vm8kqPOZV2fxQXF6idNGyaaY6F9RruyDmTND5VdGBy0%3D&amp;reserved=0
>
> --
> Ludovic
> ________________________________________
> From: Andrew Haley <aph at redhat.com>
> Sent: Monday, July 13, 2020 01:36
> To: David Holmes; Thomas St?fe
> Cc: Kim Barrett; Ludovic Henry; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
>
> On 13/07/2020 06:48, David Holmes wrote:
>> Hi Thomas,
>>
>> On 13/07/2020 2:41 pm, Thomas St?fe wrote:
>>>
>>> Can a compiler reorder system calls and stores? How would it determine
>>> if this is safe to do?
>
> I very much doubt it.
>
>> A compiler can reorder anything it likes if it can determine it is safe
>> to do so. :)
>
> I'm fairly sure the compiler doesn't care about that!
>
>>> I'd be surprised if Microsoft loosened up reordering since this would
>>> mean existing software cannot just be recompiled for arm and expected to
>>> work. But this is just a guess of course.
>>
>> It's an interesting point because I would expect there to be a lot of
>> software written for Windows that contains assumptions of TSO that would
>> in fact fail when run on Aarch64. I don't know if there are any special
>> mechanisms to force a binary to run in TSO mode on Aarch64 under Windows
>> (or build flags), that would allow for ease of migration.
>
> There's no standard hardware mechanism that would do so.
>
> I've been very surprised at how little software has broken on AArch64
> because of memory ordering. Like you, I initially assumed that stuff
> would break all over the place, but by and large it was OK. I know of
> two reasons: firstly, programmers are pretty conservative and tend to
> use simple and reliable mechanisms such as safe publication and
> mutexes for inter-thread communication. But also, and maybe more
> importantly, the kinds of reordering the hardware can do are not very
> different from those compilers do. Therefore, anyone playing fast and
> loose with TSO has probably already been bitten by the compiler.
>
>> But unless all Windows software will run in such a mode there is a
>> need for MS to document what the memory consistency properties of
>> various APIs are (as POSIX does [1]).
>
> Indeed. I would have thought it existed somewhere.
>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C64d4fa75aa9949a4716508d828e083f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304292066219168&amp;sdata=zhJaUB7k2aIWHUQViQjxsWp%2Bj6DEnGzN5GQpQaE2sgM%3D&amp;reserved=0>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C64d4fa75aa9949a4716508d828e083f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304292066219168&amp;sdata=g8G%2FIED9WQGKBWYK4WMPVlP0r903iMygoOUfqttSGeE%3D&amp;reserved=0
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>

From volker.simonis at gmail.com  Fri Jul 24 18:50:48 2020
From: volker.simonis at gmail.com (Volker Simonis)
Date: Fri, 24 Jul 2020 20:50:48 +0200
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
Message-ID: <CA+3eh13gxZydCXLeP2z=hs6J0WD90ORvhD1Nou5KD_=ja6Hn7Q@mail.gmail.com>

Hi Andrei,

nice finding :)

I think your fix looks good and I'm happy to sponsor your change once
we get a second review.

I only have a question about your test. How long does it usually run
before and after your fix? Just trying to understand if it is stable
enough in cases where a test machine might be overloaded or if we
should rather make it a manual test.

Thank you and best regards,
Volker

On Fri, Jul 24, 2020 at 4:54 PM Andrei Pangin <andrei.pangin at gmail.com> wrote:
>
> Hi,
>
> Please review a small fix to a not-so-small performance issue that we've
> seen when migrating a production application from JDK 8 to JDK 14.
>
> On certain workloads, where Nashorn produces thousands MethodHandles,
> ResolvedMethodTable operations become extremely slow due to degenerate
> hashcode. This patch basically fixes hashcode by including the method
> holder's name in the computation. More details in the bug report.
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>
> Tested: tier1-2, hotspot*runtime
>
> I'll be glad if someone could sponsor the patch.
>
> Thank you,
> Andrei Pangin

From coleen.phillimore at oracle.com  Fri Jul 24 21:07:40 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 17:07:40 -0400
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into OopStorage
Message-ID: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>

Summary: The original patch but add a null pointer check so an OopHandle 
is not created if a NULL is read from the archive.

If a NULL is read from the archive, we shouldn't create an OopHandle for 
it because one will be created in initialize_basic_type_mirrors.? The 
assert added in the previous patch I pushed detected that.

incremental change to original change: 
http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
full webrev at http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev

Retested with tier1-3.

Thanks,
Coleen


From daniel.daugherty at oracle.com  Fri Jul 24 21:37:25 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 24 Jul 2020 17:37:25 -0400
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
Message-ID: <716df75c-ef30-4bcc-9a54-edfbbd053361@oracle.com>

On 7/24/20 5:07 PM, coleen.phillimore at oracle.com wrote:
> Summary: The original patch but add a null pointer check so an 
> OopHandle is not created if a NULL is read from the archive.
>
> If a NULL is read from the archive, we shouldn't create an OopHandle 
> for it because one will be created in initialize_basic_type_mirrors.? 
> The assert added in the previous patch I pushed detected that.
>
> incremental change to original change: 
> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
> full webrev at http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>
> Retested with tier1-3.
>
> Thanks,
> Coleen

Okay. So I compared this patch:

http://cr.openjdk.java.net/~coleenp/2020/8249938.02/webrev/open.patch

with this patch:

http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev/open.patch

And in addition to the change described above, there are two changes
in the original patch that are not in the new patch:

--- old/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
21:40:56.177635044 -0400
+++ new/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
21:40:55.717623090 -0400
@@ -52,6 +52,8 @@

 ?? inline void release(OopStorage* storage);

+? inline void replace(oop obj);
+
 ?? // Used only for removing handle.
 ?? oop* ptr_raw() const { return _obj; }
 ?};
--- old/src/hotspot/share/oops/oopHandle.inline.hpp??? 2020-07-23 
21:40:56.721649181 -0400
+++ new/src/hotspot/share/oops/oopHandle.inline.hpp??? 2020-07-23 
21:40:56.257637123 -0400
@@ -54,4 +54,10 @@
 ?? }
 ?}

+inline void OopHandle::replace(oop obj) {
+? oop* ptr = ptr_raw();
+? assert(ptr != NULL, "should not use replace");
+? NativeAccess<>::oop_store(ptr, obj);
+}
+
 ?#endif // SHARE_OOPS_OOPHANDLE_INLINE_HPP

So it looks like you are removing this OopHandle::replace() function
also. Is that intentional?

Dan

From coleen.phillimore at oracle.com  Fri Jul 24 21:43:58 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 17:43:58 -0400
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <716df75c-ef30-4bcc-9a54-edfbbd053361@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
 <716df75c-ef30-4bcc-9a54-edfbbd053361@oracle.com>
Message-ID: <4c9eaa9e-7709-c25c-0a9d-5e9fa0990a8e@oracle.com>


Dan, Thank you for noticing this.? Yes, when I checked in the OopStorage 
mirror patch, I removed the "replace" function because it was the same 
function in the patch where I added the assert for this bug (what I 
called the "assert" patch):

https://bugs.openjdk.java.net/browse/JDK-8249822

https://hg.openjdk.java.net/jdk/jdk/rev/a36b9f6adbf2

So I actually had already checked this function in.

The only difference in this 8250519 should be the NULL check.

Thanks,
Coleen


On 7/24/20 5:37 PM, Daniel D. Daugherty wrote:
> On 7/24/20 5:07 PM, coleen.phillimore at oracle.com wrote:
>> Summary: The original patch but add a null pointer check so an 
>> OopHandle is not created if a NULL is read from the archive.
>>
>> If a NULL is read from the archive, we shouldn't create an OopHandle 
>> for it because one will be created in initialize_basic_type_mirrors.? 
>> The assert added in the previous patch I pushed detected that.
>>
>> incremental change to original change: 
>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
>> full webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>>
>> Retested with tier1-3.
>>
>> Thanks,
>> Coleen
>
> Okay. So I compared this patch:
>
> http://cr.openjdk.java.net/~coleenp/2020/8249938.02/webrev/open.patch
>
> with this patch:
>
> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev/open.patch
>
> And in addition to the change described above, there are two changes
> in the original patch that are not in the new patch:
>
> --- old/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
> 21:40:56.177635044 -0400
> +++ new/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
> 21:40:55.717623090 -0400
> @@ -52,6 +52,8 @@
>
> ?? inline void release(OopStorage* storage);
>
> +? inline void replace(oop obj);
> +
> ?? // Used only for removing handle.
> ?? oop* ptr_raw() const { return _obj; }
> ?};
> --- old/src/hotspot/share/oops/oopHandle.inline.hpp??? 2020-07-23 
> 21:40:56.721649181 -0400
> +++ new/src/hotspot/share/oops/oopHandle.inline.hpp??? 2020-07-23 
> 21:40:56.257637123 -0400
> @@ -54,4 +54,10 @@
> ?? }
> ?}
>
> +inline void OopHandle::replace(oop obj) {
> +? oop* ptr = ptr_raw();
> +? assert(ptr != NULL, "should not use replace");
> +? NativeAccess<>::oop_store(ptr, obj);
> +}
> +
> ?#endif // SHARE_OOPS_OOPHANDLE_INLINE_HPP
>
> So it looks like you are removing this OopHandle::replace() function
> also. Is that intentional?
>
> Dan


From daniel.daugherty at oracle.com  Fri Jul 24 21:44:18 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 24 Jul 2020 17:44:18 -0400
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <716df75c-ef30-4bcc-9a54-edfbbd053361@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
 <716df75c-ef30-4bcc-9a54-edfbbd053361@oracle.com>
Message-ID: <2e4adfe1-7041-9025-4541-9d2bf3787d33@oracle.com>

On 7/24/20 5:37 PM, Daniel D. Daugherty wrote:
> On 7/24/20 5:07 PM, coleen.phillimore at oracle.com wrote:
>> Summary: The original patch but add a null pointer check so an 
>> OopHandle is not created if a NULL is read from the archive.
>>
>> If a NULL is read from the archive, we shouldn't create an OopHandle 
>> for it because one will be created in initialize_basic_type_mirrors.? 
>> The assert added in the previous patch I pushed detected that.
>>
>> incremental change to original change: 
>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
>> full webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>>
>> Retested with tier1-3.
>>
>> Thanks,
>> Coleen
>
> Okay. So I compared this patch:
>
> http://cr.openjdk.java.net/~coleenp/2020/8249938.02/webrev/open.patch
>
> with this patch:
>
> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev/open.patch

Okay. I looked at the bits that were pushed using 8249938 (changeset 
e9c7deca9a98)
and those bits don't have the code below either.

The patch above compared with the bits that were pushed using 8249938 
(changeset
e9c7deca9a98) only shows the change described above so we're good here.

Thumbs up!

Dan

>
> And in addition to the change described above, there are two changes
> in the original patch that are not in the new patch:
>
> --- old/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
> 21:40:56.177635044 -0400
> +++ new/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
> 21:40:55.717623090 -0400
> @@ -52,6 +52,8 @@
>
> ?? inline void release(OopStorage* storage);
>
> +? inline void replace(oop obj);
> +
> ?? // Used only for removing handle.
> ?? oop* ptr_raw() const { return _obj; }
> ?};
> --- old/src/hotspot/share/oops/oopHandle.inline.hpp??? 2020-07-23 
> 21:40:56.721649181 -0400
> +++ new/src/hotspot/share/oops/oopHandle.inline.hpp??? 2020-07-23 
> 21:40:56.257637123 -0400
> @@ -54,4 +54,10 @@
> ?? }
> ?}
>
> +inline void OopHandle::replace(oop obj) {
> +? oop* ptr = ptr_raw();
> +? assert(ptr != NULL, "should not use replace");
> +? NativeAccess<>::oop_store(ptr, obj);
> +}
> +
> ?#endif // SHARE_OOPS_OOPHANDLE_INLINE_HPP
>
> So it looks like you are removing this OopHandle::replace() function
> also. Is that intentional?
>
> Dan


From daniel.daugherty at oracle.com  Fri Jul 24 21:45:32 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 24 Jul 2020 17:45:32 -0400
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <2e4adfe1-7041-9025-4541-9d2bf3787d33@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
 <716df75c-ef30-4bcc-9a54-edfbbd053361@oracle.com>
 <2e4adfe1-7041-9025-4541-9d2bf3787d33@oracle.com>
Message-ID: <c07b3db6-1805-3078-8cc8-961a6063081b@oracle.com>

Crossed in the ether!!

And I forgot to say that I concur that this is a trivial change relative
to the original 8249938 patch (changeset e9c7deca9a98).

Dan


On 7/24/20 5:44 PM, Daniel D. Daugherty wrote:
> On 7/24/20 5:37 PM, Daniel D. Daugherty wrote:
>> On 7/24/20 5:07 PM, coleen.phillimore at oracle.com wrote:
>>> Summary: The original patch but add a null pointer check so an 
>>> OopHandle is not created if a NULL is read from the archive.
>>>
>>> If a NULL is read from the archive, we shouldn't create an OopHandle 
>>> for it because one will be created in 
>>> initialize_basic_type_mirrors.? The assert added in the previous 
>>> patch I pushed detected that.
>>>
>>> incremental change to original change: 
>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
>>> full webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>>>
>>> Retested with tier1-3.
>>>
>>> Thanks,
>>> Coleen
>>
>> Okay. So I compared this patch:
>>
>> http://cr.openjdk.java.net/~coleenp/2020/8249938.02/webrev/open.patch
>>
>> with this patch:
>>
>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev/open.patch
>
> Okay. I looked at the bits that were pushed using 8249938 (changeset 
> e9c7deca9a98)
> and those bits don't have the code below either.
>
> The patch above compared with the bits that were pushed using 8249938 
> (changeset
> e9c7deca9a98) only shows the change described above so we're good here.
>
> Thumbs up!
>
> Dan
>
>>
>> And in addition to the change described above, there are two changes
>> in the original patch that are not in the new patch:
>>
>> --- old/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
>> 21:40:56.177635044 -0400
>> +++ new/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
>> 21:40:55.717623090 -0400
>> @@ -52,6 +52,8 @@
>>
>> ?? inline void release(OopStorage* storage);
>>
>> +? inline void replace(oop obj);
>> +
>> ?? // Used only for removing handle.
>> ?? oop* ptr_raw() const { return _obj; }
>> ?};
>> --- old/src/hotspot/share/oops/oopHandle.inline.hpp 2020-07-23 
>> 21:40:56.721649181 -0400
>> +++ new/src/hotspot/share/oops/oopHandle.inline.hpp 2020-07-23 
>> 21:40:56.257637123 -0400
>> @@ -54,4 +54,10 @@
>> ?? }
>> ?}
>>
>> +inline void OopHandle::replace(oop obj) {
>> +? oop* ptr = ptr_raw();
>> +? assert(ptr != NULL, "should not use replace");
>> +? NativeAccess<>::oop_store(ptr, obj);
>> +}
>> +
>> ?#endif // SHARE_OOPS_OOPHANDLE_INLINE_HPP
>>
>> So it looks like you are removing this OopHandle::replace() function
>> also. Is that intentional?
>>
>> Dan
>


From coleen.phillimore at oracle.com  Fri Jul 24 21:46:41 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 17:46:41 -0400
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <2e4adfe1-7041-9025-4541-9d2bf3787d33@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
 <716df75c-ef30-4bcc-9a54-edfbbd053361@oracle.com>
 <2e4adfe1-7041-9025-4541-9d2bf3787d33@oracle.com>
Message-ID: <8e350a84-d9dd-8a55-6eed-3a60b42d7576@oracle.com>


Dan, Thank you for your thoroughness!
Coleen

On 7/24/20 5:44 PM, Daniel D. Daugherty wrote:
> On 7/24/20 5:37 PM, Daniel D. Daugherty wrote:
>> On 7/24/20 5:07 PM, coleen.phillimore at oracle.com wrote:
>>> Summary: The original patch but add a null pointer check so an 
>>> OopHandle is not created if a NULL is read from the archive.
>>>
>>> If a NULL is read from the archive, we shouldn't create an OopHandle 
>>> for it because one will be created in 
>>> initialize_basic_type_mirrors.? The assert added in the previous 
>>> patch I pushed detected that.
>>>
>>> incremental change to original change: 
>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
>>> full webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>>>
>>> Retested with tier1-3.
>>>
>>> Thanks,
>>> Coleen
>>
>> Okay. So I compared this patch:
>>
>> http://cr.openjdk.java.net/~coleenp/2020/8249938.02/webrev/open.patch
>>
>> with this patch:
>>
>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev/open.patch
>
> Okay. I looked at the bits that were pushed using 8249938 (changeset 
> e9c7deca9a98)
> and those bits don't have the code below either.
>
> The patch above compared with the bits that were pushed using 8249938 
> (changeset
> e9c7deca9a98) only shows the change described above so we're good here.
>
> Thumbs up!
>
> Dan
>
>>
>> And in addition to the change described above, there are two changes
>> in the original patch that are not in the new patch:
>>
>> --- old/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
>> 21:40:56.177635044 -0400
>> +++ new/src/hotspot/share/oops/oopHandle.hpp??? 2020-07-23 
>> 21:40:55.717623090 -0400
>> @@ -52,6 +52,8 @@
>>
>> ?? inline void release(OopStorage* storage);
>>
>> +? inline void replace(oop obj);
>> +
>> ?? // Used only for removing handle.
>> ?? oop* ptr_raw() const { return _obj; }
>> ?};
>> --- old/src/hotspot/share/oops/oopHandle.inline.hpp 2020-07-23 
>> 21:40:56.721649181 -0400
>> +++ new/src/hotspot/share/oops/oopHandle.inline.hpp 2020-07-23 
>> 21:40:56.257637123 -0400
>> @@ -54,4 +54,10 @@
>> ?? }
>> ?}
>>
>> +inline void OopHandle::replace(oop obj) {
>> +? oop* ptr = ptr_raw();
>> +? assert(ptr != NULL, "should not use replace");
>> +? NativeAccess<>::oop_store(ptr, obj);
>> +}
>> +
>> ?#endif // SHARE_OOPS_OOPHANDLE_INLINE_HPP
>>
>> So it looks like you are removing this OopHandle::replace() function
>> also. Is that intentional?
>>
>> Dan
>


From andrei.pangin at gmail.com  Fri Jul 24 22:23:33 2020
From: andrei.pangin at gmail.com (Andrei Pangin)
Date: Sat, 25 Jul 2020 01:23:33 +0300
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CA+3eh13gxZydCXLeP2z=hs6J0WD90ORvhD1Nou5KD_=ja6Hn7Q@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CA+3eh13gxZydCXLeP2z=hs6J0WD90ORvhD1Nou5KD_=ja6Hn7Q@mail.gmail.com>
Message-ID: <CAAnUXC=z6ieJDQDXB=H_HjgZruNOr-ssLer7rvtKxbJ2CM2uaQ@mail.gmail.com>

Hi Volker,

Many thanks for the review and for your willingness to sponsor the change.

The test runs 4-5 seconds on my laptop with the fix, and about 35 minutes
without. Even on ARM32 (Raspberry Pi) it takes 22 seconds, far less than
the timeout. Seems acceptable to me. If not - just let me know, and I'll
change the test to manual.

Andrei

??, 24 ???. 2020 ?. ? 21:51, Volker Simonis <volker.simonis at gmail.com>:

> Hi Andrei,
>
> nice finding :)
>
> I think your fix looks good and I'm happy to sponsor your change once
> we get a second review.
>
> I only have a question about your test. How long does it usually run
> before and after your fix? Just trying to understand if it is stable
> enough in cases where a test machine might be overloaded or if we
> should rather make it a manual test.
>
> Thank you and best regards,
> Volker
>
> On Fri, Jul 24, 2020 at 4:54 PM Andrei Pangin <andrei.pangin at gmail.com>
> wrote:
> >
> > Hi,
> >
> > Please review a small fix to a not-so-small performance issue that we've
> > seen when migrating a production application from JDK 8 to JDK 14.
> >
> > On certain workloads, where Nashorn produces thousands MethodHandles,
> > ResolvedMethodTable operations become extremely slow due to degenerate
> > hashcode. This patch basically fixes hashcode by including the method
> > holder's name in the computation. More details in the bug report.
> >
> > CR: https://bugs.openjdk.java.net/browse/JDK-8249719
> > Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
> >
> > Tested: tier1-2, hotspot*runtime
> >
> > I'll be glad if someone could sponsor the patch.
> >
> > Thank you,
> > Andrei Pangin
>

From david.holmes at oracle.com  Fri Jul 24 23:44:32 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 25 Jul 2020 09:44:32 +1000
Subject: RFR (S) 8247296: Optimize JVM_GetDeclaringClass
In-Reply-To: <0be6ee77-be76-1960-b5d2-d2f66b23f76d@oracle.com>
References: <0f0dd12a-c36c-211f-a2df-9521c3455146@oracle.com>
 <7974f5e3-9cfe-cf33-cf4f-a0efed33ac70@oracle.com>
 <2712b135-e8c4-1b5b-b59f-c7b0be201516@oracle.com>
 <0be6ee77-be76-1960-b5d2-d2f66b23f76d@oracle.com>
Message-ID: <18f69ff0-7949-2e8a-fed6-7b1fd5851cd0@oracle.com>

Thanks for the second look Yumin!

David

On 25/07/2020 1:37 am, Yumin Qi wrote:
> Hi, David
> 
> On 7/23/20 11:44 PM, David Holmes wrote:
>>
>> I must stress I am sponsoring Christoph's change and only extended it 
>> within the current file. :) I'm sure there are many, many more 
>> opportunities for similar optimisations. Though you have to be careful 
>> to ensure you don't expose an unhandled oop.
>>
>>> 1) 
>>> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/compiler/compileBroker.cpp#L834 
>>>
>>> ??? Where thread_handle resolved multiple times.
>>
>> True that is unnecessary, but this is a compiler code and I'd need to 
>> extend the review so ... I'll pass on this one.
>>
> That is OK.
>>> 2) 
>>> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/jni.cpp#L1174 
>>>
>>> ???? Where clazz resolved twice.
>>
>> Fixed this as it is the same pattern in a core runtime file. Webrev 
>> updated in place.
>>
> Looks good!
> 
> 
> Thanks
> 
> Yumin
> 
>> Thanks,
>> David
>>
>>
>>
>>> ?? ? Do you want to include those two files in your list?
>>>
>>> Thanks
>>>
>>> Yumin
>>>
>>> On 7/23/20 8:49 PM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8247296
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8247296/webrev/
>>>>
>>>> Please review this simple optimization contributed by Christoph 
>>>> Dreis in its initial form and then expanded by me to cover other 
>>>> cases in jvm.cpp.
>>>>
>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-June/040025.html 
>>>>
>>>>
>>>> There is a common pattern of code of the form:
>>>>
>>>> if 
>>>> (java_lang_Class::is_primitive(JNIHandles::resolve_non_null(ofClass)) || 
>>>>
>>>> ??? ! 
>>>> java_lang_Class::as_Klass(JNIHandles::resolve_non_null(ofClass))->is_instance_klass()) 
>>>> {
>>>>
>>>> which resolves cls twice. There are also duplicate calls to as_Klass 
>>>> that can be removed in a couple of cases.
>>>>
>>>> Testing: tiers 1 - 3
>>>>
>>>> Thanks,
>>>> David

From david.holmes at oracle.com  Fri Jul 24 23:52:33 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 25 Jul 2020 09:52:33 +1000
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
Message-ID: <1d5af712-7be4-7276-2ab0-8d898245c277@oracle.com>

Hi Coleen,

On 25/07/2020 7:07 am, coleen.phillimore at oracle.com wrote:
> Summary: The original patch but add a null pointer check so an OopHandle 
> is not created if a NULL is read from the archive.
> 
> If a NULL is read from the archive, we shouldn't create an OopHandle for 
> it because one will be created in initialize_basic_type_mirrors.? The 
> assert added in the previous patch I pushed detected that.

Your fix to skip NULL seems reasonable but I'm left confused as to when 
and why we will find NULL for these basic type mirrors. Is it only the 
reference types that will be NULL?

Thanks,
David

> incremental change to original change: 
> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
> full webrev at http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
> 
> Retested with tier1-3.
> 
> Thanks,
> Coleen
> 
> 

From ioi.lam at oracle.com  Fri Jul 24 23:59:38 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Fri, 24 Jul 2020 16:59:38 -0700
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <1d5af712-7be4-7276-2ab0-8d898245c277@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
 <1d5af712-7be4-7276-2ab0-8d898245c277@oracle.com>
Message-ID: <a3330f05-1ec9-dee0-e9d1-406953f66877@oracle.com>


On 7/24/20 4:52 PM, David Holmes wrote:
> Hi Coleen,
>
> On 25/07/2020 7:07 am, coleen.phillimore at oracle.com wrote:
>> Summary: The original patch but add a null pointer check so an 
>> OopHandle is not created if a NULL is read from the archive.
>>
>> If a NULL is read from the archive, we shouldn't create an OopHandle 
>> for it because one will be created in initialize_basic_type_mirrors.? 
>> The assert added in the previous patch I pushed detected that.
>
> Your fix to skip NULL seems reasonable but I'm left confused as to 
> when and why we will find NULL for these basic type mirrors. Is it 
> only the reference types that will be NULL?
>

Hi David,

The null skipping happens here:

 ?263?????? if (f->reading()) {
 ?264???????? f->do_oop(&mirror_oop); // read from archive
 ?265???????? assert(oopDesc::is_oop_or_null(mirror_oop), "is oop");
 ?266???????? // Only create an OopHandle for non-null mirrors
 ?267???????? if (mirror_oop != NULL) {
 ?268?????????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
 ?269???????? }

where the f->do_oop() will read an archived oop from the CDS heap. There 
are cases where the a NULL is returned inside mirror_oop:

(1) The CDS image was created WITHOUT an archived heap (this could 
happen when you run -Xshare:dump with a GC or compressed oop encoding 
that's not compatible with the archived heap).

(2) The CDS image has an archived heap, but the current VM is using a GC 
or compressed oop encoding that's not compatible with the archived heap.

Thanks
- Ioi


> Thanks,
> David
>
>> incremental change to original change: 
>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
>> full webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>>
>> Retested with tier1-3.
>>
>> Thanks,
>> Coleen
>>
>>


From felix.yang at huawei.com  Sat Jul 25 02:16:15 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Sat, 25 Jul 2020 02:16:15 +0000
Subject: RFR: 8165404: AArch64: Implement SHA512 accelerator/intrinsic
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E897E5@dggeml507-mbx.china.huawei.com>

Hi,

    Bug: https://bugs.openjdk.java.net/browse/JDK-8165404 
    Webrev: http://cr.openjdk.java.net/~fyang/8165404/webrev.00/ 

    This implement SHA-384/SHA-512 transformation using aarch64 v8.2 SHA512 Crypto Extensions.
    Reference implementation:
        https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha512-ce-core.S?h=v5.4.52

    We used QEMU system emulator which supports SHA512 instructions to test the functionality.
    SHA512 basic functionality is tested with: http://cr.openjdk.java.net/~fyang/8165404/SHA512.java 
    Patch passed jtreg tier1-3 test with QEMU system emulator.
    We've also verified it with full jtreg tests without SHA512 instructions on aarch64-linux-gnu, to make sure that there's no regression.

    We've also created a JMH for performance test: http://cr.openjdk.java.net/~fyang/8165404/TestSHA512.java 
    We measured the performance benefit with a cycle-accurate simulator.  
    Patch delivers more than 2x performance gain measured with the three different size message.

    Comments?

Thanks,
Felix

From david.holmes at oracle.com  Sat Jul 25 02:22:57 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 25 Jul 2020 12:22:57 +1000
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <a3330f05-1ec9-dee0-e9d1-406953f66877@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
 <1d5af712-7be4-7276-2ab0-8d898245c277@oracle.com>
 <a3330f05-1ec9-dee0-e9d1-406953f66877@oracle.com>
Message-ID: <bfdfce21-d015-afca-2f01-67f5995509a6@oracle.com>

Hi Ioi,

On 25/07/2020 9:59 am, Ioi Lam wrote:
> 
> 
> On 7/24/20 4:52 PM, David Holmes wrote:
>> Hi Coleen,
>>
>> On 25/07/2020 7:07 am, coleen.phillimore at oracle.com wrote:
>>> Summary: The original patch but add a null pointer check so an 
>>> OopHandle is not created if a NULL is read from the archive.
>>>
>>> If a NULL is read from the archive, we shouldn't create an OopHandle 
>>> for it because one will be created in initialize_basic_type_mirrors. 
>>> The assert added in the previous patch I pushed detected that.
>>
>> Your fix to skip NULL seems reasonable but I'm left confused as to 
>> when and why we will find NULL for these basic type mirrors. Is it 
>> only the reference types that will be NULL?
>>
> 
> Hi David,
> 
> The null skipping happens here:
> 
>  ?263?????? if (f->reading()) {
>  ?264???????? f->do_oop(&mirror_oop); // read from archive
>  ?265???????? assert(oopDesc::is_oop_or_null(mirror_oop), "is oop");
>  ?266???????? // Only create an OopHandle for non-null mirrors
>  ?267???????? if (mirror_oop != NULL) {
>  ?268?????????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
>  ?269???????? }
> 
> where the f->do_oop() will read an archived oop from the CDS heap. There 
> are cases where the a NULL is returned inside mirror_oop:
> 
> (1) The CDS image was created WITHOUT an archived heap (this could 
> happen when you run -Xshare:dump with a GC or compressed oop encoding 
> that's not compatible with the archived heap).
> 
> (2) The CDS image has an archived heap, but the current VM is using a GC 
> or compressed oop encoding that's not compatible with the archived heap.

Thanks for the explanation. I'm having trouble seeing the equivalence of 
the old Universe::serialize code and the new version. The new version 
seems to perform both writing and reading - which seem odd for something 
called "serialize". ??

Thanks,
David
-----

> Thanks
> - Ioi
> 
> 
> 
>> Thanks,
>> David
>>
>>> incremental change to original change: 
>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
>>> full webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>>>
>>> Retested with tier1-3.
>>>
>>> Thanks,
>>> Coleen
>>>
>>>
> 

From ioi.lam at oracle.com  Sat Jul 25 02:48:29 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Fri, 24 Jul 2020 19:48:29 -0700
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <bfdfce21-d015-afca-2f01-67f5995509a6@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
 <1d5af712-7be4-7276-2ab0-8d898245c277@oracle.com>
 <a3330f05-1ec9-dee0-e9d1-406953f66877@oracle.com>
 <bfdfce21-d015-afca-2f01-67f5995509a6@oracle.com>
Message-ID: <ab8336a8-edda-6b94-dd04-04940bee252b@oracle.com>


On 7/24/20 7:22 PM, David Holmes wrote:
> Hi Ioi,
>
> On 25/07/2020 9:59 am, Ioi Lam wrote:
>>
>>
>> On 7/24/20 4:52 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 25/07/2020 7:07 am, coleen.phillimore at oracle.com wrote:
>>>> Summary: The original patch but add a null pointer check so an 
>>>> OopHandle is not created if a NULL is read from the archive.
>>>>
>>>> If a NULL is read from the archive, we shouldn't create an 
>>>> OopHandle for it because one will be created in 
>>>> initialize_basic_type_mirrors. The assert added in the previous 
>>>> patch I pushed detected that.
>>>
>>> Your fix to skip NULL seems reasonable but I'm left confused as to 
>>> when and why we will find NULL for these basic type mirrors. Is it 
>>> only the reference types that will be NULL?
>>>
>>
>> Hi David,
>>
>> The null skipping happens here:
>>
>> ??263?????? if (f->reading()) {
>> ??264???????? f->do_oop(&mirror_oop); // read from archive
>> ??265???????? assert(oopDesc::is_oop_or_null(mirror_oop), "is oop");
>> ??266???????? // Only create an OopHandle for non-null mirrors
>> ??267???????? if (mirror_oop != NULL) {
>> ??268?????????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
>> ??269???????? }
>>
>> where the f->do_oop() will read an archived oop from the CDS heap. 
>> There are cases where the a NULL is returned inside mirror_oop:
>>
>> (1) The CDS image was created WITHOUT an archived heap (this could 
>> happen when you run -Xshare:dump with a GC or compressed oop encoding 
>> that's not compatible with the archived heap).
>>
>> (2) The CDS image has an archived heap, but the current VM is using a 
>> GC or compressed oop encoding that's not compatible with the archived 
>> heap.
>
> Thanks for the explanation. I'm having trouble seeing the equivalence 
> of the old Universe::serialize code and the new version. The new 
> version seems to perform both writing and reading - which seem odd for 
> something called "serialize". ??
>

CDS "serialize" is used for both read and writing. It's probably a bad 
name but has been there since the beginning of CDS ....

Thanks
- Ioi

> Thanks,
> David
> -----
>
>> Thanks
>> - Ioi
>>
>>
>>
>>> Thanks,
>>> David
>>>
>>>> incremental change to original change: 
>>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
>>>> full webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>>>>
>>>> Retested with tier1-3.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>
>>


From suenaga at oss.nttdata.com  Sat Jul 25 05:51:15 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sat, 25 Jul 2020 14:51:15 +0900
Subject: Hypervisor detector for Windows
Message-ID: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>

Hi all,

When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.

I tried to file it to JBS and to fix it, but I have some questions for this.
(This feature has been introduced in JDK-8219241)


   - According to [1] (it is mentioned in the source code), we need to check bit 31 in ECX when CPUID is called with EAX = 1h. Why it would not do so?

   - Why would VM_Version::check_virtualizations() call CPUID with 40000000h to 4000FF00h? 40000000h should be used if we want to get vendor ID.

   - Why VM_Version::check_virt_cpuid() is separated for GNU C (GAS) and MacroAssembler? I guess we can use MacroAssembler for x86 / x86_64.

   - In case of Hyper-V, host OS is treated as root partition [2], so we cannot use this CPUID solution for Hyper-V. I guess we need to check it with other solutions like [3].


Thanks,

Yasumasa


[1] https://kb.vmware.com/s/article/1009458
[2] https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
[3] https://stackoverflow.com/questions/10544498/detect-the-virtualization-layer-from-a-guest-instancevm-vpc-or-hyper-v-in-c

From thomas.stuefe at gmail.com  Sat Jul 25 07:18:38 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 25 Jul 2020 09:18:38 +0200
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
Message-ID: <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>

Hi, Andrei,

Good find. I played around with a test of generating lots of lambdas and
yes, all the hashes are equal. With your patch invocation time went down by
half (that was for 10000 lambdas).

The test looks fine though the normal way to do this seems to be jcod. I
personally don't care since the test is nice and self contained that way,
but someone from the Oracle runtime group should confirm this is fine
(ccing Coleen).

JDK11 seems to be affected too.

This probably also affects jruby.

+1 from me.

..Thomas

On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <andrei.pangin at gmail.com>
wrote:

> Hi,
>
> Please review a small fix to a not-so-small performance issue that we've
> seen when migrating a production application from JDK 8 to JDK 14.
>
> On certain workloads, where Nashorn produces thousands MethodHandles,
> ResolvedMethodTable operations become extremely slow due to degenerate
> hashcode. This patch basically fixes hashcode by including the method
> holder's name in the computation. More details in the bug report.
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>
> Tested: tier1-2, hotspot*runtime
>
> I'll be glad if someone could sponsor the patch.
>
> Thank you,
> Andrei Pangin
>

From david.holmes at oracle.com  Sat Jul 25 12:13:45 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 25 Jul 2020 22:13:45 +1000
Subject: Hypervisor detector for Windows
In-Reply-To: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>
References: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>
Message-ID: <c0823991-e790-5d09-e963-cc9507c1d1b0@oracle.com>

Hi Yasumasa,

My recollection from reviewing this was that it was all based on vendor 
strings. Best to ask Matthias (cc'd) if you need more details.

Cheers,
David

On 25/07/2020 3:51 pm, Yasumasa Suenaga wrote:
> Hi all,
> 
> When I got hs_err log on Windows, I saw "HyperV virtualization detected" 
> in it in spite of running on host OS.
> 
> I tried to file it to JBS and to fix it, but I have some questions for 
> this.
> (This feature has been introduced in JDK-8219241)
> 
> 
>  ? - According to [1] (it is mentioned in the source code), we need to 
> check bit 31 in ECX when CPUID is called with EAX = 1h. Why it would not 
> do so?
> 
>  ? - Why would VM_Version::check_virtualizations() call CPUID with 
> 40000000h to 4000FF00h? 40000000h should be used if we want to get 
> vendor ID.
> 
>  ? - Why VM_Version::check_virt_cpuid() is separated for GNU C (GAS) and 
> MacroAssembler? I guess we can use MacroAssembler for x86 / x86_64.
> 
>  ? - In case of Hyper-V, host OS is treated as root partition [2], so we 
> cannot use this CPUID solution for Hyper-V. I guess we need to check it 
> with other solutions like [3].
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> [1] https://kb.vmware.com/s/article/1009458
> [2] 
> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
> 
> [3] 
> https://stackoverflow.com/questions/10544498/detect-the-virtualization-layer-from-a-guest-instancevm-vpc-or-hyper-v-in-c 
> 

From coleen.phillimore at oracle.com  Sat Jul 25 14:19:03 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Sat, 25 Jul 2020 10:19:03 -0400
Subject: RFR (T) 8250519: [REDO] Move mirror oops from Universe into
 OopStorage
In-Reply-To: <ab8336a8-edda-6b94-dd04-04940bee252b@oracle.com>
References: <26a74821-ef3a-4395-6ebd-9bea22d32b32@oracle.com>
 <1d5af712-7be4-7276-2ab0-8d898245c277@oracle.com>
 <a3330f05-1ec9-dee0-e9d1-406953f66877@oracle.com>
 <bfdfce21-d015-afca-2f01-67f5995509a6@oracle.com>
 <ab8336a8-edda-6b94-dd04-04940bee252b@oracle.com>
Message-ID: <409db805-313b-470a-d362-cf4b4229f5ac@oracle.com>


On 7/24/20 10:48 PM, Ioi Lam wrote:
>
>
> On 7/24/20 7:22 PM, David Holmes wrote:
>> Hi Ioi,
>>
>> On 25/07/2020 9:59 am, Ioi Lam wrote:
>>>
>>>
>>> On 7/24/20 4:52 PM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> On 25/07/2020 7:07 am, coleen.phillimore at oracle.com wrote:
>>>>> Summary: The original patch but add a null pointer check so an 
>>>>> OopHandle is not created if a NULL is read from the archive.
>>>>>
>>>>> If a NULL is read from the archive, we shouldn't create an 
>>>>> OopHandle for it because one will be created in 
>>>>> initialize_basic_type_mirrors. The assert added in the previous 
>>>>> patch I pushed detected that.
>>>>
>>>> Your fix to skip NULL seems reasonable but I'm left confused as to 
>>>> when and why we will find NULL for these basic type mirrors. Is it 
>>>> only the reference types that will be NULL?
>>>>
>>>
>>> Hi David,
>>>
>>> The null skipping happens here:
>>>
>>> ??263?????? if (f->reading()) {
>>> ??264???????? f->do_oop(&mirror_oop); // read from archive
>>> ??265???????? assert(oopDesc::is_oop_or_null(mirror_oop), "is oop");
>>> ??266???????? // Only create an OopHandle for non-null mirrors
>>> ??267???????? if (mirror_oop != NULL) {
>>> ??268?????????? _mirrors[i] = OopHandle(vm_global(), mirror_oop);
>>> ??269???????? }
>>>
>>> where the f->do_oop() will read an archived oop from the CDS heap. 
>>> There are cases where the a NULL is returned inside mirror_oop:
>>>
>>> (1) The CDS image was created WITHOUT an archived heap (this could 
>>> happen when you run -Xshare:dump with a GC or compressed oop 
>>> encoding that's not compatible with the archived heap).
>>>
>>> (2) The CDS image has an archived heap, but the current VM is using 
>>> a GC or compressed oop encoding that's not compatible with the 
>>> archived heap.
>>
>> Thanks for the explanation. I'm having trouble seeing the equivalence 
>> of the old Universe::serialize code and the new version. The new 
>> version seems to perform both writing and reading - which seem odd 
>> for something called "serialize". ??
>>
>
> CDS "serialize" is used for both read and writing. It's probably a bad 
> name but has been there since the beginning of CDS ....

Generally, serialize doesn't have to test whether it's reading or 
writing.? It's at least a useful name to find in the sources.

Thanks for answering David's question!
Coleen
>
> Thanks
> - Ioi
>
>> Thanks,
>> David
>> -----
>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> incremental change to original change: 
>>>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01.incr/webrev
>>>>> full webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/2020/8250519.01/webrev
>>>>>
>>>>> Retested with tier1-3.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>
>>>
>


From coleen.phillimore at oracle.com  Sat Jul 25 14:34:22 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Sat, 25 Jul 2020 10:34:22 -0400
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
Message-ID: <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>


Hi Andrei,
This looks good.? Thank you for finding this bug.? And thanks to Volker 
for sponsoring it as well.
Nice to see you on the list, Andrei!
Coleen

On 7/25/20 3:18 AM, Thomas St?fe wrote:
> Hi, Andrei,
>
> Good find. I played around with a test of generating lots of lambdas 
> and yes, all the hashes are equal. With your patch invocation?time 
> went down by half (that was for 10000 lambdas).
>
> The test looks fine though the normal way to do this seems to be jcod. 
> I personally don't care since the test is nice and self contained?that 
> way, but someone from the Oracle runtime group should confirm this is 
> fine (ccing Coleen).
>
> JDK11 seems to be affected too.
>
> This probably also affects jruby.
>
> +1 from me.
>
> ..Thomas
>
> On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <andrei.pangin at gmail.com 
> <mailto:andrei.pangin at gmail.com>> wrote:
>
>     Hi,
>
>     Please review a small fix to a not-so-small performance issue that
>     we've
>     seen when migrating a production application from JDK 8 to JDK 14.
>
>     On certain workloads, where Nashorn produces thousands MethodHandles,
>     ResolvedMethodTable operations become extremely slow due to degenerate
>     hashcode. This patch basically fixes hashcode by including the method
>     holder's name in the computation. More details in the bug report.
>
>     CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>     Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>
>     Tested: tier1-2, hotspot*runtime
>
>     I'll be glad if someone could sponsor the patch.
>
>     Thank you,
>     Andrei Pangin
>


From suenaga at oss.nttdata.com  Sat Jul 25 15:01:56 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sun, 26 Jul 2020 00:01:56 +0900
Subject: Hypervisor detector for Windows
In-Reply-To: <c0823991-e790-5d09-e963-cc9507c1d1b0@oracle.com>
References: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>
 <c0823991-e790-5d09-e963-cc9507c1d1b0@oracle.com>
Message-ID: <2cd75fd8-5a9d-415d-4817-be086b311c88@oss.nttdata.com>

Thanks David!

On 2020/07/25 21:13, David Holmes wrote:
> Hi Yasumasa,
> 
> My recollection from reviewing this was that it was all based on vendor strings. Best to ask Matthias (cc'd) if you need more details.

I think we can fix like following webrev. It works fine on Windows 10 (host), Windows 10 (guest) on Hyper-V, and Fedora 32 on Hyper-V.
Matthias, what do you think? If this webrev seems good, I will file it to JBS and will send review request.

   http://cr.openjdk.java.net/~ysuenaga/hv-detection/


Thanks,

Yasumasa


> Cheers,
> David
> 
> On 25/07/2020 3:51 pm, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>
>> I tried to file it to JBS and to fix it, but I have some questions for this.
>> (This feature has been introduced in JDK-8219241)
>>
>>
>> ?? - According to [1] (it is mentioned in the source code), we need to check bit 31 in ECX when CPUID is called with EAX = 1h. Why it would not do so?
>>
>> ?? - Why would VM_Version::check_virtualizations() call CPUID with 40000000h to 4000FF00h? 40000000h should be used if we want to get vendor ID.
>>
>> ?? - Why VM_Version::check_virt_cpuid() is separated for GNU C (GAS) and MacroAssembler? I guess we can use MacroAssembler for x86 / x86_64.
>>
>> ?? - In case of Hyper-V, host OS is treated as root partition [2], so we cannot use this CPUID solution for Hyper-V. I guess we need to check it with other solutions like [3].
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] https://kb.vmware.com/s/article/1009458
>> [2] https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>> [3] https://stackoverflow.com/questions/10544498/detect-the-virtualization-layer-from-a-guest-instancevm-vpc-or-hyper-v-in-c

From david.holmes at oracle.com  Mon Jul 27 01:39:41 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 11:39:41 +1000
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <MWHPR21MB0511168A11BCC1501E85A3BCB0770@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
 <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
 <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d5a5e563-e0c5-ec0a-8640-ea940c05f738@oracle.com>
 <MWHPR21MB05114BF8C3AB71CF2125FB25B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511168A11BCC1501E85A3BCB0770@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <8a9577be-bbfa-8ce2-abcc-1bcd837b00ea@oracle.com>

Hi Ludovic,

This patch is good to go as far as I am concerned.

Thanks,
David

On 25/07/2020 1:39 am, Ludovic Henry wrote:
> Hi,
> 
> A quick follow-up on that change. Is there anything else you'd like to see changed to get in merged?
> 
> Following an offline discussion with David, I'm working with relevant Microsoft teams to figure out where relevant documentation is. I'll keep you posted on that.
> 
> Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248657/webrev.00/8248657.patch
> 
> Thank you,
> 
> --
> Ludovic
> 
> -----Original Message-----
> From: Ludovic Henry <luhenry at microsoft.com>
> Sent: Wednesday, July 15, 2020 10:00 AM
> To: David Holmes <david.holmes at oracle.com>; Andrew Haley <aph at redhat.com>; Thomas St?fe <thomas.stuefe at gmail.com>
> Cc: Kim Barrett <kim.barrett at oracle.com>; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64 <openjdk-aarch64 at microsoft.com>
> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
> 
> Hi David,
> 
>>> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.
>>
>> That is good to know. But this is something that Microsoft should be
>> documenting explicitly - even if just a blanket statement that all
>> syscalls (which are what exactly?) provide an implicit memory barrier
>> (of what type exactly?).
> 
> I don't think it's because SetEvent is a syscall that we can assume it has a barrier (even though syscall do guarantee a barrier), it's more that SetEvent is an equivalent to sem_post. And if you cannot assume that sem_post or SetEvent guarantee a memory barrier (full or at least store_release), then you could not trust any standard locking mechanism (what's the point of synchronizing if the CPU can load and store outside of the critical section).
> 
>>> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.
>>
>> jcstress tests will execute the native runtime code of course, but they
>> won't be "stressing" it as such.
> 
> Makes sense, thanks for the clarification.
> 
> --
> Ludovic
> 
> I agree with you on the value of a more explicit documentation, and I'll go look for that. If it doesn't exist, I'll put the request to have it documented somewhere on docs.microsoft.com. In the meantime, it is safe to assume that SetEvent contains a memory barrier that has at least a store_release semantic. Similarly, WaitForSingleObect and WaitForMultipleObjects have at least a load_acquire memory barrier, and are also syscalls (actually guaranteeing a full memory barrier).
> 
> ________________________________________
> From: David Holmes <david.holmes at oracle.com>
> Sent: Monday, July 13, 2020 19:25
> To: Ludovic Henry; Andrew Haley; Thomas St?fe
> Cc: Kim Barrett; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
> 
> Hi Ludovic,
> 
> On 14/07/2020 11:28 am, Ludovic Henry wrote:
>> Hello,
>>
>>> But if we are dealing with non-TSO races then it would be good to get
>>> some guidance from Microsoft as to the memory ordering properties of
>>> various API's to ensure that we are maintaining correct ordering. For
>>> example, in the destructor we have:
>>>
>>> 81     lock_owner = 0;
>>> 82     // No lost wakeups, lock_event stays signaled until reset.
>>> 83     DWORD ret = SetEvent(lock_event);
>>>
>>> but unless we are guaranteed that the store to lock_owner cannot be
>>> reordered by the compiler or the hardware, to appear to be after the
>>> SetEvent, then the logic is broken. Generally, because Windows only
>>> supported TSO systems, we have assumed that the compiler will not
>>> reorder code across these kind of API calls. But now we also need
>>> hardware guarantees.
>>
>> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.
> 
> That is good to know. But this is something that Microsoft should be
> documenting explicitly - even if just a blanket statement that all
> syscalls (which are what exactly?) provide an implicit memory barrier
> (of what type exactly?).
> 
>> As for the general question around platforms with weaker memory models, AArch64 is not the first such platform that MSVC and Windows have been ported to. It is safe to assume that MSVC has a similar approach to GCC and Clang on memory reordering optimizations. [1] also gives some pointers on some MSVC specific knobs for working around the weaker memory model.
> 
> The /volatile:ms is the kind of build control I was wondering about.
> Thanks for the pointer.
> 
>> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.
> 
> jcstress tests will execute the native runtime code of course, but they
> won't be "stressing" it as such.
> 
> Cheers,
> David
> -----
> 
>> I hope this helps to answer your questions.
>>
>> [1] https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fbuild%2Fcommon-visual-cpp-arm-migration-issues%3Fview%3Dvs-2019%23volatile-keyword-default-behavior&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C64d4fa75aa9949a4716508d828e083f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304292066219168&amp;sdata=vm8kqPOZV2fxQXF6idNGyaaY6F9RruyDmTND5VdGBy0%3D&amp;reserved=0
>>
>> --
>> Ludovic
>> ________________________________________
>> From: Andrew Haley <aph at redhat.com>
>> Sent: Monday, July 13, 2020 01:36
>> To: David Holmes; Thomas St?fe
>> Cc: Kim Barrett; Ludovic Henry; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
>> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
>>
>> On 13/07/2020 06:48, David Holmes wrote:
>>> Hi Thomas,
>>>
>>> On 13/07/2020 2:41 pm, Thomas St?fe wrote:
>>>>
>>>> Can a compiler reorder system calls and stores? How would it determine
>>>> if this is safe to do?
>>
>> I very much doubt it.
>>
>>> A compiler can reorder anything it likes if it can determine it is safe
>>> to do so. :)
>>
>> I'm fairly sure the compiler doesn't care about that!
>>
>>>> I'd be surprised if Microsoft loosened up reordering since this would
>>>> mean existing software cannot just be recompiled for arm and expected to
>>>> work. But this is just a guess of course.
>>>
>>> It's an interesting point because I would expect there to be a lot of
>>> software written for Windows that contains assumptions of TSO that would
>>> in fact fail when run on Aarch64. I don't know if there are any special
>>> mechanisms to force a binary to run in TSO mode on Aarch64 under Windows
>>> (or build flags), that would allow for ease of migration.
>>
>> There's no standard hardware mechanism that would do so.
>>
>> I've been very surprised at how little software has broken on AArch64
>> because of memory ordering. Like you, I initially assumed that stuff
>> would break all over the place, but by and large it was OK. I know of
>> two reasons: firstly, programmers are pretty conservative and tend to
>> use simple and reliable mechanisms such as safe publication and
>> mutexes for inter-thread communication. But also, and maybe more
>> importantly, the kinds of reordering the hardware can do are not very
>> different from those compilers do. Therefore, anyone playing fast and
>> loose with TSO has probably already been bitten by the compiler.
>>
>>> But unless all Windows software will run in such a mode there is a
>>> need for MS to document what the memory consistency properties of
>>> various APIs are (as POSIX does [1]).
>>
>> Indeed. I would have thought it existed somewhere.
>>
>> --
>> Andrew Haley  (he/him)
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C64d4fa75aa9949a4716508d828e083f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304292066219168&amp;sdata=zhJaUB7k2aIWHUQViQjxsWp%2Bj6DEnGzN5GQpQaE2sgM%3D&amp;reserved=0>
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C64d4fa75aa9949a4716508d828e083f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304292066219168&amp;sdata=g8G%2FIED9WQGKBWYK4WMPVlP0r903iMygoOUfqttSGeE%3D&amp;reserved=0
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>

From david.holmes at oracle.com  Mon Jul 27 01:50:28 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 11:50:28 +1000
Subject: Hypervisor detector for Windows
In-Reply-To: <2cd75fd8-5a9d-415d-4817-be086b311c88@oss.nttdata.com>
References: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>
 <c0823991-e790-5d09-e963-cc9507c1d1b0@oracle.com>
 <2cd75fd8-5a9d-415d-4817-be086b311c88@oss.nttdata.com>
Message-ID: <52687df4-c644-ca47-4f20-1f41a9abd66a@oracle.com>

On 26/07/2020 1:01 am, Yasumasa Suenaga wrote:
> Thanks David!
> 
> On 2020/07/25 21:13, David Holmes wrote:
>> Hi Yasumasa,
>>
>> My recollection from reviewing this was that it was all based on 
>> vendor strings. Best to ask Matthias (cc'd) if you need more details.
> 
> I think we can fix like following webrev. It works fine on Windows 10 
> (host), Windows 10 (guest) on Hyper-V, and Fedora 32 on Hyper-V.
> Matthias, what do you think? If this webrev seems good, I will file it 
> to JBS and will send review request.
> 
>  ? http://cr.openjdk.java.net/~ysuenaga/hv-detection/

By all means file a JBS issue but I can't comment on the correctness of 
the fix.

Cheers,
David

> 
> Thanks,
> 
> Yasumasa
> 
> 
>> Cheers,
>> David
>>
>> On 25/07/2020 3:51 pm, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> When I got hs_err log on Windows, I saw "HyperV virtualization 
>>> detected" in it in spite of running on host OS.
>>>
>>> I tried to file it to JBS and to fix it, but I have some questions 
>>> for this.
>>> (This feature has been introduced in JDK-8219241)
>>>
>>>
>>> ?? - According to [1] (it is mentioned in the source code), we need 
>>> to check bit 31 in ECX when CPUID is called with EAX = 1h. Why it 
>>> would not do so?
>>>
>>> ?? - Why would VM_Version::check_virtualizations() call CPUID with 
>>> 40000000h to 4000FF00h? 40000000h should be used if we want to get 
>>> vendor ID.
>>>
>>> ?? - Why VM_Version::check_virt_cpuid() is separated for GNU C (GAS) 
>>> and MacroAssembler? I guess we can use MacroAssembler for x86 / x86_64.
>>>
>>> ?? - In case of Hyper-V, host OS is treated as root partition [2], so 
>>> we cannot use this CPUID solution for Hyper-V. I guess we need to 
>>> check it with other solutions like [3].
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> [1] https://kb.vmware.com/s/article/1009458
>>> [2] 
>>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
>>>
>>> [3] 
>>> https://stackoverflow.com/questions/10544498/detect-the-virtualization-layer-from-a-guest-instancevm-vpc-or-hyper-v-in-c 
>>>

From suenaga at oss.nttdata.com  Mon Jul 27 02:49:47 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 11:49:47 +0900
Subject: Hypervisor detector for Windows
In-Reply-To: <52687df4-c644-ca47-4f20-1f41a9abd66a@oracle.com>
References: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>
 <c0823991-e790-5d09-e963-cc9507c1d1b0@oracle.com>
 <2cd75fd8-5a9d-415d-4817-be086b311c88@oss.nttdata.com>
 <52687df4-c644-ca47-4f20-1f41a9abd66a@oracle.com>
Message-ID: <b1ede801-2752-f656-d71f-1ab524db778e@oss.nttdata.com>

On 2020/07/27 10:50, David Holmes wrote:
> On 26/07/2020 1:01 am, Yasumasa Suenaga wrote:
>> Thanks David!
>>
>> On 2020/07/25 21:13, David Holmes wrote:
>>> Hi Yasumasa,
>>>
>>> My recollection from reviewing this was that it was all based on vendor strings. Best to ask Matthias (cc'd) if you need more details.
>>
>> I think we can fix like following webrev. It works fine on Windows 10 (host), Windows 10 (guest) on Hyper-V, and Fedora 32 on Hyper-V.
>> Matthias, what do you think? If this webrev seems good, I will file it to JBS and will send review request.
>>
>> ?? http://cr.openjdk.java.net/~ysuenaga/hv-detection/
> 
> By all means file a JBS issue but I can't comment on the correctness of the fix.

I filed it to JBS:
   https://bugs.openjdk.java.net/browse/JDK-8250598

I will send review request later, and I want to discuss about it in review thread.


Thanks,

Yasumasa


> Cheers,
> David
> 
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> Cheers,
>>> David
>>>
>>> On 25/07/2020 3:51 pm, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>>>
>>>> I tried to file it to JBS and to fix it, but I have some questions for this.
>>>> (This feature has been introduced in JDK-8219241)
>>>>
>>>>
>>>> ?? - According to [1] (it is mentioned in the source code), we need to check bit 31 in ECX when CPUID is called with EAX = 1h. Why it would not do so?
>>>>
>>>> ?? - Why would VM_Version::check_virtualizations() call CPUID with 40000000h to 4000FF00h? 40000000h should be used if we want to get vendor ID.
>>>>
>>>> ?? - Why VM_Version::check_virt_cpuid() is separated for GNU C (GAS) and MacroAssembler? I guess we can use MacroAssembler for x86 / x86_64.
>>>>
>>>> ?? - In case of Hyper-V, host OS is treated as root partition [2], so we cannot use this CPUID solution for Hyper-V. I guess we need to check it with other solutions like [3].
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> [1] https://kb.vmware.com/s/article/1009458
>>>> [2] https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>>> [3] https://stackoverflow.com/questions/10544498/detect-the-virtualization-layer-from-a-guest-instancevm-vpc-or-hyper-v-in-c

From suenaga at oss.nttdata.com  Mon Jul 27 04:24:57 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 13:24:57 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
Message-ID: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>

Hi all,

Please review this change:

   JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/

When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.

Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:

   - Hyper-V is detected on Windows in spite of running on host OS
   - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
   - Does not check CPUID hypervisor present bit [1]
   - Does not support x86 (32bit) platform

I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:

   - Windows x64 (host)
   - Windows x64 (Hyper-V guest)
   - Fedora32 x64 (Hyper-V guest)
   - 32 bit JDK on Fedora32 x64 (Hyper-V guest)


Thanks,

Yasumasa


[1] https://kb.vmware.com/s/article/1009458

From david.holmes at oracle.com  Mon Jul 27 05:02:06 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 15:02:06 +1000
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
Message-ID: <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>


On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
> Hi all,
> 
> Please review this change:
> 
>  ? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>  ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
> 
> When I got hs_err log on Windows, I saw "HyperV virtualization detected" 
> in it in spite of running on host OS.
> 
> Hypervisor detector has been introduced in JDK-8219241, but it has some 
> problems as below:
> 
>  ? - Hyper-V is detected on Windows in spite of running on host OS
>  ? - Call CPUID with other than EAX = 40000000h (it is not described in 
> the spec [1])

That VMWare document is not a "spec" for anything other than VMware. So 
this may work for VMWare:

   a->movl(rax, 0x40000000);

but may not work for all other HV environments - which is why the 
original code checks a range of addresses within the reserved area. See 
this related code for example:

http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD

   71   /* Most hypervisors only have information in leaf 0x40000000.
   72    *
   73    * Some hypervisors have "Viridian [HyperV] extensions", and those
   74    * must appear in slot 0x40000000, but they will also have the true
   75    * hypervisor in a higher slot.

You have to be able to check this on a range of HV's to ensure you have 
not broken anything.

Did you actually diagnose why the existing code mis-detects Hyper-V 
under Windows?

David
-----

>  ? - Does not check CPUID hypervisor present bit [1]
>  ? - Does not support x86 (32bit) platform
> 
> I've tested this change on submit repo, and have checked output from 
> VM.info jcmd on following environment:
> 
>  ? - Windows x64 (host)
>  ? - Windows x64 (Hyper-V guest)
>  ? - Fedora32 x64 (Hyper-V guest)
>  ? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> [1] https://kb.vmware.com/s/article/1009458

From ioi.lam at oracle.com  Mon Jul 27 05:20:04 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Sun, 26 Jul 2020 22:20:04 -0700
Subject: RFR(S) 8249276 CDS archived objects must have "neutral" markwords
Message-ID: <5d5b4b17-a40f-b0f4-fa50-fdd783def9d4@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8249276
http://cr.openjdk.java.net/~iklam/jdk16/8249276-reset-archive-obj-headers.v01/

Please review this change (initial patch provided by David Holmes; I 
rearranged it a little and added more comments).

We reinitialize the markWord of all archived objects to remove any side 
effect of GC/locking/etc during "java -Xshare:dump".

(David mentioned in the bug comments about assertions inside the 
identity_hash() call, but maybe this should be fixed in a different bug?)

Tested with mach5 tiers 1-4.

Thanks
- Ioi

From suenaga at oss.nttdata.com  Mon Jul 27 05:21:05 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 14:21:05 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
Message-ID: <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>

Hi David,

On 2020/07/27 14:02, David Holmes wrote:
> 
> 
> On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> Please review this change:
>>
>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>
>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>
>> Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:
>>
>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>> ?? - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
> 
> That VMWare document is not a "spec" for anything other than VMware. So this may work for VMWare:
> 
>  ? a->movl(rax, 0x40000000);
> 
> but may not work for all other HV environments - which is why the original code checks a range of addresses within the reserved area. See this related code for example:
> 
> http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD
> 
>  ? 71?? /* Most hypervisors only have information in leaf 0x40000000.
>  ? 72??? *
>  ? 73??? * Some hypervisors have "Viridian [HyperV] extensions", and those
>  ? 74??? * must appear in slot 0x40000000, but they will also have the true
>  ? 75??? * hypervisor in a higher slot.
> 
> You have to be able to check this on a range of HV's to ensure you have not broken anything.

Currently this feature supports VMware, Hyper-V, KVM, Xen.
We can distinguish them from CPUID with 40000000h. So we should not check other than 40000000h.

    VMware: https://kb.vmware.com/s/article/1009458
   Hyper-V: https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
       KVM: https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
       Xen: https://xenbits.xen.org/docs/unstable/hypercall/x86_32/include,public,arch-x86,cpuid.h.html


> Did you actually diagnose why the existing code mis-detects Hyper-V under Windows?

CPUID with EAX = 40000000h returns "Microsoft Hv" if Hyper-V is installed. I guess it is caused by Hyper-V architecture.
According to [1], root partition is as a host OS, so I guess JVM would detect which is running on Hyper-V even if it is running on host OS.


Thanks,

Yasumasa


[1] https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture


> David
> -----
> 
>> ?? - Does not check CPUID hypervisor present bit [1]
>> ?? - Does not support x86 (32bit) platform
>>
>> I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:
>>
>> ?? - Windows x64 (host)
>> ?? - Windows x64 (Hyper-V guest)
>> ?? - Fedora32 x64 (Hyper-V guest)
>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] https://kb.vmware.com/s/article/1009458

From suenaga at oss.nttdata.com  Mon Jul 27 05:23:12 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 14:23:12 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
Message-ID: <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>

On 2020/07/27 14:21, Yasumasa Suenaga wrote:
> Hi David,
> 
> On 2020/07/27 14:02, David Holmes wrote:
>>
>>
>> On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> Please review this change:
>>>
>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>>
>>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>>
>>> Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:
>>>
>>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>>> ?? - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
>>
>> That VMWare document is not a "spec" for anything other than VMware. So this may work for VMWare:
>>
>> ?? a->movl(rax, 0x40000000);
>>
>> but may not work for all other HV environments - which is why the original code checks a range of addresses within the reserved area. See this related code for example:
>>
>> http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD
>>
>> ?? 71?? /* Most hypervisors only have information in leaf 0x40000000.
>> ?? 72??? *
>> ?? 73??? * Some hypervisors have "Viridian [HyperV] extensions", and those
>> ?? 74??? * must appear in slot 0x40000000, but they will also have the true
>> ?? 75??? * hypervisor in a higher slot.
>>
>> You have to be able to check this on a range of HV's to ensure you have not broken anything.
> 
> Currently this feature supports VMware, Hyper-V, KVM, Xen.
> We can distinguish them from CPUID with 40000000h. So we should not check other than 40000000h.
> 
>  ?? VMware: https://kb.vmware.com/s/article/1009458
>  ? Hyper-V: https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture

Sorry, Hyper-V spec is here:

   https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/tlfs

In Windows Server 2019, you can see CPUID in 2.4.1 .


>  ????? KVM: https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>  ????? Xen: https://xenbits.xen.org/docs/unstable/hypercall/x86_32/include,public,arch-x86,cpuid.h.html
> 
> 
>> Did you actually diagnose why the existing code mis-detects Hyper-V under Windows?
> 
> CPUID with EAX = 40000000h returns "Microsoft Hv" if Hyper-V is installed. I guess it is caused by Hyper-V architecture.
> According to [1], root partition is as a host OS, so I guess JVM would detect which is running on Hyper-V even if it is running on host OS.
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> [1] https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
> 
> 
>> David
>> -----
>>
>>> ?? - Does not check CPUID hypervisor present bit [1]
>>> ?? - Does not support x86 (32bit) platform
>>>
>>> I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:
>>>
>>> ?? - Windows x64 (host)
>>> ?? - Windows x64 (Hyper-V guest)
>>> ?? - Fedora32 x64 (Hyper-V guest)
>>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> [1] https://kb.vmware.com/s/article/1009458

From matthias.baesken at sap.com  Mon Jul 27 06:59:49 2020
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Mon, 27 Jul 2020 06:59:49 +0000
Subject: Hypervisor detector for Windows
In-Reply-To: <2cd75fd8-5a9d-415d-4817-be086b311c88@oss.nttdata.com>
References: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>
 <c0823991-e790-5d09-e963-cc9507c1d1b0@oracle.com>
 <2cd75fd8-5a9d-415d-4817-be086b311c88@oss.nttdata.com>
Message-ID: <AM0PR02MB5412E6CBA26AEFD727B268F893720@AM0PR02MB5412.eurprd02.prod.outlook.com>

Hi  Yasumasa , I'll  put  your  patch  into our build/test  queue  and check the outcome.

And thanks for filing the JBS issue too .
 
Regarding ...

> Does not support x86 (32bit) platform

Yes , I think 32bit is not very important any more in current OpenJDK .
But if your patch addresses it too , then  it's not a bad thing for sure !

Best regards, Matthias


-----Original Message-----
From: Yasumasa Suenaga <suenaga at oss.nttdata.com> 
Sent: Samstag, 25. Juli 2020 17:02
To: David Holmes <david.holmes at oracle.com>; hotspot-runtime-dev at openjdk.java.net; Baesken, Matthias <matthias.baesken at sap.com>
Subject: Re: Hypervisor detector for Windows

Thanks David!

On 2020/07/25 21:13, David Holmes wrote:
> Hi Yasumasa,
> 
> My recollection from reviewing this was that it was all based on vendor strings. Best to ask Matthias (cc'd) if you need more details.

I think we can fix like following webrev. It works fine on Windows 10 (host), Windows 10 (guest) on Hyper-V, and Fedora 32 on Hyper-V.
Matthias, what do you think? If this webrev seems good, I will file it to JBS and will send review request.

   http://cr.openjdk.java.net/~ysuenaga/hv-detection/


Thanks,

Yasumasa


> Cheers,
> David
> 
> On 25/07/2020 3:51 pm, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>
>> I tried to file it to JBS and to fix it, but I have some questions for this.
>> (This feature has been introduced in JDK-8219241)
>>
>>
>> ?? - According to [1] (it is mentioned in the source code), we need to check bit 31 in ECX when CPUID is called with EAX = 1h. Why it would not do so?
>>
>> ?? - Why would VM_Version::check_virtualizations() call CPUID with 40000000h to 4000FF00h? 40000000h should be used if we want to get vendor ID.
>>
>> ?? - Why VM_Version::check_virt_cpuid() is separated for GNU C (GAS) and MacroAssembler? I guess we can use MacroAssembler for x86 / x86_64.
>>
>> ?? - In case of Hyper-V, host OS is treated as root partition [2], so we cannot use this CPUID solution for Hyper-V. I guess we need to check it with other solutions like [3].
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] https://kb.vmware.com/s/article/1009458
>> [2] https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>> [3] https://stackoverflow.com/questions/10544498/detect-the-virtualization-layer-from-a-guest-instancevm-vpc-or-hyper-v-in-c

From david.holmes at oracle.com  Mon Jul 27 07:09:48 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 17:09:48 +1000
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
Message-ID: <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>

On 27/07/2020 3:23 pm, Yasumasa Suenaga wrote:
> On 2020/07/27 14:21, Yasumasa Suenaga wrote:
>> Hi David,
>>
>> On 2020/07/27 14:02, David Holmes wrote:
>>>
>>>
>>> On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> Please review this change:
>>>>
>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>>>
>>>> When I got hs_err log on Windows, I saw "HyperV virtualization 
>>>> detected" in it in spite of running on host OS.
>>>>
>>>> Hypervisor detector has been introduced in JDK-8219241, but it has 
>>>> some problems as below:
>>>>
>>>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>>>> ?? - Call CPUID with other than EAX = 40000000h (it is not described 
>>>> in the spec [1])
>>>
>>> That VMWare document is not a "spec" for anything other than VMware. 
>>> So this may work for VMWare:
>>>
>>> ?? a->movl(rax, 0x40000000);
>>>
>>> but may not work for all other HV environments - which is why the 
>>> original code checks a range of addresses within the reserved area. 
>>> See this related code for example:
>>>
>>> http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD 
>>>
>>>
>>> ?? 71?? /* Most hypervisors only have information in leaf 0x40000000.
>>> ?? 72??? *
>>> ?? 73??? * Some hypervisors have "Viridian [HyperV] extensions", and 
>>> those
>>> ?? 74??? * must appear in slot 0x40000000, but they will also have 
>>> the true
>>> ?? 75??? * hypervisor in a higher slot.
>>>
>>> You have to be able to check this on a range of HV's to ensure you 
>>> have not broken anything.
>>
>> Currently this feature supports VMware, Hyper-V, KVM, Xen.
>> We can distinguish them from CPUID with 40000000h. So we should not 
>> check other than 40000000h.
>>
>> ??? VMware: https://kb.vmware.com/s/article/1009458
>> ?? Hyper-V: 
>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
>>
> 
> Sorry, Hyper-V spec is here:
> 
>    
> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/tlfs
> 
> In Windows Server 2019, you can see CPUID in 2.4.1 .
> 
> 
>> ?????? KVM: https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>> ?????? Xen: 
>> https://xenbits.xen.org/docs/unstable/hypercall/x86_32/include,public,arch-x86,cpuid.h.html 
>>
>>
>>
>>> Did you actually diagnose why the existing code mis-detects Hyper-V 
>>> under Windows?
>>
>> CPUID with EAX = 40000000h returns "Microsoft Hv" if Hyper-V is 
>> installed. I guess it is caused by Hyper-V architecture.

But do we actually check the enabled bit:

"Bit 31 returned in ECX is defined as Not Used, and will always return 0 
from the physical CPU. A hypervisor conformant with the Microsoft 
hypervisor interface will set CPUID.1:ECX [bit 31] = 1 to indicate its 
presence to software."

?

Thanks,
David
-----

>> According to [1], root partition is as a host OS, so I guess JVM would 
>> detect which is running on Hyper-V even if it is running on host OS.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] 
>> https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
>>
>>
>>
>>> David
>>> -----
>>>
>>>> ?? - Does not check CPUID hypervisor present bit [1]
>>>> ?? - Does not support x86 (32bit) platform
>>>>
>>>> I've tested this change on submit repo, and have checked output from 
>>>> VM.info jcmd on following environment:
>>>>
>>>> ?? - Windows x64 (host)
>>>> ?? - Windows x64 (Hyper-V guest)
>>>> ?? - Fedora32 x64 (Hyper-V guest)
>>>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> [1] https://kb.vmware.com/s/article/1009458

From suenaga at oss.nttdata.com  Mon Jul 27 07:16:08 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 16:16:08 +0900
Subject: Hypervisor detector for Windows
In-Reply-To: <AM0PR02MB5412E6CBA26AEFD727B268F893720@AM0PR02MB5412.eurprd02.prod.outlook.com>
References: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>
 <c0823991-e790-5d09-e963-cc9507c1d1b0@oracle.com>
 <2cd75fd8-5a9d-415d-4817-be086b311c88@oss.nttdata.com>
 <AM0PR02MB5412E6CBA26AEFD727B268F893720@AM0PR02MB5412.eurprd02.prod.outlook.com>
Message-ID: <8686efd0-43f6-35ba-eb70-8655f9746e68@oss.nttdata.com>

On 2020/07/27 15:59, Baesken, Matthias wrote:
> Hi  Yasumasa , I'll  put  your  patch  into our build/test  queue  and check the outcome.

Thanks Matthias, can you check this webrev rather than in my first email?

   http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/


Yasumasa


> And thanks for filing the JBS issue too .
>   
> Regarding ...
> 
>> Does not support x86 (32bit) platform
> 
> Yes , I think 32bit is not very important any more in current OpenJDK .
> But if your patch addresses it too , then  it's not a bad thing for sure !
> 
> Best regards, Matthias
> 
> 
> 
> 
> -----Original Message-----
> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
> Sent: Samstag, 25. Juli 2020 17:02
> To: David Holmes <david.holmes at oracle.com>; hotspot-runtime-dev at openjdk.java.net; Baesken, Matthias <matthias.baesken at sap.com>
> Subject: Re: Hypervisor detector for Windows
> 
> Thanks David!
> 
> On 2020/07/25 21:13, David Holmes wrote:
>> Hi Yasumasa,
>>
>> My recollection from reviewing this was that it was all based on vendor strings. Best to ask Matthias (cc'd) if you need more details.
> 
> I think we can fix like following webrev. It works fine on Windows 10 (host), Windows 10 (guest) on Hyper-V, and Fedora 32 on Hyper-V.
> Matthias, what do you think? If this webrev seems good, I will file it to JBS and will send review request.
> 
>     http://cr.openjdk.java.net/~ysuenaga/hv-detection/
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
>> Cheers,
>> David
>>
>> On 25/07/2020 3:51 pm, Yasumasa Suenaga wrote:
>>> Hi all,
>>>
>>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>>
>>> I tried to file it to JBS and to fix it, but I have some questions for this.
>>> (This feature has been introduced in JDK-8219241)
>>>
>>>
>>>  ?? - According to [1] (it is mentioned in the source code), we need to check bit 31 in ECX when CPUID is called with EAX = 1h. Why it would not do so?
>>>
>>>  ?? - Why would VM_Version::check_virtualizations() call CPUID with 40000000h to 4000FF00h? 40000000h should be used if we want to get vendor ID.
>>>
>>>  ?? - Why VM_Version::check_virt_cpuid() is separated for GNU C (GAS) and MacroAssembler? I guess we can use MacroAssembler for x86 / x86_64.
>>>
>>>  ?? - In case of Hyper-V, host OS is treated as root partition [2], so we cannot use this CPUID solution for Hyper-V. I guess we need to check it with other solutions like [3].
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> [1] https://kb.vmware.com/s/article/1009458
>>> [2] https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>> [3] https://stackoverflow.com/questions/10544498/detect-the-virtualization-layer-from-a-guest-instancevm-vpc-or-hyper-v-in-c

From david.holmes at oracle.com  Mon Jul 27 07:18:19 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 17:18:19 +1000
Subject: Hypervisor detector for Windows
In-Reply-To: <8686efd0-43f6-35ba-eb70-8655f9746e68@oss.nttdata.com>
References: <b42f0008-bfdf-f73a-0d24-a54e7202f973@oss.nttdata.com>
 <c0823991-e790-5d09-e963-cc9507c1d1b0@oracle.com>
 <2cd75fd8-5a9d-415d-4817-be086b311c88@oss.nttdata.com>
 <AM0PR02MB5412E6CBA26AEFD727B268F893720@AM0PR02MB5412.eurprd02.prod.outlook.com>
 <8686efd0-43f6-35ba-eb70-8655f9746e68@oss.nttdata.com>
Message-ID: <4059199a-b9f4-65da-7203-b8adf8aaa881@oracle.com>

Please follow up on the review thread.

Thanks,
David

On 27/07/2020 5:16 pm, Yasumasa Suenaga wrote:
> On 2020/07/27 15:59, Baesken, Matthias wrote:
>> Hi? Yasumasa , I'll? put? your? patch? into our build/test? queue? and 
>> check the outcome.
> 
> Thanks Matthias, can you check this webrev rather than in my first email?
> 
>  ? http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
> 
> 
> Yasumasa
> 
> 
>> And thanks for filing the JBS issue too .
>> Regarding ...
>>
>>> Does not support x86 (32bit) platform
>>
>> Yes , I think 32bit is not very important any more in current OpenJDK .
>> But if your patch addresses it too , then? it's not a bad thing for 
>> sure !
>>
>> Best regards, Matthias
>>
>>
>>
>>
>> -----Original Message-----
>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>> Sent: Samstag, 25. Juli 2020 17:02
>> To: David Holmes <david.holmes at oracle.com>; 
>> hotspot-runtime-dev at openjdk.java.net; Baesken, Matthias 
>> <matthias.baesken at sap.com>
>> Subject: Re: Hypervisor detector for Windows
>>
>> Thanks David!
>>
>> On 2020/07/25 21:13, David Holmes wrote:
>>> Hi Yasumasa,
>>>
>>> My recollection from reviewing this was that it was all based on 
>>> vendor strings. Best to ask Matthias (cc'd) if you need more details.
>>
>> I think we can fix like following webrev. It works fine on Windows 10 
>> (host), Windows 10 (guest) on Hyper-V, and Fedora 32 on Hyper-V.
>> Matthias, what do you think? If this webrev seems good, I will file it 
>> to JBS and will send review request.
>>
>> ??? http://cr.openjdk.java.net/~ysuenaga/hv-detection/
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> Cheers,
>>> David
>>>
>>> On 25/07/2020 3:51 pm, Yasumasa Suenaga wrote:
>>>> Hi all,
>>>>
>>>> When I got hs_err log on Windows, I saw "HyperV virtualization 
>>>> detected" in it in spite of running on host OS.
>>>>
>>>> I tried to file it to JBS and to fix it, but I have some questions 
>>>> for this.
>>>> (This feature has been introduced in JDK-8219241)
>>>>
>>>>
>>>> ??? - According to [1] (it is mentioned in the source code), we need 
>>>> to check bit 31 in ECX when CPUID is called with EAX = 1h. Why it 
>>>> would not do so?
>>>>
>>>> ??? - Why would VM_Version::check_virtualizations() call CPUID with 
>>>> 40000000h to 4000FF00h? 40000000h should be used if we want to get 
>>>> vendor ID.
>>>>
>>>> ??? - Why VM_Version::check_virt_cpuid() is separated for GNU C 
>>>> (GAS) and MacroAssembler? I guess we can use MacroAssembler for x86 
>>>> / x86_64.
>>>>
>>>> ??? - In case of Hyper-V, host OS is treated as root partition [2], 
>>>> so we cannot use this CPUID solution for Hyper-V. I guess we need to 
>>>> check it with other solutions like [3].
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> [1] https://kb.vmware.com/s/article/1009458
>>>> [2] 
>>>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
>>>>
>>>> [3] 
>>>> https://stackoverflow.com/questions/10544498/detect-the-virtualization-layer-from-a-guest-instancevm-vpc-or-hyper-v-in-c 
>>>>

From suenaga at oss.nttdata.com  Mon Jul 27 07:20:05 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 16:20:05 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
 <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
Message-ID: <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>

On 2020/07/27 16:09, David Holmes wrote:
> On 27/07/2020 3:23 pm, Yasumasa Suenaga wrote:
>> On 2020/07/27 14:21, Yasumasa Suenaga wrote:
>>> Hi David,
>>>
>>> On 2020/07/27 14:02, David Holmes wrote:
>>>>
>>>>
>>>> On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
>>>>> Hi all,
>>>>>
>>>>> Please review this change:
>>>>>
>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>>>>
>>>>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>>>>
>>>>> Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:
>>>>>
>>>>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>>>>> ?? - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
>>>>
>>>> That VMWare document is not a "spec" for anything other than VMware. So this may work for VMWare:
>>>>
>>>> ?? a->movl(rax, 0x40000000);
>>>>
>>>> but may not work for all other HV environments - which is why the original code checks a range of addresses within the reserved area. See this related code for example:
>>>>
>>>> http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD
>>>>
>>>> ?? 71?? /* Most hypervisors only have information in leaf 0x40000000.
>>>> ?? 72??? *
>>>> ?? 73??? * Some hypervisors have "Viridian [HyperV] extensions", and those
>>>> ?? 74??? * must appear in slot 0x40000000, but they will also have the true
>>>> ?? 75??? * hypervisor in a higher slot.
>>>>
>>>> You have to be able to check this on a range of HV's to ensure you have not broken anything.
>>>
>>> Currently this feature supports VMware, Hyper-V, KVM, Xen.
>>> We can distinguish them from CPUID with 40000000h. So we should not check other than 40000000h.
>>>
>>> ??? VMware: https://kb.vmware.com/s/article/1009458
>>> ?? Hyper-V: https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>
>> Sorry, Hyper-V spec is here:
>>
>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/tlfs
>>
>> In Windows Server 2019, you can see CPUID in 2.4.1 .
>>
>>
>>> ?????? KVM: https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>>> ?????? Xen: https://xenbits.xen.org/docs/unstable/hypercall/x86_32/include,public,arch-x86,cpuid.h.html
>>>
>>>
>>>> Did you actually diagnose why the existing code mis-detects Hyper-V under Windows?
>>>
>>> CPUID with EAX = 40000000h returns "Microsoft Hv" if Hyper-V is installed. I guess it is caused by Hyper-V architecture.
> 
> But do we actually check the enabled bit:
> 
> "Bit 31 returned in ECX is defined as Not Used, and will always return 0 from the physical CPU. A hypervisor conformant with the Microsoft hypervisor interface will set CPUID.1:ECX [bit 31] = 1 to indicate its presence to software."
> 
> ?

CPUID with EAX = 40000000h would be handled in vCPU, so it is worth to check bit 31.
KVM document implies it.

   https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html


Thanks,

Yasumasa


> Thanks,
> David
> -----
> 
>>> According to [1], root partition is as a host OS, so I guess JVM would detect which is running on Hyper-V even if it is running on host OS.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> [1] https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>>
>>>
>>>> David
>>>> -----
>>>>
>>>>> ?? - Does not check CPUID hypervisor present bit [1]
>>>>> ?? - Does not support x86 (32bit) platform
>>>>>
>>>>> I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:
>>>>>
>>>>> ?? - Windows x64 (host)
>>>>> ?? - Windows x64 (Hyper-V guest)
>>>>> ?? - Fedora32 x64 (Hyper-V guest)
>>>>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> [1] https://kb.vmware.com/s/article/1009458

From david.holmes at oracle.com  Mon Jul 27 07:48:25 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 17:48:25 +1000
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
 <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
 <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>
Message-ID: <5d69c718-06b4-d510-4773-6f545c82dc70@oracle.com>

On 27/07/2020 5:20 pm, Yasumasa Suenaga wrote:
> On 2020/07/27 16:09, David Holmes wrote:
>> On 27/07/2020 3:23 pm, Yasumasa Suenaga wrote:
>>> On 2020/07/27 14:21, Yasumasa Suenaga wrote:
>>>> Hi David,
>>>>
>>>> On 2020/07/27 14:02, David Holmes wrote:
>>>>>
>>>>>
>>>>> On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Please review this change:
>>>>>>
>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>>>>>> ?? webrev: 
>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>>>>>
>>>>>> When I got hs_err log on Windows, I saw "HyperV virtualization 
>>>>>> detected" in it in spite of running on host OS.
>>>>>>
>>>>>> Hypervisor detector has been introduced in JDK-8219241, but it has 
>>>>>> some problems as below:
>>>>>>
>>>>>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>>>>>> ?? - Call CPUID with other than EAX = 40000000h (it is not 
>>>>>> described in the spec [1])
>>>>>
>>>>> That VMWare document is not a "spec" for anything other than 
>>>>> VMware. So this may work for VMWare:
>>>>>
>>>>> ?? a->movl(rax, 0x40000000);
>>>>>
>>>>> but may not work for all other HV environments - which is why the 
>>>>> original code checks a range of addresses within the reserved area. 
>>>>> See this related code for example:
>>>>>
>>>>> http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD 
>>>>>
>>>>>
>>>>> ?? 71?? /* Most hypervisors only have information in leaf 0x40000000.
>>>>> ?? 72??? *
>>>>> ?? 73??? * Some hypervisors have "Viridian [HyperV] extensions", 
>>>>> and those
>>>>> ?? 74??? * must appear in slot 0x40000000, but they will also have 
>>>>> the true
>>>>> ?? 75??? * hypervisor in a higher slot.
>>>>>
>>>>> You have to be able to check this on a range of HV's to ensure you 
>>>>> have not broken anything.
>>>>
>>>> Currently this feature supports VMware, Hyper-V, KVM, Xen.
>>>> We can distinguish them from CPUID with 40000000h. So we should not 
>>>> check other than 40000000h.
>>>>
>>>> ??? VMware: https://kb.vmware.com/s/article/1009458
>>>> ?? Hyper-V: 
>>>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
>>>>
>>>
>>> Sorry, Hyper-V spec is here:
>>>
>>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/tlfs 
>>>
>>>
>>> In Windows Server 2019, you can see CPUID in 2.4.1 .
>>>
>>>
>>>> ?????? KVM: https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>>>> ?????? Xen: 
>>>> https://xenbits.xen.org/docs/unstable/hypercall/x86_32/include,public,arch-x86,cpuid.h.html 
>>>>
>>>>
>>>>
>>>>> Did you actually diagnose why the existing code mis-detects Hyper-V 
>>>>> under Windows?
>>>>
>>>> CPUID with EAX = 40000000h returns "Microsoft Hv" if Hyper-V is 
>>>> installed. I guess it is caused by Hyper-V architecture.
>>
>> But do we actually check the enabled bit:
>>
>> "Bit 31 returned in ECX is defined as Not Used, and will always return 
>> 0 from the physical CPU. A hypervisor conformant with the Microsoft 
>> hypervisor interface will set CPUID.1:ECX [bit 31] = 1 to indicate its 
>> presence to software."
>>
>> ?
> 
> CPUID with EAX = 40000000h would be handled in vCPU, so it is worth to 
> check bit 31.
> KVM document implies it.
> 
>  ? https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html

Let me try this again. This bug report was created because the hpervisor 
detection code is falsely reporting we are executing in hyper-v under 
Windows even when we are actually running a non-virtualized native OS. 
My question about the original is code is what part of it, exactly, is 
responsible for that false report?

Thanks,
David

> 
> Thanks,
> 
> Yasumasa
> 
> 
>> Thanks,
>> David
>> -----
>>
>>>> According to [1], root partition is as a host OS, so I guess JVM 
>>>> would detect which is running on Hyper-V even if it is running on 
>>>> host OS.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> [1] 
>>>> https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
>>>>
>>>>
>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> ?? - Does not check CPUID hypervisor present bit [1]
>>>>>> ?? - Does not support x86 (32bit) platform
>>>>>>
>>>>>> I've tested this change on submit repo, and have checked output 
>>>>>> from VM.info jcmd on following environment:
>>>>>>
>>>>>> ?? - Windows x64 (host)
>>>>>> ?? - Windows x64 (Hyper-V guest)
>>>>>> ?? - Fedora32 x64 (Hyper-V guest)
>>>>>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> [1] https://kb.vmware.com/s/article/1009458

From suenaga at oss.nttdata.com  Mon Jul 27 08:05:53 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 17:05:53 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <5d69c718-06b4-d510-4773-6f545c82dc70@oracle.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
 <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
 <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>
 <5d69c718-06b4-d510-4773-6f545c82dc70@oracle.com>
Message-ID: <2f8e68dd-0f30-4b05-5144-7847590519aa@oss.nttdata.com>

On 2020/07/27 16:48, David Holmes wrote:
> On 27/07/2020 5:20 pm, Yasumasa Suenaga wrote:
>> On 2020/07/27 16:09, David Holmes wrote:
>>> On 27/07/2020 3:23 pm, Yasumasa Suenaga wrote:
>>>> On 2020/07/27 14:21, Yasumasa Suenaga wrote:
>>>>> Hi David,
>>>>>
>>>>> On 2020/07/27 14:02, David Holmes wrote:
>>>>>>
>>>>>>
>>>>>> On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Please review this change:
>>>>>>>
>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>>>>>>
>>>>>>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>>>>>>
>>>>>>> Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:
>>>>>>>
>>>>>>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>>>>>>> ?? - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
>>>>>>
>>>>>> That VMWare document is not a "spec" for anything other than VMware. So this may work for VMWare:
>>>>>>
>>>>>> ?? a->movl(rax, 0x40000000);
>>>>>>
>>>>>> but may not work for all other HV environments - which is why the original code checks a range of addresses within the reserved area. See this related code for example:
>>>>>>
>>>>>> http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD
>>>>>>
>>>>>> ?? 71?? /* Most hypervisors only have information in leaf 0x40000000.
>>>>>> ?? 72??? *
>>>>>> ?? 73??? * Some hypervisors have "Viridian [HyperV] extensions", and those
>>>>>> ?? 74??? * must appear in slot 0x40000000, but they will also have the true
>>>>>> ?? 75??? * hypervisor in a higher slot.
>>>>>>
>>>>>> You have to be able to check this on a range of HV's to ensure you have not broken anything.
>>>>>
>>>>> Currently this feature supports VMware, Hyper-V, KVM, Xen.
>>>>> We can distinguish them from CPUID with 40000000h. So we should not check other than 40000000h.
>>>>>
>>>>> ??? VMware: https://kb.vmware.com/s/article/1009458
>>>>> ?? Hyper-V: https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>>>
>>>> Sorry, Hyper-V spec is here:
>>>>
>>>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/tlfs
>>>>
>>>> In Windows Server 2019, you can see CPUID in 2.4.1 .
>>>>
>>>>
>>>>> ?????? KVM: https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>>>>> ?????? Xen: https://xenbits.xen.org/docs/unstable/hypercall/x86_32/include,public,arch-x86,cpuid.h.html
>>>>>
>>>>>
>>>>>> Did you actually diagnose why the existing code mis-detects Hyper-V under Windows?
>>>>>
>>>>> CPUID with EAX = 40000000h returns "Microsoft Hv" if Hyper-V is installed. I guess it is caused by Hyper-V architecture.
>>>
>>> But do we actually check the enabled bit:
>>>
>>> "Bit 31 returned in ECX is defined as Not Used, and will always return 0 from the physical CPU. A hypervisor conformant with the Microsoft hypervisor interface will set CPUID.1:ECX [bit 31] = 1 to indicate its presence to software."
>>>
>>> ?
>>
>> CPUID with EAX = 40000000h would be handled in vCPU, so it is worth to check bit 31.
>> KVM document implies it.
>>
>> ?? https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
> 
> Let me try this again. This bug report was created because the hpervisor detection code is falsely reporting we are executing in hyper-v under Windows even when we are actually running a non-virtualized native OS. My question about the original is code is what part of it, exactly, is responsible for that false report?

VM_Version::check_virtualizations() would check whether the process is running on virtual machine with CPUID (EAX = 40000000h to 4000ff00h, however it would detect Hyper-V (CPUID returns "Microsoft Hv") even if the process is running on host OS. It should be check in other solution (e.g. WMI)

This is a problem in Hyper-V, but I found out some related issues in hypervisor detection code, so I want to fix them together:

   - Lack of x86 (32 bit) support
   - Lack of hypervisor present bit (bit 31) check
   - Calling CPUID with other than EAX = 40000000h

If they are the cause of confusing, I can separate them in other issues.
(They are not Hyper-V specific issues)


Thanks,

Yasumasa


> Thanks,
> David
> 
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>> According to [1], root partition is as a host OS, so I guess JVM would detect which is running on Hyper-V even if it is running on host OS.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> [1] https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>>>>
>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> ?? - Does not check CPUID hypervisor present bit [1]
>>>>>>> ?? - Does not support x86 (32bit) platform
>>>>>>>
>>>>>>> I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:
>>>>>>>
>>>>>>> ?? - Windows x64 (host)
>>>>>>> ?? - Windows x64 (Hyper-V guest)
>>>>>>> ?? - Fedora32 x64 (Hyper-V guest)
>>>>>>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> [1] https://kb.vmware.com/s/article/1009458

From david.holmes at oracle.com  Mon Jul 27 08:15:35 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 18:15:35 +1000
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <2f8e68dd-0f30-4b05-5144-7847590519aa@oss.nttdata.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
 <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
 <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>
 <5d69c718-06b4-d510-4773-6f545c82dc70@oracle.com>
 <2f8e68dd-0f30-4b05-5144-7847590519aa@oss.nttdata.com>
Message-ID: <d9d593cc-fba4-93ad-57a0-832ba9944481@oracle.com>

On 27/07/2020 6:05 pm, Yasumasa Suenaga wrote:
> On 2020/07/27 16:48, David Holmes wrote:
>> On 27/07/2020 5:20 pm, Yasumasa Suenaga wrote:
>>> On 2020/07/27 16:09, David Holmes wrote:
>>>> On 27/07/2020 3:23 pm, Yasumasa Suenaga wrote:
>>>>> On 2020/07/27 14:21, Yasumasa Suenaga wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> On 2020/07/27 14:02, David Holmes wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Please review this change:
>>>>>>>>
>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>>>>>>>> ?? webrev: 
>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>>>>>>>
>>>>>>>> When I got hs_err log on Windows, I saw "HyperV virtualization 
>>>>>>>> detected" in it in spite of running on host OS.
>>>>>>>>
>>>>>>>> Hypervisor detector has been introduced in JDK-8219241, but it 
>>>>>>>> has some problems as below:
>>>>>>>>
>>>>>>>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>>>>>>>> ?? - Call CPUID with other than EAX = 40000000h (it is not 
>>>>>>>> described in the spec [1])
>>>>>>>
>>>>>>> That VMWare document is not a "spec" for anything other than 
>>>>>>> VMware. So this may work for VMWare:
>>>>>>>
>>>>>>> ?? a->movl(rax, 0x40000000);
>>>>>>>
>>>>>>> but may not work for all other HV environments - which is why the 
>>>>>>> original code checks a range of addresses within the reserved 
>>>>>>> area. See this related code for example:
>>>>>>>
>>>>>>> http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD 
>>>>>>>
>>>>>>>
>>>>>>> ?? 71?? /* Most hypervisors only have information in leaf 
>>>>>>> 0x40000000.
>>>>>>> ?? 72??? *
>>>>>>> ?? 73??? * Some hypervisors have "Viridian [HyperV] extensions", 
>>>>>>> and those
>>>>>>> ?? 74??? * must appear in slot 0x40000000, but they will also 
>>>>>>> have the true
>>>>>>> ?? 75??? * hypervisor in a higher slot.
>>>>>>>
>>>>>>> You have to be able to check this on a range of HV's to ensure 
>>>>>>> you have not broken anything.
>>>>>>
>>>>>> Currently this feature supports VMware, Hyper-V, KVM, Xen.
>>>>>> We can distinguish them from CPUID with 40000000h. So we should 
>>>>>> not check other than 40000000h.
>>>>>>
>>>>>> ??? VMware: https://kb.vmware.com/s/article/1009458
>>>>>> ?? Hyper-V: 
>>>>>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
>>>>>>
>>>>>
>>>>> Sorry, Hyper-V spec is here:
>>>>>
>>>>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/tlfs 
>>>>>
>>>>>
>>>>> In Windows Server 2019, you can see CPUID in 2.4.1 .
>>>>>
>>>>>
>>>>>> ?????? KVM: 
>>>>>> https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>>>>>> ?????? Xen: 
>>>>>> https://xenbits.xen.org/docs/unstable/hypercall/x86_32/include,public,arch-x86,cpuid.h.html 
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Did you actually diagnose why the existing code mis-detects 
>>>>>>> Hyper-V under Windows?
>>>>>>
>>>>>> CPUID with EAX = 40000000h returns "Microsoft Hv" if Hyper-V is 
>>>>>> installed. I guess it is caused by Hyper-V architecture.
>>>>
>>>> But do we actually check the enabled bit:
>>>>
>>>> "Bit 31 returned in ECX is defined as Not Used, and will always 
>>>> return 0 from the physical CPU. A hypervisor conformant with the 
>>>> Microsoft hypervisor interface will set CPUID.1:ECX [bit 31] = 1 to 
>>>> indicate its presence to software."
>>>>
>>>> ?
>>>
>>> CPUID with EAX = 40000000h would be handled in vCPU, so it is worth 
>>> to check bit 31.
>>> KVM document implies it.
>>>
>>> ?? https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>>
>> Let me try this again. This bug report was created because the 
>> hpervisor detection code is falsely reporting we are executing in 
>> hyper-v under Windows even when we are actually running a 
>> non-virtualized native OS. My question about the original is code is 
>> what part of it, exactly, is responsible for that false report?
> 
> VM_Version::check_virtualizations() would check whether the process is 
> running on virtual machine with CPUID (EAX = 40000000h to 4000ff00h, 
> however it would detect Hyper-V (CPUID returns "Microsoft Hv") even if 
> the process is running on host OS. It should be check in other solution 
> (e.g. WMI)
> 
> This is a problem in Hyper-V, but I found out some related issues in 
> hypervisor detection code, so I want to fix them together:
> 
>  ? - Lack of x86 (32 bit) support
>  ? - Lack of hypervisor present bit (bit 31) check

Okay so I'm going to assume that a system might fill out the CPUID leaf 
values unconditionally and expect the programmers to check bit 31 and 
only if it is enabled use the information from the leaf. Hence the 
current code reports Hyper-V when it isn't enabled. But the new code can 
just skip all the leaf reading when it isn't enabled - right? (so 
startup should be marginally faster)

Thanks,
David
-----

>  ? - Calling CPUID with other than EAX = 40000000h
> 
> If they are the cause of confusing, I can separate them in other issues.
> (They are not Hyper-V specific issues)
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
>> Thanks,
>> David
>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>>> According to [1], root partition is as a host OS, so I guess JVM 
>>>>>> would detect which is running on Hyper-V even if it is running on 
>>>>>> host OS.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> [1] 
>>>>>> https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture 
>>>>>>
>>>>>>
>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> ?? - Does not check CPUID hypervisor present bit [1]
>>>>>>>> ?? - Does not support x86 (32bit) platform
>>>>>>>>
>>>>>>>> I've tested this change on submit repo, and have checked output 
>>>>>>>> from VM.info jcmd on following environment:
>>>>>>>>
>>>>>>>> ?? - Windows x64 (host)
>>>>>>>> ?? - Windows x64 (Hyper-V guest)
>>>>>>>> ?? - Fedora32 x64 (Hyper-V guest)
>>>>>>>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://kb.vmware.com/s/article/1009458

From suenaga at oss.nttdata.com  Mon Jul 27 08:33:58 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 17:33:58 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <d9d593cc-fba4-93ad-57a0-832ba9944481@oracle.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
 <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
 <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>
 <5d69c718-06b4-d510-4773-6f545c82dc70@oracle.com>
 <2f8e68dd-0f30-4b05-5144-7847590519aa@oss.nttdata.com>
 <d9d593cc-fba4-93ad-57a0-832ba9944481@oracle.com>
Message-ID: <d870e6b8-38b4-7521-6b36-dbe4cb7a1ec3@oss.nttdata.com>

On 2020/07/27 17:15, David Holmes wrote:
> On 27/07/2020 6:05 pm, Yasumasa Suenaga wrote:
>> On 2020/07/27 16:48, David Holmes wrote:
>>> On 27/07/2020 5:20 pm, Yasumasa Suenaga wrote:
>>>> On 2020/07/27 16:09, David Holmes wrote:
>>>>> On 27/07/2020 3:23 pm, Yasumasa Suenaga wrote:
>>>>>> On 2020/07/27 14:21, Yasumasa Suenaga wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> On 2020/07/27 14:02, David Holmes wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 27/07/2020 2:24 pm, Yasumasa Suenaga wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> Please review this change:
>>>>>>>>>
>>>>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>>>>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>>>>>>>>
>>>>>>>>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>>>>>>>>
>>>>>>>>> Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:
>>>>>>>>>
>>>>>>>>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>>>>>>>>> ?? - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
>>>>>>>>
>>>>>>>> That VMWare document is not a "spec" for anything other than VMware. So this may work for VMWare:
>>>>>>>>
>>>>>>>> ?? a->movl(rax, 0x40000000);
>>>>>>>>
>>>>>>>> but may not work for all other HV environments - which is why the original code checks a range of addresses within the reserved area. See this related code for example:
>>>>>>>>
>>>>>>>> http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what-cpuid-helper.c;h=9c6cdb290105ca86868e2c7935ed42a55598b0f7;hb=HEAD
>>>>>>>>
>>>>>>>> ?? 71?? /* Most hypervisors only have information in leaf 0x40000000.
>>>>>>>> ?? 72??? *
>>>>>>>> ?? 73??? * Some hypervisors have "Viridian [HyperV] extensions", and those
>>>>>>>> ?? 74??? * must appear in slot 0x40000000, but they will also have the true
>>>>>>>> ?? 75??? * hypervisor in a higher slot.
>>>>>>>>
>>>>>>>> You have to be able to check this on a range of HV's to ensure you have not broken anything.
>>>>>>>
>>>>>>> Currently this feature supports VMware, Hyper-V, KVM, Xen.
>>>>>>> We can distinguish them from CPUID with 40000000h. So we should not check other than 40000000h.
>>>>>>>
>>>>>>> ??? VMware: https://kb.vmware.com/s/article/1009458
>>>>>>> ?? Hyper-V: https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>>>>>
>>>>>> Sorry, Hyper-V spec is here:
>>>>>>
>>>>>> https://docs.microsoft.com/virtualization/hyper-v-on-windows/reference/tlfs
>>>>>>
>>>>>> In Windows Server 2019, you can see CPUID in 2.4.1 .
>>>>>>
>>>>>>
>>>>>>> ?????? KVM: https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>>>>>>> ?????? Xen: https://xenbits.xen.org/docs/unstable/hypercall/x86_32/include,public,arch-x86,cpuid.h.html
>>>>>>>
>>>>>>>
>>>>>>>> Did you actually diagnose why the existing code mis-detects Hyper-V under Windows?
>>>>>>>
>>>>>>> CPUID with EAX = 40000000h returns "Microsoft Hv" if Hyper-V is installed. I guess it is caused by Hyper-V architecture.
>>>>>
>>>>> But do we actually check the enabled bit:
>>>>>
>>>>> "Bit 31 returned in ECX is defined as Not Used, and will always return 0 from the physical CPU. A hypervisor conformant with the Microsoft hypervisor interface will set CPUID.1:ECX [bit 31] = 1 to indicate its presence to software."
>>>>>
>>>>> ?
>>>>
>>>> CPUID with EAX = 40000000h would be handled in vCPU, so it is worth to check bit 31.
>>>> KVM document implies it.
>>>>
>>>> ?? https://www.kernel.org/doc/html/latest/virt/kvm/cpuid.html
>>>
>>> Let me try this again. This bug report was created because the hpervisor detection code is falsely reporting we are executing in hyper-v under Windows even when we are actually running a non-virtualized native OS. My question about the original is code is what part of it, exactly, is responsible for that false report?
>>
>> VM_Version::check_virtualizations() would check whether the process is running on virtual machine with CPUID (EAX = 40000000h to 4000ff00h, however it would detect Hyper-V (CPUID returns "Microsoft Hv") even if the process is running on host OS. It should be check in other solution (e.g. WMI)
>>
>> This is a problem in Hyper-V, but I found out some related issues in hypervisor detection code, so I want to fix them together:
>>
>> ?? - Lack of x86 (32 bit) support
>> ?? - Lack of hypervisor present bit (bit 31) check
> 
> Okay so I'm going to assume that a system might fill out the CPUID leaf values unconditionally and expect the programmers to check bit 31 and only if it is enabled use the information from the leaf. Hence the current code reports Hyper-V when it isn't enabled. But the new code can just skip all the leaf reading when it isn't enabled - right? (so startup should be marginally faster)

The current code reports Hyper-V even if it is running on native OS (root partition - not a guest OS).
In root partition, bit 31 is always 1 and Hyper-V is always reported (CPUID always returns "Microsoft Hv")

To avoid this problem, the new code would check with WMI whether it is running on Virtual Machine or not.


> Thanks,
> David
> -----
> 
>> ?? - Calling CPUID with other than EAX = 40000000h
>>
>> If they are the cause of confusing, I can separate them in other issues.
>> (They are not Hyper-V specific issues)
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>>> Thanks,
>>> David
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>>> According to [1], root partition is as a host OS, so I guess JVM would detect which is running on Hyper-V even if it is running on host OS.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> [1] https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>>>>>>
>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>> ?? - Does not check CPUID hypervisor present bit [1]
>>>>>>>>> ?? - Does not support x86 (32bit) platform
>>>>>>>>>
>>>>>>>>> I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:
>>>>>>>>>
>>>>>>>>> ?? - Windows x64 (host)
>>>>>>>>> ?? - Windows x64 (Hyper-V guest)
>>>>>>>>> ?? - Fedora32 x64 (Hyper-V guest)
>>>>>>>>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] https://kb.vmware.com/s/article/1009458

From suenaga at oss.nttdata.com  Mon Jul 27 08:42:21 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 17:42:21 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <d870e6b8-38b4-7521-6b36-dbe4cb7a1ec3@oss.nttdata.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
 <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
 <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>
 <5d69c718-06b4-d510-4773-6f545c82dc70@oracle.com>
 <2f8e68dd-0f30-4b05-5144-7847590519aa@oss.nttdata.com>
 <d9d593cc-fba4-93ad-57a0-832ba9944481@oracle.com>
 <d870e6b8-38b4-7521-6b36-dbe4cb7a1ec3@oss.nttdata.com>
Message-ID: <a4eb2df7-9dff-59bd-90d4-e0713da22634@oss.nttdata.com>

On 2020/07/27 17:33, Yasumasa Suenaga wrote:

...snip...

>>> VM_Version::check_virtualizations() would check whether the process is running on virtual machine with CPUID (EAX = 40000000h to 4000ff00h, however it would detect Hyper-V (CPUID returns "Microsoft Hv") even if the process is running on host OS. It should be check in other solution (e.g. WMI)
>>>
>>> This is a problem in Hyper-V, but I found out some related issues in hypervisor detection code, so I want to fix them together:
>>>
>>> ?? - Lack of x86 (32 bit) support
>>> ?? - Lack of hypervisor present bit (bit 31) check
>>
>> Okay so I'm going to assume that a system might fill out the CPUID leaf values unconditionally and expect the programmers to check bit 31 and only if it is enabled use the information from the leaf. Hence the current code reports Hyper-V when it isn't enabled. But the new code can just skip all the leaf reading when it isn't enabled - right? (so startup should be marginally faster)
> 
> The current code reports Hyper-V even if it is running on native OS (root partition - not a guest OS).
> In root partition, bit 31 is always 1 and Hyper-V is always reported (CPUID always returns "Microsoft Hv")
> 
> To avoid this problem, the new code would check with WMI whether it is running on Virtual Machine or not.

I cannot evaluate, but I guess bit 31 is 0 when Hyper-V is not installed.
If so, the new code skip all the leaf reading as you said.


Thanks,

Yasumasa


>> Thanks,
>> David
>> -----
>>
>>> ?? - Calling CPUID with other than EAX = 40000000h
>>>
>>> If they are the cause of confusing, I can separate them in other issues.
>>> (They are not Hyper-V specific issues)
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>>> Thanks,
>>>> David
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>>> According to [1], root partition is as a host OS, so I guess JVM would detect which is running on Hyper-V even if it is running on host OS.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://docs.microsoft.com/ja-jp/virtualization/hyper-v-on-windows/reference/hyper-v-architecture
>>>>>>>>
>>>>>>>>
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> ?? - Does not check CPUID hypervisor present bit [1]
>>>>>>>>>> ?? - Does not support x86 (32bit) platform
>>>>>>>>>>
>>>>>>>>>> I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:
>>>>>>>>>>
>>>>>>>>>> ?? - Windows x64 (host)
>>>>>>>>>> ?? - Windows x64 (Hyper-V guest)
>>>>>>>>>> ?? - Fedora32 x64 (Hyper-V guest)
>>>>>>>>>> ?? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] https://kb.vmware.com/s/article/1009458

From fweimer at redhat.com  Mon Jul 27 08:42:20 2020
From: fweimer at redhat.com (Florian Weimer)
Date: Mon, 27 Jul 2020 10:42:20 +0200
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <d870e6b8-38b4-7521-6b36-dbe4cb7a1ec3@oss.nttdata.com> (Yasumasa
 Suenaga's message of "Mon, 27 Jul 2020 17:33:58 +0900")
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
 <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
 <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>
 <5d69c718-06b4-d510-4773-6f545c82dc70@oracle.com>
 <2f8e68dd-0f30-4b05-5144-7847590519aa@oss.nttdata.com>
 <d9d593cc-fba4-93ad-57a0-832ba9944481@oracle.com>
 <d870e6b8-38b4-7521-6b36-dbe4cb7a1ec3@oss.nttdata.com>
Message-ID: <87mu3l2vtf.fsf@oldenburg2.str.redhat.com>

* Yasumasa Suenaga:

> The current code reports Hyper-V even if it is running on native OS
> (root partition - not a guest OS).  In root partition, bit 31 is
> always 1 and Hyper-V is always reported (CPUID always returns
> "Microsoft Hv")
>
> To avoid this problem, the new code would check with WMI whether it is
> running on Virtual Machine or not.

But isn't it correct to say that the OS is running under virtualization
in this case?  It is running on a hypervisor, and not bare metal, after
all.

The KVM model, where you can run on the virtualization host itself
(pretty much like bare metal), is somewhat unusual and not even
supported directly by all CPU architectures.  Not all hypervisors use
this model.

Thanks,
Florian


From suenaga at oss.nttdata.com  Mon Jul 27 08:53:25 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 17:53:25 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <87mu3l2vtf.fsf@oldenburg2.str.redhat.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <5a23147f-cc05-e85d-d7ed-479a8096ca5f@oracle.com>
 <2f4e81a6-096b-205d-ff9d-123e23bec595@oss.nttdata.com>
 <94a3d854-ea93-bf75-0391-12dbb0a223f0@oss.nttdata.com>
 <6227e45b-e564-8760-3bf2-258818f1153a@oracle.com>
 <f5ead183-f7be-701a-b454-a561af3f90ba@oss.nttdata.com>
 <5d69c718-06b4-d510-4773-6f545c82dc70@oracle.com>
 <2f8e68dd-0f30-4b05-5144-7847590519aa@oss.nttdata.com>
 <d9d593cc-fba4-93ad-57a0-832ba9944481@oracle.com>
 <d870e6b8-38b4-7521-6b36-dbe4cb7a1ec3@oss.nttdata.com>
 <87mu3l2vtf.fsf@oldenburg2.str.redhat.com>
Message-ID: <d2b37c17-2d98-34dd-7ace-54b98a3ba07e@oss.nttdata.com>

On 2020/07/27 17:42, Florian Weimer wrote:
> * Yasumasa Suenaga:
> 
>> The current code reports Hyper-V even if it is running on native OS
>> (root partition - not a guest OS).  In root partition, bit 31 is
>> always 1 and Hyper-V is always reported (CPUID always returns
>> "Microsoft Hv")
>>
>> To avoid this problem, the new code would check with WMI whether it is
>> running on Virtual Machine or not.
> 
> But isn't it correct to say that the OS is running under virtualization
> in this case?  It is running on a hypervisor, and not bare metal, after
> all.

You are right in technically, but I think hypervisor detector is expected it would report the process is running on guest OS or not like Dom0 and DomU in Xen. I think it might be useful information from customers for troubleshooting.

At least, hs_err log which was generated in Windows was confused me why it reports Hyper-V when I haven't been installed Windows as a guest.


Thanks,

Yasumasa


> The KVM model, where you can run on the virtualization host itself
> (pretty much like bare metal), is somewhat unusual and not even
> supported directly by all CPU architectures.  Not all hypervisors use
> this model.
> 
> Thanks,
> Florian
> 

From matthias.baesken at sap.com  Mon Jul 27 10:36:43 2020
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Mon, 27 Jul 2020 10:36:43 +0000
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
Message-ID: <AM0PR02MB5412723053A1F4715A9387FE93720@AM0PR02MB5412.eurprd02.prod.outlook.com>

Hi Yasumasa, I put your patch into our internal  nightly  build/test  queue . 


>   - Hyper-V is detected on Windows in spite of running on host OS

So it is in your case a host that  runs  Hyper-V guests.  And the host  reports too  "HyperV virtualization detected" .
Do I get you  right ?
Maybe we should be more specific in this case (however I did not see much problem so far with the  current output ).

>   - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])

I think we had  some virtualizations that needed checking  CPUID with other than EAX = 40000000h  for detection, 
 But might be these were old or buggy versions , cannot find much details about it atm .

>   - Does not check CPUID hypervisor present bit [1]

Yes, this could/should indeed be added  ( we even did this in our internal JVM a while ago).

>   - Does not support x86 (32bit) platform

This is true , it was not added because   I think 32bit support has not much importance any more in current jdk .
However in case the new version does  support 32bit too, it is for sure a good thing!

Best regards, Matthias


-----Original Message-----
From: Yasumasa Suenaga <suenaga at oss.nttdata.com> 
Sent: Montag, 27. Juli 2020 06:25
To: hotspot-runtime-dev at openjdk.java.net; Baesken, Matthias <matthias.baesken at sap.com>
Cc: David Holmes <david.holmes at oracle.com>
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS

Hi all,

Please review this change:

   JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/

When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.

Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:

   - Hyper-V is detected on Windows in spite of running on host OS
   - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
   - Does not check CPUID hypervisor present bit [1]
   - Does not support x86 (32bit) platform

I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:

   - Windows x64 (host)
   - Windows x64 (Hyper-V guest)
   - Fedora32 x64 (Hyper-V guest)
   - 32 bit JDK on Fedora32 x64 (Hyper-V guest)


Thanks,

Yasumasa


[1] https://kb.vmware.com/s/article/1009458

From suenaga at oss.nttdata.com  Mon Jul 27 12:21:26 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Mon, 27 Jul 2020 21:21:26 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <AM0PR02MB5412723053A1F4715A9387FE93720@AM0PR02MB5412.eurprd02.prod.outlook.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <AM0PR02MB5412723053A1F4715A9387FE93720@AM0PR02MB5412.eurprd02.prod.outlook.com>
Message-ID: <05f4a122-6b17-37f9-5ee4-228704308a6f@oss.nttdata.com>

On 2020/07/27 19:36, Baesken, Matthias wrote:
> Hi Yasumasa, I put your patch into our internal  nightly  build/test  queue .

Thanks!

>>    - Hyper-V is detected on Windows in spite of running on host OS
> 
> So it is in your case a host that  runs  Hyper-V guests.  And the host  reports too  "HyperV virtualization detected" .
> Do I get you  right ?

Yes, that's right!

> Maybe we should be more specific in this case (however I did not see much problem so far with the  current output ).
> 
>>    - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
> 
> I think we had  some virtualizations that needed checking  CPUID with other than EAX = 40000000h  for detection,
>   But might be these were old or buggy versions , cannot find much details about it atm .

As David shows, virt-what checks CPUID with other than EAX = 40000000h, but at least, current supported hypervisor (VMware, Hyper-V, KVM, Xen) do not seem to need to do it.

>>    - Does not check CPUID hypervisor present bit [1]
> 
> Yes, this could/should indeed be added  ( we even did this in our internal JVM a while ago).
> 
>>    - Does not support x86 (32bit) platform
> 
> This is true , it was not added because   I think 32bit support has not much importance any more in current jdk .
> However in case the new version does  support 32bit too, it is for sure a good thing!

I found out "TODO support 32 bit" in current code, so I attempt to fix it :)


Thanks,

Yasumasa


> Best regards, Matthias
> 
> 
> 
> -----Original Message-----
> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
> Sent: Montag, 27. Juli 2020 06:25
> To: hotspot-runtime-dev at openjdk.java.net; Baesken, Matthias <matthias.baesken at sap.com>
> Cc: David Holmes <david.holmes at oracle.com>
> Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
> 
> Hi all,
> 
> Please review this change:
> 
>     JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>     webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
> 
> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
> 
> Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:
> 
>     - Hyper-V is detected on Windows in spite of running on host OS
>     - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
>     - Does not check CPUID hypervisor present bit [1]
>     - Does not support x86 (32bit) platform
> 
> I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:
> 
>     - Windows x64 (host)
>     - Windows x64 (Hyper-V guest)
>     - Fedora32 x64 (Hyper-V guest)
>     - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> [1] https://kb.vmware.com/s/article/1009458
> 

From gerard.ziemski at oracle.com  Mon Jul 27 16:08:20 2020
From: gerard.ziemski at oracle.com (gerard ziemski)
Date: Mon, 27 Jul 2020 11:08:20 -0500
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash
 log file
In-Reply-To: <4dfca747-833e-5ee7-0e7f-36b4958d464b@oss.nttdata.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
 <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
 <4dfca747-833e-5ee7-0e7f-36b4958d464b@oss.nttdata.com>
Message-ID: <b8b9fa60-08ba-a421-03f2-baa5b22248a8@oracle.com>

hi Yasumasa,

On 7/17/20 7:19 PM, Yasumasa Suenaga wrote:
> Hi Gerard,
>
> I cannot review it because I do not have and am not familiar for Mac, 
> but I have some comments.
>
>
> ? - You set OS name to `os` with strncpy(), but can you use #define 
> for them? For example:
> ????? #ifdef __APPLE__
> ????? #define OSNAME "Darwin"
> ????? #elif defined __OpenBSD__
> ????? #define OSNAME "OpenBSD"
> ????? #else
> ????? #define OSNAME "BSD"
> ????? #endif
>
> ??????? :
>
> ?????? snprintf(buf, buflen, OSNAME " %s, macOS %s", os, release, 
> osproductversion);
>
>
> ? - You can replace strncpy() to write '\0'
> ????? strncpy(release, "", sizeof(release));? ->? release[0] = '\0';
Thank you for taking a look, but the code you are referencing is an 
existing code and even though it could be improved, I'd rather not 
mofify it in this fix.


cheers

From gerard.ziemski at oracle.com  Mon Jul 27 16:12:18 2020
From: gerard.ziemski at oracle.com (gerard ziemski)
Date: Mon, 27 Jul 2020 11:12:18 -0500
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash
 log file
In-Reply-To: <70bb6f74-e626-bd54-ddf0-568bebe933e9@oracle.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
 <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
 <70bb6f74-e626-bd54-ddf0-568bebe933e9@oracle.com>
Message-ID: <b94bd6f1-9b37-a7d3-16da-e3f0d211fcb8@oracle.com>

Thank you David for taking a look.


On 7/19/20 11:37 PM, David Holmes wrote:
> Hi Gerard,
>
> On 18/07/2020 5:19 am, gerard ziemski wrote:
>> Hi all,
>>
>> Please review this small fix that adds the OS version and the OS 
>> build number to the hs_err_pidXXX.log output in the ?Summary? section 
>> for Mac platform (it?s easier to use for developers than the Darwin 
>> kernel version that we display right now).
>>
>> This is how things used to look:
>>
>>
>> --------------- S U M M A R Y ------------
>>
>> Command Line: Crasher
>>
>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, 
>> Darwin 19.5.0
>> Time: Thu Jul 16 14:01:46 2020 CDT elapsed time: 1.089465 seconds (0d 
>> 0h 0m 1s)
>>
>>
>> And this is how the ?Summary? section looks like with the proposed 
>> change:
>>
>>
>> --------------- S U M M A R Y ------------
>>
>> Command Line: Crasher
>>
>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, 
>> Darwin 19.5.0, macOS 10.15.5 (19F101)
>> Time: Thu Jul 16 14:02:29 2020 CDT elapsed time: 0.360881 seconds (0d 
>> 0h 0m 0s)
>>
>>
>> bug link at https://bugs.openjdk.java.net/browse/JDK-8237591
>> open webrev at http://cr.openjdk.java.net/~gziemski/8237591_rev1
>> testing Mach5 hs_tier1,2,3,4,5 in progress
>
> Just to be clear, the changes prior to:
>
> 1555 #ifdef __APPLE__
>
> are just fixing up existing indentation errors - correct?

Yes, hope that's OK, as this was the only spot in the function that 
stood out with inconsistent indentation.

>
> The actual change seems okay, just one query:
>
> 1562???? int mib_build[] = { CTL_KERN, KERN_OSVERSION };
>
> I couldn't find KERN_OSVERSION documented for sysctl - is it a 
> "recent" addition?

Yes it is. Apple added it back in 2018 (see bug comments or this link 
https://github.com/apple/darwin-xnu/commit/5bbb823c13f3ab1ab58878f96b35433a29882676?diff=split#diff-6651b0c84a045f400bc45faa9f61c9e1 
)


cheers

From andrei.pangin at gmail.com  Mon Jul 27 17:51:48 2020
From: andrei.pangin at gmail.com (Andrei Pangin)
Date: Mon, 27 Jul 2020 20:51:48 +0300
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
Message-ID: <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>

Hi,

Coleen, Thomas, thank you for your reviews. It turns out though that my new
test times out on debug builds (thanks Volker for double-checking).

TestTierPlatformDescription
runtime/MemberName/ResolvedMethodTableHash.java tier1
linux-aarch64-debug Error:
Program `...java' timed out (timeout set to ...ms, elapsed time including
timeout handling was ...ms).
runtime/MemberName/ResolvedMethodTableHash.java tier1 windows-x64-debug Error:
Program
`c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
timed out (timeout set to ...ms, elapsed time including timeout handling
was ...ms).
runtime/MemberName/ResolvedMethodTableHash.java tier1 macosx-x64-debug Error:
Program `...java' timed out (timeout set to ...ms, elapsed time including
timeout handling was ...ms).

Apparently, we can't rely on timing, since this test runs two orders of
magnitude slower on debug JVM than on product one. I could simply mark the
test as manual, or is there a better idea? Maybe, adjust the parameters and
run the test automatically only on product or only on debug JVM? What do
you think?

Regards,
Andrei

??, 25 ???. 2020 ?. ? 17:34, <coleen.phillimore at oracle.com>:

>
> Hi Andrei,
> This looks good.  Thank you for finding this bug.  And thanks to Volker
> for sponsoring it as well.
> Nice to see you on the list, Andrei!
> Coleen
>
> On 7/25/20 3:18 AM, Thomas St?fe wrote:
>
> Hi, Andrei,
>
> Good find. I played around with a test of generating lots of lambdas and
> yes, all the hashes are equal. With your patch invocation time went down by
> half (that was for 10000 lambdas).
>
> The test looks fine though the normal way to do this seems to be jcod. I
> personally don't care since the test is nice and self contained that way,
> but someone from the Oracle runtime group should confirm this is fine
> (ccing Coleen).
>
> JDK11 seems to be affected too.
>
> This probably also affects jruby.
>
> +1 from me.
>
> ..Thomas
>
> On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <andrei.pangin at gmail.com>
> wrote:
>
>> Hi,
>>
>> Please review a small fix to a not-so-small performance issue that we've
>> seen when migrating a production application from JDK 8 to JDK 14.
>>
>> On certain workloads, where Nashorn produces thousands MethodHandles,
>> ResolvedMethodTable operations become extremely slow due to degenerate
>> hashcode. This patch basically fixes hashcode by including the method
>> holder's name in the computation. More details in the bug report.
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>
>> Tested: tier1-2, hotspot*runtime
>>
>> I'll be glad if someone could sponsor the patch.
>>
>> Thank you,
>> Andrei Pangin
>>
>
>

From thomas.stuefe at gmail.com  Mon Jul 27 18:04:36 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 27 Jul 2020 20:04:36 +0200
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
Message-ID: <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>

Hi Andrei,

I was afraid this might happen, but only occurred to me after the review.

My proposal would be:

- in resolvedMethodTable.cpp, extend the log message in log_insert() and
print out the hashcode too (e.g. "ResolvedMethod entry added for ...
(4711)").

- In the test, start your test as a sub process, give it
-Xlog:membername+table=debug, and scan the output line for "ResolvedMethod
entry added for <your class name> (<hash>)". Extract the hash. Read two or
three lines and compare the hash.

The advantage would be that you do not need to load 200k classes to see
that your regression test works, and it is timing independent. For examples
of sub process spawning and output scanning, there are many jtreg tests
already (e.g. runtime/ErrorHandling has a few).

Just my 5 cent. Maybe the others have different proposals.

Cheers, Thomas


On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin <andrei.pangin at gmail.com>
wrote:

> Hi,
>
> Coleen, Thomas, thank you for your reviews. It turns out though that my
> new test times out on debug builds (thanks Volker for double-checking).
>
> TestTierPlatformDescription
> runtime/MemberName/ResolvedMethodTableHash.java tier1 linux-aarch64-debug Error:
> Program `...java' timed out (timeout set to ...ms, elapsed time including
> timeout handling was ...ms).
> runtime/MemberName/ResolvedMethodTableHash.java tier1 windows-x64-debug Error:
> Program
> `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
> timed out (timeout set to ...ms, elapsed time including timeout handling
> was ...ms).
> runtime/MemberName/ResolvedMethodTableHash.java tier1 macosx-x64-debug Error:
> Program `...java' timed out (timeout set to ...ms, elapsed time including
> timeout handling was ...ms).
>
> Apparently, we can't rely on timing, since this test runs two orders of
> magnitude slower on debug JVM than on product one. I could simply mark the
> test as manual, or is there a better idea? Maybe, adjust the parameters and
> run the test automatically only on product or only on debug JVM? What do
> you think?
>
> Regards,
> Andrei
>
> ??, 25 ???. 2020 ?. ? 17:34, <coleen.phillimore at oracle.com>:
>
>>
>> Hi Andrei,
>> This looks good.  Thank you for finding this bug.  And thanks to Volker
>> for sponsoring it as well.
>> Nice to see you on the list, Andrei!
>> Coleen
>>
>> On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>
>> Hi, Andrei,
>>
>> Good find. I played around with a test of generating lots of lambdas and
>> yes, all the hashes are equal. With your patch invocation time went down by
>> half (that was for 10000 lambdas).
>>
>> The test looks fine though the normal way to do this seems to be jcod. I
>> personally don't care since the test is nice and self contained that way,
>> but someone from the Oracle runtime group should confirm this is fine
>> (ccing Coleen).
>>
>> JDK11 seems to be affected too.
>>
>> This probably also affects jruby.
>>
>> +1 from me.
>>
>> ..Thomas
>>
>> On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <andrei.pangin at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Please review a small fix to a not-so-small performance issue that we've
>>> seen when migrating a production application from JDK 8 to JDK 14.
>>>
>>> On certain workloads, where Nashorn produces thousands MethodHandles,
>>> ResolvedMethodTable operations become extremely slow due to degenerate
>>> hashcode. This patch basically fixes hashcode by including the method
>>> holder's name in the computation. More details in the bug report.
>>>
>>> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>
>>> Tested: tier1-2, hotspot*runtime
>>>
>>> I'll be glad if someone could sponsor the patch.
>>>
>>> Thank you,
>>> Andrei Pangin
>>>
>>
>>

From volker.simonis at gmail.com  Mon Jul 27 18:33:42 2020
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 27 Jul 2020 20:33:42 +0200
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
 <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
Message-ID: <CA+3eh104RhivscWM-+4JgEZA0igfiVwK2eno2bdgG1h_Rg4BDg@mail.gmail.com>

I like this solution although it introduces a hard dependency on the exact
log format. But that's probably not a big issue.

Another possibility would probably be to add a gtest.

But I'd also be happy to make the current test a manual test.

Let's see what Coleen's opinion is. We can use her as a tiebreaker :)

Otherwise I'll leave it up to Andrei to choose his preferred option :)


On Mon, Jul 27, 2020 at 8:04 PM Thomas St?fe <thomas.stuefe at gmail.com>
wrote:

> Hi Andrei,
>
> I was afraid this might happen, but only occurred to me after the review.
>
> My proposal would be:
>
> - in resolvedMethodTable.cpp, extend the log message in log_insert() and
> print out the hashcode too (e.g. "ResolvedMethod entry added for ...
> (4711)").
>
> - In the test, start your test as a sub process, give it
> -Xlog:membername+table=debug, and scan the output line for "ResolvedMethod
> entry added for <your class name> (<hash>)". Extract the hash. Read two or
> three lines and compare the hash.
>
> The advantage would be that you do not need to load 200k classes to see
> that your regression test works, and it is timing independent. For examples
> of sub process spawning and output scanning, there are many jtreg tests
> already (e.g. runtime/ErrorHandling has a few).
>
> Just my 5 cent. Maybe the others have different proposals.
>
> Cheers, Thomas
>
>
>
>
> On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin <andrei.pangin at gmail.com>
> wrote:
>
>> Hi,
>>
>> Coleen, Thomas, thank you for your reviews. It turns out though that my
>> new test times out on debug builds (thanks Volker for double-checking).
>>
>> TestTierPlatformDescription
>> runtime/MemberName/ResolvedMethodTableHash.java tier1 linux-aarch64-debug Error:
>> Program `...java' timed out (timeout set to ...ms, elapsed time including
>> timeout handling was ...ms).
>> runtime/MemberName/ResolvedMethodTableHash.java tier1 windows-x64-debug Error:
>> Program
>> `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>> timed out (timeout set to ...ms, elapsed time including timeout handling
>> was ...ms).
>> runtime/MemberName/ResolvedMethodTableHash.java tier1 macosx-x64-debug Error:
>> Program `...java' timed out (timeout set to ...ms, elapsed time including
>> timeout handling was ...ms).
>>
>> Apparently, we can't rely on timing, since this test runs two orders of
>> magnitude slower on debug JVM than on product one. I could simply mark the
>> test as manual, or is there a better idea? Maybe, adjust the parameters and
>> run the test automatically only on product or only on debug JVM? What do
>> you think?
>>
>> Regards,
>> Andrei
>>
>> ??, 25 ???. 2020 ?. ? 17:34, <coleen.phillimore at oracle.com>:
>>
>>>
>>> Hi Andrei,
>>> This looks good.  Thank you for finding this bug.  And thanks to Volker
>>> for sponsoring it as well.
>>> Nice to see you on the list, Andrei!
>>> Coleen
>>>
>>> On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>>
>>> Hi, Andrei,
>>>
>>> Good find. I played around with a test of generating lots of lambdas and
>>> yes, all the hashes are equal. With your patch invocation time went down by
>>> half (that was for 10000 lambdas).
>>>
>>> The test looks fine though the normal way to do this seems to be jcod. I
>>> personally don't care since the test is nice and self contained that way,
>>> but someone from the Oracle runtime group should confirm this is fine
>>> (ccing Coleen).
>>>
>>> JDK11 seems to be affected too.
>>>
>>> This probably also affects jruby.
>>>
>>> +1 from me.
>>>
>>> ..Thomas
>>>
>>> On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <andrei.pangin at gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Please review a small fix to a not-so-small performance issue that we've
>>>> seen when migrating a production application from JDK 8 to JDK 14.
>>>>
>>>> On certain workloads, where Nashorn produces thousands MethodHandles,
>>>> ResolvedMethodTable operations become extremely slow due to degenerate
>>>> hashcode. This patch basically fixes hashcode by including the method
>>>> holder's name in the computation. More details in the bug report.
>>>>
>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>>> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>>
>>>> Tested: tier1-2, hotspot*runtime
>>>>
>>>> I'll be glad if someone could sponsor the patch.
>>>>
>>>> Thank you,
>>>> Andrei Pangin
>>>>
>>>
>>>

From coleen.phillimore at oracle.com  Mon Jul 27 18:37:00 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 27 Jul 2020 14:37:00 -0400
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
 <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
Message-ID: <79ab40c3-c0f6-164e-1467-8f65bc8f0b7c@oracle.com>


On 7/27/20 2:04 PM, Thomas St?fe wrote:
> Hi Andrei,
>
> I was afraid this might happen, but only occurred to me after the review.
>
> My proposal would be:
>
> - in resolvedMethodTable.cpp, extend the log message in log_insert() 
> and print out the hashcode too (e.g. "ResolvedMethod entry added for 
> ... (4711)").
>
> - In the test, start your test as a sub process, give it 
> -Xlog:membername+table=debug, and scan the output line for 
> "ResolvedMethod entry added for <your class name> (<hash>)". Extract 
> the hash. Read two or three lines and compare the hash.
>
> The advantage would be that you do not need to load 200k classes to 
> see that your regression test works, and it is timing independent. For 
> examples of sub process spawning and output?scanning, there are many 
> jtreg tests already (e.g. runtime/ErrorHandling has a few).
>
> Just my 5 cent. Maybe the others have different proposals.

There is a 4th argument to the hashtable get() function that detects 
that we need rehashing.? You could assert it's false, and have your test 
insert 100 entries and see if it crashes.? I think if the hashcode is 
good, this should always return false.? I don't know this for a fact, 
but you could experiment with this.? Otherwise, do what Thomas suggests.
thanks,
Coleen


>
> Cheers, Thomas
>
>
>
>
> On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin <andrei.pangin at gmail.com 
> <mailto:andrei.pangin at gmail.com>> wrote:
>
>     Hi,
>
>     Coleen, Thomas, thank you for your reviews. It turns out though
>     that my new test times out on debug builds (thanks Volker for
>     double-checking).
>
>     Test 	Tier 	Platform 	Description
>     runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>     linux-aarch64-debug 	Error: Program `...java' timed out (timeout
>     set to ...ms, elapsed time including timeout handling was ...ms).
>     runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>     windows-x64-debug 	Error: Program
>     `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>     timed out (timeout set to ...ms, elapsed time including timeout
>     handling was ...ms).
>     runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>     macosx-x64-debug 	Error: Program `...java' timed out (timeout set
>     to ...ms, elapsed time including timeout handling was ...ms).
>
>
>     Apparently, we can't rely on timing, since this test runs two
>     orders of magnitude slower on debug JVM than on product one. I
>     could simply mark the test as manual, or is there a better idea?
>     Maybe, adjust the parameters and run the test automatically only
>     on product or only on debug JVM? What do you think?
>
>     Regards,
>     Andrei
>
>     ??, 25 ???. 2020 ?. ? 17:34, <coleen.phillimore at oracle.com
>     <mailto:coleen.phillimore at oracle.com>>:
>
>
>         Hi Andrei,
>         This looks good.? Thank you for finding this bug.? And thanks
>         to Volker for sponsoring it as well.
>         Nice to see you on the list, Andrei!
>         Coleen
>
>         On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>         Hi, Andrei,
>>
>>         Good find. I played around with a test of generating lots of
>>         lambdas and yes, all the hashes are equal. With your patch
>>         invocation?time went down by half (that was for 10000 lambdas).
>>
>>         The test looks fine though the normal way to do this seems to
>>         be jcod. I personally don't care since the test is nice and
>>         self contained?that way, but someone from the Oracle runtime
>>         group should confirm this is fine (ccing Coleen).
>>
>>         JDK11 seems to be affected too.
>>
>>         This probably also affects jruby.
>>
>>         +1 from me.
>>
>>         ..Thomas
>>
>>         On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin
>>         <andrei.pangin at gmail.com <mailto:andrei.pangin at gmail.com>> wrote:
>>
>>             Hi,
>>
>>             Please review a small fix to a not-so-small performance
>>             issue that we've
>>             seen when migrating a production application from JDK 8
>>             to JDK 14.
>>
>>             On certain workloads, where Nashorn produces thousands
>>             MethodHandles,
>>             ResolvedMethodTable operations become extremely slow due
>>             to degenerate
>>             hashcode. This patch basically fixes hashcode by
>>             including the method
>>             holder's name in the computation. More details in the bug
>>             report.
>>
>>             CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>             Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>
>>             Tested: tier1-2, hotspot*runtime
>>
>>             I'll be glad if someone could sponsor the patch.
>>
>>             Thank you,
>>             Andrei Pangin
>>
>


From coleen.phillimore at oracle.com  Mon Jul 27 18:41:25 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 27 Jul 2020 14:41:25 -0400
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CA+3eh104RhivscWM-+4JgEZA0igfiVwK2eno2bdgG1h_Rg4BDg@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
 <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
 <CA+3eh104RhivscWM-+4JgEZA0igfiVwK2eno2bdgG1h_Rg4BDg@mail.gmail.com>
Message-ID: <67f535f6-0913-3cd4-e67f-4c3962d68ae3@oracle.com>


On 7/27/20 2:33 PM, Volker Simonis wrote:
> I like this solution although?it introduces a hard dependency?on the 
> exact log format. But that's probably not a big issue.
>
> Another possibility would probably be to add a gtest.
>
> But I'd also be happy to make the current test a manual test.
>
> Let's see what Coleen's?opinion is. We can use her as a tiebreaker :)
>
> Otherwise I'll leave it up to Andrei to choose his preferred option :)

So what I suggested is risky (I'm curious to see if it works).? My 
second choice is to make it a manual test.? Hopefully we won't be 
changing up the hashcode all the time and risk breaking this again.
thanks,
Coleen
>
> On Mon, Jul 27, 2020 at 8:04 PM Thomas St?fe <thomas.stuefe at gmail.com 
> <mailto:thomas.stuefe at gmail.com>> wrote:
>
>     Hi Andrei,
>
>     I was afraid this might happen, but only occurred to me after the
>     review.
>
>     My proposal would be:
>
>     - in resolvedMethodTable.cpp, extend the log message in
>     log_insert() and print out the hashcode too (e.g. "ResolvedMethod
>     entry added for ... (4711)").
>
>     - In the test, start your test as a sub process, give it
>     -Xlog:membername+table=debug, and scan the output line for
>     "ResolvedMethod entry added for <your class name> (<hash>)".
>     Extract the hash. Read two or three lines and compare the hash.
>
>     The advantage would be that you do not need to load 200k classes
>     to see that your regression test works, and it is timing
>     independent. For examples of sub process spawning and
>     output?scanning, there are many jtreg tests already (e.g.
>     runtime/ErrorHandling has a few).
>
>     Just my 5 cent. Maybe the others have different proposals.
>
>     Cheers, Thomas
>
>
>
>
>     On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin
>     <andrei.pangin at gmail.com <mailto:andrei.pangin at gmail.com>> wrote:
>
>         Hi,
>
>         Coleen, Thomas, thank you for your reviews. It turns out
>         though that my new test times out on debug builds (thanks
>         Volker for double-checking).
>
>         Test 	Tier 	Platform 	Description
>         runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>         linux-aarch64-debug 	Error: Program `...java' timed out
>         (timeout set to ...ms, elapsed time including timeout handling
>         was ...ms).
>         runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>         windows-x64-debug 	Error: Program
>         `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>         timed out (timeout set to ...ms, elapsed time including
>         timeout handling was ...ms).
>         runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>         macosx-x64-debug 	Error: Program `...java' timed out (timeout
>         set to ...ms, elapsed time including timeout handling was ...ms).
>
>
>         Apparently, we can't rely on timing, since this test runs two
>         orders of magnitude slower on debug JVM than on product one. I
>         could simply mark the test as manual, or is there a better
>         idea? Maybe, adjust the parameters and run the test
>         automatically only on product or only on debug JVM? What do
>         you think?
>
>         Regards,
>         Andrei
>
>         ??, 25 ???. 2020 ?. ? 17:34, <coleen.phillimore at oracle.com
>         <mailto:coleen.phillimore at oracle.com>>:
>
>
>             Hi Andrei,
>             This looks good.? Thank you for finding this bug.? And
>             thanks to Volker for sponsoring it as well.
>             Nice to see you on the list, Andrei!
>             Coleen
>
>             On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>             Hi, Andrei,
>>
>>             Good find. I played around with a test of generating lots
>>             of lambdas and yes, all the hashes are equal. With your
>>             patch invocation?time went down by half (that was for
>>             10000 lambdas).
>>
>>             The test looks fine though the normal way to do this
>>             seems to be jcod. I personally don't care since the test
>>             is nice and self contained?that way, but someone from the
>>             Oracle runtime group should confirm this is fine (ccing
>>             Coleen).
>>
>>             JDK11 seems to be affected too.
>>
>>             This probably also affects jruby.
>>
>>             +1 from me.
>>
>>             ..Thomas
>>
>>             On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin
>>             <andrei.pangin at gmail.com
>>             <mailto:andrei.pangin at gmail.com>> wrote:
>>
>>                 Hi,
>>
>>                 Please review a small fix to a not-so-small
>>                 performance issue that we've
>>                 seen when migrating a production application from JDK
>>                 8 to JDK 14.
>>
>>                 On certain workloads, where Nashorn produces
>>                 thousands MethodHandles,
>>                 ResolvedMethodTable operations become extremely slow
>>                 due to degenerate
>>                 hashcode. This patch basically fixes hashcode by
>>                 including the method
>>                 holder's name in the computation. More details in the
>>                 bug report.
>>
>>                 CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>                 Webrev:
>>                 https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>
>>                 Tested: tier1-2, hotspot*runtime
>>
>>                 I'll be glad if someone could sponsor the patch.
>>
>>                 Thank you,
>>                 Andrei Pangin
>>
>


From thomas.stuefe at gmail.com  Mon Jul 27 18:59:51 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 27 Jul 2020 20:59:51 +0200
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <67f535f6-0913-3cd4-e67f-4c3962d68ae3@oracle.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
 <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
 <CA+3eh104RhivscWM-+4JgEZA0igfiVwK2eno2bdgG1h_Rg4BDg@mail.gmail.com>
 <67f535f6-0913-3cd4-e67f-4c3962d68ae3@oracle.com>
Message-ID: <CAA-vtUwHkQmHEEMNtokuC115pb=WR9HT1Q0OHSXoK7pJAEWsNw@mail.gmail.com>

A manual test may be enough too, I mean how probable is it that this
particular error would ever resurface. Better would be, maybe, to just add
an assert into the insert path which fires if a bucket length is above a
certain threshold. That would catch all kinds of bad hashes, not just this
one case.

BTW maybe we should add a jcmd option to print this table, like we have for
the symbol- and stringtable.

Cheers Thomas

On Mon, Jul 27, 2020 at 8:44 PM <coleen.phillimore at oracle.com> wrote:

>
>
> On 7/27/20 2:33 PM, Volker Simonis wrote:
>
> I like this solution although it introduces a hard dependency on the exact
> log format. But that's probably not a big issue.
>
> Another possibility would probably be to add a gtest.
>
> But I'd also be happy to make the current test a manual test.
>
> Let's see what Coleen's opinion is. We can use her as a tiebreaker :)
>
> Otherwise I'll leave it up to Andrei to choose his preferred option :)
>
>
> So what I suggested is risky (I'm curious to see if it works).  My second
> choice is to make it a manual test.  Hopefully we won't be changing up the
> hashcode all the time and risk breaking this again.
> thanks,
> Coleen
>
>
>
> On Mon, Jul 27, 2020 at 8:04 PM Thomas St?fe <thomas.stuefe at gmail.com>
> wrote:
>
>> Hi Andrei,
>>
>> I was afraid this might happen, but only occurred to me after the review.
>>
>> My proposal would be:
>>
>> - in resolvedMethodTable.cpp, extend the log message in log_insert() and
>> print out the hashcode too (e.g. "ResolvedMethod entry added for ...
>> (4711)").
>>
>> - In the test, start your test as a sub process, give it
>> -Xlog:membername+table=debug, and scan the output line for "ResolvedMethod
>> entry added for <your class name> (<hash>)". Extract the hash. Read two or
>> three lines and compare the hash.
>>
>> The advantage would be that you do not need to load 200k classes to see
>> that your regression test works, and it is timing independent. For examples
>> of sub process spawning and output scanning, there are many jtreg tests
>> already (e.g. runtime/ErrorHandling has a few).
>>
>> Just my 5 cent. Maybe the others have different proposals.
>>
>> Cheers, Thomas
>>
>>
>>
>>
>> On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin <andrei.pangin at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Coleen, Thomas, thank you for your reviews. It turns out though that my
>>> new test times out on debug builds (thanks Volker for double-checking).
>>>
>>> Test Tier Platform Description
>>> runtime/MemberName/ResolvedMethodTableHash.java tier1
>>> linux-aarch64-debug Error: Program `...java' timed out (timeout set to
>>> ...ms, elapsed time including timeout handling was ...ms).
>>> runtime/MemberName/ResolvedMethodTableHash.java tier1 windows-x64-debug Error:
>>> Program
>>> `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>>> timed out (timeout set to ...ms, elapsed time including timeout handling
>>> was ...ms).
>>> runtime/MemberName/ResolvedMethodTableHash.java tier1 macosx-x64-debug Error:
>>> Program `...java' timed out (timeout set to ...ms, elapsed time including
>>> timeout handling was ...ms).
>>>
>>> Apparently, we can't rely on timing, since this test runs two orders of
>>> magnitude slower on debug JVM than on product one. I could simply mark the
>>> test as manual, or is there a better idea? Maybe, adjust the parameters and
>>> run the test automatically only on product or only on debug JVM? What do
>>> you think?
>>>
>>> Regards,
>>> Andrei
>>>
>>> ??, 25 ???. 2020 ?. ? 17:34, <coleen.phillimore at oracle.com>:
>>>
>>>>
>>>> Hi Andrei,
>>>> This looks good.  Thank you for finding this bug.  And thanks to Volker
>>>> for sponsoring it as well.
>>>> Nice to see you on the list, Andrei!
>>>> Coleen
>>>>
>>>> On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>>>
>>>> Hi, Andrei,
>>>>
>>>> Good find. I played around with a test of generating lots of lambdas
>>>> and yes, all the hashes are equal. With your patch invocation time went
>>>> down by half (that was for 10000 lambdas).
>>>>
>>>> The test looks fine though the normal way to do this seems to be jcod.
>>>> I personally don't care since the test is nice and self contained that way,
>>>> but someone from the Oracle runtime group should confirm this is fine
>>>> (ccing Coleen).
>>>>
>>>> JDK11 seems to be affected too.
>>>>
>>>> This probably also affects jruby.
>>>>
>>>> +1 from me.
>>>>
>>>> ..Thomas
>>>>
>>>> On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <andrei.pangin at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Please review a small fix to a not-so-small performance issue that
>>>>> we've
>>>>> seen when migrating a production application from JDK 8 to JDK 14.
>>>>>
>>>>> On certain workloads, where Nashorn produces thousands MethodHandles,
>>>>> ResolvedMethodTable operations become extremely slow due to degenerate
>>>>> hashcode. This patch basically fixes hashcode by including the method
>>>>> holder's name in the computation. More details in the bug report.
>>>>>
>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>>>> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>>>
>>>>> Tested: tier1-2, hotspot*runtime
>>>>>
>>>>> I'll be glad if someone could sponsor the patch.
>>>>>
>>>>> Thank you,
>>>>> Andrei Pangin
>>>>>
>>>>
>>>>
>

From coleen.phillimore at oracle.com  Mon Jul 27 19:12:09 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 27 Jul 2020 15:12:09 -0400
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAA-vtUwHkQmHEEMNtokuC115pb=WR9HT1Q0OHSXoK7pJAEWsNw@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
 <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
 <CA+3eh104RhivscWM-+4JgEZA0igfiVwK2eno2bdgG1h_Rg4BDg@mail.gmail.com>
 <67f535f6-0913-3cd4-e67f-4c3962d68ae3@oracle.com>
 <CAA-vtUwHkQmHEEMNtokuC115pb=WR9HT1Q0OHSXoK7pJAEWsNw@mail.gmail.com>
Message-ID: <46590d43-ec6b-9185-8c02-11585d3ec62e@oracle.com>


On 7/27/20 2:59 PM, Thomas St?fe wrote:
> A manual test may be enough too, I mean how probable is it that this 
> particular error would ever resurface. Better?would be, maybe, to just 
> add an assert into the insert path which fires if a bucket length is 
> above a certain threshold. That would catch all kinds of bad hashes, 
> not just this one case.
>
> BTW maybe we should add a jcmd option to print this table, like we 
> have for the symbol- and stringtable.

Erik Osterlund removes this table with the New Invoke Bindings JEP so 
it's not worth adding more to this right now.
thanks,
Coleen
>
> Cheers Thomas
>
> On Mon, Jul 27, 2020 at 8:44 PM <coleen.phillimore at oracle.com 
> <mailto:coleen.phillimore at oracle.com>> wrote:
>
>
>
>     On 7/27/20 2:33 PM, Volker Simonis wrote:
>>     I like this solution although?it introduces a hard dependency?on
>>     the exact log format. But that's probably not a big issue.
>>
>>     Another possibility would probably be to add a gtest.
>>
>>     But I'd also be happy to make the current test a manual test.
>>
>>     Let's see what Coleen's?opinion is. We can use her as a tiebreaker :)
>>
>>     Otherwise I'll leave it up to Andrei to choose his preferred
>>     option :)
>
>     So what I suggested is risky (I'm curious to see if it works).? My
>     second choice is to make it a manual test. Hopefully we won't be
>     changing up the hashcode all the time and risk breaking this again.
>     thanks,
>     Coleen
>>
>>     On Mon, Jul 27, 2020 at 8:04 PM Thomas St?fe
>>     <thomas.stuefe at gmail.com <mailto:thomas.stuefe at gmail.com>> wrote:
>>
>>         Hi Andrei,
>>
>>         I was afraid this might happen, but only occurred to me after
>>         the review.
>>
>>         My proposal would be:
>>
>>         - in resolvedMethodTable.cpp, extend the log message in
>>         log_insert() and print out the hashcode too (e.g.
>>         "ResolvedMethod entry added for ... (4711)").
>>
>>         - In the test, start your test as a sub process, give it
>>         -Xlog:membername+table=debug, and scan the output line for
>>         "ResolvedMethod entry added for <your class name> (<hash>)".
>>         Extract the hash. Read two or three lines and compare the hash.
>>
>>         The advantage would be that you do not need to load 200k
>>         classes to see that your regression test works, and it is
>>         timing independent. For examples of sub process spawning and
>>         output?scanning, there are many jtreg tests already (e.g.
>>         runtime/ErrorHandling has a few).
>>
>>         Just my 5 cent. Maybe the others have different proposals.
>>
>>         Cheers, Thomas
>>
>>
>>
>>
>>         On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin
>>         <andrei.pangin at gmail.com <mailto:andrei.pangin at gmail.com>> wrote:
>>
>>             Hi,
>>
>>             Coleen, Thomas, thank you for your reviews. It turns out
>>             though that my new test times out on debug builds (thanks
>>             Volker for double-checking).
>>
>>             Test 	Tier 	Platform 	Description
>>             runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>>             linux-aarch64-debug 	Error: Program `...java' timed out
>>             (timeout set to ...ms, elapsed time including timeout
>>             handling was ...ms).
>>             runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>>             windows-x64-debug 	Error: Program
>>             `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>>             timed out (timeout set to ...ms, elapsed time including
>>             timeout handling was ...ms).
>>             runtime/MemberName/ResolvedMethodTableHash.java 	tier1
>>             macosx-x64-debug 	Error: Program `...java' timed out
>>             (timeout set to ...ms, elapsed time including timeout
>>             handling was ...ms).
>>
>>
>>             Apparently, we can't rely on timing, since this test runs
>>             two orders of magnitude slower on debug JVM than on
>>             product one. I could simply mark the test as manual, or
>>             is there a better idea? Maybe, adjust the parameters and
>>             run the test automatically only on product or only on
>>             debug JVM? What do you think?
>>
>>             Regards,
>>             Andrei
>>
>>             ??, 25 ???. 2020 ?. ? 17:34,
>>             <coleen.phillimore at oracle.com
>>             <mailto:coleen.phillimore at oracle.com>>:
>>
>>
>>                 Hi Andrei,
>>                 This looks good.? Thank you for finding this bug.?
>>                 And thanks to Volker for sponsoring it as well.
>>                 Nice to see you on the list, Andrei!
>>                 Coleen
>>
>>                 On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>>                 Hi, Andrei,
>>>
>>>                 Good find. I played around with a test of generating
>>>                 lots of lambdas and yes, all the hashes are equal.
>>>                 With your patch invocation?time went down by half
>>>                 (that was for 10000 lambdas).
>>>
>>>                 The test looks fine though the normal way to do this
>>>                 seems to be jcod. I personally don't care since the
>>>                 test is nice and self contained?that way, but
>>>                 someone from the Oracle runtime group should confirm
>>>                 this is fine (ccing Coleen).
>>>
>>>                 JDK11 seems to be affected too.
>>>
>>>                 This probably also affects jruby.
>>>
>>>                 +1 from me.
>>>
>>>                 ..Thomas
>>>
>>>                 On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin
>>>                 <andrei.pangin at gmail.com
>>>                 <mailto:andrei.pangin at gmail.com>> wrote:
>>>
>>>                     Hi,
>>>
>>>                     Please review a small fix to a not-so-small
>>>                     performance issue that we've
>>>                     seen when migrating a production application
>>>                     from JDK 8 to JDK 14.
>>>
>>>                     On certain workloads, where Nashorn produces
>>>                     thousands MethodHandles,
>>>                     ResolvedMethodTable operations become extremely
>>>                     slow due to degenerate
>>>                     hashcode. This patch basically fixes hashcode by
>>>                     including the method
>>>                     holder's name in the computation. More details
>>>                     in the bug report.
>>>
>>>                     CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>>                     Webrev:
>>>                     https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>
>>>                     Tested: tier1-2, hotspot*runtime
>>>
>>>                     I'll be glad if someone could sponsor the patch.
>>>
>>>                     Thank you,
>>>                     Andrei Pangin
>>>
>>
>


From andrei.pangin at gmail.com  Mon Jul 27 20:29:46 2020
From: andrei.pangin at gmail.com (Andrei Pangin)
Date: Mon, 27 Jul 2020 23:29:46 +0300
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <46590d43-ec6b-9185-8c02-11585d3ec62e@oracle.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
 <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
 <CA+3eh104RhivscWM-+4JgEZA0igfiVwK2eno2bdgG1h_Rg4BDg@mail.gmail.com>
 <67f535f6-0913-3cd4-e67f-4c3962d68ae3@oracle.com>
 <CAA-vtUwHkQmHEEMNtokuC115pb=WR9HT1Q0OHSXoK7pJAEWsNw@mail.gmail.com>
 <46590d43-ec6b-9185-8c02-11585d3ec62e@oracle.com>
Message-ID: <CAAnUXCmpR0N+rUG1upXB_LNeFuKm+7axsTcjGSMM2EjPAZUxhA@mail.gmail.com>

Thank you for the ideas. Checking the real bucket length looks attractive,
but as far as I see, it does not always work. First, because the table
expansion is triggered only after GC. Second, because it's still possible
to intentionally generate hash collisions, but I believe the JVM should not
crash even in the worst case scenario.

Printing the hash value and parsing the logs looks a bit too artificial to
me, while my original intention was to test the performance (or algorithmic
complexity) of the code rather than the particular hash code algorithm.

Anyway, if you all don't mind making the test manual, let me just
stick with this option - not to make the test much more complicated than
the fix itself :) especially if the table will be removed some time soon.

Here's the updated webrev, just in case. It differs only in changing
"timeout=300" to "manual".
http://cr.openjdk.java.net/~apangin/8249719/webrev.2/

Thanks again,
Andrei


??, 27 ???. 2020 ?. ? 22:12, <coleen.phillimore at oracle.com>:

>
>
> On 7/27/20 2:59 PM, Thomas St?fe wrote:
>
> A manual test may be enough too, I mean how probable is it that this
> particular error would ever resurface. Better would be, maybe, to just add
> an assert into the insert path which fires if a bucket length is above a
> certain threshold. That would catch all kinds of bad hashes, not just this
> one case.
>
> BTW maybe we should add a jcmd option to print this table, like we have
> for the symbol- and stringtable.
>
>
> Erik Osterlund removes this table with the New Invoke Bindings JEP so it's
> not worth adding more to this right now.
> thanks,
> Coleen
>
>
> Cheers Thomas
>
> On Mon, Jul 27, 2020 at 8:44 PM <coleen.phillimore at oracle.com> wrote:
>
>>
>>
>> On 7/27/20 2:33 PM, Volker Simonis wrote:
>>
>> I like this solution although it introduces a hard dependency on the
>> exact log format. But that's probably not a big issue.
>>
>> Another possibility would probably be to add a gtest.
>>
>> But I'd also be happy to make the current test a manual test.
>>
>> Let's see what Coleen's opinion is. We can use her as a tiebreaker :)
>>
>> Otherwise I'll leave it up to Andrei to choose his preferred option :)
>>
>>
>> So what I suggested is risky (I'm curious to see if it works).  My second
>> choice is to make it a manual test.  Hopefully we won't be changing up the
>> hashcode all the time and risk breaking this again.
>> thanks,
>> Coleen
>>
>>
>>
>> On Mon, Jul 27, 2020 at 8:04 PM Thomas St?fe <thomas.stuefe at gmail.com>
>> wrote:
>>
>>> Hi Andrei,
>>>
>>> I was afraid this might happen, but only occurred to me after the review.
>>>
>>> My proposal would be:
>>>
>>> - in resolvedMethodTable.cpp, extend the log message in log_insert() and
>>> print out the hashcode too (e.g. "ResolvedMethod entry added for ...
>>> (4711)").
>>>
>>> - In the test, start your test as a sub process, give it
>>> -Xlog:membername+table=debug, and scan the output line for "ResolvedMethod
>>> entry added for <your class name> (<hash>)". Extract the hash. Read two or
>>> three lines and compare the hash.
>>>
>>> The advantage would be that you do not need to load 200k classes to see
>>> that your regression test works, and it is timing independent. For examples
>>> of sub process spawning and output scanning, there are many jtreg tests
>>> already (e.g. runtime/ErrorHandling has a few).
>>>
>>> Just my 5 cent. Maybe the others have different proposals.
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>> On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin <andrei.pangin at gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Coleen, Thomas, thank you for your reviews. It turns out though that my
>>>> new test times out on debug builds (thanks Volker for double-checking).
>>>>
>>>> Test Tier Platform Description
>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1
>>>> linux-aarch64-debug Error: Program `...java' timed out (timeout set to
>>>> ...ms, elapsed time including timeout handling was ...ms).
>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1 windows-x64-debug Error:
>>>> Program
>>>> `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>>>> timed out (timeout set to ...ms, elapsed time including timeout handling
>>>> was ...ms).
>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1 macosx-x64-debug Error:
>>>> Program `...java' timed out (timeout set to ...ms, elapsed time including
>>>> timeout handling was ...ms).
>>>>
>>>> Apparently, we can't rely on timing, since this test runs two orders of
>>>> magnitude slower on debug JVM than on product one. I could simply mark the
>>>> test as manual, or is there a better idea? Maybe, adjust the parameters and
>>>> run the test automatically only on product or only on debug JVM? What do
>>>> you think?
>>>>
>>>> Regards,
>>>> Andrei
>>>>
>>>> ??, 25 ???. 2020 ?. ? 17:34, <coleen.phillimore at oracle.com>:
>>>>
>>>>>
>>>>> Hi Andrei,
>>>>> This looks good.  Thank you for finding this bug.  And thanks to
>>>>> Volker for sponsoring it as well.
>>>>> Nice to see you on the list, Andrei!
>>>>> Coleen
>>>>>
>>>>> On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>>>>
>>>>> Hi, Andrei,
>>>>>
>>>>> Good find. I played around with a test of generating lots of lambdas
>>>>> and yes, all the hashes are equal. With your patch invocation time went
>>>>> down by half (that was for 10000 lambdas).
>>>>>
>>>>> The test looks fine though the normal way to do this seems to be jcod.
>>>>> I personally don't care since the test is nice and self contained that way,
>>>>> but someone from the Oracle runtime group should confirm this is fine
>>>>> (ccing Coleen).
>>>>>
>>>>> JDK11 seems to be affected too.
>>>>>
>>>>> This probably also affects jruby.
>>>>>
>>>>> +1 from me.
>>>>>
>>>>> ..Thomas
>>>>>
>>>>> On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <andrei.pangin at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Please review a small fix to a not-so-small performance issue that
>>>>>> we've
>>>>>> seen when migrating a production application from JDK 8 to JDK 14.
>>>>>>
>>>>>> On certain workloads, where Nashorn produces thousands MethodHandles,
>>>>>> ResolvedMethodTable operations become extremely slow due to degenerate
>>>>>> hashcode. This patch basically fixes hashcode by including the method
>>>>>> holder's name in the computation. More details in the bug report.
>>>>>>
>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>>>>> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>>>>
>>>>>> Tested: tier1-2, hotspot*runtime
>>>>>>
>>>>>> I'll be glad if someone could sponsor the patch.
>>>>>>
>>>>>> Thank you,
>>>>>> Andrei Pangin
>>>>>>
>>>>>
>>>>>
>>
>

From thomas.stuefe at gmail.com  Mon Jul 27 20:37:38 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 27 Jul 2020 22:37:38 +0200
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAAnUXCmpR0N+rUG1upXB_LNeFuKm+7axsTcjGSMM2EjPAZUxhA@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
 <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
 <CA+3eh104RhivscWM-+4JgEZA0igfiVwK2eno2bdgG1h_Rg4BDg@mail.gmail.com>
 <67f535f6-0913-3cd4-e67f-4c3962d68ae3@oracle.com>
 <CAA-vtUwHkQmHEEMNtokuC115pb=WR9HT1Q0OHSXoK7pJAEWsNw@mail.gmail.com>
 <46590d43-ec6b-9185-8c02-11585d3ec62e@oracle.com>
 <CAAnUXCmpR0N+rUG1upXB_LNeFuKm+7axsTcjGSMM2EjPAZUxhA@mail.gmail.com>
Message-ID: <CAA-vtUyXwFsjTLmw5C9dQCXECu5QX0Swnrv-4kNx=gpmdyKJGQ@mail.gmail.com>

I?m fine with that version.

Thanks, Thomas

On Mon 27. Jul 2020 at 22:30, Andrei Pangin <andrei.pangin at gmail.com> wrote:

> Thank you for the ideas. Checking the real bucket length looks attractive,
> but as far as I see, it does not always work. First, because the table
> expansion is triggered only after GC. Second, because it's still possible
> to intentionally generate hash collisions, but I believe the JVM should not
> crash even in the worst case scenario.
>
> Printing the hash value and parsing the logs looks a bit too artificial to
> me, while my original intention was to test the performance (or algorithmic
> complexity) of the code rather than the particular hash code algorithm.
>
> Anyway, if you all don't mind making the test manual, let me just
> stick with this option - not to make the test much more complicated than
> the fix itself :) especially if the table will be removed some time soon.
>
> Here's the updated webrev, just in case. It differs only in changing
> "timeout=300" to "manual".
> http://cr.openjdk.java.net/~apangin/8249719/webrev.2/
>
> Thanks again,
> Andrei
>
>
> ??, 27 ???. 2020 ?. ? 22:12, <coleen.phillimore at oracle.com>:
>
>>
>>
>> On 7/27/20 2:59 PM, Thomas St?fe wrote:
>>
>> A manual test may be enough too, I mean how probable is it that this
>> particular error would ever resurface. Better would be, maybe, to just add
>> an assert into the insert path which fires if a bucket length is above a
>> certain threshold. That would catch all kinds of bad hashes, not just this
>> one case.
>>
>> BTW maybe we should add a jcmd option to print this table, like we have
>> for the symbol- and stringtable.
>>
>>
>> Erik Osterlund removes this table with the New Invoke Bindings JEP so
>> it's not worth adding more to this right now.
>> thanks,
>> Coleen
>>
>>
>> Cheers Thomas
>>
>> On Mon, Jul 27, 2020 at 8:44 PM <coleen.phillimore at oracle.com> wrote:
>>
>>>
>>>
>>> On 7/27/20 2:33 PM, Volker Simonis wrote:
>>>
>>> I like this solution although it introduces a hard dependency on the
>>> exact log format. But that's probably not a big issue.
>>>
>>> Another possibility would probably be to add a gtest.
>>>
>>> But I'd also be happy to make the current test a manual test.
>>>
>>> Let's see what Coleen's opinion is. We can use her as a tiebreaker :)
>>>
>>> Otherwise I'll leave it up to Andrei to choose his preferred option :)
>>>
>>>
>>> So what I suggested is risky (I'm curious to see if it works).  My
>>> second choice is to make it a manual test.  Hopefully we won't be changing
>>> up the hashcode all the time and risk breaking this again.
>>> thanks,
>>> Coleen
>>>
>>>
>>>
>>> On Mon, Jul 27, 2020 at 8:04 PM Thomas St?fe <thomas.stuefe at gmail.com>
>>> wrote:
>>>
>>>> Hi Andrei,
>>>>
>>>> I was afraid this might happen, but only occurred to me after the
>>>> review.
>>>>
>>>> My proposal would be:
>>>>
>>>> - in resolvedMethodTable.cpp, extend the log message in log_insert()
>>>> and print out the hashcode too (e.g. "ResolvedMethod entry added for ...
>>>> (4711)").
>>>>
>>>> - In the test, start your test as a sub process, give it
>>>> -Xlog:membername+table=debug, and scan the output line for "ResolvedMethod
>>>> entry added for <your class name> (<hash>)". Extract the hash. Read two or
>>>> three lines and compare the hash.
>>>>
>>>> The advantage would be that you do not need to load 200k classes to see
>>>> that your regression test works, and it is timing independent. For examples
>>>> of sub process spawning and output scanning, there are many jtreg tests
>>>> already (e.g. runtime/ErrorHandling has a few).
>>>>
>>>> Just my 5 cent. Maybe the others have different proposals.
>>>>
>>>> Cheers, Thomas
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin <andrei.pangin at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Coleen, Thomas, thank you for your reviews. It turns out though that
>>>>> my new test times out on debug builds (thanks Volker for double-checking).
>>>>>
>>>>> Test Tier Platform Description
>>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1
>>>>> linux-aarch64-debug Error: Program `...java' timed out (timeout set
>>>>> to ...ms, elapsed time including timeout handling was ...ms).
>>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1
>>>>> windows-x64-debug Error: Program
>>>>> `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>>>>> timed out (timeout set to ...ms, elapsed time including timeout handling
>>>>> was ...ms).
>>>>> runtime/MemberName/ResolvedMethodTableHash.java tier1 macosx-x64-debug Error:
>>>>> Program `...java' timed out (timeout set to ...ms, elapsed time including
>>>>> timeout handling was ...ms).
>>>>>
>>>>> Apparently, we can't rely on timing, since this test runs two orders
>>>>> of magnitude slower on debug JVM than on product one. I could simply mark
>>>>> the test as manual, or is there a better idea? Maybe, adjust the parameters
>>>>> and run the test automatically only on product or only on debug JVM? What
>>>>> do you think?
>>>>>
>>>>> Regards,
>>>>> Andrei
>>>>>
>>>>> ??, 25 ???. 2020 ?. ? 17:34, <coleen.phillimore at oracle.com>:
>>>>>
>>>>>>
>>>>>> Hi Andrei,
>>>>>> This looks good.  Thank you for finding this bug.  And thanks to
>>>>>> Volker for sponsoring it as well.
>>>>>> Nice to see you on the list, Andrei!
>>>>>> Coleen
>>>>>>
>>>>>> On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>>>>>
>>>>>> Hi, Andrei,
>>>>>>
>>>>>> Good find. I played around with a test of generating lots of lambdas
>>>>>> and yes, all the hashes are equal. With your patch invocation time went
>>>>>> down by half (that was for 10000 lambdas).
>>>>>>
>>>>>> The test looks fine though the normal way to do this seems to be
>>>>>> jcod. I personally don't care since the test is nice and self
>>>>>> contained that way, but someone from the Oracle runtime group should
>>>>>> confirm this is fine (ccing Coleen).
>>>>>>
>>>>>> JDK11 seems to be affected too.
>>>>>>
>>>>>> This probably also affects jruby.
>>>>>>
>>>>>> +1 from me.
>>>>>>
>>>>>> ..Thomas
>>>>>>
>>>>>> On Fri, Jul 24, 2020 at 4:53 PM Andrei Pangin <
>>>>>> andrei.pangin at gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please review a small fix to a not-so-small performance issue that
>>>>>>> we've
>>>>>>> seen when migrating a production application from JDK 8 to JDK 14.
>>>>>>>
>>>>>>> On certain workloads, where Nashorn produces thousands MethodHandles,
>>>>>>> ResolvedMethodTable operations become extremely slow due to
>>>>>>> degenerate
>>>>>>> hashcode. This patch basically fixes hashcode by including the method
>>>>>>> holder's name in the computation. More details in the bug report.
>>>>>>>
>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8249719
>>>>>>> Webrev: https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>>>>>
>>>>>>> Tested: tier1-2, hotspot*runtime
>>>>>>>
>>>>>>> I'll be glad if someone could sponsor the patch.
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Andrei Pangin
>>>>>>>
>>>>>>
>>>>>>
>>>
>>

From coleen.phillimore at oracle.com  Mon Jul 27 21:08:01 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 27 Jul 2020 17:08:01 -0400
Subject: RFR(XS) 8249719: MethodHandle performance suffers from bad
 ResolvedMethodTable hash function
In-Reply-To: <CAA-vtUyXwFsjTLmw5C9dQCXECu5QX0Swnrv-4kNx=gpmdyKJGQ@mail.gmail.com>
References: <CAAnUXCmRtGdo=nVi8QSD7+zyfjNzTjD9CoMM0_3Ro5mSj9gFqQ@mail.gmail.com>
 <CAA-vtUw1B8KVWx4fkLLJyz_yz0gSejkCWsCo1KT3rvOHfaS8gA@mail.gmail.com>
 <9e58ce8c-07a8-9bac-1c7b-cd5c64c54ee5@oracle.com>
 <CAAnUXCnO8ivgsXUcW_h8AeYM3WVk=adrj2Kf9wLoe5T_BxezKg@mail.gmail.com>
 <CAA-vtUxXGqNtbq7yAJXim93f3qOM2mzTN1r1PUEm_A2=f5e=ZQ@mail.gmail.com>
 <CA+3eh104RhivscWM-+4JgEZA0igfiVwK2eno2bdgG1h_Rg4BDg@mail.gmail.com>
 <67f535f6-0913-3cd4-e67f-4c3962d68ae3@oracle.com>
 <CAA-vtUwHkQmHEEMNtokuC115pb=WR9HT1Q0OHSXoK7pJAEWsNw@mail.gmail.com>
 <46590d43-ec6b-9185-8c02-11585d3ec62e@oracle.com>
 <CAAnUXCmpR0N+rUG1upXB_LNeFuKm+7axsTcjGSMM2EjPAZUxhA@mail.gmail.com>
 <CAA-vtUyXwFsjTLmw5C9dQCXECu5QX0Swnrv-4kNx=gpmdyKJGQ@mail.gmail.com>
Message-ID: <be37b6c0-1c71-07b4-1e9c-6611af3aa4a4@oracle.com>


+1.

On 7/27/20 4:37 PM, Thomas St?fe wrote:
> I?m fine with that version.
>
> Thanks, Thomas
>
> On Mon 27. Jul 2020 at 22:30, Andrei Pangin <andrei.pangin at gmail.com 
> <mailto:andrei.pangin at gmail.com>> wrote:
>
>     Thank you for the ideas. Checking the real bucket length looks
>     attractive, but as far as I see, it does not always work. First,
>     because the table expansion is triggered only after GC. Second,
>     because it's still possible to intentionally generate hash
>     collisions, but I believe the JVM should not crash even in the
>     worst case scenario.
>

yes, I agree.? It might not be safe.

Making this manual is fine.
Thanks for fixing this.
Coleen
>
>
>     Printing the hash value and parsing the logs looks a bit too
>     artificial to me, while my original intention was to test the
>     performance (or?algorithmic complexity) of the code rather than
>     the particular hash code algorithm.
>
>     Anyway, if you all don't mind making the test manual, let me just
>     stick?with this option - not to make the?test much more
>     complicated than the fix itself :) especially if the table will be
>     removed some time soon.
>
>     Here's the updated webrev, just in case. It differs only in
>     changing "timeout=300" to "manual".
>     http://cr.openjdk.java.net/~apangin/8249719/webrev.2/
>
>     Thanks again,
>     Andrei
>
>
>     ??, 27 ???. 2020 ?. ? 22:12, <coleen.phillimore at oracle.com
>     <mailto:coleen.phillimore at oracle.com>>:
>
>
>
>         On 7/27/20 2:59 PM, Thomas St?fe wrote:
>>         A manual test may be enough too, I mean how probable is it
>>         that this particular error would ever resurface. Better?would
>>         be, maybe, to just add an assert into the insert path which
>>         fires if a bucket length is above a certain threshold. That
>>         would catch all kinds of bad hashes, not just this one case.
>>
>>         BTW maybe we should add a jcmd option to print this table,
>>         like we have for the symbol- and stringtable.
>
>         Erik Osterlund removes this table with the New Invoke Bindings
>         JEP so it's not worth adding more to this right now.
>         thanks,
>         Coleen
>>
>>         Cheers Thomas
>>
>>         On Mon, Jul 27, 2020 at 8:44 PM <coleen.phillimore at oracle.com
>>         <mailto:coleen.phillimore at oracle.com>> wrote:
>>
>>
>>
>>             On 7/27/20 2:33 PM, Volker Simonis wrote:
>>>             I like this solution although?it introduces a hard
>>>             dependency?on the exact log format. But that's probably
>>>             not a big issue.
>>>
>>>             Another possibility would probably be to add a gtest.
>>>
>>>             But I'd also be happy to make the current test a manual
>>>             test.
>>>
>>>             Let's see what Coleen's?opinion is. We can use her as a
>>>             tiebreaker :)
>>>
>>>             Otherwise I'll leave it up to Andrei to choose his
>>>             preferred option :)
>>
>>             So what I suggested is risky (I'm curious to see if it
>>             works).? My second choice is to make it a manual test.?
>>             Hopefully we won't be changing up the hashcode all the
>>             time and risk breaking this again.
>>             thanks,
>>             Coleen
>>>
>>>             On Mon, Jul 27, 2020 at 8:04 PM Thomas St?fe
>>>             <thomas.stuefe at gmail.com
>>>             <mailto:thomas.stuefe at gmail.com>> wrote:
>>>
>>>                 Hi Andrei,
>>>
>>>                 I was afraid this might happen, but only occurred to
>>>                 me after the review.
>>>
>>>                 My proposal would be:
>>>
>>>                 - in resolvedMethodTable.cpp, extend the log message
>>>                 in log_insert() and print out the hashcode too (e.g.
>>>                 "ResolvedMethod entry added for ... (4711)").
>>>
>>>                 - In the test, start your test as a sub process,
>>>                 give it -Xlog:membername+table=debug, and scan the
>>>                 output line for "ResolvedMethod entry added for
>>>                 <your class name> (<hash>)". Extract the hash. Read
>>>                 two or three lines and compare the hash.
>>>
>>>                 The advantage would be that you do not need to load
>>>                 200k classes to see that your regression test works,
>>>                 and it is timing independent. For examples of sub
>>>                 process spawning and output?scanning, there are many
>>>                 jtreg tests already (e.g. runtime/ErrorHandling has
>>>                 a few).
>>>
>>>                 Just my 5 cent. Maybe the others have different
>>>                 proposals.
>>>
>>>                 Cheers, Thomas
>>>
>>>
>>>
>>>
>>>                 On Mon, Jul 27, 2020 at 7:52 PM Andrei Pangin
>>>                 <andrei.pangin at gmail.com
>>>                 <mailto:andrei.pangin at gmail.com>> wrote:
>>>
>>>                     Hi,
>>>
>>>                     Coleen, Thomas, thank you for your reviews. It
>>>                     turns out though that my new test times out on
>>>                     debug builds (thanks Volker for double-checking).
>>>
>>>                     Test 	Tier 	Platform 	Description
>>>                     runtime/MemberName/ResolvedMethodTableHash.java
>>>                     tier1 	linux-aarch64-debug 	Error: Program
>>>                     `...java' timed out (timeout set to ...ms,
>>>                     elapsed time including timeout handling was ...ms).
>>>                     runtime/MemberName/ResolvedMethodTableHash.java
>>>                     tier1 	windows-x64-debug 	Error: Program
>>>                     `c\:\\ade\\mesos\\work_dir\\jib-master\\install\\...-07-27-0826248.andrei.pangin.source\\windows-x64-debug.jdk\\jdk-16\\fastdebug\\bin\\java'
>>>                     timed out (timeout set to ...ms, elapsed time
>>>                     including timeout handling was ...ms).
>>>                     runtime/MemberName/ResolvedMethodTableHash.java
>>>                     tier1 	macosx-x64-debug 	Error: Program
>>>                     `...java' timed out (timeout set to ...ms,
>>>                     elapsed time including timeout handling was ...ms).
>>>
>>>
>>>                     Apparently, we can't rely on timing, since this
>>>                     test runs two orders of magnitude slower on
>>>                     debug JVM than on product one. I could simply
>>>                     mark the test as manual, or is there a better
>>>                     idea? Maybe, adjust the parameters and run the
>>>                     test automatically only on product or only on
>>>                     debug JVM? What do you think?
>>>
>>>                     Regards,
>>>                     Andrei
>>>
>>>                     ??, 25 ???. 2020 ?. ? 17:34,
>>>                     <coleen.phillimore at oracle.com
>>>                     <mailto:coleen.phillimore at oracle.com>>:
>>>
>>>
>>>                         Hi Andrei,
>>>                         This looks good.? Thank you for finding this
>>>                         bug. And thanks to Volker for sponsoring it
>>>                         as well.
>>>                         Nice to see you on the list, Andrei!
>>>                         Coleen
>>>
>>>                         On 7/25/20 3:18 AM, Thomas St?fe wrote:
>>>>                         Hi, Andrei,
>>>>
>>>>                         Good find. I played around with a test of
>>>>                         generating lots of lambdas and yes, all the
>>>>                         hashes are equal. With your patch
>>>>                         invocation?time went down by half (that was
>>>>                         for 10000 lambdas).
>>>>
>>>>                         The test looks fine though the normal way
>>>>                         to do this seems to be jcod. I personally
>>>>                         don't care since the test is nice and self
>>>>                         contained?that way, but someone from the
>>>>                         Oracle runtime group should confirm this is
>>>>                         fine (ccing Coleen).
>>>>
>>>>                         JDK11 seems to be affected too.
>>>>
>>>>                         This probably also affects jruby.
>>>>
>>>>                         +1 from me.
>>>>
>>>>                         ..Thomas
>>>>
>>>>                         On Fri, Jul 24, 2020 at 4:53 PM Andrei
>>>>                         Pangin <andrei.pangin at gmail.com
>>>>                         <mailto:andrei.pangin at gmail.com>> wrote:
>>>>
>>>>                             Hi,
>>>>
>>>>                             Please review a small fix to a
>>>>                             not-so-small performance issue that we've
>>>>                             seen when migrating a production
>>>>                             application from JDK 8 to JDK 14.
>>>>
>>>>                             On certain workloads, where Nashorn
>>>>                             produces thousands MethodHandles,
>>>>                             ResolvedMethodTable operations become
>>>>                             extremely slow due to degenerate
>>>>                             hashcode. This patch basically fixes
>>>>                             hashcode by including the method
>>>>                             holder's name in the computation. More
>>>>                             details in the bug report.
>>>>
>>>>                             CR:
>>>>                             https://bugs.openjdk.java.net/browse/JDK-8249719
>>>>                             Webrev:
>>>>                             https://cr.openjdk.java.net/~apangin/8249719/webrev/
>>>>
>>>>                             Tested: tier1-2, hotspot*runtime
>>>>
>>>>                             I'll be glad if someone could sponsor
>>>>                             the patch.
>>>>
>>>>                             Thank you,
>>>>                             Andrei Pangin
>>>>
>>>
>>
>


From david.holmes at oracle.com  Mon Jul 27 23:21:15 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 28 Jul 2020 09:21:15 +1000
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash
 log file
In-Reply-To: <b94bd6f1-9b37-a7d3-16da-e3f0d211fcb8@oracle.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
 <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
 <70bb6f74-e626-bd54-ddf0-568bebe933e9@oracle.com>
 <b94bd6f1-9b37-a7d3-16da-e3f0d211fcb8@oracle.com>
Message-ID: <f26d8a90-cfb9-73b9-aba2-ed57c5c74292@oracle.com>

On 28/07/2020 2:12 am, gerard ziemski wrote:
> Thank you David for taking a look.
> 
> 
> On 7/19/20 11:37 PM, David Holmes wrote:
>> Hi Gerard,
>>
>> On 18/07/2020 5:19 am, gerard ziemski wrote:
>>> Hi all,
>>>
>>> Please review this small fix that adds the OS version and the OS 
>>> build number to the hs_err_pidXXX.log output in the ?Summary? section 
>>> for Mac platform (it?s easier to use for developers than the Darwin 
>>> kernel version that we display right now).
>>>
>>> This is how things used to look:
>>>
>>>
>>> --------------- S U M M A R Y ------------
>>>
>>> Command Line: Crasher
>>>
>>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, 
>>> Darwin 19.5.0
>>> Time: Thu Jul 16 14:01:46 2020 CDT elapsed time: 1.089465 seconds (0d 
>>> 0h 0m 1s)
>>>
>>>
>>> And this is how the ?Summary? section looks like with the proposed 
>>> change:
>>>
>>>
>>> --------------- S U M M A R Y ------------
>>>
>>> Command Line: Crasher
>>>
>>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 32G, 
>>> Darwin 19.5.0, macOS 10.15.5 (19F101)
>>> Time: Thu Jul 16 14:02:29 2020 CDT elapsed time: 0.360881 seconds (0d 
>>> 0h 0m 0s)
>>>
>>>
>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8237591
>>> open webrev at http://cr.openjdk.java.net/~gziemski/8237591_rev1
>>> testing Mach5 hs_tier1,2,3,4,5 in progress
>>
>> Just to be clear, the changes prior to:
>>
>> 1555 #ifdef __APPLE__
>>
>> are just fixing up existing indentation errors - correct?
> 
> Yes, hope that's OK, as this was the only spot in the function that 
> stood out with inconsistent indentation.

Yes that is fine.

>>
>> The actual change seems okay, just one query:
>>
>> 1562???? int mib_build[] = { CTL_KERN, KERN_OSVERSION };
>>
>> I couldn't find KERN_OSVERSION documented for sysctl - is it a 
>> "recent" addition?
> 
> Yes it is. Apple added it back in 2018 (see bug comments or this link 
> https://github.com/apple/darwin-xnu/commit/5bbb823c13f3ab1ab58878f96b35433a29882676?diff=split#diff-6651b0c84a045f400bc45faa9f61c9e1 
> )

That link shows the addition of sysctl_osproductversion which I assume 
underpins "kern.osproductversion". But my question was on 
KERN_OSVERSION. That definition seems to already exist prior to the 
change you link. My concern is whether it was also fairly recently 
introduced and so referring to it would require a minimum macOS version 
on the build machine?

Thanks,
David

> 
> cheers

From richard.reingruber at sap.com  Tue Jul 28 07:32:58 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Tue, 28 Jul 2020 07:32:58 +0000
Subject: RFR(T) 8250610: SafepointMechanism::disarm_if_needed() is declared
 but not used
Message-ID: <AM0PR0202MB333147294E031F5992CFAD649B730@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi,

please review this trivial clean-up, which removes the declaration of
SafepointMechanism::disarm_if_needed() (forgotten in JDK-8240918).

Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8250610/webrev.0/
Bug:    https://bugs.openjdk.java.net/browse/JDK-8250610

Testing: Submit repo and nightly regression testing @SAP.

Thanks, Richard.

From shade at redhat.com  Tue Jul 28 07:36:40 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 28 Jul 2020 09:36:40 +0200
Subject: RFR(T) 8250610: SafepointMechanism::disarm_if_needed() is
 declared but not used
In-Reply-To: <AM0PR0202MB333147294E031F5992CFAD649B730@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <AM0PR0202MB333147294E031F5992CFAD649B730@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <b3411ea2-f11e-e0dc-8ab5-499809156401@redhat.com>

On 7/28/20 9:32 AM, Reingruber, Richard wrote:
> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8250610/webrev.0/
> Bug:    https://bugs.openjdk.java.net/browse/JDK-8250610

Looks good.

-- 
Thanks,
-Aleksey


From david.holmes at oracle.com  Tue Jul 28 07:59:16 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 28 Jul 2020 17:59:16 +1000
Subject: RFR(T) 8250610: SafepointMechanism::disarm_if_needed() is
 declared but not used
In-Reply-To: <b3411ea2-f11e-e0dc-8ab5-499809156401@redhat.com>
References: <AM0PR0202MB333147294E031F5992CFAD649B730@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <b3411ea2-f11e-e0dc-8ab5-499809156401@redhat.com>
Message-ID: <79e7c1d0-3b00-af9a-5e3a-187bd3406003@oracle.com>

+1

Thanks,
David

On 28/07/2020 5:36 pm, Aleksey Shipilev wrote:
> On 7/28/20 9:32 AM, Reingruber, Richard wrote:
>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8250610/webrev.0/
>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8250610
> 
> Looks good.
> 

From richard.reingruber at sap.com  Tue Jul 28 08:01:52 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Tue, 28 Jul 2020 08:01:52 +0000
Subject: RFR(T) 8250610: SafepointMechanism::disarm_if_needed() is
 declared but not used
In-Reply-To: <79e7c1d0-3b00-af9a-5e3a-187bd3406003@oracle.com>
References: <AM0PR0202MB333147294E031F5992CFAD649B730@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <b3411ea2-f11e-e0dc-8ab5-499809156401@redhat.com>
 <79e7c1d0-3b00-af9a-5e3a-187bd3406003@oracle.com>
Message-ID: <AM0PR0202MB3331CF0C794D94FD576B092D9B730@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Thanks Aleksey and David!
Richard.

-----Original Message-----
From: David Holmes <david.holmes at oracle.com> 
Sent: Dienstag, 28. Juli 2020 09:59
To: Reingruber, Richard <richard.reingruber at sap.com>; hotspot-runtime-dev at openjdk.java.net
Cc: Aleksey Shipilev <shade at redhat.com>
Subject: Re: RFR(T) 8250610: SafepointMechanism::disarm_if_needed() is declared but not used

+1

Thanks,
David

On 28/07/2020 5:36 pm, Aleksey Shipilev wrote:
> On 7/28/20 9:32 AM, Reingruber, Richard wrote:
>> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8250610/webrev.0/
>> Bug:    https://bugs.openjdk.java.net/browse/JDK-8250610
> 
> Looks good.
> 

From luhenry at microsoft.com  Tue Jul 28 16:26:58 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Tue, 28 Jul 2020 16:26:58 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>,
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Kim, David,

A quick follow-up on that change. Anything else you'd like to see changed? If not, could one of you please sponsor for it to be merged? Let me know of anything I should do to get it merged.

Thank you
Ludovic

________________________________________
From: Ludovic Henry <luhenry at microsoft.com>
Sent: Friday, July 17, 2020 11:26
To: Kim Barrett
Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

Hi Kim,

I've updated the webrev at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ce9fe183c3c224660c1dd08d82a7ee79c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637306071835521707&amp;sdata=qObY3%2BbWF8%2FHDCxFyT1Keof40pedUw9QTiZEZhW2CiM%3D&amp;reserved=0 with these spacing fixes.

________________________________________
From: Kim Barrett <kim.barrett at oracle.com>
Sent: Thursday, July 16, 2020 18:43
To: Ludovic Henry
Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

> On Jul 16, 2020, at 6:00 PM, Ludovic Henry <luhenry at microsoft.com> wrote:
>
> I've upload these latest changes to https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ce9fe183c3c224660c1dd08d82a7ee79c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637306071835521707&amp;sdata=qObY3%2BbWF8%2FHDCxFyT1Keof40pedUw9QTiZEZhW2CiM%3D&amp;reserved=0

The change from "StubName" => "IntrinsicName" made the indenting of
arguments in the calls no longer lined up normally. Line 65, line 82,
and lines 104-5 are now abnormally indented.

Other than that, looks good.  I don't need another webrev for a fix of
the indentation.


From luhenry at microsoft.com  Tue Jul 28 16:29:49 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Tue, 28 Jul 2020 16:29:49 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <8a9577be-bbfa-8ce2-abcc-1bcd837b00ea@oracle.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
 <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
 <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d5a5e563-e0c5-ec0a-8640-ea940c05f738@oracle.com>
 <MWHPR21MB05114BF8C3AB71CF2125FB25B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511168A11BCC1501E85A3BCB0770@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <8a9577be-bbfa-8ce2-abcc-1bcd837b00ea@oracle.com>
Message-ID: <MWHPR21MB05110E24D1291BC0554DCFB4B0730@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi David, Andrew, Thomas, Kim,

If the change looks good to you, could one of you please sponsor it to get it merged? Let me know of anything I would need to do.

Thank you
Ludovic

________________________________________
From: David Holmes <david.holmes at oracle.com>
Sent: Sunday, July 26, 2020 18:39
To: Ludovic Henry; Andrew Haley; Thomas St?fe
Cc: Kim Barrett; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model

Hi Ludovic,

This patch is good to go as far as I am concerned.

Thanks,
David

On 25/07/2020 1:39 am, Ludovic Henry wrote:
> Hi,
>
> A quick follow-up on that change. Is there anything else you'd like to see changed to get in merged?
>
> Following an offline discussion with David, I'm working with relevant Microsoft teams to figure out where relevant documentation is. I'll keep you posted on that.
>
> Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248657%2Fwebrev.00%2F8248657.patch&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C7697659c177f44af561308d831ce676a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637314109856986725&amp;sdata=l4PEeysY3nzkq6PMBjEe1%2FdCqnyQCO%2F5hfueX3RInng%3D&amp;reserved=0
>
> Thank you,
>
> --
> Ludovic
>
> -----Original Message-----
> From: Ludovic Henry <luhenry at microsoft.com>
> Sent: Wednesday, July 15, 2020 10:00 AM
> To: David Holmes <david.holmes at oracle.com>; Andrew Haley <aph at redhat.com>; Thomas St?fe <thomas.stuefe at gmail.com>
> Cc: Kim Barrett <kim.barrett at oracle.com>; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64 <openjdk-aarch64 at microsoft.com>
> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
>
> Hi David,
>
>>> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.
>>
>> That is good to know. But this is something that Microsoft should be
>> documenting explicitly - even if just a blanket statement that all
>> syscalls (which are what exactly?) provide an implicit memory barrier
>> (of what type exactly?).
>
> I don't think it's because SetEvent is a syscall that we can assume it has a barrier (even though syscall do guarantee a barrier), it's more that SetEvent is an equivalent to sem_post. And if you cannot assume that sem_post or SetEvent guarantee a memory barrier (full or at least store_release), then you could not trust any standard locking mechanism (what's the point of synchronizing if the CPU can load and store outside of the critical section).
>
>>> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.
>>
>> jcstress tests will execute the native runtime code of course, but they
>> won't be "stressing" it as such.
>
> Makes sense, thanks for the clarification.
>
> --
> Ludovic
>
> I agree with you on the value of a more explicit documentation, and I'll go look for that. If it doesn't exist, I'll put the request to have it documented somewhere on docs.microsoft.com. In the meantime, it is safe to assume that SetEvent contains a memory barrier that has at least a store_release semantic. Similarly, WaitForSingleObect and WaitForMultipleObjects have at least a load_acquire memory barrier, and are also syscalls (actually guaranteeing a full memory barrier).
>
> ________________________________________
> From: David Holmes <david.holmes at oracle.com>
> Sent: Monday, July 13, 2020 19:25
> To: Ludovic Henry; Andrew Haley; Thomas St?fe
> Cc: Kim Barrett; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
>
> Hi Ludovic,
>
> On 14/07/2020 11:28 am, Ludovic Henry wrote:
>> Hello,
>>
>>> But if we are dealing with non-TSO races then it would be good to get
>>> some guidance from Microsoft as to the memory ordering properties of
>>> various API's to ensure that we are maintaining correct ordering. For
>>> example, in the destructor we have:
>>>
>>> 81     lock_owner = 0;
>>> 82     // No lost wakeups, lock_event stays signaled until reset.
>>> 83     DWORD ret = SetEvent(lock_event);
>>>
>>> but unless we are guaranteed that the store to lock_owner cannot be
>>> reordered by the compiler or the hardware, to appear to be after the
>>> SetEvent, then the logic is broken. Generally, because Windows only
>>> supported TSO systems, we have assumed that the compiler will not
>>> reorder code across these kind of API calls. But now we also need
>>> hardware guarantees.
>>
>> I can confirm that calls such as SetEvent have an implicit memory barrier as they are syscalls. This specific instance would then not suffer from any memory reordering issues.
>
> That is good to know. But this is something that Microsoft should be
> documenting explicitly - even if just a blanket statement that all
> syscalls (which are what exactly?) provide an implicit memory barrier
> (of what type exactly?).
>
>> As for the general question around platforms with weaker memory models, AArch64 is not the first such platform that MSVC and Windows have been ported to. It is safe to assume that MSVC has a similar approach to GCC and Clang on memory reordering optimizations. [1] also gives some pointers on some MSVC specific knobs for working around the weaker memory model.
>
> The /volatile:ms is the kind of build control I was wondering about.
> Thanks for the pointer.
>
>> Also, would jcstress help root out these kinds of issues in Hotspot, or does that only test the code generated with the Interpreter/C1/C2? We run successfully jcstress in `tough` mode.
>
> jcstress tests will execute the native runtime code of course, but they
> won't be "stressing" it as such.
>
> Cheers,
> David
> -----
>
>> I hope this helps to answer your questions.
>>
>> [1] https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fcpp%2Fbuild%2Fcommon-visual-cpp-arm-migration-issues%3Fview%3Dvs-2019%23volatile-keyword-default-behavior&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C7697659c177f44af561308d831ce676a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637314109856986725&amp;sdata=xIxlQA1AfQMoQEzrpGfo9BQqgFe1SXtNS4FFfMrAvq0%3D&amp;reserved=0
>>
>> --
>> Ludovic
>> ________________________________________
>> From: Andrew Haley <aph at redhat.com>
>> Sent: Monday, July 13, 2020 01:36
>> To: David Holmes; Thomas St?fe
>> Cc: Kim Barrett; Ludovic Henry; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
>> Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model
>>
>> On 13/07/2020 06:48, David Holmes wrote:
>>> Hi Thomas,
>>>
>>> On 13/07/2020 2:41 pm, Thomas St?fe wrote:
>>>>
>>>> Can a compiler reorder system calls and stores? How would it determine
>>>> if this is safe to do?
>>
>> I very much doubt it.
>>
>>> A compiler can reorder anything it likes if it can determine it is safe
>>> to do so. :)
>>
>> I'm fairly sure the compiler doesn't care about that!
>>
>>>> I'd be surprised if Microsoft loosened up reordering since this would
>>>> mean existing software cannot just be recompiled for arm and expected to
>>>> work. But this is just a guess of course.
>>>
>>> It's an interesting point because I would expect there to be a lot of
>>> software written for Windows that contains assumptions of TSO that would
>>> in fact fail when run on Aarch64. I don't know if there are any special
>>> mechanisms to force a binary to run in TSO mode on Aarch64 under Windows
>>> (or build flags), that would allow for ease of migration.
>>
>> There's no standard hardware mechanism that would do so.
>>
>> I've been very surprised at how little software has broken on AArch64
>> because of memory ordering. Like you, I initially assumed that stuff
>> would break all over the place, but by and large it was OK. I know of
>> two reasons: firstly, programmers are pretty conservative and tend to
>> use simple and reliable mechanisms such as safe publication and
>> mutexes for inter-thread communication. But also, and maybe more
>> importantly, the kinds of reordering the hardware can do are not very
>> different from those compilers do. Therefore, anyone playing fast and
>> loose with TSO has probably already been bitten by the compiler.
>>
>>> But unless all Windows software will run in such a mode there is a
>>> need for MS to document what the memory consistency properties of
>>> various APIs are (as POSIX does [1]).
>>
>> Indeed. I would have thought it existed somewhere.
>>
>> --
>> Andrew Haley  (he/him)
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C7697659c177f44af561308d831ce676a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637314109856986725&amp;sdata=2bL5FMiSG%2BxhjI6eq0OS0y2Oon8WZMayboWFeNIS87k%3D&amp;reserved=0>
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C7697659c177f44af561308d831ce676a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637314109856986725&amp;sdata=CF8krOxrzXg0kuARbauw9pajZNj%2BI6t76c9EsSkjm3s%3D&amp;reserved=0
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>

From thomas.stuefe at gmail.com  Tue Jul 28 17:08:04 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 28 Jul 2020 19:08:04 +0200
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <MWHPR21MB05110E24D1291BC0554DCFB4B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
 <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
 <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d5a5e563-e0c5-ec0a-8640-ea940c05f738@oracle.com>
 <MWHPR21MB05114BF8C3AB71CF2125FB25B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511168A11BCC1501E85A3BCB0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <8a9577be-bbfa-8ce2-abcc-1bcd837b00ea@oracle.com>
 <MWHPR21MB05110E24D1291BC0554DCFB4B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <CAA-vtUyAiY-0PQAOVxUNcG4F+oWPftBwCn9aqmckrnA=haX4Qw@mail.gmail.com>

Hi Ludovic,

I can sponsor it. What is the last webrev?

Cheers, Thomas

From luhenry at microsoft.com  Tue Jul 28 17:16:04 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Tue, 28 Jul 2020 17:16:04 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in
 ThreadCritical regarding memory model
In-Reply-To: <CAA-vtUyAiY-0PQAOVxUNcG4F+oWPftBwCn9aqmckrnA=haX4Qw@mail.gmail.com>
References: <MWHPR21MB05112E7BFAB4686CA2BB194FB0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d7e8ef00-e2ca-8fd4-fb04-04690c686e28@oracle.com>
 <F4A7FFAD-5AF2-4AB9-A214-9C0B3887F537@oracle.com>
 <7992f43b-6c02-9bba-495e-e3c090bd528d@redhat.com>
 <a08c7ba8-12ee-e9a4-795e-19c9e1218373@oracle.com>
 <CAA-vtUxO+0-UDA+_J+PY=mAUQT-mdm1-5DLFrKqsva2hU3VUnw@mail.gmail.com>
 <75b00982-1fb5-1825-7128-25a6e45a7630@oracle.com>
 <a8a55361-0af0-b8ca-6187-783f8892a959@redhat.com>
 <MWHPR21MB0511BABCE82EE496476826D4B0610@MWHPR21MB0511.namprd21.prod.outlook.com>
 <d5a5e563-e0c5-ec0a-8640-ea940c05f738@oracle.com>
 <MWHPR21MB05114BF8C3AB71CF2125FB25B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511168A11BCC1501E85A3BCB0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <8a9577be-bbfa-8ce2-abcc-1bcd837b00ea@oracle.com>
 <MWHPR21MB05110E24D1291BC0554DCFB4B0730@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <CAA-vtUyAiY-0PQAOVxUNcG4F+oWPftBwCn9aqmckrnA=haX4Qw@mail.gmail.com>
Message-ID: <MWHPR21MB05113910CCD8D9CF5CE360FBB0730@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Thomas,

The last webrev is at http://cr.openjdk.java.net/~burban/luhenry/8248657/webrev.00/ .

Thank you!

________________________________________
From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Tuesday, July 28, 2020 10:08
To: Ludovic Henry
Cc: David Holmes; Andrew Haley; Kim Barrett; hotspot-runtime-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248657: Windows: strengthening in ThreadCritical regarding memory model

Hi Ludovic,

I can sponsor it. What is the last webrev?

Cheers, Thomas

From patricio.chilano.mateo at oracle.com  Tue Jul 28 19:16:14 2020
From: patricio.chilano.mateo at oracle.com (Patricio Chilano)
Date: Tue, 28 Jul 2020 16:16:14 -0300
Subject: RFR 8242263: Diagnose synchronization on primitive wrappers
Message-ID: <55fd6e9c-4007-2e67-ddff-f13c278cfefc@oracle.com>

Hi all,

Please review the following change that adds diagnostic capabilities 
when synchronizing on primitive wrapper classes.

Bug: https://bugs.openjdk.java.net/browse/JDK-8242263
Webrev: http://cr.openjdk.java.net/~pchilanomate/8242263/v1/webrev/

The new flag allows to identify synchronization on these classes and to 
take one of the following actions: exit the VM with fatal error, log a 
warning message, or issue a JFR event. The implementation uses a simple 
approach where a check is added at the beginning of the monitorenter 
generated code when the flag is enabled to check whether the object is 
of a primitive wrapper class. If it is, we jump to the slow path, 
otherwise we just continue as always. The extra instructions will be: 
load the klass of the object, load the _access_flags field for that 
klass, AND with a constant, and branch based on the result. The code 
will only be generated whenever the new opt-in diagnostic flag is 
enabled so performance won't be affected when off.

In addition to the purpose described in the description of the bug, this 
flag will also be useful when trying to diagnose possible 
synchronization issues if these classes ever become inline types as part 
of the Valhalla project.

I added test SyncOnPrimitiveWrapperTest.java that tests for the exit and 
logging cases. I added test TestSyncOnPrimitiveWrapperEvent.java to test 
for the JFR event case.
I tested the patch running tiers1-6 in mach5 with the flag set to 
DiagnoseSyncOnPrimitiveWrappers=2.
I checked it builds with arm32 and ppc but can't run any tests on those 
platforms, so it would be good if somebody can run the new test included 
in the patch.

Let me know if you think I should run or add any more tests.

Thanks!

Patricio

From harold.seigel at oracle.com  Tue Jul 28 20:04:07 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Tue, 28 Jul 2020 16:04:07 -0400
Subject: RFR(T) 8250562: Clean up weird comment in vmTestbase class
 Terminator.java
Message-ID: <065de86d-dd47-46c6-1680-5cbd114a4d26@oracle.com>

Hi,

Please review this trivial change to fix JDK-8250562.

Open Webrev: 
http://cr.openjdk.java.net/~hseigel/bug_8250562/webrev/index.html

JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8250562

Thanks, Harold


From lois.foltan at oracle.com  Tue Jul 28 20:06:36 2020
From: lois.foltan at oracle.com (Lois Foltan)
Date: Tue, 28 Jul 2020 16:06:36 -0400
Subject: RFR(T) 8250562: Clean up weird comment in vmTestbase class
 Terminator.java
In-Reply-To: <065de86d-dd47-46c6-1680-5cbd114a4d26@oracle.com>
References: <065de86d-dd47-46c6-1680-5cbd114a4d26@oracle.com>
Message-ID: <8c72a2b5-5a84-ce07-7fed-588911a3022e@oracle.com>

Looks good and trivial.
Lois

On 7/28/2020 4:04 PM, Harold Seigel wrote:
> Hi,
>
> Please review this trivial change to fix JDK-8250562.
>
> Open Webrev: 
> http://cr.openjdk.java.net/~hseigel/bug_8250562/webrev/index.html
>
> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8250562
>
> Thanks, Harold
>


From harold.seigel at oracle.com  Tue Jul 28 20:11:40 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Tue, 28 Jul 2020 16:11:40 -0400
Subject: RFR(T) 8250562: Clean up weird comment in vmTestbase class
 Terminator.java
In-Reply-To: <8c72a2b5-5a84-ce07-7fed-588911a3022e@oracle.com>
References: <065de86d-dd47-46c6-1680-5cbd114a4d26@oracle.com>
 <8c72a2b5-5a84-ce07-7fed-588911a3022e@oracle.com>
Message-ID: <e9c4357c-daa9-6ea0-cbc9-0c00dac5029c@oracle.com>

Thanks Lois!

Harold

On 7/28/2020 4:06 PM, Lois Foltan wrote:
> Looks good and trivial.
> Lois
>
> On 7/28/2020 4:04 PM, Harold Seigel wrote:
>> Hi,
>>
>> Please review this trivial change to fix JDK-8250562.
>>
>> Open Webrev: 
>> http://cr.openjdk.java.net/~hseigel/bug_8250562/webrev/index.html
>>
>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8250562
>>
>> Thanks, Harold
>>
>

From david.holmes at oracle.com  Wed Jul 29 09:30:44 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 29 Jul 2020 19:30:44 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>

Hi Ludovic,

I was on vacation today but back tomorrow (13 hours from this email) and 
I can sponsor this if Kim doesn't get there first. :)

Cheers,
David

On 29/07/2020 2:26 am, Ludovic Henry wrote:
> Hi Kim, David,
> 
> A quick follow-up on that change. Anything else you'd like to see changed? If not, could one of you please sponsor for it to be merged? Let me know of anything I should do to get it merged.
> 
> Thank you
> Ludovic
> 
> ________________________________________
> From: Ludovic Henry <luhenry at microsoft.com>
> Sent: Friday, July 17, 2020 11:26
> To: Kim Barrett
> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
> 
> Hi Kim,
> 
> I've updated the webrev at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ce9fe183c3c224660c1dd08d82a7ee79c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637306071835521707&amp;sdata=qObY3%2BbWF8%2FHDCxFyT1Keof40pedUw9QTiZEZhW2CiM%3D&amp;reserved=0 with these spacing fixes.
> 
> ________________________________________
> From: Kim Barrett <kim.barrett at oracle.com>
> Sent: Thursday, July 16, 2020 18:43
> To: Ludovic Henry
> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
> 
>> On Jul 16, 2020, at 6:00 PM, Ludovic Henry <luhenry at microsoft.com> wrote:
>>
>> I've upload these latest changes to https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ce9fe183c3c224660c1dd08d82a7ee79c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637306071835521707&amp;sdata=qObY3%2BbWF8%2FHDCxFyT1Keof40pedUw9QTiZEZhW2CiM%3D&amp;reserved=0
> 
> The change from "StubName" => "IntrinsicName" made the indenting of
> arguments in the calls no longer lined up normally. Line 65, line 82,
> and lines 104-5 are now abnormally indented.
> 
> Other than that, looks good.  I don't need another webrev for a fix of
> the indentation.
> 

From gerard.ziemski at oracle.com  Wed Jul 29 18:51:04 2020
From: gerard.ziemski at oracle.com (gerard ziemski)
Date: Wed, 29 Jul 2020 13:51:04 -0500
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash
 log file
In-Reply-To: <f26d8a90-cfb9-73b9-aba2-ed57c5c74292@oracle.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
 <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
 <70bb6f74-e626-bd54-ddf0-568bebe933e9@oracle.com>
 <b94bd6f1-9b37-a7d3-16da-e3f0d211fcb8@oracle.com>
 <f26d8a90-cfb9-73b9-aba2-ed57c5c74292@oracle.com>
Message-ID: <56bb935f-1277-e97e-aeb2-3f488bc825a7@oracle.com>


On 7/27/20 6:21 PM, David Holmes wrote:
> On 28/07/2020 2:12 am, gerard ziemski wrote:
>> Thank you David for taking a look.
>>
>>
>> On 7/19/20 11:37 PM, David Holmes wrote:
>>> Hi Gerard,
>>>
>>> On 18/07/2020 5:19 am, gerard ziemski wrote:
>>>> Hi all,
>>>>
>>>> Please review this small fix that adds the OS version and the OS 
>>>> build number to the hs_err_pidXXX.log output in the ?Summary? 
>>>> section for Mac platform (it?s easier to use for developers than 
>>>> the Darwin kernel version that we display right now).
>>>>
>>>> This is how things used to look:
>>>>
>>>>
>>>> --------------- S U M M A R Y ------------
>>>>
>>>> Command Line: Crasher
>>>>
>>>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 
>>>> 32G, Darwin 19.5.0
>>>> Time: Thu Jul 16 14:01:46 2020 CDT elapsed time: 1.089465 seconds 
>>>> (0d 0h 0m 1s)
>>>>
>>>>
>>>> And this is how the ?Summary? section looks like with the proposed 
>>>> change:
>>>>
>>>>
>>>> --------------- S U M M A R Y ------------
>>>>
>>>> Command Line: Crasher
>>>>
>>>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 
>>>> 32G, Darwin 19.5.0, macOS 10.15.5 (19F101)
>>>> Time: Thu Jul 16 14:02:29 2020 CDT elapsed time: 0.360881 seconds 
>>>> (0d 0h 0m 0s)
>>>>
>>>>
>>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8237591
>>>> open webrev at http://cr.openjdk.java.net/~gziemski/8237591_rev1
>>>> testing Mach5 hs_tier1,2,3,4,5 in progress
>>>
>>> Just to be clear, the changes prior to:
>>>
>>> 1555 #ifdef __APPLE__
>>>
>>> are just fixing up existing indentation errors - correct?
>>
>> Yes, hope that's OK, as this was the only spot in the function that 
>> stood out with inconsistent indentation.
>
> Yes that is fine.
>
>>>
>>> The actual change seems okay, just one query:
>>>
>>> 1562???? int mib_build[] = { CTL_KERN, KERN_OSVERSION };
>>>
>>> I couldn't find KERN_OSVERSION documented for sysctl - is it a 
>>> "recent" addition?
>>
>> Yes it is. Apple added it back in 2018 (see bug comments or this link 
>> https://github.com/apple/darwin-xnu/commit/5bbb823c13f3ab1ab58878f96b35433a29882676?diff=split#diff-6651b0c84a045f400bc45faa9f61c9e1 
>> )
>
> That link shows the addition of sysctl_osproductversion which I assume 
> underpins "kern.osproductversion". But my question was on 
> KERN_OSVERSION. That definition seems to already exist prior to the 
> change you link. My concern is whether it was also fairly recently 
> introduced and so referring to it would require a minimum macOS 
> version on the build machine?

Sorry, I thought you meant "kern.osproductversion", not KERN_OSVERSION, 
but that's a valid question.

I found Apple using KERN_OSVERSION in its own code since macOS 10.7, 
i.e. 
https://opensource.apple.com/source/Libc/Libc-763.11/gen/assumes.c.auto.html 
, though I could not find any documentation of it either.


cheers


From david.holmes at oracle.com  Thu Jul 30 00:34:12 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 30 Jul 2020 10:34:12 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
 <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
Message-ID: <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>

Pushed.

David

On 29/07/2020 7:30 pm, David Holmes wrote:
> Hi Ludovic,
> 
> I was on vacation today but back tomorrow (13 hours from this email) and 
> I can sponsor this if Kim doesn't get there first. :)
> 
> Cheers,
> David
> 
> On 29/07/2020 2:26 am, Ludovic Henry wrote:
>> Hi Kim, David,
>>
>> A quick follow-up on that change. Anything else you'd like to see 
>> changed? If not, could one of you please sponsor for it to be merged? 
>> Let me know of anything I should do to get it merged.
>>
>> Thank you
>> Ludovic
>>
>> ________________________________________
>> From: Ludovic Henry <luhenry at microsoft.com>
>> Sent: Friday, July 17, 2020 11:26
>> To: Kim Barrett
>> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
>> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
>>
>> Hi Kim,
>>
>> I've updated the webrev at 
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ce9fe183c3c224660c1dd08d82a7ee79c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637306071835521707&amp;sdata=qObY3%2BbWF8%2FHDCxFyT1Keof40pedUw9QTiZEZhW2CiM%3D&amp;reserved=0 
>> with these spacing fixes.
>>
>> ________________________________________
>> From: Kim Barrett <kim.barrett at oracle.com>
>> Sent: Thursday, July 16, 2020 18:43
>> To: Ludovic Henry
>> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
>> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
>>
>>> On Jul 16, 2020, at 6:00 PM, Ludovic Henry <luhenry at microsoft.com> 
>>> wrote:
>>>
>>> I've upload these latest changes to 
>>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Ce9fe183c3c224660c1dd08d82a7ee79c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637306071835521707&amp;sdata=qObY3%2BbWF8%2FHDCxFyT1Keof40pedUw9QTiZEZhW2CiM%3D&amp;reserved=0 
>>>
>>
>> The change from "StubName" => "IntrinsicName" made the indenting of
>> arguments in the calls no longer lined up normally. Line 65, line 82,
>> and lines 104-5 are now abnormally indented.
>>
>> Other than that, looks good.? I don't need another webrev for a fix of
>> the indentation.
>>

From luhenry at microsoft.com  Thu Jul 30 01:30:19 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 30 Jul 2020 01:30:19 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
 <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
 <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>
Message-ID: <DM5PR21MB05076A5D713EAE09E8FBC065B0710@DM5PR21MB0507.namprd21.prod.outlook.com>

Perfect, thank you!

-----Original Message-----
From: David Holmes <david.holmes at oracle.com> 
Sent: Wednesday, July 29, 2020 5:34 PM
To: Ludovic Henry <luhenry at microsoft.com>; Kim Barrett <kim.barrett at oracle.com>
Cc: hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64 <openjdk-aarch64 at microsoft.com>
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

Pushed.

David

On 29/07/2020 7:30 pm, David Holmes wrote:
> Hi Ludovic,
> 
> I was on vacation today but back tomorrow (13 hours from this email) and 
> I can sponsor this if Kim doesn't get there first. :)
> 
> Cheers,
> David
> 
> On 29/07/2020 2:26 am, Ludovic Henry wrote:
>> Hi Kim, David,
>>
>> A quick follow-up on that change. Anything else you'd like to see 
>> changed? If not, could one of you please sponsor for it to be merged? 
>> Let me know of anything I should do to get it merged.
>>
>> Thank you
>> Ludovic
>>
>> ________________________________________
>> From: Ludovic Henry <luhenry at microsoft.com>
>> Sent: Friday, July 17, 2020 11:26
>> To: Kim Barrett
>> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
>> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
>>
>> Hi Kim,
>>
>> I've updated the webrev at 
>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C948a308e467b438d34c208d834204db4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637316660633998158&amp;sdata=u8nbJRe692lkuRN4GJvTgAZVDpQzD%2F%2F3MEpDqBodtjg%3D&amp;reserved=0 
>> with these spacing fixes.
>>
>> ________________________________________
>> From: Kim Barrett <kim.barrett at oracle.com>
>> Sent: Thursday, July 16, 2020 18:43
>> To: Ludovic Henry
>> Cc: David Holmes; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64
>> Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code
>>
>>> On Jul 16, 2020, at 6:00 PM, Ludovic Henry <luhenry at microsoft.com> 
>>> wrote:
>>>
>>> I've upload these latest changes to 
>>> https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-atomics%2Fwebrev.04&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C948a308e467b438d34c208d834204db4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637316660633998158&amp;sdata=u8nbJRe692lkuRN4GJvTgAZVDpQzD%2F%2F3MEpDqBodtjg%3D&amp;reserved=0 
>>>
>>
>> The change from "StubName" => "IntrinsicName" made the indenting of
>> arguments in the calls no longer lined up normally. Line 65, line 82,
>> and lines 104-5 are now abnormally indented.
>>
>> Other than that, looks good.? I don't need another webrev for a fix of
>> the indentation.
>>

From luhenry at microsoft.com  Thu Jul 30 01:45:21 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 30 Jul 2020 01:45:21 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <DM5PR21MB05076A5D713EAE09E8FBC065B0710@DM5PR21MB0507.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <751f047f-6ac6-225b-d6d2-513558aa6209@oracle.com>
 <MWHPR21MB05110D20181CCEA8E84D3EBAB0600@MWHPR21MB0511.namprd21.prod.outlook.com>
 <9df48223-d138-7f80-be83-0860fd6bc062@oracle.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
 <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
 <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>
 <DM5PR21MB05076A5D713EAE09E8FBC065B0710@DM5PR21MB0507.namprd21.prod.outlook.com>
Message-ID: <DM5PR21MB05078740C00EB63B70D7A3CFB0710@DM5PR21MB0507.namprd21.prod.outlook.com>

Hi David,

Looking at the change pushed [1], it seems like you pushed only part of the whole change (the bits related to atomics). It's missing the other two webrevs linked in the first email ([2] and [3]). What would be the best way to proceed? Would you like me to submit another review?

Thanks,
Ludovic

[1] https://hg.openjdk.java.net/jdk/jdk/rev/bda65def14de
[2] http://cr.openjdk.java.net/~burban/luhenry/8248817-exception-handling/
[3] http://cr.openjdk.java.net/~burban/luhenry/8248817-frames/

From david.holmes at oracle.com  Thu Jul 30 06:10:40 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 30 Jul 2020 16:10:40 +1000
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash
 log file
In-Reply-To: <56bb935f-1277-e97e-aeb2-3f488bc825a7@oracle.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
 <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
 <70bb6f74-e626-bd54-ddf0-568bebe933e9@oracle.com>
 <b94bd6f1-9b37-a7d3-16da-e3f0d211fcb8@oracle.com>
 <f26d8a90-cfb9-73b9-aba2-ed57c5c74292@oracle.com>
 <56bb935f-1277-e97e-aeb2-3f488bc825a7@oracle.com>
Message-ID: <dbd47bd3-fdef-6ff1-e552-e1cfd49d0dc7@oracle.com>

On 30/07/2020 4:51 am, gerard ziemski wrote:
> On 7/27/20 6:21 PM, David Holmes wrote:
>> On 28/07/2020 2:12 am, gerard ziemski wrote:
>>> Thank you David for taking a look.
>>>
>>>
>>> On 7/19/20 11:37 PM, David Holmes wrote:
>>>> Hi Gerard,
>>>>
>>>> On 18/07/2020 5:19 am, gerard ziemski wrote:
>>>>> Hi all,
>>>>>
>>>>> Please review this small fix that adds the OS version and the OS 
>>>>> build number to the hs_err_pidXXX.log output in the ?Summary? 
>>>>> section for Mac platform (it?s easier to use for developers than 
>>>>> the Darwin kernel version that we display right now).
>>>>>
>>>>> This is how things used to look:
>>>>>
>>>>>
>>>>> --------------- S U M M A R Y ------------
>>>>>
>>>>> Command Line: Crasher
>>>>>
>>>>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 
>>>>> 32G, Darwin 19.5.0
>>>>> Time: Thu Jul 16 14:01:46 2020 CDT elapsed time: 1.089465 seconds 
>>>>> (0d 0h 0m 1s)
>>>>>
>>>>>
>>>>> And this is how the ?Summary? section looks like with the proposed 
>>>>> change:
>>>>>
>>>>>
>>>>> --------------- S U M M A R Y ------------
>>>>>
>>>>> Command Line: Crasher
>>>>>
>>>>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 
>>>>> 32G, Darwin 19.5.0, macOS 10.15.5 (19F101)
>>>>> Time: Thu Jul 16 14:02:29 2020 CDT elapsed time: 0.360881 seconds 
>>>>> (0d 0h 0m 0s)
>>>>>
>>>>>
>>>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8237591
>>>>> open webrev at http://cr.openjdk.java.net/~gziemski/8237591_rev1
>>>>> testing Mach5 hs_tier1,2,3,4,5 in progress
>>>>
>>>> Just to be clear, the changes prior to:
>>>>
>>>> 1555 #ifdef __APPLE__
>>>>
>>>> are just fixing up existing indentation errors - correct?
>>>
>>> Yes, hope that's OK, as this was the only spot in the function that 
>>> stood out with inconsistent indentation.
>>
>> Yes that is fine.
>>
>>>>
>>>> The actual change seems okay, just one query:
>>>>
>>>> 1562???? int mib_build[] = { CTL_KERN, KERN_OSVERSION };
>>>>
>>>> I couldn't find KERN_OSVERSION documented for sysctl - is it a 
>>>> "recent" addition?
>>>
>>> Yes it is. Apple added it back in 2018 (see bug comments or this link 
>>> https://github.com/apple/darwin-xnu/commit/5bbb823c13f3ab1ab58878f96b35433a29882676?diff=split#diff-6651b0c84a045f400bc45faa9f61c9e1 
>>> )
>>
>> That link shows the addition of sysctl_osproductversion which I assume 
>> underpins "kern.osproductversion". But my question was on 
>> KERN_OSVERSION. That definition seems to already exist prior to the 
>> change you link. My concern is whether it was also fairly recently 
>> introduced and so referring to it would require a minimum macOS 
>> version on the build machine?
> 
> Sorry, I thought you meant "kern.osproductversion", not KERN_OSVERSION, 
> but that's a valid question.
> 
> I found Apple using KERN_OSVERSION in its own code since macOS 10.7, 
> i.e. 
> https://opensource.apple.com/source/Libc/Libc-763.11/gen/assumes.c.auto.html 
> , though I could not find any documentation of it either.

I found it in sysctl.h from 10.5 dev kit as well, so that looks fine to use.

Thanks for checking.
David
-----

> 
> cheers
> 

From david.holmes at oracle.com  Thu Jul 30 06:40:45 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 30 Jul 2020 16:40:45 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <DM5PR21MB05078740C00EB63B70D7A3CFB0710@DM5PR21MB0507.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511F0ED860523341C33113CB07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
 <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
 <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>
 <DM5PR21MB05076A5D713EAE09E8FBC065B0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <DM5PR21MB05078740C00EB63B70D7A3CFB0710@DM5PR21MB0507.namprd21.prod.outlook.com>
Message-ID: <fa743b5d-5498-8e27-3e23-e55751ece193@oracle.com>

Hi Ludovic,

On 30/07/2020 11:45 am, Ludovic Henry wrote:
> Hi David,
> 
> Looking at the change pushed [1], it seems like you pushed only part of the whole change (the bits related to atomics). It's missing the other two webrevs linked in the first email ([2] and [3]). What would be the best way to proceed? Would you like me to submit another review?

Aaarghhh! Sorry. I'll file a new bug and push the other bits under it.

David

> Thanks,
> Ludovic
> 
> [1] https://hg.openjdk.java.net/jdk/jdk/rev/bda65def14de
> [2] http://cr.openjdk.java.net/~burban/luhenry/8248817-exception-handling/
> [3] http://cr.openjdk.java.net/~burban/luhenry/8248817-frames/
> 

From david.holmes at oracle.com  Thu Jul 30 06:56:45 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 30 Jul 2020 16:56:45 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <fa743b5d-5498-8e27-3e23-e55751ece193@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
 <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
 <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>
 <DM5PR21MB05076A5D713EAE09E8FBC065B0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <DM5PR21MB05078740C00EB63B70D7A3CFB0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <fa743b5d-5498-8e27-3e23-e55751ece193@oracle.com>
Message-ID: <1dfdbace-944d-0c20-04de-7387c2cd1f5a@oracle.com>

Changeset: c35fba4bce35
Author:    dholmes
Date:      2020-07-30 02:47 -0400
URL:       https://hg.openjdk.java.net/jdk/jdk/rev/c35fba4bce35

8250810: Push missing parts of JDK-8248817
Summary: Push changes from JDK-8248817 that were accidentally excluded 
from the commit.
Reviewed-by: kbarrett, dholmes
Contributed-by: Ludovic Henry <luhenry at microsoft.com>

! src/hotspot/os/windows/os_windows.cpp
! src/hotspot/os_cpu/windows_x86/thread_windows_x86.cpp


On 30/07/2020 4:40 pm, David Holmes wrote:
> Hi Ludovic,
> 
> On 30/07/2020 11:45 am, Ludovic Henry wrote:
>> Hi David,
>>
>> Looking at the change pushed [1], it seems like you pushed only part 
>> of the whole change (the bits related to atomics). It's missing the 
>> other two webrevs linked in the first email ([2] and [3]). What would 
>> be the best way to proceed? Would you like me to submit another review?
> 
> Aaarghhh! Sorry. I'll file a new bug and push the other bits under it.
> 
> David
> 
>> Thanks,
>> Ludovic
>>
>> [1] https://hg.openjdk.java.net/jdk/jdk/rev/bda65def14de
>> [2] 
>> http://cr.openjdk.java.net/~burban/luhenry/8248817-exception-handling/
>> [3] http://cr.openjdk.java.net/~burban/luhenry/8248817-frames/
>>

From dcherepanov at azul.com  Thu Jul 30 09:32:55 2020
From: dcherepanov at azul.com (Dmitry Cherepanov)
Date: Thu, 30 Jul 2020 12:32:55 +0300
Subject: RFR: 8250636: iso8601_time returns incorrect offset part on MacOS
Message-ID: <777fc30a-45ef-aa45-8f04-0bbe80c5a83d@azul.com>

Hello,

Please review a small change for fixing offset (timezone) part of the
string returned by os::iso8601_time on MacOS. The patch negates
tm_gmtoff and skips the DST adjustment when tm_gmtoff is used. The
behavior on other platforms should remain unchanged. More details are in
the bug report.

JBS: https://bugs.openjdk.java.net/browse/JDK-8250636
Webrev: http://cr.openjdk.java.net/~dcherepanov/8250636/webrev.v3/

Thanks,

Dmitry


From kim.barrett at oracle.com  Thu Jul 30 11:54:50 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 30 Jul 2020 07:54:50 -0400
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <1dfdbace-944d-0c20-04de-7387c2cd1f5a@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <a2662ad0-f23f-a223-498a-eaa357e8af44@oracle.com>
 <MWHPR21MB0511FE13098B53E2A665044EB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
 <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
 <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>
 <DM5PR21MB05076A5D713EAE09E8FBC065B0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <DM5PR21MB05078740C00EB63B70D7A3CFB0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <fa743b5d-5498-8e27-3e23-e55751ece193@oracle.com>
 <1dfdbace-944d-0c20-04de-7387c2cd1f5a@oracle.com>
Message-ID: <4BB7CDD2-5B3C-4EE5-99A1-B336207AB67E@oracle.com>

> On Jul 30, 2020, at 2:56 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Changeset: c35fba4bce35
> Author:    dholmes
> Date:      2020-07-30 02:47 -0400
> URL:       https://hg.openjdk.java.net/jdk/jdk/rev/c35fba4bce35
> 
> 8250810: Push missing parts of JDK-8248817
> Summary: Push changes from JDK-8248817 that were accidentally excluded from the commit.
> Reviewed-by: kbarrett, dholmes
> Contributed-by: Ludovic Henry <luhenry at microsoft.com>
> 
> ! src/hotspot/os/windows/os_windows.cpp
> ! src/hotspot/os_cpu/windows_x86/thread_windows_x86.cpp

For the record, I only reviewed the atomics changes (I always referred to
that change with my comments and approval; sorry that apparently wasn't
clear), so I think the other two parts (pushed under JDK-8250810) were only
reviewed by David. Not a disaster, but potentially an oops.

Just in general, I think it's better to split unrelated changes like this,
rather than having an omnibus bug/webrev.  The changes were split into
multiple webrevs, but all under the same bug, which led to the confusion
about who had reviewed what.

> On 30/07/2020 4:40 pm, David Holmes wrote:
>> Hi Ludovic,
>> On 30/07/2020 11:45 am, Ludovic Henry wrote:
>>> Hi David,
>>> 
>>> Looking at the change pushed [1], it seems like you pushed only part of the whole change (the bits related to atomics). It's missing the other two webrevs linked in the first email ([2] and [3]). What would be the best way to proceed? Would you like me to submit another review?
>> Aaarghhh! Sorry. I'll file a new bug and push the other bits under it.
>> David
>>> Thanks,
>>> Ludovic
>>> 
>>> [1] https://hg.openjdk.java.net/jdk/jdk/rev/bda65def14de
>>> [2] http://cr.openjdk.java.net/~burban/luhenry/8248817-exception-handling/
>>> [3] http://cr.openjdk.java.net/~burban/luhenry/8248817-frames/


From david.holmes at oracle.com  Thu Jul 30 12:20:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 30 Jul 2020 22:20:37 +1000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <4BB7CDD2-5B3C-4EE5-99A1-B336207AB67E@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
 <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
 <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>
 <DM5PR21MB05076A5D713EAE09E8FBC065B0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <DM5PR21MB05078740C00EB63B70D7A3CFB0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <fa743b5d-5498-8e27-3e23-e55751ece193@oracle.com>
 <1dfdbace-944d-0c20-04de-7387c2cd1f5a@oracle.com>
 <4BB7CDD2-5B3C-4EE5-99A1-B336207AB67E@oracle.com>
Message-ID: <5f871cba-cd9c-db39-df16-a3d8c605584e@oracle.com>

On 30/07/2020 9:54 pm, Kim Barrett wrote:
>> On Jul 30, 2020, at 2:56 AM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Changeset: c35fba4bce35
>> Author:    dholmes
>> Date:      2020-07-30 02:47 -0400
>> URL:       https://hg.openjdk.java.net/jdk/jdk/rev/c35fba4bce35
>>
>> 8250810: Push missing parts of JDK-8248817
>> Summary: Push changes from JDK-8248817 that were accidentally excluded from the commit.
>> Reviewed-by: kbarrett, dholmes
>> Contributed-by: Ludovic Henry <luhenry at microsoft.com>
>>
>> ! src/hotspot/os/windows/os_windows.cpp
>> ! src/hotspot/os_cpu/windows_x86/thread_windows_x86.cpp
> 
> For the record, I only reviewed the atomics changes (I always referred to
> that change with my comments and approval; sorry that apparently wasn't
> clear), so I think the other two parts (pushed under JDK-8250810) were only
> reviewed by David. Not a disaster, but potentially an oops.

My fault. With the time lag I overlooked the split with the initial 
commit and I didn't then go back and re-read the email thread to check 
the details.

David
-----

> Just in general, I think it's better to split unrelated changes like this,
> rather than having an omnibus bug/webrev.  The changes were split into
> multiple webrevs, but all under the same bug, which led to the confusion
> about who had reviewed what.
> 
>> On 30/07/2020 4:40 pm, David Holmes wrote:
>>> Hi Ludovic,
>>> On 30/07/2020 11:45 am, Ludovic Henry wrote:
>>>> Hi David,
>>>>
>>>> Looking at the change pushed [1], it seems like you pushed only part of the whole change (the bits related to atomics). It's missing the other two webrevs linked in the first email ([2] and [3]). What would be the best way to proceed? Would you like me to submit another review?
>>> Aaarghhh! Sorry. I'll file a new bug and push the other bits under it.
>>> David
>>>> Thanks,
>>>> Ludovic
>>>>
>>>> [1] https://hg.openjdk.java.net/jdk/jdk/rev/bda65def14de
>>>> [2] http://cr.openjdk.java.net/~burban/luhenry/8248817-exception-handling/
>>>> [3] http://cr.openjdk.java.net/~burban/luhenry/8248817-frames/
> 
> 

From luhenry at microsoft.com  Thu Jul 30 14:11:18 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 30 Jul 2020 14:11:18 +0000
Subject: RFR: 8248817: Windows: Improving common cross-platform code
In-Reply-To: <5f871cba-cd9c-db39-df16-a3d8c605584e@oracle.com>
References: <MWHPR21MB0511E28F31DF6AF57089235EB0670@MWHPR21MB0511.namprd21.prod.outlook.com>
 <2F90F784-6829-4BFF-B20B-4F7E7FD0FAC3@oracle.com>
 <MWHPR21MB0511BB5454544B322E3EBE8DB07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A603594E2594180A08D2B07F0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <BE58A4B9-989F-4E47-AE9E-26A86BA28DBD@oracle.com>
 <MWHPR21MB05115677FF53C90F05303EE9B07C0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511AC06807028FECFCAD042B0730@MWHPR21MB0511.namprd21.prod.outlook.com>
 <db48d3e7-b0af-38c2-3929-fbc090132d45@oracle.com>
 <ff337193-c37a-175d-d7b3-a1d4c283bc70@oracle.com>
 <DM5PR21MB05076A5D713EAE09E8FBC065B0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <DM5PR21MB05078740C00EB63B70D7A3CFB0710@DM5PR21MB0507.namprd21.prod.outlook.com>
 <fa743b5d-5498-8e27-3e23-e55751ece193@oracle.com>
 <1dfdbace-944d-0c20-04de-7387c2cd1f5a@oracle.com>
 <4BB7CDD2-5B3C-4EE5-99A1-B336207AB67E@oracle.com>
 <5f871cba-cd9c-db39-df16-a3d8c605584e@oracle.com>
Message-ID: <CY4PR21MB050283151C59D93855AB7E8CB0710@CY4PR21MB0502.namprd21.prod.outlook.com>

Hi David, Kim,

Thanks again for pushing that and sorry for the confusion.

> Just in general, I think it's better to split unrelated changes like this,
> rather than having an omnibus bug/webrev.  The changes were split into
> multiple webrevs, but all under the same bug, which led to the confusion
> about who had reviewed what.

Definitely lesson learned for me. I'll make sure to have 1 Webrev for 1 JBS issue in the future.

-----Original Message-----
From: David Holmes <david.holmes at oracle.com> 
Sent: Thursday, July 30, 2020 5:21 AM
To: Kim Barrett <kim.barrett at oracle.com>
Cc: Ludovic Henry <luhenry at microsoft.com>; hotspot-runtime-dev at openjdk.java.net; openjdk-aarch64 <openjdk-aarch64 at microsoft.com>
Subject: Re: RFR: 8248817: Windows: Improving common cross-platform code

On 30/07/2020 9:54 pm, Kim Barrett wrote:
>> On Jul 30, 2020, at 2:56 AM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Changeset: c35fba4bce35
>> Author:    dholmes
>> Date:      2020-07-30 02:47 -0400
>> URL:       https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhg.openjdk.java.net%2Fjdk%2Fjdk%2Frev%2Fc35fba4bce35&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6a4bf40eaf5241578f4508d834834688%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637317085726171698&amp;sdata=pZPT6eCfohM%2F0O2INcW5Dvjdin%2BHP39Nzx%2FpbB3eQA4%3D&amp;reserved=0
>>
>> 8250810: Push missing parts of JDK-8248817
>> Summary: Push changes from JDK-8248817 that were accidentally excluded from the commit.
>> Reviewed-by: kbarrett, dholmes
>> Contributed-by: Ludovic Henry <luhenry at microsoft.com>
>>
>> ! src/hotspot/os/windows/os_windows.cpp
>> ! src/hotspot/os_cpu/windows_x86/thread_windows_x86.cpp
> 
> For the record, I only reviewed the atomics changes (I always referred to
> that change with my comments and approval; sorry that apparently wasn't
> clear), so I think the other two parts (pushed under JDK-8250810) were only
> reviewed by David. Not a disaster, but potentially an oops.

My fault. With the time lag I overlooked the split with the initial 
commit and I didn't then go back and re-read the email thread to check 
the details.

David
-----

> Just in general, I think it's better to split unrelated changes like this,
> rather than having an omnibus bug/webrev.  The changes were split into
> multiple webrevs, but all under the same bug, which led to the confusion
> about who had reviewed what.
> 
>> On 30/07/2020 4:40 pm, David Holmes wrote:
>>> Hi Ludovic,
>>> On 30/07/2020 11:45 am, Ludovic Henry wrote:
>>>> Hi David,
>>>>
>>>> Looking at the change pushed [1], it seems like you pushed only part of the whole change (the bits related to atomics). It's missing the other two webrevs linked in the first email ([2] and [3]). What would be the best way to proceed? Would you like me to submit another review?
>>> Aaarghhh! Sorry. I'll file a new bug and push the other bits under it.
>>> David
>>>> Thanks,
>>>> Ludovic
>>>>
>>>> [1] https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhg.openjdk.java.net%2Fjdk%2Fjdk%2Frev%2Fbda65def14de&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6a4bf40eaf5241578f4508d834834688%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637317085726171698&amp;sdata=gc3oth1zAxuAS04lGHPKs6GG6hFB5JstVNM%2BhRkB8kM%3D&amp;reserved=0
>>>> [2] https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-exception-handling%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6a4bf40eaf5241578f4508d834834688%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637317085726171698&amp;sdata=WitCgwZwEHYk0vADDXP4B9UwokzmspSYKR2ooaE%2BQmE%3D&amp;reserved=0
>>>> [3] https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248817-frames%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C6a4bf40eaf5241578f4508d834834688%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637317085726181691&amp;sdata=gPQ0owTq3ufqJFLHKkN6xK78rgVYy031a7C03EEYq3M%3D&amp;reserved=0
> 
> 

From Nikola.Grcevski at microsoft.com  Thu Jul 30 16:25:38 2020
From: Nikola.Grcevski at microsoft.com (Nikola Grcevski)
Date: Thu, 30 Jul 2020 16:25:38 +0000
Subject: RFR(s): Support graceful application termination on Windows
 shutdown/logoff
In-Reply-To: <b1d37679-f376-b9d8-0745-23c4f78ba304@oracle.com>
References: <DM6PR21MB12890934E3B440B16CB962B1F5770@DM6PR21MB1289.namprd21.prod.outlook.com>
 <b1d37679-f376-b9d8-0745-23c4f78ba304@oracle.com>
Message-ID: <DM6PR21MB1289F17FF6B8DE7F51746ED9F5710@DM6PR21MB1289.namprd21.prod.outlook.com>

Hi core-libs-dev,

I've been searching extensively the mailing list archives and JBS, but I cannot
find any more information on this topic, apart from the two linked bug reports [1] and [2].

I apologize for my newbie ?? search skills, can someone please help review the email
I sent to hotspot-runtime-dev below. 

Essentially, no shutdown code runs on normal Windows logoff/shutdown since Windows 7, 
which I believe was reported as change in behaviour under [1].

I want to point out that the issue has nothing to do with Virtual Machines, the JVM
simply doesn't receive the correct events, unless there's an AWT window open.
The implementation for WM_ENDSESSION in AWT seems to have a small issue that
the java process will terminate regardless of the user changing their mind about shutting down.

Thanks,
Nikola

[1] https://bugs.openjdk.java.net/browse/JDK-8079631
[2] https://bugs.openjdk.java.net/browse/JDK-7068835

-----Original Message-----
From: David Holmes <david.holmes at oracle.com> 
Sent: July 23, 2020 11:02 PM
To: Nikola Grcevski <Nikola.Grcevski at microsoft.com>; core-libs-dev Libs <core-libs-dev at openjdk.java.net>
Cc: hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR(s): Support graceful application termination on Windows shutdown/logoff

Hi Nikola,

I'm redirecting this to the core-libs team initially because this is an issue that has been raised and discussed considerably in the past (possibly with some misunderstanding relating to the WM_ENDSESSION event). The core-libs team need to confirm the intended semantics here and we (runtime) can then implement whatever is determined to be needed. 
Interaction with the client team for AWT interoperability may also be needed.

Thanks,
David

On 24/07/2020 11:47 am, Nikola Grcevski wrote:
> Hello hotspot-runtime-dev,
> 
> After some recent investigation into stale files remaining after Java 
> process terminates on Windows shutdown, we noticed that there's 
> missing support for detecting Windows shutdown/logoff events for 
> interactive Java applications. Given that Java loads both GDI32.dll 
> and USER32.dll, even for console applications, this means that almost 
> all Java processes launched on Windows don't run any shutdown hooks at the moment on user logoff or system shutdown/restart.
> 
> Since Windows 7, all Windows applications that load (or transitively 
> call) GDI32.dll or USER32.dll will not receive the CTRL_LOGOFF_EVENT 
> and CTRL_SHUTDOWN_EVENT events, but instead they will be sent WM_ENDSESSION.
> 
> This is documented in MSDN under the following article:
> 
> https://docs.microsoft.com/en-us/windows/console/setconsolectrlhandler
> 
> It appears that this issue was logged in JSB at some point, but it was 
> made duplicate of another issue:
> 
> https://bugs.openjdk.java.net/browse/JDK-8079631
> 
> The behaviour changed going from Windows Vista to Windows 7.
> 
> I've made a proposal patch to address this issue under the following webrev:
> 
> http://cr.openjdk.java.net/~adityam/nikola/wm_endsession_handling/
> 
> At the moment only AWT applications would terminate gracefully on 
> shutdown/logoff, because they have support for listening on WM_ENDSESSION.
> There's a bug in the AWT code, it doesn't check for wparam upon 
> receiving the event, but it will work in most cases. If this patch is 
> accepted I can submit a follow-up patch for AWT to resolve the possible issues.
> 
> Finally, there are third set of events for service processes, for 
> example java applications which are started with a Windows Service 
> wrappers. These services work with SERVICE_ACCEPT_SHUTDOWN and SERVICE_CONTROL_SHUTDOWN.
> Once the most common case is resolved, I'd like to submit perhaps a 
> follow-up patch to support graceful termination of Java as Windows service programs.
> 
> We are working to amend the MSDN documentation for 
> SetConsoleCtrlHandler to specify that this behaviour change is also 
> present on server OSs. The documentation only mentions the workstation OS flavours at the moment.
> 
> Thanks in advance for reviewing this.
> 
> Nikola Grcevski
> Microsoft
> 

From gerard.ziemski at oracle.com  Thu Jul 30 16:52:39 2020
From: gerard.ziemski at oracle.com (gerard ziemski)
Date: Thu, 30 Jul 2020 11:52:39 -0500
Subject: RFR: 8250636: iso8601_time returns incorrect offset part on MacOS
In-Reply-To: <777fc30a-45ef-aa45-8f04-0bbe80c5a83d@azul.com>
References: <777fc30a-45ef-aa45-8f04-0bbe80c5a83d@azul.com>
Message-ID: <e202b41a-6f0f-17e7-6e6e-d25ed1b2febe@oracle.com>

hi Dmitry,

Looks good when it comes to the core approach of the fix, but may I 
suggest simplifying the code a bit? Perhaps something like this:

http://cr.openjdk.java.net/~gziemski/8250636_rev1/index.html

I do not like the duplication of "const time_t? constants and I don?t 
think that including more logic code in ?get_timezone()? does much for 
readability of the code here.

Thank you for catching and fixing it!


cheers


On 7/30/20 4:32 AM, Dmitry Cherepanov wrote:
> Hello,
>
> Please review a small change for fixing offset (timezone) part of the
> string returned by os::iso8601_time on MacOS. The patch negates
> tm_gmtoff and skips the DST adjustment when tm_gmtoff is used. The
> behavior on other platforms should remain unchanged. More details are in
> the bug report.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8250636
> Webrev: http://cr.openjdk.java.net/~dcherepanov/8250636/webrev.v3/
>
> Thanks,
>
> Dmitry
>


From gerard.ziemski at oracle.com  Thu Jul 30 16:54:32 2020
From: gerard.ziemski at oracle.com (gerard ziemski)
Date: Thu, 30 Jul 2020 11:54:32 -0500
Subject: RFR (S) 8237591: Mac: include OS X version in hs_err_pid crash
 log file
In-Reply-To: <dbd47bd3-fdef-6ff1-e552-e1cfd49d0dc7@oracle.com>
References: <49AB6201-C2C2-4862-A019-B60EEE44E515@me.com>
 <74c08f37-673c-84ae-a512-6f5afbe08050@oracle.com>
 <70bb6f74-e626-bd54-ddf0-568bebe933e9@oracle.com>
 <b94bd6f1-9b37-a7d3-16da-e3f0d211fcb8@oracle.com>
 <f26d8a90-cfb9-73b9-aba2-ed57c5c74292@oracle.com>
 <56bb935f-1277-e97e-aeb2-3f488bc825a7@oracle.com>
 <dbd47bd3-fdef-6ff1-e552-e1cfd49d0dc7@oracle.com>
Message-ID: <e95ae076-0bd4-d938-18e7-51bd2f49e051@oracle.com>

Thank you for the review.

May I please have a 2nd reviewer? (or would this be considered trivial?)


cheers

On 7/30/20 1:10 AM, David Holmes wrote:
> On 30/07/2020 4:51 am, gerard ziemski wrote:
>> On 7/27/20 6:21 PM, David Holmes wrote:
>>> On 28/07/2020 2:12 am, gerard ziemski wrote:
>>>> Thank you David for taking a look.
>>>>
>>>>
>>>> On 7/19/20 11:37 PM, David Holmes wrote:
>>>>> Hi Gerard,
>>>>>
>>>>> On 18/07/2020 5:19 am, gerard ziemski wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Please review this small fix that adds the OS version and the OS 
>>>>>> build number to the hs_err_pidXXX.log output in the ?Summary? 
>>>>>> section for Mac platform (it?s easier to use for developers than 
>>>>>> the Darwin kernel version that we display right now).
>>>>>>
>>>>>> This is how things used to look:
>>>>>>
>>>>>>
>>>>>> --------------- S U M M A R Y ------------
>>>>>>
>>>>>> Command Line: Crasher
>>>>>>
>>>>>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 
>>>>>> 32G, Darwin 19.5.0
>>>>>> Time: Thu Jul 16 14:01:46 2020 CDT elapsed time: 1.089465 seconds 
>>>>>> (0d 0h 0m 1s)
>>>>>>
>>>>>>
>>>>>> And this is how the ?Summary? section looks like with the 
>>>>>> proposed change:
>>>>>>
>>>>>>
>>>>>> --------------- S U M M A R Y ------------
>>>>>>
>>>>>> Command Line: Crasher
>>>>>>
>>>>>> Host: Gerards-MBP-16, MacBookPro16,1 x86_64 2600 MHz, 12 cores, 
>>>>>> 32G, Darwin 19.5.0, macOS 10.15.5 (19F101)
>>>>>> Time: Thu Jul 16 14:02:29 2020 CDT elapsed time: 0.360881 seconds 
>>>>>> (0d 0h 0m 0s)
>>>>>>
>>>>>>
>>>>>> bug link at https://bugs.openjdk.java.net/browse/JDK-8237591
>>>>>> open webrev at http://cr.openjdk.java.net/~gziemski/8237591_rev1
>>>>>> testing Mach5 hs_tier1,2,3,4,5 in progress
>>>>>
>>>>> Just to be clear, the changes prior to:
>>>>>
>>>>> 1555 #ifdef __APPLE__
>>>>>
>>>>> are just fixing up existing indentation errors - correct?
>>>>
>>>> Yes, hope that's OK, as this was the only spot in the function that 
>>>> stood out with inconsistent indentation.
>>>
>>> Yes that is fine.
>>>
>>>>>
>>>>> The actual change seems okay, just one query:
>>>>>
>>>>> 1562???? int mib_build[] = { CTL_KERN, KERN_OSVERSION };
>>>>>
>>>>> I couldn't find KERN_OSVERSION documented for sysctl - is it a 
>>>>> "recent" addition?
>>>>
>>>> Yes it is. Apple added it back in 2018 (see bug comments or this 
>>>> link 
>>>> https://github.com/apple/darwin-xnu/commit/5bbb823c13f3ab1ab58878f96b35433a29882676?diff=split#diff-6651b0c84a045f400bc45faa9f61c9e1 
>>>> )
>>>
>>> That link shows the addition of sysctl_osproductversion which I 
>>> assume underpins "kern.osproductversion". But my question was on 
>>> KERN_OSVERSION. That definition seems to already exist prior to the 
>>> change you link. My concern is whether it was also fairly recently 
>>> introduced and so referring to it would require a minimum macOS 
>>> version on the build machine?
>>
>> Sorry, I thought you meant "kern.osproductversion", not 
>> KERN_OSVERSION, but that's a valid question.
>>
>> I found Apple using KERN_OSVERSION in its own code since macOS 10.7, 
>> i.e. 
>> https://opensource.apple.com/source/Libc/Libc-763.11/gen/assumes.c.auto.html 
>> , though I could not find any documentation of it either.
>
> I found it in sysctl.h from 10.5 dev kit as well, so that looks fine 
> to use.
>
> Thanks for checking.
> David
> -----


From shade at redhat.com  Thu Jul 30 19:03:58 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 30 Jul 2020 21:03:58 +0200
Subject: RFR (S) 8250844: Make sure {type, obj}ArrayOopDesc accessors check the
 bounds
Message-ID: <9a8f1e02-c24d-c8e1-1b2c-1b928ddbe23e@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8250844

I was debugging some new VM patch, and figured it was a memory stomp due to wrong index passed to
objArrayOopDesc::obj_at_put. That method does not assert the index at all, which hides the errors
and silently corrupts the heap, until something else discovers it. Some objArrayOopDesc accessors do
verify the index against the bounds. Same thing goes for typeArrayOopDesc.

Fix:
  https://cr.openjdk.java.net/~shade/8250844/webrev.01/

Testing: tier{1,2} locally; jdk-submit (running)

-- 
Thanks,
-Aleksey


From felix.yang at huawei.com  Fri Jul 31 01:39:11 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Fri, 31 Jul 2020 01:39:11 +0000
Subject: RFR: 8165404: AArch64: Implement SHA512 accelerator/intrinsic
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E912DF@dggeml507-mbx.china.huawei.com>

Ping...    Any suggestions?

Thanks,
Felix

> -----Original Message-----
> From: Yangfei (Felix)
> Sent: Saturday, July 25, 2020 10:16 AM
> To: hotspot-runtime-dev at openjdk.java.net
> Cc: aarch64-port-dev at openjdk.java.net
> Subject: RFR: 8165404: AArch64: Implement SHA512 accelerator/intrinsic
> 
> Hi,
> 
>     Bug: https://bugs.openjdk.java.net/browse/JDK-8165404
>     Webrev: http://cr.openjdk.java.net/~fyang/8165404/webrev.00/
> 
>     This implement SHA-384/SHA-512 transformation using aarch64 v8.2
> SHA512 Crypto Extensions.
>     Reference implementation:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/ar
> m64/crypto/sha512-ce-core.S?h=v5.4.52
> 
>     We used QEMU system emulator which supports SHA512 instructions to
> test the functionality.
>     SHA512 basic functionality is tested with:
> http://cr.openjdk.java.net/~fyang/8165404/SHA512.java
>     Patch passed jtreg tier1-3 test with QEMU system emulator.
>     We've also verified it with full jtreg tests without SHA512 instructions on
> aarch64-linux-gnu, to make sure that there's no regression.
> 
>     We've also created a JMH for performance test:
> http://cr.openjdk.java.net/~fyang/8165404/TestSHA512.java
>     We measured the performance benefit with a cycle-accurate simulator.
>     Patch delivers more than 2x performance gain measured with the three
> different size message.
> 
>     Comments?
> 
> Thanks,
> Felix

From aph at redhat.com  Fri Jul 31 08:34:08 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 31 Jul 2020 09:34:08 +0100
Subject: RFR: 8165404: AArch64: Implement SHA512 accelerator/intrinsic
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E912DF@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E912DF@dggeml507-mbx.china.huawei.com>
Message-ID: <654e8a2e-87a1-9c3d-4c58-36b4215bb0ad@redhat.com>

On 7/31/20 2:39 AM, Yangfei (Felix) wrote:
> Ping...    Any suggestions?

Sorry for the slowness. I've been thinking what to do about patches we
can't really test. Given that no one has this hardware, I think we should
accept this patch but not enable it by default.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From richard.reingruber at sap.com  Fri Jul 31 09:11:38 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Fri, 31 Jul 2020 09:11:38 +0000
Subject: RFR (S) 8250844: Make sure {type, obj}ArrayOopDesc accessors
 check the bounds
In-Reply-To: <CAPE47PPZsa0_7f2aF+d3eBQ7ZF=n1Yod1AeBi90RVMJS8HvZCA@mail.gmail.com>
References: <9a8f1e02-c24d-c8e1-1b2c-1b928ddbe23e@redhat.com>
 <CAPE47PPZsa0_7f2aF+d3eBQ7ZF=n1Yod1AeBi90RVMJS8HvZCA@mail.gmail.com>
Message-ID: <AM0PR0202MB3331DFD0296F65E8AB4D42539B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Aleksey,

it does make sense to add the range checks and your patch looks good to me.
// Not Reviewer though.

Maybe bounds should be checked even at a lower level, namely in HeapAccess?
At least when storing to the heap? Or would that be too pedantic, too expensive?
Just curious what you and others think.

Thanks, Richard. 

---------- Forwarded message ---------
From: Aleksey Shipilev <mailto:shade at redhat.com>
Date: Thu, Jul 30, 2020 at 9:04 PM
Subject: RFR (S) 8250844: Make sure {type, obj}ArrayOopDesc accessors check the bounds
To: mailto:hotspot-runtime-dev at openjdk.java.net <mailto:hotspot-runtime-dev at openjdk.java.net>


RFE:
? https://bugs.openjdk.java.net/browse/JDK-8250844

I was debugging some new VM patch, and figured it was a memory stomp due to wrong index passed to
objArrayOopDesc::obj_at_put. That method does not assert the index at all, which hides the errors
and silently corrupts the heap, until something else discovers it. Some objArrayOopDesc accessors do
verify the index against the bounds. Same thing goes for typeArrayOopDesc.

Fix:
? https://cr.openjdk.java.net/~shade/8250844/webrev.01/

Testing: tier{1,2} locally; jdk-submit (running)

-- 
Thanks,
-Aleksey

From shade at redhat.com  Fri Jul 31 09:26:39 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 31 Jul 2020 11:26:39 +0200
Subject: RFR (S) 8250844: Make sure {type, obj}ArrayOopDesc accessors
 check the bounds
In-Reply-To: <AM0PR0202MB3331DFD0296F65E8AB4D42539B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <9a8f1e02-c24d-c8e1-1b2c-1b928ddbe23e@redhat.com>
 <CAPE47PPZsa0_7f2aF+d3eBQ7ZF=n1Yod1AeBi90RVMJS8HvZCA@mail.gmail.com>
 <AM0PR0202MB3331DFD0296F65E8AB4D42539B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <935516a9-231f-ef6e-2de0-3bf3451f8322@redhat.com>

On 7/31/20 11:11 AM, Reingruber, Richard wrote:
> it does make sense to add the range checks and your patch looks good to me.
> // Not Reviewer though.

Thanks!

> Maybe bounds should be checked even at a lower level, namely in HeapAccess?

I briefly considered it yesterday. But AFAIU, HeapAccess is too low-level: it does not know about
arrays, only about oop + offset. So {type,obj}ArrayOopDesc seems as low as we can get.

-- 
-Aleksey


From felix.yang at huawei.com  Fri Jul 31 09:57:05 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Fri, 31 Jul 2020 09:57:05 +0000
Subject: RFR: 8165404: AArch64: Implement SHA512 accelerator/intrinsic
In-Reply-To: <654e8a2e-87a1-9c3d-4c58-36b4215bb0ad@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E912DF@dggeml507-mbx.china.huawei.com>
 <654e8a2e-87a1-9c3d-4c58-36b4215bb0ad@redhat.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E91982@dggeml507-mbx.china.huawei.com>

Hi Andrew,

> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Friday, July 31, 2020 4:34 PM
> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-runtime-
> dev at openjdk.java.net
> Cc: aarch64-port-dev at openjdk.java.net
> Subject: Re: RFR: 8165404: AArch64: Implement SHA512 accelerator/intrinsic
> 
> On 7/31/20 2:39 AM, Yangfei (Felix) wrote:
> > Ping...    Any suggestions?
> 
> Sorry for the slowness. I've been thinking what to do about patches we can't
> really test. Given that no one has this hardware, I think we should accept this
> patch but not enable it by default.

Thanks for the reply.
Since this will not be enabled when the hardware feature is missing,  do you think we need any more changes?

Felix


From dcherepanov at azul.com  Fri Jul 31 10:29:49 2020
From: dcherepanov at azul.com (Dmitry Cherepanov)
Date: Fri, 31 Jul 2020 13:29:49 +0300
Subject: RFR: 8250636: iso8601_time returns incorrect offset part on MacOS
In-Reply-To: <e202b41a-6f0f-17e7-6e6e-d25ed1b2febe@oracle.com>
References: <777fc30a-45ef-aa45-8f04-0bbe80c5a83d@azul.com>
 <e202b41a-6f0f-17e7-6e6e-d25ed1b2febe@oracle.com>
Message-ID: <0e1ed1ae-2011-2e4a-4f58-d99b57ede74c@azul.com>

Hi Gerard,

Thanks for reviewing this, moving the logic of get_timezone to
iso8601_time looks good to me too.
Here's an updated webrev
http://cr.openjdk.java.net/~dcherepanov/8250636/webrev.v4/

Thanks,

Dmitry

On 30.07.2020 19:52, gerard ziemski wrote:
> hi Dmitry,
>
> Looks good when it comes to the core approach of the fix, but may I
> suggest simplifying the code a bit? Perhaps something like this:
>
> http://cr.openjdk.java.net/~gziemski/8250636_rev1/index.html
>
> I do not like the duplication of "const time_t? constants and I don?t
> think that including more logic code in ?get_timezone()? does much for
> readability of the code here.
>
> Thank you for catching and fixing it!
>
>
> cheers
>
>
> On 7/30/20 4:32 AM, Dmitry Cherepanov wrote:
>> Hello,
>>
>> Please review a small change for fixing offset (timezone) part of the
>> string returned by os::iso8601_time on MacOS. The patch negates
>> tm_gmtoff and skips the DST adjustment when tm_gmtoff is used. The
>> behavior on other platforms should remain unchanged. More details are in
>> the bug report.
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250636
>> Webrev: http://cr.openjdk.java.net/~dcherepanov/8250636/webrev.v3/
>>
>> Thanks,
>>
>> Dmitry
>>


From richard.reingruber at sap.com  Fri Jul 31 11:37:40 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Fri, 31 Jul 2020 11:37:40 +0000
Subject: RFR (S) 8250844: Make sure {type, obj}ArrayOopDesc accessors
 check the bounds
In-Reply-To: <935516a9-231f-ef6e-2de0-3bf3451f8322@redhat.com>
References: <9a8f1e02-c24d-c8e1-1b2c-1b928ddbe23e@redhat.com>
 <CAPE47PPZsa0_7f2aF+d3eBQ7ZF=n1Yod1AeBi90RVMJS8HvZCA@mail.gmail.com>
 <AM0PR0202MB3331DFD0296F65E8AB4D42539B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <935516a9-231f-ef6e-2de0-3bf3451f8322@redhat.com>
Message-ID: <AM0PR0202MB3331C10894A9C0539AECBF949B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>

I noticed this too. But then I thought, the offset could be checked for every kind of java heap
object, maybe guarded by some PedanticChecks flag. On the other hand one could argue that for
instance field accesses the offset is correct by construction.

Well, I reckon your fix is a good and balanced solution.

Thanks, Richard. 

-----Original Message-----
From: Aleksey Shipilev <shade at redhat.com> 
Sent: Freitag, 31. Juli 2020 11:27
To: Reingruber, Richard <richard.reingruber at sap.com>; hotspot-runtime-dev at openjdk.java.net
Subject: Re: RFR (S) 8250844: Make sure {type, obj}ArrayOopDesc accessors check the bounds

On 7/31/20 11:11 AM, Reingruber, Richard wrote:
> it does make sense to add the range checks and your patch looks good to me.
> // Not Reviewer though.

Thanks!

> Maybe bounds should be checked even at a lower level, namely in HeapAccess?

I briefly considered it yesterday. But AFAIU, HeapAccess is too low-level: it does not know about
arrays, only about oop + offset. So {type,obj}ArrayOopDesc seems as low as we can get.

-- 
-Aleksey


From shade at redhat.com  Fri Jul 31 12:53:30 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 31 Jul 2020 14:53:30 +0200
Subject: RFR (S) 8250844: Make sure {type, obj}ArrayOopDesc accessors
 check the bounds
In-Reply-To: <AM0PR0202MB3331C10894A9C0539AECBF949B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <9a8f1e02-c24d-c8e1-1b2c-1b928ddbe23e@redhat.com>
 <CAPE47PPZsa0_7f2aF+d3eBQ7ZF=n1Yod1AeBi90RVMJS8HvZCA@mail.gmail.com>
 <AM0PR0202MB3331DFD0296F65E8AB4D42539B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <935516a9-231f-ef6e-2de0-3bf3451f8322@redhat.com>
 <AM0PR0202MB3331C10894A9C0539AECBF949B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <be9fda4a-4e0f-ae94-fb6e-b062634e250f@redhat.com>

On 7/31/20 1:37 PM, Reingruber, Richard wrote:
> Well, I reckon your fix is a good and balanced solution.

Yeah. Thanks for review!

Anyone else?

-- 
-Aleksey


From coleen.phillimore at oracle.com  Fri Jul 31 14:12:16 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 31 Jul 2020 10:12:16 -0400
Subject: RFR (S) 8250844: Make sure {type, obj}ArrayOopDesc accessors
 check the bounds
In-Reply-To: <be9fda4a-4e0f-ae94-fb6e-b062634e250f@redhat.com>
References: <9a8f1e02-c24d-c8e1-1b2c-1b928ddbe23e@redhat.com>
 <CAPE47PPZsa0_7f2aF+d3eBQ7ZF=n1Yod1AeBi90RVMJS8HvZCA@mail.gmail.com>
 <AM0PR0202MB3331DFD0296F65E8AB4D42539B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <935516a9-231f-ef6e-2de0-3bf3451f8322@redhat.com>
 <AM0PR0202MB3331C10894A9C0539AECBF949B4E0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <be9fda4a-4e0f-ae94-fb6e-b062634e250f@redhat.com>
Message-ID: <4db2ac51-2120-5139-28c5-bd9321c45565@oracle.com>


I like this change a lot!? I'm surprised but relieved that it didn't 
find any existing bugs.
LGTM!
Coleen

On 7/31/20 8:53 AM, Aleksey Shipilev wrote:
> On 7/31/20 1:37 PM, Reingruber, Richard wrote:
>> Well, I reckon your fix is a good and balanced solution.
> Yeah. Thanks for review!
>
> Anyone else?
>


From gerard.ziemski at oracle.com  Fri Jul 31 15:08:55 2020
From: gerard.ziemski at oracle.com (gerard ziemski)
Date: Fri, 31 Jul 2020 10:08:55 -0500
Subject: RFR: 8250636: iso8601_time returns incorrect offset part on MacOS
In-Reply-To: <0e1ed1ae-2011-2e4a-4f58-d99b57ede74c@azul.com>
References: <777fc30a-45ef-aa45-8f04-0bbe80c5a83d@azul.com>
 <e202b41a-6f0f-17e7-6e6e-d25ed1b2febe@oracle.com>
 <0e1ed1ae-2011-2e4a-4f58-d99b57ede74c@azul.com>
Message-ID: <86ed9c10-5798-d135-1adc-6a4c495d0dc6@oracle.com>

Looks good to me, thank you.


On 7/31/20 5:29 AM, Dmitry Cherepanov wrote:
> Hi Gerard,
>
> Thanks for reviewing this, moving the logic of get_timezone to
> iso8601_time looks good to me too.
> Here's an updated webrev
> http://cr.openjdk.java.net/~dcherepanov/8250636/webrev.v4/
>
> Thanks,
>
> Dmitry
>
> On 30.07.2020 19:52, gerard ziemski wrote:
>> hi Dmitry,
>>
>> Looks good when it comes to the core approach of the fix, but may I
>> suggest simplifying the code a bit? Perhaps something like this:
>>
>> http://cr.openjdk.java.net/~gziemski/8250636_rev1/index.html
>>
>> I do not like the duplication of "const time_t? constants and I don?t
>> think that including more logic code in ?get_timezone()? does much for
>> readability of the code here.
>>
>> Thank you for catching and fixing it!
>>
>>
>> cheers
>>
>>
>> On 7/30/20 4:32 AM, Dmitry Cherepanov wrote:
>>> Hello,
>>>
>>> Please review a small change for fixing offset (timezone) part of the
>>> string returned by os::iso8601_time on MacOS. The patch negates
>>> tm_gmtoff and skips the DST adjustment when tm_gmtoff is used. The
>>> behavior on other platforms should remain unchanged. More details are in
>>> the bug report.
>>>
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250636
>>> Webrev: http://cr.openjdk.java.net/~dcherepanov/8250636/webrev.v3/
>>>
>>> Thanks,
>>>
>>> Dmitry
>>>


From aph at redhat.com  Fri Jul 31 15:56:04 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 31 Jul 2020 16:56:04 +0100
Subject: RFR: 8165404: AArch64: Implement SHA512 accelerator/intrinsic
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E91982@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E912DF@dggeml507-mbx.china.huawei.com>
 <654e8a2e-87a1-9c3d-4c58-36b4215bb0ad@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E91982@dggeml507-mbx.china.huawei.com>
Message-ID: <5350e745-87af-ce34-8711-2b652a38e323@redhat.com>

On 7/31/20 10:57 AM, Yangfei (Felix) wrote:
> Since this will not be enabled when the hardware feature is missing,  do you think we need any more changes?

I do. I do not want the accelerator intrinsic to be auto-enabled until
it has been tested on hardware.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From lois.foltan at oracle.com  Fri Jul 31 17:55:17 2020
From: lois.foltan at oracle.com (Lois Foltan)
Date: Fri, 31 Jul 2020 13:55:17 -0400
Subject: RFR(S): JDK-8247938: Change various JVM enums like
 LinkInfo::AccessCheck and Klass::DefaultsLookupMode to enum class
Message-ID: <b8ed8618-df95-ff49-ee6e-1cffb08e5222@oracle.com>

Please review the webrev to change the following unscoped enumeration 
declarations to scoped enumerations.

Klass::DefaultsLookupMode
Klass::OverpassLookupMode
Klass::StaticLookupMode
Klass::PrivateLookupMode
LinkInfo::AccessCheck

With C++11/14 enum class provides the added type safety to prevent 0 
(integral) or false (boolean) to be implicitly converted to 
AccessCheck::needs_access_check for example.

open webrev 
at:http://cr.openjdk.java.net/~lfoltan/bug_jdk8247938.0/webrev/ 
<http://cr.openjdk.java.net/~lfoltan/bug_jdk8247938.0/webrev/index.html>
bug link: https://bugs.openjdk.java.net/browse/JDK-8247938

Testing: hs-tier1 & 2 complete, hs-tier3-6 in progress

Thanks,
Lois

From harold.seigel at oracle.com  Fri Jul 31 18:31:20 2020
From: harold.seigel at oracle.com (Harold Seigel)
Date: Fri, 31 Jul 2020 14:31:20 -0400
Subject: RFR(S): JDK-8247938: Change various JVM enums like
 LinkInfo::AccessCheck and Klass::DefaultsLookupMode to enum class
In-Reply-To: <b8ed8618-df95-ff49-ee6e-1cffb08e5222@oracle.com>
References: <b8ed8618-df95-ff49-ee6e-1cffb08e5222@oracle.com>
Message-ID: <02fce8c5-7d6f-8bd0-2024-f201a9feda55@oracle.com>

Hi Lois,

The changes look good.

Thanks, Harold

On 7/31/2020 1:55 PM, Lois Foltan wrote:
> Please review the webrev to change the following unscoped enumeration 
> declarations to scoped enumerations.
>
> Klass::DefaultsLookupMode
> Klass::OverpassLookupMode
> Klass::StaticLookupMode
> Klass::PrivateLookupMode
> LinkInfo::AccessCheck
>
> With C++11/14 enum class provides the added type safety to prevent 0 
> (integral) or false (boolean) to be implicitly converted to 
> AccessCheck::needs_access_check for example.
>
> open webrev 
> at:http://cr.openjdk.java.net/~lfoltan/bug_jdk8247938.0/webrev/ 
> <http://cr.openjdk.java.net/~lfoltan/bug_jdk8247938.0/webrev/index.html>
> bug link: https://bugs.openjdk.java.net/browse/JDK-8247938
>
> Testing: hs-tier1 & 2 complete, hs-tier3-6 in progress
>
> Thanks,
> Lois

From coleen.phillimore at oracle.com  Fri Jul 31 18:50:11 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 31 Jul 2020 14:50:11 -0400
Subject: RFR(S): JDK-8247938: Change various JVM enums like
 LinkInfo::AccessCheck and Klass::DefaultsLookupMode to enum class
In-Reply-To: <b8ed8618-df95-ff49-ee6e-1cffb08e5222@oracle.com>
References: <b8ed8618-df95-ff49-ee6e-1cffb08e5222@oracle.com>
Message-ID: <81ee29bb-9d85-3d42-f2f2-a55a2eb4e594@oracle.com>

Yes, I agree. This looks good!
Coleen

On 7/31/20 1:55 PM, Lois Foltan wrote:
> Please review the webrev to change the following unscoped enumeration 
> declarations to scoped enumerations.
>
> Klass::DefaultsLookupMode
> Klass::OverpassLookupMode
> Klass::StaticLookupMode
> Klass::PrivateLookupMode
> LinkInfo::AccessCheck
>
> With C++11/14 enum class provides the added type safety to prevent 0 
> (integral) or false (boolean) to be implicitly converted to 
> AccessCheck::needs_access_check for example.
>
> open webrev 
> at:http://cr.openjdk.java.net/~lfoltan/bug_jdk8247938.0/webrev/ 
> <http://cr.openjdk.java.net/~lfoltan/bug_jdk8247938.0/webrev/index.html>
> bug link: https://bugs.openjdk.java.net/browse/JDK-8247938
>
> Testing: hs-tier1 & 2 complete, hs-tier3-6 in progress
>
> Thanks,
> Lois


From lois.foltan at oracle.com  Fri Jul 31 19:27:34 2020
From: lois.foltan at oracle.com (Lois Foltan)
Date: Fri, 31 Jul 2020 15:27:34 -0400
Subject: RFR(S): JDK-8247938: Change various JVM enums like
 LinkInfo::AccessCheck and Klass::DefaultsLookupMode to enum class
In-Reply-To: <81ee29bb-9d85-3d42-f2f2-a55a2eb4e594@oracle.com>
References: <b8ed8618-df95-ff49-ee6e-1cffb08e5222@oracle.com>
 <81ee29bb-9d85-3d42-f2f2-a55a2eb4e594@oracle.com>
Message-ID: <851ffb96-bc93-a177-49c5-a9d8c9173240@oracle.com>

Thank you Harold & Coleen for the review!
Lois

On 7/31/2020 2:50 PM, coleen.phillimore at oracle.com wrote:
> Yes, I agree. This looks good!
> Coleen
>
> On 7/31/20 1:55 PM, Lois Foltan wrote:
>> Please review the webrev to change the following unscoped enumeration 
>> declarations to scoped enumerations.
>>
>> Klass::DefaultsLookupMode
>> Klass::OverpassLookupMode
>> Klass::StaticLookupMode
>> Klass::PrivateLookupMode
>> LinkInfo::AccessCheck
>>
>> With C++11/14 enum class provides the added type safety to prevent 0 
>> (integral) or false (boolean) to be implicitly converted to 
>> AccessCheck::needs_access_check for example.
>>
>> open webrev 
>> at:http://cr.openjdk.java.net/~lfoltan/bug_jdk8247938.0/webrev/ 
>> <http://cr.openjdk.java.net/~lfoltan/bug_jdk8247938.0/webrev/index.html>
>> bug link: https://bugs.openjdk.java.net/browse/JDK-8247938
>>
>> Testing: hs-tier1 & 2 complete, hs-tier3-6 in progress
>>
>> Thanks,
>> Lois
>


From suenaga at oss.nttdata.com  Fri Jul 31 23:04:22 2020
From: suenaga at oss.nttdata.com (Yasumasa Suenaga)
Date: Sat, 1 Aug 2020 08:04:22 +0900
Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
In-Reply-To: <05f4a122-6b17-37f9-5ee4-228704308a6f@oss.nttdata.com>
References: <3a1a5ff7-308d-c003-fc34-14b276ecaf43@oss.nttdata.com>
 <AM0PR02MB5412723053A1F4715A9387FE93720@AM0PR02MB5412.eurprd02.prod.outlook.com>
 <05f4a122-6b17-37f9-5ee4-228704308a6f@oss.nttdata.com>
Message-ID: <247113e2-ca18-85c2-4743-e73ddd7dca14@oss.nttdata.com>

Hi Matthias,

Have you got the result from your internal nightly build / test?
Also all comments are welcome.


Thanks,

Yasumasa


On 2020/07/27 21:21, Yasumasa Suenaga wrote:
> On 2020/07/27 19:36, Baesken, Matthias wrote:
>> Hi Yasumasa, I put your patch into our internal? nightly? build/test? queue .
> 
> Thanks!
> 
>>> ?? - Hyper-V is detected on Windows in spite of running on host OS
>>
>> So it is in your case a host that? runs? Hyper-V guests.? And the host? reports too? "HyperV virtualization detected" .
>> Do I get you? right ?
> 
> Yes, that's right!
> 
>> Maybe we should be more specific in this case (however I did not see much problem so far with the? current output ).
>>
>>> ?? - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
>>
>> I think we had? some virtualizations that needed checking? CPUID with other than EAX = 40000000h? for detection,
>> ? But might be these were old or buggy versions , cannot find much details about it atm .
> 
> As David shows, virt-what checks CPUID with other than EAX = 40000000h, but at least, current supported hypervisor (VMware, Hyper-V, KVM, Xen) do not seem to need to do it.
> 
>>> ?? - Does not check CPUID hypervisor present bit [1]
>>
>> Yes, this could/should indeed be added? ( we even did this in our internal JVM a while ago).
>>
>>> ?? - Does not support x86 (32bit) platform
>>
>> This is true , it was not added because?? I think 32bit support has not much importance any more in current jdk .
>> However in case the new version does? support 32bit too, it is for sure a good thing!
> 
> I found out "TODO support 32 bit" in current code, so I attempt to fix it :)
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
>> Best regards, Matthias
>>
>>
>>
>> -----Original Message-----
>> From: Yasumasa Suenaga <suenaga at oss.nttdata.com>
>> Sent: Montag, 27. Juli 2020 06:25
>> To: hotspot-runtime-dev at openjdk.java.net; Baesken, Matthias <matthias.baesken at sap.com>
>> Cc: David Holmes <david.holmes at oracle.com>
>> Subject: RFR: 8250598: Hyper-V is detected in spite of running on host OS
>>
>> Hi all,
>>
>> Please review this change:
>>
>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8250598
>> ??? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8250598/webrev.00/
>>
>> When I got hs_err log on Windows, I saw "HyperV virtualization detected" in it in spite of running on host OS.
>>
>> Hypervisor detector has been introduced in JDK-8219241, but it has some problems as below:
>>
>> ??? - Hyper-V is detected on Windows in spite of running on host OS
>> ??? - Call CPUID with other than EAX = 40000000h (it is not described in the spec [1])
>> ??? - Does not check CPUID hypervisor present bit [1]
>> ??? - Does not support x86 (32bit) platform
>>
>> I've tested this change on submit repo, and have checked output from VM.info jcmd on following environment:
>>
>> ??? - Windows x64 (host)
>> ??? - Windows x64 (Hyper-V guest)
>> ??? - Fedora32 x64 (Hyper-V guest)
>> ??? - 32 bit JDK on Fedora32 x64 (Hyper-V guest)
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] https://kb.vmware.com/s/article/1009458
>>

From vladimir.a.ivanov at intel.com  Fri Jul 17 21:57:53 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Fri, 17 Jul 2020 21:57:53 -0000
Subject: add microcode version to the hs_err files
In-Reply-To: <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
Message-ID: <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>

>  +#if defined(IA32) || defined(AMD64)
>
> Is that not synonymous with x86?
This patter was copied from the method ?print_model_name_and_flags? (file os/linux/os_linux.cpp).
This method also read the ?/proc/cpuinfo? file and I reuse it as ?template? for the new method.
It is better to use one pattern to work with exactly same file but in general you are right.
The X86 is defined in the file ./share/utilities/macros.hpp as:
#if defined(IA32) || defined(AMD64)
#define X86
#define X86_ONLY(code) code
#define NOT_X86(code)

The question here: could I delete this ?ifdefs? while this method should work on x86 only?

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Friday, July 17, 2020 2:26 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: Re: add microcode version to the hs_err files


On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>> wrote:
Hi Vladimir,

I think this would be more suited to hotspot-runtime.

http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html

+#if defined(IA32) || defined(AMD64)

Is that not synonymous with x86?

+    while ((read = getline(&line, &len, fp)) != -1) {
+      if (len > 10 && strstr(line, "microcode") != NULL) {
+        char* rev = strchr(line, ':');
+        if (rev != NULL) sscanf(rev + 1, "%x", &result);
+        break;
+      }
+    }
+    free(line);

Not sure this works as intended. At the first call to getline() it will allocate a line buffer for you and return it. That buffer will be as large as the first line you happen to read. You then pass that same buffer into getline to fetch the next lines, but what if those are longer than the first?


Forget that point, getline calls realloc() on the line buffer to resize it, so this should be okay.

Thanks, Thomas

But anyway it would be better to pass a simple caller provided buffer in - stack allocated. Since this function is called at crash time and the C heap could be corrupted.

Cheers, Thomas


On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Hello,

could you please review the patch  http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/

This patch add the microcode version for different OSes that may be useful in the issue resolution process.


The reported microcode version for different OSes loos as:


Linux (RHEL7.7):

# cat hs_err_pid251046.log |grep microc

CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb


Windows (Win10, v1809):

CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt


MacOS (Darwin):

$ cat hs_err_pid95187.log |grep microc

CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt


Thanks, Vladimir


  Thanks, Vladimir

From vladimir.a.ivanov at intel.com  Fri Jul 17 22:52:56 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Fri, 17 Jul 2020 22:52:56 -0000
Subject: add microcode version to the hs_err files
In-Reply-To: <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
Message-ID: <BYAPR11MB378241E44D75A7AAC274DECDA77C0@BYAPR11MB3782.namprd11.prod.outlook.com>

Thanks for your comment.
The updated patch available as http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.01/

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Friday, July 17, 2020 3:02 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: add microcode version to the hs_err files

Hi Vladimir,

On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>  +#if defined(IA32) || defined(AMD64)
>
> Is that not synonymous with x86?
This patter was copied from the method ?print_model_name_and_flags? (file os/linux/os_linux.cpp).
This method also read the ?/proc/cpuinfo? file and I reuse it as ?template? for the new method.
It is better to use one pattern to work with exactly same file but in general you are right.
The X86 is defined in the file ./share/utilities/macros.hpp as:
#if defined(IA32) || defined(AMD64)
#define X86
#define X86_ONLY(code) code
#define NOT_X86(code)

The question here: could I delete this ?ifdefs? while this method should work on x86 only?


os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp is shared among all architectures.

So, in the former you do not need to exclude non-x86 architectures.

Cheers, Thomas

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
Sent: Friday, July 17, 2020 2:26 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: add microcode version to the hs_err files


On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>> wrote:
Hi Vladimir,

I think this would be more suited to hotspot-runtime.

http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html

+#if defined(IA32) || defined(AMD64)

Is that not synonymous with x86?

+    while ((read = getline(&line, &len, fp)) != -1) {
+      if (len > 10 && strstr(line, "microcode") != NULL) {
+        char* rev = strchr(line, ':');
+        if (rev != NULL) sscanf(rev + 1, "%x", &result);
+        break;
+      }
+    }
+    free(line);

Not sure this works as intended. At the first call to getline() it will allocate a line buffer for you and return it. That buffer will be as large as the first line you happen to read. You then pass that same buffer into getline to fetch the next lines, but what if those are longer than the first?


Forget that point, getline calls realloc() on the line buffer to resize it, so this should be okay.

Thanks, Thomas

But anyway it would be better to pass a simple caller provided buffer in - stack allocated. Since this function is called at crash time and the C heap could be corrupted.

Cheers, Thomas


On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Hello,

could you please review the patch  http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/

This patch add the microcode version for different OSes that may be useful in the issue resolution process.


The reported microcode version for different OSes loos as:


Linux (RHEL7.7):

# cat hs_err_pid251046.log |grep microc

CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb


Windows (Win10, v1809):

CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt


MacOS (Darwin):

$ cat hs_err_pid95187.log |grep microc

CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt


Thanks, Vladimir


  Thanks, Vladimir

From sandhya.viswanathan at intel.com  Tue Jul 21 16:51:23 2020
From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya)
Date: Tue, 21 Jul 2020 16:51:23 -0000
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <d1d2cc32-6e80-e76e-0431-9d87c665c6c4@oracle.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
 <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>
 <BYAPR11MB37826BC619E8ECC8BF62C711A77B0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <d1d2cc32-6e80-e76e-0431-9d87c665c6c4@oracle.com>
Message-ID: <BYAPR11MB35432A7DA631DDFD29E58B5EEF780@BYAPR11MB3543.namprd11.prod.outlook.com>

Hi VladimirK,

Please let me know if I can push this onto jdk/jdk.

Best Regards,
Sandhya


-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> On Behalf Of Vladimir Kozlov
Sent: Monday, July 20, 2020 3:37 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

Looks good.

Passed my tier1 testing.

Thanks,
Vladimir

On 7/20/20 10:12 AM, Ivanov, Vladimir A wrote:
> HI,
> The updated patch available as 
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.03/
> It use the ?fgets? instead of ?getline? to use local memory.
> The tier1 tests passed on the release and fastdebug builds on Linux and fastdebug builds on MacOS systems.
> Testing results same for patched and non-patched builds.
> 
> Thanks, Vladmir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Friday, July 17, 2020 10:25 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
> Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime 
> <hotspot-runtime-dev at openjdk.java.net>; 
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in 
> features_string on x86
> 
> Oh, sorry, you are right :(
> 
> I was under the assumption you wanted to call os::cpu_microcode_revision() directly from within VMError::report(). During initialization using c-heap like this should not be a problem and you can forget about 9/10ths of what I wrote, sorry.
> 
> In that case your original variant is fine, my only suggestion would be to clearly mark the free as ::free() with a comment to prevent someone from correcting it to os::free.
> 
> Thank you,
> 
> Thomas
> 
> 
> 
> On Sat, Jul 18, 2020 at 7:08 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Hi,
> seems, this info created during initialization phase. Is it correct? Collect or parse common info at the crash point usually not a good idea. During initialization usage of the c-heap not a problem.
> The ?::free? work OK here. At least tier1 test produce same results for patched and non-patched builds. But these tests not generates real case for hs_err files.
> It looks like 2k byte array enough for the one record for CPU from cpuinfo file. Will update code to use local buffer.
> 
> Thanks, Vladimir
> 
> From: Thomas St?fe 
> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
> Sent: Friday, July 17, 2020 9:42 PM
> To: Ivanov, Vladimir A 
> <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
> Cc: Vladimir Kozlov 
> <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>; 
> Hotspot dev runtime 
> <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openj
> dk.java.net>>; 
> hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at open
> jdk.java.net>
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in 
> features_string on x86
> 
> Hi,
> 
> yes, you must use the raw free here (for the same reason we cannot pass in an os::malloc() allocated buffer to getline, since if it were to resize it would use raw ::realloc() internally and crash the same way).
> 
> But as I wrote in my first mail to the original thread, I would not use c-heap memory at all, since this function is used during crash reporting in the signal handler and the c-heap may be corrupted.
> 
> It the max line length of /proc/cpu can be reliably predicted (so that getline wont realloc()) I would pass a stack allocated buffer into getline. If not, I would not use getline() at all but rewrite this, probably using fgets().
> 
> Cheers, Thomas
> 
> 
> 
> 
> On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.
> 
>   Thanks, Vladimir
> 
> -----Original Message-----
> From: Vladimir Kozlov 
> <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>
> Sent: Friday, July 17, 2020 4:17 PM
> To: Thomas St?fe 
> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>; Ivanov, 
> Vladimir A 
> <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
> Cc: Hotspot dev runtime 
> <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openj
> dk.java.net>>; 
> hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at open
> jdk.java.net>
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in 
> features_string on x86
> 
> I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.
> 
> Someone from Runtime may suggest what is the best for this case.
> 
> Thanks,
> Vladimir K
> 
> [1] 
> http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share
> /runtime/os.cpp#l792
> 
> On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
>> I updated subject to our formal review request format (JDK version, RFE's id and subject).
>>
>> I moved RFE to runtime group as Thomas said:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8249672
>>
>> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
>>
>> #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V 
>> [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
>> const+0xeb
>>
>> V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
>> const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
>> [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
>> os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c] 
>> VM_Version::get_processor_features()+0x76c
>> V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V 
>> [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
>> init_globals()+0x55 V  [libjvm.so+0x16dde63] 
>> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
>>
>>
>> Regards,
>> Vladimir K
>>
>> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>>> Hi Vladimir,
>>>
>>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A < 
>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>
>>>>>    +#if defined(IA32) || defined(AMD64)
>>>>>
>>>>> Is that not synonymous with x86?
>>>>
>>>> This patter was copied from the method ?print_model_name_and_flags?
>>>> (file os/linux/os_linux.cpp).
>>>>
>>>> This method also read the ?/proc/cpuinfo? file and I reuse it as 
>>>> ?template? for the new method.
>>>>
>>>> It is better to use one pattern to work with exactly same file but 
>>>> in general you are right.
>>>>
>>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>>
>>>> #if defined(IA32) || defined(AMD64)
>>>>
>>>> #define X86
>>>>
>>>> #define X86_ONLY(code) code
>>>>
>>>> #define NOT_X86(code)
>>>>
>>>>
>>>>
>>>> The question here: could I delete this ?ifdefs? while this method 
>>>> should work on x86 only?
>>>>
>>>>
>>>>
>>>
>>> os_linux_x86.cpp is compiled for x86 platforms only, whereas 
>>> os_linux.cpp is shared among all architectures.
>>>
>>> So, in the former you do not need to exclude non-x86 architectures.
>>>
>>> Cheers, Thomas
>>>
>>>
>>>> Thanks, Vladimir
>>>>
>>>>
>>>>
>>>> *From:* Thomas St?fe 
>>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>>> *To:* Ivanov, Vladimir A 
>>>> <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; 
>>>> Hotspot dev runtime 
>>>> <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at op
>>>> enjdk.java.net>>
>>>> *Cc:* 
>>>> hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at o
>>>> penjdk.java.net>
>>>> *Subject:* Re: add microcode version to the hs_err files
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe 
>>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>>> wrote:
>>>>
>>>> Hi Vladimir,
>>>>
>>>>
>>>>
>>>> I think this would be more suited to hotspot-runtime.
>>>>
>>>>
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00
>>>> / src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>>
>>>>
>>>>
>>>> +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>>
>>>>
>>>>
>>>> +    while ((read = getline(&line, &len, fp)) != -1) {
>>>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>>>> +        char* rev = strchr(line, ':');
>>>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>>> +        break;
>>>> +      }
>>>> +    }
>>>> +    free(line);
>>>>
>>>>
>>>>
>>>> Not sure this works as intended. At the first call to getline() it 
>>>> will allocate a line buffer for you and return it. That buffer will 
>>>> be as large as the first line you happen to read. You then pass 
>>>> that same buffer into getline to fetch the next lines, but what if 
>>>> those are longer than the first?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Forget that point, getline calls realloc() on the line buffer to 
>>>> resize it, so this should be okay.
>>>>
>>>>
>>>>
>>>> Thanks, Thomas
>>>>
>>>>
>>>>
>>>> But anyway it would be better to pass a simple caller provided 
>>>> buffer in - stack allocated. Since this function is called at crash 
>>>> time and the C heap could be corrupted.
>>>>
>>>>
>>>>
>>>> Cheers, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A < 
>>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> could you please review the patch
>>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00
>>>> /
>>>>
>>>> This patch add the microcode version for different OSes that may be 
>>>> useful in the issue resolution process.
>>>>
>>>>
>>>>
>>>> The reported microcode version for different OSes loos as:
>>>>
>>>>
>>>>
>>>> Linux (RHEL7.7):
>>>>
>>>> # cat hs_err_pid251046.log |grep microc
>>>>
>>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads 
>>>> per
>>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, 
>>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, 
>>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, 
>>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>>
>>>>
>>>>
>>>> Windows (Win10, v1809):
>>>>
>>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
>>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, 
>>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, 
>>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, 
>>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>>
>>>>
>>>>
>>>> MacOS (Darwin):
>>>>
>>>> $ cat hs_err_pid95187.log |grep microc
>>>>
>>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
>>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, 
>>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, 
>>>> vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, 
>>>> tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>>
>>>>
>>>>
>>>> Thanks, Vladimir
>>>>
>>>>
>>>>     Thanks, Vladimir
>>>>
>>>>

From vladimir.a.ivanov at intel.com  Mon Jul 20 17:13:26 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Mon, 20 Jul 2020 17:13:26 -0000
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
 <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>
Message-ID: <BYAPR11MB37826BC619E8ECC8BF62C711A77B0@BYAPR11MB3782.namprd11.prod.outlook.com>

HI,
The updated patch available as http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.03/
It use the ?fgets? instead of ?getline? to use local memory.
The tier1 tests passed on the release and fastdebug builds on Linux and fastdebug builds on MacOS systems.
Testing results same for patched and non-patched builds.

Thanks, Vladmir

From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Friday, July 17, 2020 10:25 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

Oh, sorry, you are right :(

I was under the assumption you wanted to call os::cpu_microcode_revision() directly from within VMError::report(). During initialization using c-heap like this should not be a problem and you can forget about 9/10ths of what I wrote, sorry.

In that case your original variant is fine, my only suggestion would be to clearly mark the free as ::free() with a comment to prevent someone from correcting it to os::free.

Thank you,

Thomas


On Sat, Jul 18, 2020 at 7:08 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Hi,
seems, this info created during initialization phase. Is it correct? Collect or parse common info at the crash point usually not a good idea. During initialization usage of the c-heap not a problem.
The ?::free? work OK here. At least tier1 test produce same results for patched and non-patched builds. But these tests not generates real case for hs_err files.
It looks like 2k byte array enough for the one record for CPU from cpuinfo file. Will update code to use local buffer.

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
Sent: Friday, July 17, 2020 9:42 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

Hi,

yes, you must use the raw free here (for the same reason we cannot pass in an os::malloc() allocated buffer to getline, since if it were to resize it would use raw ::realloc() internally and crash the same way).

But as I wrote in my first mail to the original thread, I would not use c-heap memory at all, since this function is used during crash reporting in the signal handler and the c-heap may be corrupted.

It the max line length of /proc/cpu can be reliably predicted (so that getline wont realloc()) I would pass a stack allocated buffer into getline. If not, I would not use getline() at all but rewrite this, probably using fgets().

Cheers, Thomas


On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.

 Thanks, Vladimir

-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>
Sent: Friday, July 17, 2020 4:17 PM
To: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>; Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.

Someone from Runtime may suggest what is the best for this case.

Thanks,
Vladimir K

[1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792

On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> I updated subject to our formal review request format (JDK version, RFE's id and subject).
>
> I moved RFE to runtime group as Thomas said:
>
> https://bugs.openjdk.java.net/browse/JDK-8249672
>
> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
>
> #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
> [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> const+0xeb
>
> V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
> [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
> os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
> VM_Version::get_processor_features()+0x76c
> V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
> [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
> init_globals()+0x55 V  [libjvm.so+0x16dde63]
> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
>
>
> Regards,
> Vladimir K
>
> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>> Hi Vladimir,
>>
>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>
>>>>   +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>
>>> This patter was copied from the method ?print_model_name_and_flags?
>>> (file os/linux/os_linux.cpp).
>>>
>>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>>> ?template? for the new method.
>>>
>>> It is better to use one pattern to work with exactly same file but
>>> in general you are right.
>>>
>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>
>>> #if defined(IA32) || defined(AMD64)
>>>
>>> #define X86
>>>
>>> #define X86_ONLY(code) code
>>>
>>> #define NOT_X86(code)
>>>
>>>
>>>
>>> The question here: could I delete this ?ifdefs? while this method
>>> should work on x86 only?
>>>
>>>
>>>
>>
>> os_linux_x86.cpp is compiled for x86 platforms only, whereas
>> os_linux.cpp is shared among all architectures.
>>
>> So, in the former you do not need to exclude non-x86 architectures.
>>
>> Cheers, Thomas
>>
>>
>>> Thanks, Vladimir
>>>
>>>
>>>
>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev
>>> runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
>>> *Cc:* hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
>>> *Subject:* Re: add microcode version to the hs_err files
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>> wrote:
>>>
>>> Hi Vladimir,
>>>
>>>
>>>
>>> I think this would be more suited to hotspot-runtime.
>>>
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>
>>>
>>>
>>> +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>>
>>>
>>>
>>> +    while ((read = getline(&line, &len, fp)) != -1) {
>>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>>> +        char* rev = strchr(line, ':');
>>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>> +        break;
>>> +      }
>>> +    }
>>> +    free(line);
>>>
>>>
>>>
>>> Not sure this works as intended. At the first call to getline() it
>>> will allocate a line buffer for you and return it. That buffer will
>>> be as large as the first line you happen to read. You then pass that
>>> same buffer into getline to fetch the next lines, but what if those
>>> are longer than the first?
>>>
>>>
>>>
>>>
>>>
>>> Forget that point, getline calls realloc() on the line buffer to
>>> resize it, so this should be okay.
>>>
>>>
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> But anyway it would be better to pass a simple caller provided
>>> buffer in - stack allocated. Since this function is called at crash
>>> time and the C heap could be corrupted.
>>>
>>>
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>
>>> Hello,
>>>
>>> could you please review the patch
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>
>>> This patch add the microcode version for different OSes that may be
>>> useful in the issue resolution process.
>>>
>>>
>>>
>>> The reported microcode version for different OSes loos as:
>>>
>>>
>>>
>>> Linux (RHEL7.7):
>>>
>>> # cat hs_err_pid251046.log |grep microc
>>>
>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>
>>>
>>>
>>> Windows (Win10, v1809):
>>>
>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
>>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>
>>>
>>>
>>> MacOS (Darwin):
>>>
>>> $ cat hs_err_pid95187.log |grep microc
>>>
>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
>>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>
>>>
>>>
>>> Thanks, Vladimir
>>>
>>>
>>>    Thanks, Vladimir
>>>
>>>

From vladimir.a.ivanov at intel.com  Fri Jul 17 23:24:54 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Fri, 17 Jul 2020 23:24:54 -0000
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
Message-ID: <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>

Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.

 Thanks, Vladimir

-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com> 
Sent: Friday, July 17, 2020 4:17 PM
To: Thomas St?fe <thomas.stuefe at gmail.com>; Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.

Someone from Runtime may suggest what is the best for this case.

Thanks,
Vladimir K

[1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792

On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> I updated subject to our formal review request format (JDK version, RFE's id and subject).
> 
> I moved RFE to runtime group as Thomas said:
> 
> https://bugs.openjdk.java.net/browse/JDK-8249672
> 
> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
> 
> #? SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V? 
> [libjvm.so+0xc12b0b]? GuardedMemory::print_on(outputStream*) 
> const+0xeb
> 
> V? [libjvm.so+0xc12b0b]? GuardedMemory::print_on(outputStream*) 
> const+0xeb V? [libjvm.so+0x13c898a]? verify_memory(void*)+0x26a V? 
> [libjvm.so+0x13cd30b]? os::free(void*)+0x5b V? [libjvm.so+0x13e5598]? 
> os::cpu_microcode_revision()+0xc8 V? [libjvm.so+0x17d314c]? 
> VM_Version::get_processor_features()+0x76c
> V? [libjvm.so+0x17d6ead]? VM_Version::initialize()+0x10d V? 
> [libjvm.so+0x17ce6c6]? VM_Version_init()+0x26 V? [libjvm.so+0xcb2895]?? 
> init_globals()+0x55 V? [libjvm.so+0x16dde63]? 
> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
> 
> 
> Regards,
> Vladimir K
> 
> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>> Hi Vladimir,
>>
>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A < 
>> vladimir.a.ivanov at intel.com> wrote:
>>
>>>> ? +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>
>>> This patter was copied from the method ?print_model_name_and_flags? 
>>> (file os/linux/os_linux.cpp).
>>>
>>> This method also read the ?/proc/cpuinfo? file and I reuse it as 
>>> ?template? for the new method.
>>>
>>> It is better to use one pattern to work with exactly same file but 
>>> in general you are right.
>>>
>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>
>>> #if defined(IA32) || defined(AMD64)
>>>
>>> #define X86
>>>
>>> #define X86_ONLY(code) code
>>>
>>> #define NOT_X86(code)
>>>
>>>
>>>
>>> The question here: could I delete this ?ifdefs? while this method 
>>> should work on x86 only?
>>>
>>>
>>>
>>
>> os_linux_x86.cpp is compiled for x86 platforms only, whereas 
>> os_linux.cpp is shared among all architectures.
>>
>> So, in the former you do not need to exclude non-x86 architectures.
>>
>> Cheers, Thomas
>>
>>
>>> Thanks, Vladimir
>>>
>>>
>>>
>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev 
>>> runtime <hotspot-runtime-dev at openjdk.java.net>
>>> *Cc:* hotspot-compiler-dev at openjdk.java.net
>>> *Subject:* Re: add microcode version to the hs_err files
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe 
>>> <thomas.stuefe at gmail.com>
>>> wrote:
>>>
>>> Hi Vladimir,
>>>
>>>
>>>
>>> I think this would be more suited to hotspot-runtime.
>>>
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>
>>>
>>>
>>> +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>>
>>>
>>>
>>> +??? while ((read = getline(&line, &len, fp)) != -1) {
>>> +????? if (len > 10 && strstr(line, "microcode") != NULL) {
>>> +??????? char* rev = strchr(line, ':');
>>> +??????? if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>> +??????? break;
>>> +????? }
>>> +??? }
>>> +??? free(line);
>>>
>>>
>>>
>>> Not sure this works as intended. At the first call to getline() it 
>>> will allocate a line buffer for you and return it. That buffer will 
>>> be as large as the first line you happen to read. You then pass that 
>>> same buffer into getline to fetch the next lines, but what if those 
>>> are longer than the first?
>>>
>>>
>>>
>>>
>>>
>>> Forget that point, getline calls realloc() on the line buffer to 
>>> resize it, so this should be okay.
>>>
>>>
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> But anyway it would be better to pass a simple caller provided 
>>> buffer in - stack allocated. Since this function is called at crash 
>>> time and the C heap could be corrupted.
>>>
>>>
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A < 
>>> vladimir.a.ivanov at intel.com> wrote:
>>>
>>> Hello,
>>>
>>> could you please review the patch
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>
>>> This patch add the microcode version for different OSes that may be 
>>> useful in the issue resolution process.
>>>
>>>
>>>
>>> The reported microcode version for different OSes loos as:
>>>
>>>
>>>
>>> Linux (RHEL7.7):
>>>
>>> # cat hs_err_pid251046.log |grep microc
>>>
>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per 
>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, 
>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, 
>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, 
>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>
>>>
>>>
>>> Windows (Win10, v1809):
>>>
>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per 
>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, 
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, 
>>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, 
>>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>
>>>
>>>
>>> MacOS (Darwin):
>>>
>>> $ cat hs_err_pid95187.log |grep microc
>>>
>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per 
>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, 
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, 
>>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, 
>>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>
>>>
>>>
>>> Thanks, Vladimir
>>>
>>>
>>> ?? Thanks, Vladimir
>>>
>>>

From vladimir.a.ivanov at intel.com  Sat Jul 18 05:13:23 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Sat, 18 Jul 2020 05:13:23 -0000
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
Message-ID: <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>

Hi,
seems, this info created during initialization phase. Is it correct? Collect or parse common info at the crash point usually not a good idea. During initialization usage of the c-heap not a problem.
The ?::free? work OK here. At least tier1 test produce same results for patched and non-patched builds. But these tests not generates real case for hs_err files.
It looks like 2k byte array enough for the one record for CPU from cpuinfo file. Will update code to use local buffer.

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Friday, July 17, 2020 9:42 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

Hi,

yes, you must use the raw free here (for the same reason we cannot pass in an os::malloc() allocated buffer to getline, since if it were to resize it would use raw ::realloc() internally and crash the same way).

But as I wrote in my first mail to the original thread, I would not use c-heap memory at all, since this function is used during crash reporting in the signal handler and the c-heap may be corrupted.

It the max line length of /proc/cpu can be reliably predicted (so that getline wont realloc()) I would pass a stack allocated buffer into getline. If not, I would not use getline() at all but rewrite this, probably using fgets().

Cheers, Thomas


On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.

 Thanks, Vladimir

-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>
Sent: Friday, July 17, 2020 4:17 PM
To: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>; Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.

Someone from Runtime may suggest what is the best for this case.

Thanks,
Vladimir K

[1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792

On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> I updated subject to our formal review request format (JDK version, RFE's id and subject).
>
> I moved RFE to runtime group as Thomas said:
>
> https://bugs.openjdk.java.net/browse/JDK-8249672
>
> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
>
> #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
> [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> const+0xeb
>
> V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
> [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
> os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
> VM_Version::get_processor_features()+0x76c
> V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
> [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
> init_globals()+0x55 V  [libjvm.so+0x16dde63]
> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
>
>
> Regards,
> Vladimir K
>
> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>> Hi Vladimir,
>>
>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>
>>>>   +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>
>>> This patter was copied from the method ?print_model_name_and_flags?
>>> (file os/linux/os_linux.cpp).
>>>
>>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>>> ?template? for the new method.
>>>
>>> It is better to use one pattern to work with exactly same file but
>>> in general you are right.
>>>
>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>
>>> #if defined(IA32) || defined(AMD64)
>>>
>>> #define X86
>>>
>>> #define X86_ONLY(code) code
>>>
>>> #define NOT_X86(code)
>>>
>>>
>>>
>>> The question here: could I delete this ?ifdefs? while this method
>>> should work on x86 only?
>>>
>>>
>>>
>>
>> os_linux_x86.cpp is compiled for x86 platforms only, whereas
>> os_linux.cpp is shared among all architectures.
>>
>> So, in the former you do not need to exclude non-x86 architectures.
>>
>> Cheers, Thomas
>>
>>
>>> Thanks, Vladimir
>>>
>>>
>>>
>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev
>>> runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
>>> *Cc:* hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
>>> *Subject:* Re: add microcode version to the hs_err files
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>> wrote:
>>>
>>> Hi Vladimir,
>>>
>>>
>>>
>>> I think this would be more suited to hotspot-runtime.
>>>
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>
>>>
>>>
>>> +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>>
>>>
>>>
>>> +    while ((read = getline(&line, &len, fp)) != -1) {
>>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>>> +        char* rev = strchr(line, ':');
>>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>> +        break;
>>> +      }
>>> +    }
>>> +    free(line);
>>>
>>>
>>>
>>> Not sure this works as intended. At the first call to getline() it
>>> will allocate a line buffer for you and return it. That buffer will
>>> be as large as the first line you happen to read. You then pass that
>>> same buffer into getline to fetch the next lines, but what if those
>>> are longer than the first?
>>>
>>>
>>>
>>>
>>>
>>> Forget that point, getline calls realloc() on the line buffer to
>>> resize it, so this should be okay.
>>>
>>>
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> But anyway it would be better to pass a simple caller provided
>>> buffer in - stack allocated. Since this function is called at crash
>>> time and the C heap could be corrupted.
>>>
>>>
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>
>>> Hello,
>>>
>>> could you please review the patch
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>
>>> This patch add the microcode version for different OSes that may be
>>> useful in the issue resolution process.
>>>
>>>
>>>
>>> The reported microcode version for different OSes loos as:
>>>
>>>
>>>
>>> Linux (RHEL7.7):
>>>
>>> # cat hs_err_pid251046.log |grep microc
>>>
>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>
>>>
>>>
>>> Windows (Win10, v1809):
>>>
>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
>>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>
>>>
>>>
>>> MacOS (Darwin):
>>>
>>> $ cat hs_err_pid95187.log |grep microc
>>>
>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
>>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>
>>>
>>>
>>> Thanks, Vladimir
>>>
>>>
>>>    Thanks, Vladimir
>>>
>>>